Installation¶
Requirements¶
- Java 17 or higher
- ONNX Runtime (included transitively)
Add the dependency¶
inference4j-core is the only dependency you need — it includes all task wrappers, preprocessing, and tokenizers.
Generative AI¶
For text generation (Phi-3, DeepSeek-R1, etc.), add inference4j-genai instead:
This is a separate module backed by onnxruntime-genai. See the Generative AI guide for details.
JVM flags¶
ONNX Runtime requires native access. Add this flag to your JVM arguments:
Or, if you're on the module path:
Setting JVM flags in Gradle¶
tasks.withType(JavaExec).configureEach {
jvmArgs '--enable-native-access=ALL-UNNAMED'
}
tasks.withType(Test).configureEach {
jvmArgs '--enable-native-access=ALL-UNNAMED'
}
Spring Boot¶
For Spring Boot applications, use the starter instead:
See the Spring Boot guide for configuration details.
GPU support¶
The default dependency includes CPU and CoreML (macOS) support. For CUDA (Linux/Windows), swap the ONNX Runtime dependency:
<dependency>
<groupId>io.github.inference4j</groupId>
<artifactId>inference4j-core</artifactId>
<version>${inference4jVersion}</version>
<exclusions>
<exclusion>
<groupId>com.microsoft.onnxruntime</groupId>
<artifactId>onnxruntime</artifactId>
</exclusion>
</exclusions>
</dependency>
<dependency>
<groupId>com.microsoft.onnxruntime</groupId>
<artifactId>onnxruntime_gpu</artifactId>
<version>${onnxruntimeVersion}</version>
</dependency>
See the Hardware Acceleration guide for usage details.