Configuration¶

Model cache¶

Models are downloaded from HuggingFace and cached locally. The cache directory is resolved in this order:

Priority	Method	Example
1	Constructor parameter	`new HuggingFaceModelSource(Path.of("/cache"))`
2	System property	`-Dinference4j.cache.dir=/path/to/cache`
3	Environment variable	`INFERENCE4J_CACHE_DIR=/path/to/cache`
4	Default	`~/.cache/inference4j/`

JVM flags¶

ONNX Runtime requires native access:

--enable-native-access=ALL-UNNAMED

Or, on the module path:

--enable-native-access=com.microsoft.onnxruntime

System properties¶

Property	Description	Default
`inference4j.cache.dir`	Model cache directory	`~/.cache/inference4j/`

Environment variables¶

Variable	Description	Default
`INFERENCE4J_CACHE_DIR`	Model cache directory	`~/.cache/inference4j/`

Spring Boot properties¶

See the Spring Boot guide for the full list of inference4j.* application properties.

ONNX Runtime session options¶

Session-level configuration is set via .sessionOptions() on each builder:

.sessionOptions(opts -> {
    opts.addCoreML();                                      // execution provider
    opts.setIntraOpNumThreads(4);                          // parallelism
    opts.setOptimizationLevel(SessionOptions.OptLevel.ALL_OPT); // graph optimization
})

See the Hardware Acceleration guide for execution provider details.