Sentiment Analysis¶
Classify text as positive or negative (or any custom label set) using a fine-tuned DistilBERT model.
Quick example¶
try (var classifier = DistilBertTextClassifier.builder().build()) {
List<TextClassification> results = classifier.classify("This movie was fantastic!");
// [TextClassification[label=POSITIVE, confidence=0.9998]]
}
Full example¶
import io.github.inference4j.nlp.DistilBertTextClassifier;
import io.github.inference4j.nlp.TextClassification;
import java.util.List;
public class SentimentAnalysis {
public static void main(String[] args) {
try (var classifier = DistilBertTextClassifier.builder().build()) {
List<String> reviews = List.of(
"This movie was fantastic!",
"Terrible experience, would not recommend.",
"It was okay, nothing special."
);
for (String review : reviews) {
List<TextClassification> results = classifier.classify(review);
TextClassification top = results.get(0);
System.out.printf("%-45s → %s (%.2f%%)%n",
review, top.label(), top.confidence() * 100);
}
}
}
}
Builder options¶
| Method | Type | Default | Description |
|---|---|---|---|
.modelId(String) |
String |
inference4j/distilbert-base-uncased-finetuned-sst-2-english |
HuggingFace model ID |
.modelSource(ModelSource) |
ModelSource |
HuggingFaceModelSource |
Model resolution strategy |
.sessionOptions(SessionConfigurer) |
SessionConfigurer |
default | ONNX Runtime session config |
.tokenizer(Tokenizer) |
Tokenizer |
auto-loaded WordPieceTokenizer |
Custom tokenizer |
.config(ModelConfig) |
ModelConfig |
auto-loaded from config.json |
Model config with labels |
.outputOperator(OutputOperator) |
OutputOperator |
auto-detected (softmax or sigmoid) | Output activation |
.maxLength(int) |
int |
512 |
Maximum token sequence length |
Result type¶
TextClassification is a record with:
| Field | Type | Description |
|---|---|---|
label() |
String |
Classification label (e.g., POSITIVE) |
classIndex() |
int |
Numeric class index |
confidence() |
float |
Confidence score (0.0 to 1.0) |
Using custom models¶
Any HuggingFace text classification model exported to ONNX will work, as long as it includes vocab.txt and config.json with id2label mappings.
try (var classifier = DistilBertTextClassifier.builder()
.modelId("your-org/your-model")
.build()) {
classifier.classify("Some text");
}
The output activation (softmax vs sigmoid) is auto-detected from config.json:
problem_type: "multi_label_classification"→ sigmoid- Everything else → softmax
Tips¶
- The default model is fine-tuned on SST-2 (movie reviews). For other domains (product reviews, support tickets), use a model fine-tuned on relevant data.
- Use
.classify(text, topK)to limit the number of returned classifications. - For multi-label classification (where multiple labels can be true simultaneously), use a model with
problem_type: "multi_label_classification"in itsconfig.json.