Skip to content

Machine Translation

Translate text between languages using MarianMT (fixed language pairs) or Flan-T5 (flexible, any-to-any).

Quick example

try (var translator = MarianTranslator.builder()
        .modelId("inference4j/opus-mt-en-fr")
        .build()) {
    String french = translator.translate("The weather is beautiful today.");
    System.out.println(french); // Le temps est beau aujourd'hui.
}
try (var translator = FlanT5TextGenerator.flanT5Base().build()) {
    String french = translator.translate("The weather is beautiful today.",
            Language.EN, Language.FR);
    System.out.println(french);
}

Full example

import io.github.inference4j.generation.GenerationResult;
import io.github.inference4j.nlp.MarianTranslator;

public class Translation {
    public static void main(String[] args) {
        try (var translator = MarianTranslator.builder()
                .modelId("inference4j/opus-mt-en-de")
                .maxNewTokens(200)
                .build()) {

            GenerationResult result = translator.translate(
                    "Machine learning is transforming how we build software.",
                    token -> System.out.print(token));

            System.out.println();
            System.out.printf("%d tokens in %,d ms%n",
                    result.generatedTokens(), result.duration().toMillis());
        }
    }
}

Flexible translation with Flan-T5

FlanT5TextGenerator implements the Translator interface and can translate between any pair of languages using a single model:

import io.github.inference4j.nlp.FlanT5TextGenerator;
import io.github.inference4j.nlp.Language;

try (var translator = FlanT5TextGenerator.flanT5Base()
        .maxNewTokens(200)
        .build()) {

    // English to French
    String french = translator.translate("Hello, how are you?",
            Language.EN, Language.FR);

    // English to German
    String german = translator.translate("Hello, how are you?",
            Language.EN, Language.DE);

    // French to Spanish
    String spanish = translator.translate("Bonjour, comment allez-vous?",
            Language.FR, Language.ES);
}

Supported languages

The Language enum provides constants for the most widely spoken languages. More languages will be added in future releases.

Constant Language
EN English
FR French
DE German
ES Spanish
PT Portuguese
PT_BR Brazilian Portuguese
IT Italian
NL Dutch
CA Catalan
SV Swedish
DA Danish
NO Norwegian
FI Finnish
PL Polish
CS Czech
HR Croatian
RO Romanian
RU Russian
UK Ukrainian
TR Turkish
JA Japanese
KO Korean
AR Arabic
ZH_CN Chinese Simplified
ZH_TW Chinese Traditional
HI Hindi

Each constant provides displayName() (e.g., "Brazilian Portuguese") and isoCode() (e.g., "pt-br").

Builder options

Method Type Default Description
.modelId(String) String — (required for MarianMT) HuggingFace model ID (e.g., inference4j/opus-mt-en-fr)
.modelSource(ModelSource) ModelSource HuggingFaceModelSource Model resolution strategy
.sessionOptions(SessionConfigurer) SessionConfigurer default ONNX Runtime session config
.tokenizerProvider(TokenizerProvider) TokenizerProvider SentencePieceBpeTokenizer Tokenizer construction strategy
.maxNewTokens(int) int 256 Maximum tokens to generate
.temperature(float) float 0.0 Sampling temperature
.topK(int) int 0 (disabled) Top-K sampling
.topP(float) float 0.0 (disabled) Nucleus sampling
.eosTokenId(int) int Auto-detected End-of-sequence token ID
.addedToken(String) String Register a special token for atomic encoding

Result type

GenerationResult is a record with:

Field Type Description
text() String The translated text
promptTokens() int Number of tokens in the input
generatedTokens() int Number of tokens generated
duration() Duration Wall-clock generation time

The convenience method translate(text) returns the translation as a plain String.

Using your own MarianMT model

The pre-exported models under inference4j/opus-mt-* work out of the box. If you want to use a different MarianMT language pair (e.g., Helsinki-NLP/opus-mt-en-ja), you'll need to export it yourself.

MarianTranslator expects the model directory to contain:

File Description
encoder_model.onnx Encoder ONNX model
decoder_model.onnx Decoder ONNX model
decoder_with_past_model.onnx Decoder with KV cache
config.json Model configuration
tokenizer.json HuggingFace fast tokenizer format

MarianMT models require tokenizer conversion

MarianMT models on HuggingFace ship with SentencePiece files (source.spm, target.spm) instead of tokenizer.json. You must build tokenizer.json using the model's vocab.json for vocabulary IDs and source.spm for BPE merges.

This is important because MarianMT merges source and target SentencePiece vocabularies into a shared vocab.json with ~65K entries. The raw SentencePieceExtractor produces SPM-internal IDs (0–31999) which differ from the model's actual IDs, so you must use vocab.json for the vocabulary mapping and only extract BPE merges from the SPM model.

import json
from huggingface_hub import hf_hub_download
from optimum.exporters.onnx import main_export
from transformers.convert_slow_tokenizer import SentencePieceExtractor
from tokenizers import Tokenizer
from tokenizers.models import BPE

model_id = "Helsinki-NLP/opus-mt-en-ja"

# 1. Export ONNX models
main_export(
    model_name_or_path=model_id,
    output="my-model/",
    task="text2text-generation-with-past",
)

# 2. Build tokenizer.json from vocab.json + source.spm merges
with open("my-model/vocab.json") as f:
    model_vocab = json.load(f)

extractor = SentencePieceExtractor("my-model/source.spm")
_, merges = extractor.extract(None)

tokenizer = Tokenizer(BPE(model_vocab, merges, unk_token="<unk>"))
tokenizer.save("my-model/tokenizer.json")

Only standard opus-mt-* models are supported. The newer opus-mt-tc-big-* variants require target language prefixes (e.g., >>por<<) which MarianTranslator does not handle.

Tips

  • MarianMT models are specialized for a single language pair (e.g., opus-mt-en-fr for English→French). They produce higher quality translations for their specific pair but require a separate model per direction.
  • Flan-T5 handles any language pair with a single model, making it more flexible but generally lower quality than a dedicated pair-specific model.
  • For bidirectional translation, you need two MarianMT models (e.g., opus-mt-en-fr and opus-mt-fr-en) — or use Flan-T5 which handles both directions.
  • Use greedy decoding (default temperature=0) for translation — sampling adds noise without improving quality.