Integrating AI & Translation: External APIs and Service Layer Design

At the same time, I began shaping a proper Service Layer — separating orchestration from domain logic and pushing business rules down into entities where they belong.

Remember Last Week’s Question?
#

Last week I wrote:

“If 3 cooks request ’løg’, ‘onions’, and ‘Onion’, the Head Chef sees all three and creates one aggregated item ‘Onions — 15kg’. This is their job anyway.”

But then I thought: What if there are 40 ingredients? What if someone types “carots” (typo)? What if the staff is bilingual and half write in Danish, half in English?

Manual aggregation doesn’t scale. String comparison is cheap. Semantic understanding is not. That’s where AI becomes useful.

And then there’s the menu translation problem. The public menu (US-11, US-12) needs to display in both Danish and English. I could force Line Cooks to write everything twice, but that’s tedious and error-prone.

Time to automate both problems.

Problem 1: Menu Translation (DeepL Integration)
#

Requirement (US-12):
The public menu must be displayed in both Danish and English.
Line Cooks create dishes in Danish only.

Instead of forcing staff to write everything twice, I integrated the DeepL Translation API and automated the process.

Endpoint: POST https://api-free.deepl.com/v2/translate

Architecture Decision
#

I designed the DeepLTranslationClient as a thin infrastructure component. It has one job: talk to the API. It doesn’t know what a “Dish” is; it only knows how to turn a String into a translated String.

DeepL DTO Design
#

DeepL expects:

A list of texts
A target language

// DTOs for DeepL
public record DeepLRequestDTO(
    @JsonProperty("text") List<String> text,
    @JsonProperty("target_lang") String targetLanguage
) {}

@JsonIgnoreProperties(ignoreUnknown = true)
public record DeepLResponseDTO(
    @JsonProperty("translations") List<TranslationDTO> translations
) {}

public record TranslationDTO(String text) {}

Key takeaway: @JsonIgnoreProperties(ignoreUnknown = true) is a lifesaver. It prevents the ObjectMapper from crashing if DeepL decides to add new analytics fields to their response payload in the future. The client is modular. For now in the MVP we only going to give “EN” as the argument for target language, but in the future we could easily expand to more languages without changing the client code.

Usage after implementation of DeepL client:

Line Cooks write dish suggestions in Danish
MenuService calls DishTranslationService, which calls DeepLTranslationClient to translate the dish name and description into English
English translations are stored when he menu is published, so unnecessary load is avoided on the translation API.
The public menu supports multiple languages

This keeps the domain clean while external communication stays isolated in the infrastructure layer.

One important distinction:

DeepL is deterministic. Given the same input and target language, it will always return the same translation. This makes it ideal for user-facing content like menus, where predictability matters.

Gemini, on the other hand, is probabilistic — which required stricter defensive programming (covered below).

Problem 2: Ingredient Normalization
#

The Issue: Three cooks request the same ingredient with different spellings for the weekly shopping list:

“løg” (Danish)

“onions” (English, plural)

“Onion” (Capitalized)

If we just group these by string matching, the head chef gets a messy list with three different entries for onions. I needed automated, semantic normalization.

Solution: Use Google Gemini 2.5 Flash to normalize ingredient names based on culinary standards.

However, integrating an LLM into a strongly typed Java backend is not trivial. LLMs are conversational by nature. My backend needs deterministic JSON. Bridging that gap required strict payload modeling and disciplined prompt engineering.

Endpoint: POST https://generativelanguage.googleapis.com/v1beta/models/gemini-2.5-flash:generateContent

This is how i structured the request and response handling to get clean JSON output from Gemini, which I can then serialize directly into Java objects:

The Request: What Google Expects
#

Google’s Generative Language API is multimodal (it can accept text, images, video). Because of this, even a simple text prompt cannot just be sent as {“prompt”: “Hello”}. It must be wrapped in a deeply nested hierarchy.

I modeled this strict structure using Java Records:

public record GeminiRequest(List<Content> contents) {}
public record Content(List<Part> parts) {}
public record Part(String text) {}

To build the HTTP request payload, the prompt should be wrapped in these layers:

private GeminiRequest buildGeminiRequest(String prompt) {
    // Wrapping the prompt in the required multimodal structure
    return new GeminiRequest(List.of(new Content(List.of(new Part(prompt)))));
}

The Art of the Prompt
#

To make the AI work as a reliable data-transformer, I had to be extremely strict in the prompt. I created a NormalizeTextPromptBuilder to encapsulate this logic:

public class NormalizeTextPromptBuilder {
    public static String buildNormalizeTextPrompt(String ingredientsJson, String languageName) {
        return String.format(
            """
            Normalize these ingredient names to standard %s culinary terminology.

            Return ONLY valid JSON in this exact format:
            {"ingredient1": "Normalized1", "ingredient2": "Normalized2"}

            Rules:
            - Singular form
            - Capitalize first letter
            - Fix spelling
            - Translate to %s
            - NO markdown, NO explanation

            Ingredients: %s

            JSON:""",
            languageName,
            languageName,
            ingredientsJson
        );
    }
}

Why this prompt works:

Schema anchoring — Providing the exact JSON structure reduces hallucinated formats.
Explicit transformation rules — Singular form, capitalization, spelling correction, and translation make the output predictable.
Token priming (“JSON:”) — Ending with JSON: biases the first generated token toward {, reducing conversational filler.

Treating the LLM as a strict data transformer — not a chatbot — was the key architectural shift.

The Response: Extracting the Data
#

Just like the request, Gemini’s response is buried deep within a nested structure: Response -> Candidates -> Content -> Parts -> Text.

@JsonIgnoreProperties(ignoreUnknown = true)
public record GeminiResponse(
    @JsonProperty("candidates")
    List<Candidate> candidates,

    @JsonProperty("usageMetadata")
    UsageMetadata usageMetadata
)
{}

Even with strict instructions, Gemini frequently wrapped the JSON in Markdown code fences.
Since Jackson cannot parse markdown, I added a defensive sanitation step inside the client:

// 1. Traverse the nested DTOs to find the raw string
String geminiResponse = response.candidates().get(0).content().parts().get(0).text();

// 2. Strip the markdown backticks that crash Jackson
    private String cleanGeminiResponse(String geminiResponse)
    {
        return geminiResponse.replace("```json", "").replace("```", "").trim();
    }

Service-Level Usage
#

With the Client returning a clean JSON string, our AiService uses Jackson’s TypeReference to map the result directly into a Java Map<String, String>.

@Override
public Map<String, String> normalizeIngredientList(List<String> ingredients, String targetLanguage) {
    // 1. Build the prompt
    String prompt = NormalizeTextPromptBuilder.buildNormalizeTextPrompt(
        objectMapper.writeValueAsString(ingredients), 
        targetLanguage
    );
    
    // 2. Get the clean JSON string from our dumb client
    String jsonResponse = aiClient.generateResponse(prompt);

    // 3. Map it to a Dictionary (Key: Original, Value: Normalized)
    return objectMapper.readValue(jsonResponse, new TypeReference<Map<String, String>>() {});
}

Usage: ShoppingListService passes ingredient strings through normalization. Result: “løg” + “onions” + “Onion” → all map to “Onion” or “Løg” depending on target language → aggregate quantities in Java code without duplicates.

Service Layer: Where Does Business Logic Live?
#

This is one of those architectural decisions that compounds over time.
If business rules live in services, they become scattered and bypassable.
If they live in entities, they become enforceable invariants.

The Deadline Problem
#

Business rule (US-06): Line Cooks can only submit dish suggestions before Thursday 12:00 of the week before the target week.

Target week 10 (Monday March 3) → Deadline is Thursday February 27.

Decision: Put Logic in Entity
#

@Entity
public class DishSuggestion {
// Fields, constructor, etc.
    
    // Entity calculates its own deadline
    public LocalDate getDeadlineDate() {
        ValidationUtil.validateNotNull(targetWeek, "Target week");
        ValidationUtil.validateNotNull(targetYear, "Target year");

        LocalDate targetMonday = LocalDate.of(targetYear, 1, 1)
            .with(WeekFields.ISO.weekOfYear(), targetWeek)
            .with(WeekFields.ISO.dayOfWeek(), 1);

        return targetMonday.minusDays(4);  // Thursday before target week
    }
    
    // Entity enforces its own rules
    public void checkCreationAllowed(LocalDate today) {
        if (!today.isBefore(getDeadlineDate())) {
            throw new IllegalStateException("Deadline passed. Last chance was " + getDeadlineDate());
        }
    }
}

The Service Layer: The Orchestrator
#

The DishSuggestionService doesn’t know how to calculate a deadline. It just fetches the data, asks the entity if it’s allowed, and saves it.

public class DishSuggestionService {
    
    public DishSuggestionDTO submitSuggestion(DishCreateRequestDTO dto) {
        // Service validates IDs
        ValidationUtil.validateId(dto.stationId());
        ValidationUtil.validateId(dto.userCreatedById());

        // Service fetches related entities
        Station station = stationReader.getByID(dto.stationId());
        User user = userReader.getByID(dto.userCreatedById());

        // Entity validates authorization
        user.ensureIsKitchenStaff();

        // Service fetches allergens
        Set<Allergen> allergens = dto.allergenIds().stream()
            .map(allergenDAO::getByID)
            .collect(Collectors.toSet());

        // Create entity
        DishSuggestion dish = new DishSuggestion(
            dto.nameDA(),
            dto.descriptionDA(),
            dto.targetWeek(),
            dto.targetYear(),
            station,
            user,
            allergens
        );

        // Entity enforces deadline (service provides current date)
        dish.checkCreationAllowed(LocalDate.now());

        // Service persists
        DishSuggestion saved = dishSuggestionDAO.create(dish);
        return mapToDTO(saved);
    }
}

Pattern:

Service: Validates IDs, fetches entities, provides external dependencies (LocalDate.now()), persists
Entity: Enforces invariants, calculates business values, validates state transitions

Approve/Reject: Delegate to Entity
#

// aprove example
public DishSuggestionDTO approveDish(Long dishId, Long approverId) {
    ValidationUtil.validateId(dishId);
    ValidationUtil.validateId(approverId);

    DishSuggestion dish = dishSuggestionDAO.getByID(dishId);
    User approver = userReader.getByID(approverId);

    // Entity handles authorization + state transition
    dish.approve(approver);
    
    DishSuggestion updated = dishSuggestionDAO.update(dish);
    return mapToDTO(updated);
}

Why this works:

Entity owns business logic (approve() validates approver role and current status)
Service just orchestrates (fetch → delegate → persist)
Impossible to bypass validation (entity enforces it)

What Worked Well
#

DeepL API is straightforward — Simple request/response, good docs
Gemini free tier is plenty — 1500 req/day » actual usage
Prompt engineering works — Right prompt = clean JSON output
Business logic in entities is clean — Service orchestrates, entity enforces
Records for DTOs — Immutable, concise, perfect for API responses

What Was Difficult
#

Gemini nested DTOs — complex documentation how to handle request / response and navigating in candidates[0].content.parts[0].text
Markdown JSON wrapping — Debugging why ObjectMapper crashed and then realizing Gemini wraps JSON in json ... then building a cleaning method to strip it out
API key security — Initially put key in query params (logs, browser history risk)
Deciding service vs entity responsibility — Not always obvious where logic belongs
Trusting AI output — Even with strict prompts, defensive checks were necessary. Never trust external systems blindly.