📑 Table of contents

Automatically translate your content with AI

Automatically translate your content with AI

Automatisation 🟡 Intermediate ⏱️ 16 min read 📅 2026-02-24

🤖 LLM vs traditional translators: the match

Before diving headfirst into AI, let's honestly compare the available options.

🌐 Google Translate: fast but limited

Google Translate has made huge progress thanks to deep learning. For a quick translation of an email or a restaurant menu, it's perfect. But for professional web content:

  • ❌ Loses the author's tone and style
  • ❌ Translates idiomatic expressions literally
  • ❌ Ignores SEO context (keywords, meta descriptions)
  • ❌ Often breaks markdown formatting
  • ✅ Free and instant
  • ✅ Supports 130+ languages

🔵 DeepL: the best "traditional" translator

DeepL is often considered the best machine translator for European languages. Its quality is clearly superior to Google Translate for French, German, and Spanish.

  • ✅ Excellent linguistic quality
  • ✅ API available (free up to 500,000 characters/month)
  • ✅ Custom glossaries
  • ❌ Doesn't understand the "business context" of your content
  • ❌ Can't adapt the tone on instruction
  • ❌ Limited in languages (31 languages)
  • ❌ Breaks complex markdown

🧠 LLM (Claude, GPT-4, Llama): contextual translation

LLMs change the game because they don't just translate word for word. They understand the meaning and can follow precise instructions:

  • ✅ Preserves the tone (formal, casual, technical...)
  • ✅ Adapts expressions culturally
  • ✅ Respects markdown/HTML formatting on instruction
  • ✅ Can optimize SEO in the target language
  • ✅ Translates AND improves simultaneously
  • ⚠️ Can "hallucinate" or add content
  • ⚠️ Variable cost depending on the model

📊 Comparison table

Criterion Google Translate DeepL LLM (Claude/GPT)
Raw quality ⭐⭐⭐ ⭐⭐⭐⭐ ⭐⭐⭐⭐⭐
Tone preservation ⭐⭐ ⭐⭐⭐ ⭐⭐⭐⭐⭐
SEO ⭐⭐ ⭐⭐⭐⭐⭐
Markdown/formatting ⭐⭐ ⭐⭐⭐ ⭐⭐⭐⭐
Cost Free Freemium Variable
Speed ⚡⚡⚡ ⚡⚡⚡ ⚡⚡
Customization ⭐⭐ ⭐⭐⭐⭐⭐
Number of languages 130+ 31 50+ (variable)

Verdict: For professional web content, LLMs are unbeatable. For a quick, low-stakes translation, DeepL or Google Translate are sufficient. The ideal? Combine the two: LLM for the translation, then DeepL/Google as a "second opinion" for verification.

🔧 Building your automated translation pipeline

A good translation pipeline isn't just about "sending the text to ChatGPT". Here is the complete 5-step architecture.

Step 1: Source language detection

Before translating, you need to know which language you are translating from. Even if it seems obvious, it's crucial for automation.

The logic consists of querying an LLM with the first 500 characters of your text to save tokens. The model returns a JSON object containing the ISO language code (for example {"lang": "fr", "confidence": 0.99}) with a confidence score between 0 and 1. This approach is more than enough to reliably identify the language before launching the translation.

Step 2: Content preparation

Before translation, we prepare the content to avoid errors. The protection logic relies on a substitution system: we temporarily replace sensitive elements with placeholder markers (like __CODE_BLOCK_0__ or __URL_1__), then restore them once the translation is complete.

Specifically, we identify and isolate three types of elements: complete code blocks (delimited by triple backticks), standard URLs (identified via their http prefix), and internal tracking links (like /out?id=X). This process is critical, because without it, the LLM might translate your URLs, modify your code blocks, or break your internal links.

Step 3: Translation with an optimized prompt

The secret to a good AI translation is the prompt. Here is the one we use at AI-master.dev:

The model calling logic separates the system from the user content. The system prompt defines the role of a professional translator specialized in tech/AI, the source and target languages, and imposes eight strict rules: preserve markdown formatting, do not translate protected markers, adapt idiomatic expressions, maintain the tone, keep technical terms in English (API, LLM, SEO...), optimize target SEO, never add or remove content, and leave emojis intact. The user prompt injects the prepared text with optional context. We use a low temperature (around 0.3) to balance fidelity and fluency, and a high maximum token limit for long articles.

Key points of the prompt:
- Low temperature for fidelity, high enough for fluency
- We explicitly list the terms not to translate
- We ask for markdown preservation in the system prompt (more reliable)

Step 4: Automatic quality review

Once the translation is done, we pass it through a second LLM for verification.

The logic consists of sending the original and the translation (truncated to 3000 characters to remain economical) to a different model than the one that translated. We ask it to evaluate five criteria (fidelity, fluency, formatting, cultural adaptation, SEO) and return a structured JSON. Example of expected output: {"scores": {"fidelite": 9, "fluidite": 8, "formatage": 10, "culture": 8, "seo": 9}, "score_global": 8.8, "problemes": [], "suggestions": [], "approved": true}. The translation is automatically approved if the overall score exceeds 7.5 out of 10.

Why a second model? Using a different model for the review reduces bias. If Claude translates, Gemini verifies (and vice versa). It's like having an independent proofreader.

Step 5: Complete pipeline

Let's put it all together.

The logic of the complete pipeline chains five steps sequentially. First, language detection identifies the source and checks that it doesn't match the target language. Next, preparation isolates fragile elements (code, URLs) and counts how many were protected. The translation is then launched with the optimized prompt, followed by a restoration of the substituted elements. Finally, the quality review calculates the overall score: if it is rejected, the pipeline automatically relaunches a translation by injecting the correction suggestions into the context. The final result always includes the translated text accompanied by its review report.

🎯 Preserving tone, SEO, and formatting

This is THE major challenge of machine translation. Here are the advanced techniques.

🎭 Preserving the tone

Tone is what differentiates a lively article from a robotic translation. A few rules:

For a casual tone (informal "you", expressions):
- "Tu vas voir, c'est super simple !" → ✅ "You'll see, it's super easy!" → ❌ "You will observe, it is remarkably simple."

For a technical/formal tone:
- "Cette implémentation nécessite une configuration préalable." → ✅ "This implementation requires prior configuration." → ❌ "You gotta set things up before using this."

Technique: Add an example of the desired tone to your prompt. LLMs are excellent at reproducing a style when given a sample.

📈 Preserving SEO

SEO translation is an art in its own right. It's not enough to just translate the keywords:

French Literal translation SEO translation
"intelligence artificielle" "artificial intelligence" "AI" (more searched)
"automatiser ses tâches" "automate your tasks" "task automation" (target keyword)
"guide complet" "complete guide" "ultimate guide" (English SEO pattern)
"pas à pas" "step by step" "step-by-step tutorial"

Specific SEO prompt:
- Before translating, identify the 5 main SEO keywords of the source text. Find their most searched equivalents in the target language (via your knowledge). Use these equivalents naturally in the translation.

📝 Preserving markdown

Markdown is fragile. One extra space and your heading becomes plain text. Here are the common pitfalls:

  • 🚀 Heading without space → the LLM removes the space after the emoji
  • bold textimproperly closed → the LLM forgets the **
  • link → the LLM translates the URL
  • | broken | table → the LLM changes the alignment

Solutions:
1. Explicitly request preservation in the system prompt
2. Validate the markdown after translation with a parser
3. Compare the structure (number of #, **, [], |) before/after

The markdown validation logic consists of counting the occurrences of each structural element (headings with #, bold with **, links with [](), code blocks with backticks, tables with |, and lists with - or *) in the original text and in the translation. If the number differs for a category, the script flags the anomaly (for example "headings: 5 → 4"), which allows you to quickly identify what the LLM has damaged.

💰 Free models for translation

Good news: you don't need GPT-4 to translate. Several free or very cheap models do an excellent job.

🏆 Model ranking for translation

Model Translation Quality Cost Speed Ideal for
Claude Sonnet 4 ⭐⭐⭐⭐⭐ ~$3/M tokens Fast Premium content, SEO
GPT-4o ⭐⭐⭐⭐⭐ ~$2.50/M tokens Fast Premium content
Gemini 2.0 Flash ⭐⭐⭐⭐ Free* Very fast Large volumes
Llama 3.3 70B ⭐⭐⭐⭐ Free (local) Medium Self-hosted, privacy
Mistral Large ⭐⭐⭐⭐ ~$2/M tokens Fast FR↔EN specifically
Gemma 2 27B ⭐⭐⭐ Free (local) Fast Small volumes, testing
Llama 3.1 8B ⭐⭐⭐ Free (local) Very fast Drafts, pre-translation

*Gemini Flash: free via the Google AI Studio API with rate limits.

⚡ Gemini Flash: the free champion

Google Gemini 2.0 Flash is probably the best quality/price ratio for translation. It is used via the OpenRouter API by configuring an OpenAI-compatible client with the OpenRouter base URL and your API key. You then send a request to the google/gemini-2.0-flash-001 model, passing the text to be translated along with the instruction to preserve the markdown. The model returns the translation with the formatting intact (emojis, bold, links preserved).

Llama 3.3: the self-hosted solution

If privacy is important (sensitive content, customer data), Llama 3.3 70B runs locally.

Installing Ollama is done by running the official installation script via curl, then downloading the desired model with the ollama pull command (for example ollama pull llama3.3:70b). Once ready, translation is launched directly from the command line with ollama run followed by the model name and your translation instruction.

Advantages of self-hosting:
- No data leaves your server
- Cost = electricity only
- No token limits

Disadvantages:
- Requires a powerful GPU (24GB+ VRAM for 70B)
- Slightly lower quality than closed models
- Maintenance is your responsibility

For a blog like AI-master.dev, here is the optimal strategy:

  1. Draft: Gemini Flash (free, fast) or Llama 8B
  2. Final translation: Claude Sonnet or GPT-4o (maximum quality)
  3. Review: Gemini Flash (free, sufficient for verification)
  4. SEO validation: Claude with a specialized prompt

Estimated cost per 3000-word article: ~$0.05 to $0.15. Compare this with a human translator at ~$90 for the same article.

⚡ Automating with OpenClaw

If you use OpenClaw, you can automate the entire translation pipeline directly from Telegram.

Pipeline configuration

OpenClaw allows you to create agents that execute tasks automatically. The configuration defines a trigger (such as the publication of a new article), then chains the pipeline steps: language detection, content preparation, translation via a fast model like Gemini Flash, quality review with a premium model like Claude Sonnet, markdown validation, and saving the result. The agent then notifies you on Telegram once the process is finished and can ask for your validation before publication.

Security and privacy

When translating sensitive content, it is essential to ensure that API keys are stored in environment variables, that translations are not logged in plain text, and that API access is protected by authentication. To host this type of service, a VPS at Hostinger offers a good performance/price ratio.

🧪 Practical case: translating a blog post FR→EN

Let's put this into practice with a concrete example.

The source content

Let's take an excerpt from a French article including a title with an emoji, a paragraph with figures and bold text, a blockquote, and a three-column table listing tools (OpenClaw, Zapier).

Result after the pipeline

The pipeline transforms the source content into English while fully preserving the structure: the title emoji stays in place, the key figure and bold text are kept, the quote is faithfully translated, and the table maintains its three columns with untranslated product names. The adaptations apply solely to the vocabulary ("tu peux" becomes "you can", "réseaux" becomes "social media", "Connexion d'apps" becomes "App integration" to match English SEO practices).

What was preserved:
- ✅ Emoji in the title (🚀)
- ✅ Markdown formatting (bold, blockquote, table)
- ✅ Casual tone
- ✅ Identical structure
- ✅ Untranslated product names

What was adapted:
- "tu peux" → "you can" (adaptation from informal "tu")
- "réseaux" → "social media" (more natural term in English)
- "Connexion d'apps" → "App integration" (English SEO term)

📋 Translation quality checklist

Before publishing a translation, systematically check:

Structure

  • [ ] Same number of headings (H1, H2, H3...)
  • [ ] Same number of paragraphs
  • [ ] Tables intact (columns, alignment)
  • [ ] Bulleted/numbered lists complete
  • [ ] Code blocks unmodified

Content

  • [ ] No missing or added paragraphs
  • [ ] Expressions adapted (no literal translation)
  • [ ] Correct technical terms in the target language
  • [ ] Emojis preserved
  • [ ] Faithful quotes

SEO

  • [ ] Main keywords present in the target language
  • [ ] Meta description translated and optimized
  • [ ] URLs and internal links intact
  • [ ] Image alt text translated
  • [ ] Slug adapted to the target language

Technical

  • [ ] Valid markdown (no broken tags)
  • [ ] Working links
  • [ ] No residual source text
  • [ ] Correct encoding (accents, special characters)

🚀 Going further: multilingual translation at scale

Once your pipeline is in place for two languages, extending it to 5 or 10 languages is simple. The logic consists of defining a dictionary of target languages with their ISO codes and names, then using asyncio to launch translations in parallel rather than in series. Each translation is an independent asynchronous task, and the gather function waits for all of them to finish before returning the results. A 3000-word article can thus be translated into 5 languages in less than 2 minutes.

Estimated costs for a multilingual blog

Volume Languages Model Monthly Cost
4 articles/month 2 Gemini Flash ~$0 (free)
4 articles/month 2 Claude Sonnet ~$1
10 articles/month 5 Gemini Flash ~$0 (free)
10 articles/month 5 Claude Sonnet ~$5
30 articles/month 10 Mix Flash + Sonnet ~$10

Compared to the rates of a translation agency (€500-€2000/month for the same volume in 2025), AI makes multilingual accessible to all budgets.

📌 The essentials

  • LLMs outperform Google Translate and DeepL for web content translation thanks to their contextual understanding.
  • A robust pipeline comprises five steps: language detection, preparation (protecting code and URLs), translation with an optimized prompt, automatic review by a second model, and markdown validation.
  • The translation prompt must explicitly require the preservation of formatting, tone, and technical terms.
  • Gemini Flash (free) and Claude Sonnet (premium) form the ideal combination in a hybrid strategy.
  • The estimated cost per 3000-word article is between $0.05 and $0.15, representing a reduction of over 95% compared to human translation.

❌ Common mistakes

Translating URLs and code blocks: This is the most frequent mistake. Without a prior protection step, the LLM will interpret your links and code snippets as standard text and translate them, rendering your links unusable and your code broken. The solution is to substitute these elements with placeholders before translation.

Using a temperature that is too high: With a temperature parameter above 0.5, the model takes too many liberties. It will rephrase, add transitions, or omit paragraphs. For translation, aim for a maximum of 0.2 to 0.3.

Ignoring SEO adaptation: Literally translating your French keywords into English is not enough. The term "intelligence artificielle" is less searched than "AI", and "guide complet" loses out to "ultimate guide". You need to ask the LLM to identify the most searched equivalents in the target language.

Having the same model proofread: If Claude translates and Claude proofreads, it won't see its own errors (confirmation bias). Always use a different model for the review, for example Gemini Flash to check Claude's work.

Publishing without structural validation: An LLM can perfectly translate the content while breaking a table or forgetting a ** symbol. Automated comparison of the number of markdown elements before and after translation is essential.

❓ FAQ

Can you translate HTML directly with an LLM?
Yes, but it's riskier than markdown. HTML contains attributes, classes, and nested tags that the LLM can easily alter. If you must translate HTML, first protect all tags using the same substitution system as for URLs.

Which model should you choose if you have no budget?
Gemini 2.0 Flash via the Google AI Studio API is the best free option in 2025. For content without SEO stakes, Llama 3.1 8B locally via Ollama also does the job very well.

How long does it take to translate a 3000-word article?
With Claude Sonnet or GPT-4o, expect 30 to 60 seconds for the translation itself. By adding detection, preparation, review, and validation, the complete pipeline runs in 2 to 3 minutes.

Is manual proofreading necessary after the pipeline?
Yes, especially at the beginning. The pipeline eliminates 90% of the work, but a quick proofread (5 minutes) helps spot cultural nuances or typos that the LLM doesn't catch. Over time, you will need to proofread less and less.

Can you translate into languages with different alphabets (Japanese, Arabic, etc.)?
Yes, LLMs handle non-Latin alphabets very well. However, text direction (right-to-left for Arabic) can cause issues in some markdown editors. Always test a short excerpt before running a full translation.

🎓 Conclusion

AI machine translation is no longer a gimmick: it's a mature production tool. With the right pipeline, you can:

  • Double your audience by publishing in multiple languages
  • Reduce your translation costs by 95%+
  • Accelerate your publishing (minutes instead of days)
  • Maintain quality thanks to automatic review

The key is not to settle for simply "copy-pasting into ChatGPT". A real pipeline with detection, preparation, translation, review, and validation makes all the difference.

Start simple: translate an article with Gemini Flash, check it manually, then gradually automate. In a week, you'll have a system running on its own. To go further in automating your workflow, you can also automate your commits and reviews with Git and AI, monitor your pipelines thanks to server monitoring with AI, or even automatically retrieve content to translate via intelligent scraping with AI. 🚀