The Need for a Deliberate Shift to Context-Rich Translation AI

Why and how global brands are ditching microtask-specific AI patches for AI-first workflows

By Kirti Vashee, Translated's Language Technology Evangelist

Rome – November 4, 2025

The Localization industry held three major conferences in September and October: AMTA (American Machine Translation Association), Localization World in Monterey, CA, and the TAUS conference in Salt Lake City. These conferences all provide perspectives on the challenges that industry players grapple with as they attempt to deploy AI in production settings and move beyond pilots.

An overarching and unifying theme across all three conferences was that teams that chase AI solely to cut costs or expect plug-and-play miracles usually fall short and fail. Real impact comes from clear business goals, purposeful integration, and intelligent workflow design that properly defines and delivers value. Many are just beginning to understand that success with AI is evolutionary and gradual rather than revolutionary and immediate. The learning curve is steep and requires money, deep expertise, and ongoing experimentation to find optimal production-ready strategies.

Here are the major themes that surfaced from LocWorld.

Localization is a Strategic Powerhouse for AI, Not Just a Cost Center

Attendees focused heavily on elevating localization from an operational cost center to a strategic powerhouse, drawing explicit parallels to how IT evolved in the 2000s and 2010s. The conference highlighted that localization teams are now solving some of the most complex AI problems across entire enterprises, e.g., addressing bias, hallucinations, cultural nuance, quality assurance, and interoperability at scale. Discussions emphasized the need to rethink the industry's 30-year legacy, embedding localization into global content operations rather than treating it as a back-office translation service.

AI Implementation is Complex, Evolutionary, and Requires Nuance

A central message, repeatedly emphasized, was that the technology isn't the problem; our way of implementing it is". Attendees focused intensively on the practical challenges of deploying AI effectively rather than debating whether to adopt it. Sessions addressed inflated expectations, insufficient preparation, and the critical need to understand what AI can and cannot accomplish. Technical discussions explored the management of multilingual AI complexity, hallucinations, and the problems created by English-centric language models that perform unevenly across languages.

The focus was on moving beyond inflated expectations and AI "sprinkling" to achieve genuine business enhancement and AI maturity.

AI Unlocks Untapped Global Market Potential

The conference showcased how localization AI can dramatically expand market reach. By enabling contextual translation for previously underserved languages (e.g., African languages) and offering truly localized experiences, AI allows businesses to access new, largely untapped markets, moving beyond a "bigger slice of the pie" to "growing the pie for everyone." This theme challenged the scarcity mindset, advocating for universal localization across hundreds of languages to grow global reach and inclusivity. It positioned localization as essential for AI governance, brand trust, and economic growth in underrepresented language ecosystems.

Andrew Miller, Solutions Architect at Translated, who attended LocWorld, commented:

The lowered technical barriers to AI have clearly sparked creativity in our industry. However, the rush to "keep up" and layer AI within existing legacy workflows has unintentionally added unnecessary complexity with minimal benefit.

Much of what is currently labeled "agentic AI" is often just AI-washing of existing intelligent automation and Robotic Process Automation (RPA). Instead of fundamentally rethinking processes, we are using superficial "AI band-aids," making straightforward, well-understood procedures needlessly complicated.

As the industry continues experimenting with AI, it's important to remember the financial and environmental impacts of increased complexity, and that real quality comes from better data, optimized processes, and robust models. We must resist the urge to overcomplicate proven workflows with arbitrary, inconsistent, and unproven "AI magic dust".

LLM-based Machine Translation:

Promising Research But Slow Industry Adoption

Experts at recent MT research conferences (e.g., WMT25) acknowledge that Translation AI (LLM-based MT) significantly outperforms Neural Machine Translation (NMT) in output quality, as demonstrated by transparent evaluations from the trusted WMT community. However, industry-wide adoption of LLM-based MT for production remains slow.

Based on insights from various conference presentations at both Localization World and AMTA, most industry players still favor domain-optimized NMT, often augmented with AI-supported quality estimation (MTQE) and automated post-editing (APE). They view this combination as a more reliable, predictable, and consistent pipeline for current production needs.

Slator provided a comprehensive summary of the takeaways from AMTA, and noted the theme stated in many of the presentations: “It’s Not Time To ‘Switch Gears’ [from NMT to LLM MT]”. Possibly because many struggle to achieve consistent and reliable results from LLM MT using the legacy batch customization and TM-segment focused approaches they are familiar with.

Translated's Leading Edge with Lara

A notable exception to this slow-adoption trend is Translated's Lara, an LLM-based MT system that is already being deployed at production scale. As of November 1, 2025, Lara has processed 3.14 trillion words! Currently, Lara is central to all MT use for Translated's top 25 languages, with plans to expand coverage to over 200 languages shortly.

This rapid adoption and technology upgrade were possible because 15 years of fine-tuning adaptive MT experience and data curation practices were leveraged to drive reliable and consistent outcomes with the new LLM technology. Additionally, access to large amounts of GPU compute resources accelerated the Lara team’s learning to achieve production viability quickly.

In contrast to the committed shift to LLM MT at Translated, many speakers at AMTA emphasized that the future of translation technology is not simply replacing existing NMT systems with LLMs, and many speakers stated a preference for a “hybrid approach” described below:

  • Strategic & Selective LLM Use: LLMs should be used judiciously and strategically for specific tasks where their strengths truly add value (e.g., post-editing, improving fuzzy matches, enhancing terminology, refining style). They are not a universal solution for every translation task. Alex Yanishevsky of Smartling stated. “Just because you can do it [use an LLM], doesn’t necessarily mean that you should.”

  • Hybrid, Agentic Workflows: Many speakers stated that the most effective approach involves combining traditional NMT for its consistency and fluency with LLMs. LLMs can then act as intelligent agents to refine, adapt, and provide deeper context awareness and terminology control over the MT output. Julian Hamm of Star stated: “The future lies in hybrid, agentic workflows, where LLMs refine and adapt MT output. It’s not time to “switch gears,” Hamm noted — but to level them up.”

LLMs are increasingly used as quality advisors, error detectors, and benchmarks, routing edge cases to human experts. Open issues remain, such as gender bias, “AI voice,” and safety across languages, as well as evaluation frameworks that check comprehension and factual integrity, not just fluency. This type of confined and limited use of AI also limits the utility of this technology for deployment in mainstream and large-scale production purposes.

Other highlights from AMTA include

  • Focus on Context, Quality, and Domain Adaptation for Enterprise AI: Presentations demonstrated fine-tuning approaches for specialized domains like legal, healthcare, and manufacturing, while highlighting how LLMs can serve as quality advisors to NMT output through automated quality assessment frameworks. However, speakers emphasized that current benchmarks inadequately measure multilingual reasoning and called for evaluation frameworks that test understanding beyond fluency. This highlights a shift towards more intelligent data enrichment and automated quality assessment frameworks, except at Translated, where Lara replaces traditional NMT and integrates context-rich and context-driven LLM functionality into the core translation task to reduce the need for post-MT adjustments.

  • Advanced AI Architectures: Multi-Agent & Multimodal Futures: The conference showcased the transition toward multimodal translation systems that integrate visual, audio, and contextual information alongside text. Speech-to-speech translation grounded in visual cues demonstrated how AI interpreters could resolve ambiguities by "seeing" gestures and surroundings. The future of AI translation is envisioned as moving beyond monolithic models towards more complex, collaborative systems. This includes "multi-agent translation pipelines" where specialized LLM agents work together to improve outputs, and "multimodal interpreting systems" that integrate visual cues with speech-to-speech translation for richer contextual understanding. However, there are no compelling successful deployments to date.

  • Persistent Challenges and the Centrality of Human Expertise: Despite rapid advancements, significant challenges remain, including persistent bias (especially gender bias), the emergence of a distinct "AI style" in translations, and multilingual safety gaps. Critically, experts emphasize that human linguists remain indispensable. Their roles are evolving from mechanical post-editing to strategic functions like evaluating AI outputs, data labeling, guiding continuous model improvement, and providing critical thinking and domain expertise to shape effective human-AI collaboration.

TAUS Conference Highlights

At the TAUS conference, Łukasz Kaiser from OpenAI delivered a keynote on reasoning models and language translation. Łukasz, co-author of the landmark research introducing Transformers to the world, emphasized that a primary goal of LLM research has been building models that learn from less data, which was the same objective that drove the original MT-focused Transformer development.

He highlighted key characteristics of current LLM momentum, particularly "reasoning models," which are now in their infancy but will continue evolving. Kaiser demonstrated how reasoning models approach complex tasks like poetry translation by revealing the "thought process" the model employs, using an Edgar Allan Poe poem as an example to show how reasoning enhances translation quality.

Both Kaiser and Marco Trombetti stressed that segment-level constraints imposed by translation memory (TM) and Translation Management Systems (TMS) handicap Translation AI’s optimal capabilities. Lukasz stated several times during the conference: “How can you expect a technology that focuses on isolated segments to be the means for driving AI translation improvements? It cannot. If you want to leverage AI, drop this segment-mindset and learn to use context and a new way to use relevant data.” (I paraphrase). Contextually rich inference interactions will always produce better output from Language AI on all language-related tasks.

Wayne Bourland, Head of Localization at Dell, also declared a bold vision where he stated that "TMS is dead" and that, “Nobody will pay for these monolithic systems (TMSs) in the future”. This is a realization that appears to be gaining moomentum.



Average TTE MT+context vs TM 100% match

AI Technology experts in the translation industry increasingly question the content fragmentation caused by translation memory and TMS systems. These systems tend to break content into pieces, thus losing context, and focus all translation tasks on “segments”. They overly rely on the use of 100% translation memory matches when working with machine translation, which limits the use of more valuable contextual information.

This legacy approach is now seen as an obstacle to achieving faster improvements in translation capabilities. Translation AI has the potential to leverage a wider range of contextual data and producing contextually accurate output with great assurance, and also eliminates much of the need of post-MT agentic cleanup that hybrid approaches require.

Mathijs Sonnemans, Blackbird’s VP of Product, won the Innovation Pitch at the TAUS event, showing the increased flexibility and capability their orchestration platform has when they liberate the most useful linguistic data from a TMS and enable it to be used in a more AI-ready mode using a context and metadata-rich data repository called BlackLake, a data lake-based innovation they have just introduced.

Other highlights from Marco Trombetti’s presentation:

  • Translated has production data metrics across millions of measurements, which show it is faster and easier to process context-rich Lara (LLM MT) output than 100% matches from TM. Over the last 25 years, Translated has collected billions of production run measurements on TTE (Time To Edit) and uses them as a trusted KPI for production quality and efficiency measurement. Marco showed the following slide, which shows that reviewing Lara's “raw” output from context-rich input is more efficient than the typical industry practice of using 100% TM matches plusMT only for segments that are less than 90% (or a similar level). Also, thr evidence is clear that translators clearly prefer working with the complete document and its overall content context, as opposed to random, isolated segments.


  • TMs are Death

  • There is now clear evidence that the use of TM is slowing the rate of improvement in AI translation, and evidence shows that providing more complete document context is more efficient, even in high-volume production scenarios.
  • Marco showed actual production data measurements with TTE, that using the best translators (who are paid fairly) is a more reliable way to drive ongoing improvements in the AI output quality than using less capable translators who are paid at the lowest rates in the industry. In fact, the evidence suggests that using the lowest cost translators in the market measurably undermines AI output quality.
  • He stated that as the machine quality improves, we will see a much greater realization of the latent demand, and we may be translating from 100X to 1,000X more content. Better MT will not necessarily reduce the need for translators, but rather, provide an expanding and sometimes different role for the best translators who understand how to operate and manage Translation AI.

These conferences reveal an industry at a pivotal moment, grappling with how to move AI from pilots to production. A unifying theme was that success requires strategic implementation and process innovation, use-based learning, new skills, and new tools rather than expecting "plug-and-play" miracles.

These highlights together reflect a sector in rapid transition, where the success of technology hinges on strategic implementation and new, updated processes that transform and elevate the possibilities of translation production practices to improve communication, and understanding across human languages.



Get Updated!

Subscribe to Translated's newsletter to receive more content like this.