Jump to content

MinT

From mediawiki.org


MinT (Machine in Translation) is a machine translation service based on open-source neural machine translation models. The service is hosted in the Wikimedia Foundation infrastructure, and it runs translation models that have been released by other organizations with an open-source license. An open machine translation service can be a key piece of the essential infrastructure of the ecosystem of free knowledge. This page captures the initiatives to scale the service and make this infrastructure more widely available.

You can try MinT as part of projects such as Content Translation and translatewiki.net, or directly in a test instance.

Overview of MinT initiatives

[edit]

Machine translation can be useful in different contexts. As more products make use of MinT for different purposes, it is useful to differentiate those different contexts. In this way, when users report a bug it is more clear where it needs to be fixed.

  • MinT Service. The backend service running open-source neural machine translation models.
    • MinT test instance. A basic interface to try the different translation models.
  • MinT for Translators. Initiative to integrate the MinT Service with tools that support other machine translaiton services such as Content Translation and the Translate Extension.
    • MinT Client for Content Translation. Client exposing the MinT Service as one of the machine translation services available in Content Translation.
    • MinT Client for Translate extension. Client exposing the MinT Service as one of the machine translation services available in the Translate extension.
  • MinT for Wiki Readers. Product to enable readers to use machine translation to read contents from other languages on a wiki.

You can read more below about each of the MinT initiatives.

Get involved

[edit]

Feel free to share any feedback in the discussion page. Planned improvements propose feature enhancements, track the progress of any task, and share your perspective on it. For completed work you can also check the status updates below.

MinT Service

[edit]

The MinT Service is designed to provide translations from multiple machine translation models. Currently, it uses the following models:

  • NLLB-200. The latest model from the 200 languages, including many that are not supported by other vendors.
  • OpusMT. The OPUS (Open Parallel Corpus) project from the University of Helsinki compiles multilingual content with a free license to train the OpusMT translation models. Anyone can easily help improve the translation quality by participating in the different projects that contribute data to OPUS. For example, when using Content Translation to create translations of Wikipedia articles, the data on published translations will be incorporated as a new resource to improve the translation quality for the next version of the model. Another quick way to contribute is to provide sentence translations with Tatoeba.
  • IndicTrans2. Indian Institute of Technology, Madras.
  • Softcatalà. Softcatalà is a non-profit organization with the goal to improve the use of Catalan in digital products. As part of the have been released.
  • MADLAD-400. MADLAD-400 is a multilingual machine translation model by Google Research that supports 419 languages.

MinT supports over 200 languages, with more than 70 languages not supported by other services (including 27 languages for which there is no Wikipedia yet). You can read more about the initial release of MinT and check some frequently asked questions in the summary page for the service.

Technical details

[edit]

The translation models have been optimized for performance using avoid the need for GPU acceleration. This makes it easier for organizations and individuals to build and run their own instances. For more details you can check the following:

MinT provides a platform to run multiple translation models. In order to support different initiatives, aspects such as language detection, pre/post-processing of contents, and rich format support has been developed on top of the plain-text based models.

Test instance

[edit]

The MinT test instance is a basic interface to try the different translation models. It allow to translate contents across the selected language pairs and select the preferred translation model when multiple are available. This allows different communities to check how well the models support their language. This instance is intended for testing, so performance and availability may be reduced compared to other MinT-based products. You can check the availability status of the MinT test instance.

MinT for translators

[edit]
Mobile translation using MinT

Translation is a common way to contribute in the Wikimedia ecosystem for multilingual users. Machine translation can provide a useful initial translation for users to review and improve. The Language team has developed tools to support translations in their workflows that can integrate different machine translation services to speed up their processes. Once MinT was available, integrating it with these tools was a logical next step to amplify their impact. MinT is available in the following projects:


MinT for wiki readers

[edit]

The number of topics and the amount of information a reader can learn about from Wikipedia and other wikis depends on the languages they speak. Machine translation can help people to learn more about their topics of interest when the content is not available in their language.

This initiative explores how to surface the machine translation support from MinT in Wikipedia articles in a way that:

  • Allows readers to learn more about the topics of interest from other languages.
  • Clearly differentiates automatically generated content from community-created one.
  • Encourages to access and contribute to community-created content when possible.

At the moment the Language team is working on the designs. Learnings based on data and community input will determine the next steps for the initiative.

MinT more widely available

[edit]

Working on the previous initiatives will help to polish and solidify the system. For now, the MinT API is only available for Wikimedia products. As the system gets ready, we'll consider a wider exposure. Providing a service that can be used by communities in innovative ways can be a very powerful tool. New initiatives to make MinT more widely available will be captured here in the future. Meanwhile, feel free to configure your own MinT instance to experiment with it.

Disclaimer

[edit]
  1. Accuracy of MinT’s Translations - The accuracy of translations generated by MinT may vary. Translations may not be entirely accurate or may not always convey the intended meaning or context of the original content. Wikimedia makes no representations or warranties regarding the accuracy or adequacy of the automatically translated content.
  2. Limitation of Liability - Wikimedia, its affiliates, and employees are not liable for any direct, indirect, incidental, punitive, or consequential damages, including but not limited to damages for goodwill, use, data, or any other intangible losses arising out of or in connection with the use of MinT or translations generated with MinT.
  3. Creative Commons Compliance - Translations generated with MinT are considered derivative works under the applicable Creative Commons license governing the original content. Users shall comply with the terms of the applicable Creative Commons license when using translated content.
  4. Terms of Use and Privacy Policy - Use of MinT is subject to Wikimedia's Privacy Policy.

Status updates

[edit]

February 2024

[edit]

January 2024

[edit]

December 2023

[edit]

November 2023

[edit]

October 2023

[edit]

September 2023

[edit]

August 2023

[edit]

July 2023

[edit]

Follow Lee on X/Twitter - Father, Husband, Serial builder creating AI, crypto, games & web tools. We are friends :) AI Will Come To Life!

Check out: eBank.nz (Art Generator) | Netwrck.com (AI Tools) | Text-Generator.io (AI API) | BitBank.nz (Crypto AI) | ReadingTime (Kids Reading) | RewordGame | BigMultiplayerChess | WebFiddle | How.nz | Helix AI Assistant