OmegaT: Configuring a List of Common Errors

OmegaT is a great translation program that has so many features that not everyone is aware of them even though they may be very useful. This post describes one of such features that makes quality assurance easier.

Finding Forbidden Phrases in Translation

OmegaT makes it possible to configure a list of words that you don’t want to appear in your translation. Whenever such word appears in your translation, OmegaT highlights it in red. One common use of this function is creating a list of common errors. OmegaT will check each word in your translation against this list and flag potential errors on the fly.

You can access this list by selecting Options => Tag Validation. It’s the last field called “Fragment(s) that should be removed…”

You need to enter a regular expression that represents your list. Let’s start with a simple search string and make it more complex in a few steps.

1. Imagine that you translate into Russian and want to flag two common spelling errors, “в течении” and “по нажатию”:

(в течении|по нажатию)

2. But you’ll notice that when they occur in the beginning of a sentence and start with an upper-case letter, OmegaT doesn’t flag them. That’s because the regular expression is case-sensitive. To make sure an upper-case letter is included as well, you’ll modify the search string like this:

([Вв] течении|[Пп]о нажатию)

3. Now you may want to add a punctuation error, e.g. an unnecessary comma after “однако”:

([Вв] течении|[Пп]о нажатию|[Оо]днако,)

4. Finally, let’s add a word that you don’t want to appear in your translation at all. The main challenge is that you need to include all applicable forms of this word in the search string. For example, if you don’t want a literal translation of the word “techniques” in plural, you’ll add it like this:

([Вв] течении|[Пп]о нажатию|[Оо]днако,|[Тт]ехни(ками|ки|кам|ках|к))


By following these simple rules, you can build a powerful list of common errors over time. And you can temporarily add project-specific words to make sure you don’t leave something unwanted in your current translation. Not only is this function a good reminder for yourself, but it can also be a great learning tool if you share your list with a team of translators. Some people on the team may be unaware of those errors, and OmegaT will conveniently bring their attention to the errors.

Do you have any other ideas on how to use this function? Please share in the comments.

Another powerful function you may want to know about is filtering segments in OmegaT.


  • xinm says:

    «С помощью этих простых правил можно постепенно создать внушительный список слов, включающий в себя наиболее распространенные ошибки.»
    А не поделитесь своим списком? Было бы интересно взглянуть.

    • Здравствуйте! Конечно, вот так выглядит список на данный момент. Маловато, но сколько есть: ([Вв] течении|[Пп]о нажатию|[Оо]днако,|[Тт]ехни(ками|ки|кам|ках|к)|[Кк]б|Мб|Гб|[^,]\sнапример,|[Вв] том числе,|[Тт]ем не менее,|, например,|[Дд]оговора|млрд\.|млн\.|неполностью|таймаут| [Cс]читанн(ыми|ых|ые|ым)|штрих-код|%-ны|, в частности,|[Пп]о завершению|[Дд]иректоры|:$)
      Надеюсь, что он будет полезен. Замечания и дополнения всячески приветствуются. Сварганим лучший список ошибок для OmegaT общими усилиями! 🙂

