GET A FREE CONSULTATION OR SAMPLE TO GET YOUR PROJECT GOING.

Yes, I want my consultaion

How to Make OmegaT Like Tags in TMs Produced by Other Tools

OmegaT fails to recognize a 100% match due to a mismatch in tags

OmegaT offers a pretty good level of compatibility with other translation memory programs, in particular SDL Trados and Wordfast Pro. But it’s not 100% yet. One of the challenges in this respect is the differences between the tags produced by OmegaT and other tools. This post offers a few best practices to tackle this challenge in TTX, TXML, SDLXLIFF, and possibly other formats.

Example

A translator has a TTX  file to translate and a TMX exported from Trados. A pre-translated TTX file contains many 100% matches, but after opening it in OmegaT, the number of 100% matches is much lower. This is what the Editor and Fuzzy Matches panes show for one of the 100% matches:

Editor (OmegaT’s tags): The < t0>dog< /t0> bit Johny

Fuzzy Matches (Trados TM):  The < f1>dog< /f1> bit Johny

Even though Trados views this segment as a 100% match, OmegaT doesn’t recognize it as such due to the different tags. Oftentimes, a translator will end up inserting this translation from the TM and adjusting the tags manually. This additional work is a complete waste of time, and this post will try to help you avoid it where possible.

Keeping 100% Matches in the Files

The best thing you can do is to keep 100% matches in the source files. To do so, you need to pre-translate the source files using the original TM. For instance, if you have a TTX, you’ll use, or ask client to use, SDL Trados to pre-translate your TTX with a Trados TM.

Doing so will make it possible for an OmegaT file filter to convert the tags in 100% matches automatically. At least two file filters currently support this conversion: TagEditor TTX files (Okapi) and Wordfast Pro TXML files (Okapi). As the names imply, they belong to Okapi plugin and cover TTX and TXML formats.

For SDLXLIFF, there’s no such smart file filter yet. The native SDLXLIFF file filter can only process those source files where source has been copied to target. You can try to use Okapi Rainbow to convert a pre-translated SDLXLIFF file, but chances are the tags produced by Okapi will be different from those in OmegaT. You can then try to optimize them as described in the next section.

When neither of the above methods is available, the last resort is to simply work on a pre-translated file, ignoring the fact that this file filter doesn’t support 100% matches. For example, you can use the XLIFF Files filter to work on the pre-translated SDLXLIFF files (just make sure you copy source to target for the segments other than 100% matches). OmegaT will show translation in the source part of the segment. This isn’t perfect by all means, but it’s better than nothing when no other options are available, especially when 100% matches aren’t supposed to be reviewed. After all, you don’t want to insert 100% matches manually in the segments you aren’t  paid for, do you?

Optimizing Fuzzy Matches

The second best practice is to optimize an exported TMX by editing it with a text editor. Some of the ideas include:

  1. Replacing a letter in TMX tags by a letter in tags displayed in OmegaT. This is a good start, but some tags will be also different by numbers as well.
  2. Removing the tags from the beginning or the end of the segments. These tags can cause mismatches because OmegaT’s file filters often don’t display these tags while the translations in a TMX do have them.

Summary

When you want to reduce the number of mismatches between the tags produced by OmegaT and other CAT tools, the first step is to keep 100% matches in your source files by pre-translating them. With fuzzy matches, you can adjust or remove the tags in TMX using a text editor. While neither of the methods is perfect, it’s a good start compared to not thinking about mismatches at all.

If you have any other ideas for optimizing the tags, do tell us about them!

For more information about OmegaT, check our posts about other features this translation memory program has to offer. And add us to your RSS reader to make sure you don’t miss future posts about some of the advanced OmegaT functions.

2 comments

  • Mari says:

    Like in my situation making glossaries is a waste of time. Of course, I have all these programs, but as a beginner translator i rarely translate few documents with the similar terms.
    I don’t know if my choice is good, but if not I have some programs to convert documents into translation memory. i think it could me help sometime.

Add comment


About the Author

Roman Mironov
Roman Mironov
CEO & Founder

As the founder of Velior, Roman has had the privilege of being able to turn his passion for languages into a business. He has over 15 years of experience in the translation industry. Roman has helped dozens of clients increase sales by making their products appealing for speakers of other languages.