OmegaT offers a pretty good level of compatibility with other translation memory programs, in particular SDL Trados and Wordfast Pro. But it’s not 100% yet. One of the challenges in this respect is the differences between the tags produced by OmegaT and other tools. This post offers a few best practices to tackle this challenge in TTX, TXML, SDLXLIFF, and possibly other formats.
A translator has a TTX file to translate and a TMX exported from Trados. A pre-translated TTX file contains many 100% matches, but after opening it in OmegaT, the number of 100% matches is much lower. This is what the Editor and Fuzzy Matches panes show for one of the 100% matches:
Editor (OmegaT’s tags): The < t0>dog< /t0> bit Johny
Fuzzy Matches (Trados TM): The < f1>dog< /f1> bit Johny
Even though Trados views this segment as a 100% match, OmegaT doesn’t recognize it as such due to the different tags. Oftentimes, a translator will end up inserting this translation from the TM and adjusting the tags manually. This additional work is a complete waste of time, and this post will try to help you avoid it where possible.
Keeping 100% Matches in the Files
The best thing you can do is to keep 100% matches in the source files. To do so, you need to pre-translate the source files using the original TM. For instance, if you have a TTX, you’ll use, or ask client to use, SDL Trados to pre-translate your TTX with a Trados TM.
Doing so will make it possible for an OmegaT file filter to convert the tags in 100% matches automatically. At least two file filters currently support this conversion: TagEditor TTX files (Okapi) and Wordfast Pro TXML files (Okapi). As the names imply, they belong to Okapi plugin and cover TTX and TXML formats.
For SDLXLIFF, there’s no such smart file filter yet. The native SDLXLIFF file filter can only process those source files where source has been copied to target. You can try to use Okapi Rainbow to convert a pre-translated SDLXLIFF file, but chances are the tags produced by Okapi will be different from those in OmegaT. You can then try to optimize them as described in the next section.
When neither of the above methods is available, the last resort is to simply work on a pre-translated file, ignoring the fact that this file filter doesn’t support 100% matches. For example, you can use the XLIFF Files filter to work on the pre-translated SDLXLIFF files (just make sure you copy source to target for the segments other than 100% matches). OmegaT will show translation in the source part of the segment. This isn’t perfect by all means, but it’s better than nothing when no other options are available, especially when 100% matches aren’t supposed to be reviewed. After all, you don’t want to insert 100% matches manually in the segments you aren’t paid for, do you?
Optimizing Fuzzy Matches
The second best practice is to optimize an exported TMX by editing it with a text editor. Some of the ideas include:
- Replacing a letter in TMX tags by a letter in tags displayed in OmegaT. This is a good start, but some tags will be also different by numbers as well.
- Removing the tags from the beginning or the end of the segments. These tags can cause mismatches because OmegaT’s file filters often don’t display these tags while the translations in a TMX do have them.
When you want to reduce the number of mismatches between the tags produced by OmegaT and other CAT tools, the first step is to keep 100% matches in your source files by pre-translating them. With fuzzy matches, you can adjust or remove the tags in TMX using a text editor. While neither of the methods is perfect, it’s a good start compared to not thinking about mismatches at all.
If you have any other ideas for optimizing the tags, do tell us about them!
For more information about OmegaT, check our posts about other features this translation memory program has to offer. And add us to your RSS reader to make sure you don’t miss future posts about some of the advanced OmegaT functions.