Last time, I looked at some of the areas where the Okapi XLIFF filter for OmegaT is superior to the native one. But if you already can’t wait to start using the new filter, there’s something else you should know that might make you change your mind.
The single most important drawback is tag numbering. While the native filter starts numbering the tags from the beginning of each paragraph, the Okapi filter starts from the beginning of a file. This makes every tag unique. Example:
Close the door. For details, see section < x0/>.
Click < g0>OK< /g0> button. For details, see section < x0/>.
Call us if you have any problems< g1>.< /g1> For details, see section < x0/>.
Even though it represents a different section name in each case, the < x0/> tag is identical from the OmegaT’s standpoint. As a result, OmegaT processes all the bold segments as non-unique segments (internal repetitions).
The same text rendered by the Okapi filter
Close the door. For details, see section < x40/>.
Click < g130>OK< /g130> button. For details, see section < x131/>.
Call us if you have any problems< g1000>.< /g1001> For details, see section < x1002/>.
All the three < x> tags are unique. As a result, OmegaT processes all the bold segments as unique segments.
This results in two problems:
- You’ll lose some of the repetitions, which translates into more work and a higher risk of inconsistency.
- Since the tags are pretty much unique, you’ll have trouble inserting the matches from the previous translations. The tags will likely mismatch every time.
Minor (and probably temporary)
Right now, the filter doesn’t support the UTF-8 encoding. You need to convert the SDLXLIFF files to UTF-8 without BOM first. The Okapi developers are working on this, though.
Okay then, the new filter has both advantages and drawbacks. How do we make use of it?
- Editing the SDLXLIFF files translated by someone else. Editing such files with the native filter isn’t exactly a straightforward process. You need to create a TM in Trados from the translated file, overwrite the translations in the file with the source text, and then go through the hassle of inserting the 100% matches in OmegaT. This is where the Okapi filter is definitely superior because there’s no preparation involved at all.
- Translating smaller projects without any existing TM. In these cases, lost repetitions and mismatching tags from the previous TMs can’t be an issue.
For any projects with existing TMs, ongoing projects, or projects with many internal repetitions, the new Okapi filter should be used with caution. The mismatching tags might reduce the number of repetitions and matches with the TM.
One more thing
Last, but definitely not least: always remember to do a round-trip before actually translating the SDLXLIFF files with the Okapi filter. This will let you discover any potential problems with creating the translated documents before it’s too late. Folks forget about this quite often and only find out about the problem at the last minute.
For more information about translating the SDLXLIFF files and other file formats such as TXML or IDML in OmegaT, refer to the very well-maintained page on the Okapi wiki.
Do you think you’ll be able to make use of the new filter? If not, why would you prefer to stick to the native one?