GET A FREE CONSULTATION OR SAMPLE TO GET YOUR PROJECT GOING.

Yes, I want my consultaion

Segmentation Is Important

11494065253_ebba3dc0f3_z

Segmentation means breaking translatable text into the smallest translation-friendly logical pieces. Because computer-aided translation (CAT) tools segment texts using very basic rules by default (such as breaking a paragraph into two or more sentences with a period between them), the result is often suboptimal. But it should not be this way.

Poor segmentation makes it difficult to do a good job.

First and foremost, suboptimal segmentation makes it difficult for a translation team to do a good job. While it is not as critical as, say, reference materials, bad segmentation still causes additional work and frustration, with a negative impact on the end result.

Suppose I have a sentence:

Whatever you do, don’t think of the color blue.

And it is erroneously presented to a translator segmented like this:

Whatever you do,

don’t think of the color

blue.

Instead of thinking about the best translation, the translator concentrates on how to work around this erroneous segmentation. If the project is riddled with such broken sentences, the translator loses concentration, leading to errors. Revising these bits and pieces is difficult, too, since an editor has to focus on finding what goes where, as much as on checking the translation itself.

Good segmentation: easier for translators

Optimized segmentation makes it easier for every stakeholder to work. While, for example, packing several sentences into one segment is not bad per se, breaking them into separate sentences often leads to better results. First, completing single sentences is psychologically easier. Second, shorter segments produce repetitions and better TM matches.

Less risk of omitting things

Removing all “untranslatables” from the beginning and the end of segments is also important. Things like list numbers and bullets, tags, footnote asterisks, etc. are irrelevant for translation and only waste translators’ time. Moreover, it is easy to delete such untranslatables accidentally. Extracting them out of the segments that are visible on the translator’s editing screen produces cleaner segments, which are easier to work on.

Better leveraging in future jobs

Suboptimal segmentation dramatically decreases the probability of TM matches in future jobs. Compare these two sentences:

First project: 1) Whatever you do, don’t think of the color blue.*

Second project: Whatever you do, don’t think of the color blue.

Although the actual text is identical, this sentence will not be a 100% match in the second project because of the untranslatable garbage in the first one.

Or:

First project: Whatever you do,

don’t think of the color

blue.

Second project: Whatever you do, don’t think of the color blue.

Whereas in the first example, we get at least a fuzzy match, we get no match at all in the second one. Our translator may even fail to find this translation in the TM and end up retranslating it from scratch, which means wasteful work.

Drawbacks

Optimizing segmentation is a time-consuming and knowledge-intensive process. Creating a project in SDL Trados Studio and sending it to a translator is one thing, which almost anyone can do. But doing this in a way that is optimal is quite another thing. Some do not like the delay it causes in a project; others do not have the expertise to do it.

Summary

Optimizing segmentation deserves more attention than it gets currently, because it has many advantages, including making translators’ work easier and resulting in better TM leveraging. Even though it does take time and expertise, it leads to sizeable benefits for all stakeholders.

Add comment


About the Author

Roman Mironov
Roman Mironov
CEO & Founder

As the founder of Velior, Roman has had the privilege of being able to turn his passion for languages into a business. He has over 15 years of experience in the translation industry. Roman has helped dozens of clients increase sales by making their products appealing for speakers of other languages.