GET A FREE CONSULTATION OR SAMPLE TO GET YOUR PROJECT GOING.

Yes, I want my consultaion

Webinar Video: Remove Leading and Trailing Tags Option in OmegaT

RemoveThis is the second video from the OmegaT webinar. You can find the first one here.

This video explains the basics of using this option. When you use OmegaT, you face the following dilemma:

  • You can make OmegaT display all leading and trailing tags, including the ones that you need and the ones that you do not need, i.e. superfluous tags.
  • Or you can make OmegaT hide all these tags, but you will not be able to move these tags around if you need to do so.

Because the second option is obviously unsustainable (you cannot work without the tags that you need to move around in the translation!), I advocate using the first option. To fight the caveat—superfluous tags—you can use segmentation rules to extract those tags out of segments. And when you need to get a specific leading or trailing tag back into a segment, you simply add a segmentation counter-rule that takes priority over your “tag extraction” rules.

If you like this video, add us to your RSS reader to make sure you don’t miss future videos about some of the advanced OmegaT functions.

4 comments

  • Hector says:

    Hi, this is a nice workaround, but I noticed that you may end up with unwanted orphan segments when making the overriding segmentation rules, unless you make them very segment specific, which would not be always possible. Don’t you think this could be resolved by making OmegaT able to re-insert leading/trailing tags on a segment-level basis? I imagine an option in the drop-down menu that opens when you right-click a segment to “Insert leading and trailing tags for current segment only”.

  • Hector says:

    Hi,

    I am using an “improved” set of rules that I want to share with everyone here. I store them in a separate set named “Leading and trailing tags off”:

    Rule 1

    Pattern before:

    (^|\n)(\s|xA0)*((]+>)+(\s|xA0)*)+

    Pattern after:

    [^]

    Rule 2

    Pattern before:

    [^]

    Pattern after:

    (\s|xA0)*((]+>)+(\s|xA0)*)+($|\n)

    This way you can make sure that any tags before and after your text make it to standalone segments. Actually, I was thinking that if this set of rules was the default, there would be no need for the “Remove leading and trailing tags” option. Instead, a “Hide tag-only segments” option would be better suited.

    Now, if you go further down this trail, you could pre-process your source files so that all line breaks got replaced by some tag. In theory, that would make it possible to join segments that currently cannot be joined due to line breaks. I would like to test that approach, which I haven´t yet.

    Hope you like the improved regex!

    • Roman Mironov says:

      Hi Hector,

      This is great. It will be my pleasure to test your new rules.

      Would you care to explain what exactly is improved compared to what I suggested?

      Thank you so much for sharing.

Add comment


About the Author

Roman Mironov
Roman Mironov
CEO & Founder

As the founder of Velior, Roman has had the privilege of being able to turn his passion for languages into a business. He has over 15 years of experience in the translation industry. Roman has helped dozens of clients increase sales by making their products appealing for speakers of other languages.