Recently, I wrote about how Kos Ivantsov’s scripts help make better use of OmegaT. Let’s look at one specific example now. We had a job recently that in addition to new text, had some 40,000 full (100%) matches that we agreed with the client not to review. This job included DTP and post-DTP review. In the course of post-DTP review, we noticed that the 100% matches we had not touched did not have non-breaking spaces after numbers. This resulted in multiple cases of lines ending with a number, while the next word, such as a measurement unit, ended up on the next line. This is incorrect. Example:
За прошедшую неделю программу с нашего сайта загрузили 157
раз. Это стало для нас рекордом.
We spent several hours fixing this problem, and this was for unpaid 100% matches. We definitely needed a better way to deal with this in the future, and one of the Kos’s scripts provided the solution.
OmegaT does not have a built-in search and replace functionality yet—a frequent valid reason for criticism—but you can use this script as a workaround. Unfortunately, just like most other scripts in OmegaT, it is not as straightforward as searching and replacing through a window like in Microsoft Word. Yet it does allow you to run multiple replacements, fully unmanned. Automatic replacement requires exercising caution, of course, but the ability itself is terrific. Especially because you can create a list of regular replacements that you will apply across projects, saving the time that it normally takes to do this manually.
The first step is figuring out what to search for and what to replace it with. With straightforward replacements, you can use literal strings. For example, I can use the script to change “Вашим” to “вашим” if I do not want to have this word capitalized in English to Russian translation. But with my task of replacing regular spaces with non-breaking ones, I have to use regular expressions to cover all the possibilities:
What to search for
25 000 рублей
(/d)/s([А-я]) $1 $2
What to search:
(/d) matches any number.
/s matches a regular space.
([А-я]) matches any Russian letter.
What to replace with:
$1 inserts the number matched by (/d).
Then goes the non-breaking space.
$2 inserts the letter matched by ([А-я]).
Using the script
- You need to download the script and save it into your scripts folder within OmegaT program folder. If you are not comfortable with that, simply download OmegaT that I fully configured for you with all scripts and settings.
- Create a text file called search_replace.ini in the root folder of your OmegaT project. You use this file to tell the script what replacements to make.
- Put the above search string on the first line: (/d)/s([А-я]) $1 $2.
- Leave the second line empty. Remember that this last line in a file must always remain empty.
- In OmegaT, select Tools => Scripting, click the script in the left-hand panel, and click Run.
- The script will make the replacements and display the result window, allowing you to check the results.
Benefitting from the script across projects
By doing this replacement before the unfortunate DTP job I mentioned, we could save a few hours. And this is what we are going to do in the future—build a list of replacements that will reduce time-consuming manual work, such as adding non-breaking spaces or replacing quotation marks, as well as eliminate the risk of such problems. We will store it in a central location, copy it into each project, and run it towards the end of the project. I think this is a good idea for every OmegaT user.
From now on, the OmegaT version we provide as a download will include the recent version of our script. The script is designed for Russian translation and includes explanations of the replacements, so you can remove those that you do not need in a particular project.
Kos, you are a genius. Make sure to read more about what this talented individual has been up to lately.
Now tell me: Is there a CAT tool, including the commercial ones, that can do this as effectively as OmegaT?