Velior's Corporate Blog about Translation and Translation Industry


Archive for the ‘Translation Technology’ Category

Filter Function in OmegaT

November 8th, 2011, Roman Mironov

OmegaT provides an incredibly powerful capability to filter segments in the editor pane. A similar function is available in other translation environment tools as well, e.g. Wordfast allows you to select just 100% matches or fuzzy matches for easy navigation. OmegaT, however, takes this functionality to a whole new level. This post describes some of its applications.

The basic idea of using a filter is to save time by limiting your scope of work to only those segments that require attention, while also making navigation between them instantaneous. To apply a filter, you need to open the Text Search window (Ctrl+F), perform a search to find all segments you need, and then click Filter in the lower right corner. The OmegaT editor pane will now display and make available for editing only those segments. To disable the filter, perform any other search, click Filter, and then Remove Filter.

  1. Perhaps, the greatest benefit our translation company derived from using this feature is the ability to remove unpaid 100% matches from the scope of work. This ability is essential when a client wants to insert 100% matches in the current translation automatically and without any review, thus avoiding the costs associated with reviewing them. It makes sense then to exclude such 100% matches from the workflow to a reasonable extent. While translating, you can simply skip 100% matches by going to the next untranslated segment each time (Ctrl+U). For the editing step, however, this shortcut obviously doesn’t work. This is when a filter comes in handy. All you need to do is come up with the appropriate search criteria that will find only the segments changed in the course of translation. An example of such criteria is searching for all TM entries committed under a specific translator’s name. After finding them and applying the filter, you will be able to focus exclusively on the required segments.
  2. You often need to make global changes, e.g. to ensure a term is translated consistently. The straightforward way is to find the segments containing this term through the Text Search window and then start clicking the segments one by one to open and modify them in the editor. Clearly, the more occurrences of this term you have, the less efficient this navigation procedure gets. In such cases, we sometimes prefer to open the project’s TM in a text editor such as Notepad++ and make changes there in order to do it faster. The filter feature reduces the need for this type of workaround by allowing you to display only those segments that require changes and move through them with speed.
  3. To process TTX files that include already translated Context TM (Perfect Match, XU) segments, we use the great Toxic utility to convert the files to the format supported by OmegaT. The downside of this process is that OmegaT incorrectly provides the target text of such Context TM segments for translation as if it were the source text (due to Toxic’s method of conversion). Just as with the unpaid 100% matches, these segments can slow you down. To increase efficiency, you can use the filter feature to exclude them from the scope of work.
  4. Recently, I mentioned that OmegaT now provides the note feature. The filter function can help optimize this feature as well. After finding all segments with notes in the Text Search window, you can apply a filter to display these segments and move through them directly in the editor pane, rather than do this by clicking each segment in the Text Search window and switching back to the editor. Again, the more segments with notes you have, the more efficiency a filter will bring.

Optimizing Project Glossaries

November 3rd, 2011, Roman Mironov

Creating and maintaining a glossary for a specific client is a best practice in the translation industry. A glossary contains client’s terminology, making it easier to access approved translations and ensure consistency across this client’s projects. There are, however, a few common pitfalls that can render glossaries difficult to use or misleading. This post provides a few suggestions about how you can avoid such pitfalls to get the most out of your glossaries.

Challenge: Many glossaries are created before the actual translation starts, by extracting and translating a list of frequent terms. This supposedly helps translators to access correct terms easily and avoid discrepancy in the course of the project. While the idea of creating a glossary in advance is undoubtedly reasonable, its actual implementation can be far from perfect. The terms in such list are translated out of context, which inevitably results in a significant percentage of too general translations or even mistranslations. The resulting glossary becomes misleading and can either cause errors in future translations or, in case of a client-approved glossary, create delays in the project schedule, because the translation team will need to compile a list of suggested corrections and wait for the client to approve them.

Solution: I recommend to create your glossary along the way, so that it contains correct translations based on understanding rather than translator’s guesses.

Challenge: Glossaries tend to contain words that are completely irrelevant to their purpose. These can be verbs, names of countries, general nouns, etc. Instead of helping translators, they are downright misleading. Each time a general verb or noun requires a translation different from the one provided in the glossary (and this happens extremely often!), a translator becomes confused, wondering whether it is okay to use a better translation instead of the glossary item that would make little sense in the current context. The more irrelevant words a glossary contains, the less usable it is.

Solution: It is best to have a glossary that is focused on real terms specific to this end client and avoid general words.

Challenge: It is not uncommon to regard a glossary as an “ultimate authoritative source” and comply with it blindly, irregardless of what common sense might be telling you. For example, a translator following this principle may intentionally use a glossary translation, even if it doesn’t fit the current context or is obviously incorrect. An editor who spots a different translation may rush to overwrite it without considering possible reasons behind such deviation from the glossary. Also, the same way of thinking may prevent the end client from allowing changes to the glossary.

Solution: It pays to be flexible about your glossary. A glossary is not always an ultimate authoritative source. We have been maintaining and meticulously improving some of our English to Russian translation glossaries for years, and they still don’t fit all contexts! Written communications are unique and create so many different contexts that it’s impossible and impractical to have all imaginable translations in a single glossary.

Working in Client’s Translation Environment Tools

October 27th, 2011, Roman Mironov

Although dozens of both free and commercial translation environment tools (TEnTs) are available on the market, translation buyers, including translation agencies and direct clients, sometimes choose to develop and use their own tools. In this scenario, particularly typical for larger companies, a client asks a translation vendor to use their company’s tool instead of whatever tool that vendor prefers. One widely known example is Translation Workspace used by the translation agency Lionbridge. We worked with quite a few tools of this kind over the years, and I would like to share some of our experiences with them.

Benefits

  1. Perhaps, the biggest benefit derived from using such tools is that they make things easier for a client through better integration of the translation process with the client’s content management system. The client is able to seamlessly integrate their translation vendor into the content production process, which increases productivity and makes the client less dependent on any specific translation vendor.
  2. Many tools of this kind are web-based, thus having various advantages provided by this valuable technology. A web-based system is typically very intuitive and has little or no prerequisites: all you need to start translating is a web-browser and your login credentials. It also simplifies management and communication by making file exchange over email irrelevant—the text to translate, translation memory, and glossary are all inside the tool.

Challenges

  1. From the translation vendor’s perspective, working in the client’s tool makes it impossible to fully utilize the strengths of their standard workflow. For example, while working in a client’s web-based system, we often can’t employ our quality assurance tool or have translator check and approve editor’s changes.
  2. Whenever any technical issue arises in the course of translation, the only way for the vendor to resolve it is to contact the client for help. This may take more time as compared to using third-party tools. For instance, we have extensive experience with the tools we use on a daily basis and can therefore resolve any issues immediately and without involving the client.
  3. Some of the client’s tools might be less productive than the latest translation environment tools such as OmegaT. One reason is that the clients don’t find it necessary to improve their tools as actively as developers of TEnTs do. A client may have developed their tool years ago and now has little motivation to improve it, because it already provides the basic functionality and the company’s employees are comfortable with it as it is.

In summary, client’s translation environment tools can be a very easy and time-saving way to provide translation for both clients and translation vendors. We are normally happy to provide English to Russian translation in such TEnTs, unless they make our quality assurance process completely impossible.

Ubuntu as an Operating System for Translators

October 19th, 2011, Roman Mironov

A few months ago, I tested Ubuntu 10.10 to see whether the savings generated by using this free operating system for translation purposes instead of Microsoft Windows is worthwhile. Over a period of six months, I used Ubuntu to handle dozens of translation projects. This post outlines some of my findings:

The good news is that you can definitely use Ubuntu for translation. In fact, this OS leaves a much better impression than I originally expected. Ubuntu supports at least three translation environment tools: OmegaT (also free), Wordfast Pro, and Swordfish. It also provides most of the tools essential to running a translation business: OpenOffice.org (a viable alternative to Microsoft Office), Adobe Reader, web-browser, email client, instant messaging (including Skype), and many more.

What’s the catch then?

Because Ubuntu is not as popular and user-friendly as Windows, it doesn’t offer a similarly high level of user experience. The three most serious issues I encountered are as follows:

  1. Getting all your programs to work properly might take quite some time and require knowledge beyond basic PC user skills. If you want to deploy Ubuntu in a translation agency environment, you will also need to invest significant time in basic user training.
  2. While many programs a translator might need do have Ubuntu versions or Ubuntu provides alternative programs, there is a good chance that at least some of your favorite programs won’t be available under this OS. For me, this list included:
  3. • Abbyy Lingvo (dictionary)

    • Abbyy FineReader (OCR tool)

    • Punto Switcher (automatically switches keyboard layout and quickly enters phrases that I use frequently)

    • Foxit Reader (there was an Ubuntu version at the time of testing, but it didn’t allow editing PDF files as the Windows version does)

    QA Distiller (quality assurance tool)

  4. Another major roadblock to using Ubuntu for translation purposes is compatibility issues. Instead of Microsoft Office, which is used by most companies and individuals, you will have to use OpenOffice.org. Although OOO is a great program and in fact supports Office files without any conversion, I found that this support is in name only. The program can indeed handle small and plain Office files successfully. With larger and more complex files, however, a roundtrip (Office – OOO — Office) often results in corrupted formatting. This means that if you often translate DOCX or PPTX files, you will likely end up using Office to check and adjust formatting after translation, because OOO won’t be able to produce a file that will display correctly in Office. Thus, handling such files will be very difficult, unless you have a separate Windows machine with Microsoft Office installed.

Based on this testing experience, I came to the conclusion that using Ubuntu for translation purposes instead of Windows is certainly possible, but may not be worthwhile. The main reason to use a free, but less popular OS is to save on license costs. This economy, however, is heavily offset by the significant amount of time and effort you will need to invest in getting the programs to work properly and dealing with compatibility issues. Although Ubuntu is a great operating system, I just don’t think it has any economic advantage over Windows for translators.

OmegaT 2.5.0: Another Quantum Leap

October 4th, 2011, Roman Mironov

This post is designed to review some of the changes implemented in the latest version of OmegaT, 2.5.0, released this week. The new version introduced quite many new features. This time, however, I will focus on only those that we are already using for English to Russian translation on a daily basis.

  1. OmegaT now supports multiple translations of identical segments (internal repetitions). Previously, whenever a repetition needed a translation different from the one used elsewhere in the project, we would insert a special mark in this segment. After creating the translated documents, we would search for the mark and adjust the translation as required. Hardly effective, this approach is now history. Whenever you want to use a different translation for an internal repetition, you can do this right away within the program by selecting the Create Alternative Translation command in the context menu available in the editor pane.
  2. You can now add notes to segments. Previously, we would add a special mark and insert a note directly inside a segment. Now, it is possible to enter or edit a note for any segment in a dedicated pane. When you re-open a segment that includes a note, this note will appear in the pane. To find all segments with notes, press Ctrl+F to open the search window, deselect all options except In notes, and perform search with an empty string. Needless to say, this is a very useful and long-awaited feature, especially for a translation agency environment like ours where a translator and an editor use notes to communicate with each other regarding their choices and questions.
  3. Another improvement is the ability to have project-specific file filters and segmentation rules. Prior to this version, you could only use one common set of filters and segmentation rules for all your projects, which was stored locally on your PC. In our case, this resulted in a problem each time a translator added a new segmentation rule to improve incorrect segmentation. Because the new rule was saved locally on this translator’s PC, opening this project on another PC, e.g. for editing, resulted in a different segmentation. Each time an editor had to either add the same rule manually or ask the translator to provide the segmentation rule file stored locally on this translator’s PC. Now, when adding a rule, you can make it project-specific by pressing Ctrl+E, selecting Segmentation and then Project-specific segmentation rules. The segmentation rule file will be saved to the omegat folder of this project. Whoever opens this project thereafter will get exactly the same segmentation based on the rules in this file. A further benefit of this improvement is that you can avoid automatic application of any rules set up in an older project to all new projects. Previously, all rules you once set up, however specific they might have been, continued to apply to all future projects, often ruining segmentation.
  4. The last new feature I want to cover today is the improvement to the process of detecting changes in the external translation memories. Prior to 2.5.0, if you made any changes to, or added, any TMXs in the tm folder of your project, you needed to reload the project to be able to see the changes. This was particularly time-consuming with multiple large TMXs. Now, the program automatically detects any changes or additions in the tm folder. One benefit we’ve already derived from this improvement is an optimized process of sharing TMs between two or more translators working on the same project simultaneously. Previously, whenever one translator placed their current TM to the tm folder, they had to ask a colleague to reload the project to be able to see the latest translations added by that translator to the TM. Now, you simply place the TM to the tm folder, and the other translator sees it immediately without any notification or reloading.

In summary, the new version is a big hit with us. We truly appreciate the continued efforts the OmegaT developers are putting into this highly successful free open-source tool. If you are in need of a high-performance translation environment tool to boost the productivity and quality of your work, I strongly encourage you to try out this new version. For more information about OmegaT, please read other related blog posts. We’ll also continue reviewing the remaining new features in future posts.

OmegaT 2.5.0: Another Quantum Leap

Quality Assurance in OmegaT

August 31st, 2011, Roman Mironov

Unlike other popular translation environment tools, OmegaT doesn’t provide a full spectrum of built-in quality assurance functions out of the box. Automated quality assurance is, however, paramount to high performance in our line of work, so translators or translation agencies working in OmegaT will need to use mainly external solutions. This post is intended to briefly discuss the most essential QA tools available to OmegaT users.

Spelling and Grammar Checker

OmegaT makes it possible to check spelling and various style or grammar errors from within the program. For information on how to enable these types of checks, please refer to our previous post. These checks are performed as you go through the file, by highlighting potential errors. But you can’t perform them on the entire file as you can, say, in Microsoft Word or Wordfast. If you wish to do so (and you normally do, because it’s way too easy to miss an error if you simply look through your file for highlighted errors, rather than focus on these errors by running a specific type of check), you can use external QA tools mentioned below.

Tag Checker

OmegaT provides a built-in tag checker. All you need to do is run it by pressing Ctrl+T. This is a very important step in your QA process, since OmegaT doesn’t protect tags. The risk of deleting a tag partially or completely and missing it during editing is therefore very high, so it is imperative to run this checker on each project that involves tagged text.

Quality Assurance Checker

There are at least three reliable tools that you can use to check for errors such as inconsistent translations, untranslated segments, numbers, glossary entries, and many more: QA Distiller, ApSIC Xbench, and CheckMate. The latter two are available for free. While we use mainly QA Distiller, the free ApSIC Xbench and CheckMate provide basically the same functionality.

Whichever program you choose, you start by loading a TMX file. This file is either the “project_save.tmx” that includes all translations committed to the translation memory in your current project or one of the TMX files created whenever you save target documents in OmegaT. After loading such TMX file, you can process it in the QA tool just as any other bilingual file. If you find any errors during QA, you can return to your OmegaT project, use the search feature to find the respective segment, and then make the correction. QA Distiller has a built-in TMX editor, which makes it possible to conveniently edit the loaded file from within this program, i.e. without returning to OmegaT. By using this feature, you can close the project in OmegaT, load the “project_save.tmx” in QA Distiller, process it, and then make changes directly to the “project_save.tmx” by clicking an error and using the editor to change the translation. When you reopen the project after finishing QA, you will see updated translations.

Summing it up, you can easily use various external tools to perform QA on OmegaT projects. Our experience of using OmegaT for English to Russian translations on a day-to-day basis shows that the quality of QA performed within these projects is just as high as with other translation environment tools that provide built-in capabilities. If you have any questions on the subject, our translation company will be happy to assist.

Managing Translation Memories in OmegaT

August 13th, 2011, Roman Mironov

This post continues the series of tips on using OmegaT as a professional tool for English to Russian translation or other language combinations for that matter. Today’s focus is on the translation memory (TM) feature.

We store all translation memories in a centralized manner on a file server, which makes it easier to maintain and access the TMs. This is quite crucial in a translation company environment like ours where ongoing projects from the same end clients account for a significant portion of business. By default, OmegaT offers a project structure that keeps the “tm” subfolder in the project folder. If you need to access any additional TMs, you put them into this folder (as TMX). What this means is that you have to copy all previous project TMs to this subfolder every time you create a new project for a returning end client. You might find this inefficient, and you also face the risk of losing a few TMs along the way. You can avoid this by storing all TMs for this end client in a centralized location. If you plan to store the TMs on the same PC, this can be any folder on this PC. If you prefer to store them on a network share, you need to connect such share as a network drive (OmegaT doesn’t work well with Windows network paths starting with \\). When you create your next project for this end client, change the default TM path to your centralized location path. Afterwards, you can simply copy the settings file (omegat.project) from this project to each new project. This file will always include the correct path to your centralized location.

When your project includes many 100% matches, you normally want to insert them into your translation automatically. You can’t do this from within OmegaT, because this tool currently allows inserting such matches one by one only. This may decrease your efficiency and also result in committing all these 100% matches to TM under your name, so you won’t be able to distinguish them if you need to do so (e.g. your client doesn’t pay for them and you want to skip them during editing). The workaround is to create a subfolder named “auto” in the TM folder used in this project (either local “tm” subfolder or a centralized location described above) and put the relevant TM there. When you launch the project next time, all 100% matches from this TM will appear in your translation immediately.

Whenever you create the translated files, OmegaT also creates three TMs that include all current translations, providing a few useful abilities. This TM is created in three formats: level 1 (TMX 1.1), level 2 (TMX 1.4), and OmegaT (OmegaT’s native TMX 1.1).

1. The TMX 1.4 provides a certain degree of compatibility between OmegaT and other translation environment tool (TEnT) and is, therefore, ideal when you need to provide your TM to a translation agency client who will use it in a different TEnT such as SDL Trados.

2. You may want to change segmentation or correct errors in the source text. Often, this results in two translations in the project TM, one being current (after the change) and the other being obsolete (before the change). This might be inefficient for at least two reasons: (a) when you put this TM through a spell checker or QA process, these confusing obsolete segments appear in the results; and (b) if you re-use this TM at a later time for the same end client, these obsolete translations may mislead you during translation. In such situations, the native OmegaT TM file comes in handy, because it contains current translations only. You can use it instead of the main TM (project_save.tmx). Also, if you want to remove all obsolete translations from your main TM, you can simply rename this TM file as project_save.tmx and replace the current project_save.tmx in “omegat” subfolder with this clean file.

If you have any questions about these tips, I will be happy to assist. For information about installing and configuring OmegaT, please read this post.

How to Save Money and Improve Translation Quality with OCR

March 2nd, 2011, Ekaterina Ilyushina

PDF requiring OCR before translationDespite the rise of content/translation management systems, the share of uneditable source texts in translation supply chains remains significant. One reason is simple unavailability of the original editable versions. An example could be a Soviet Union patent issued back in 1980s that now needs translation from Russian into English. Or a client may want to translate documentation, which was provided by a subcontractor in PDF format several years ago, and now they lost contact with the subcontractor. Because handling such files can be challenging, it is important to consider preparing them for translation before you proceed.

Best Way to Prepare Uneditable Files for Translation

It is our experience that the best approach to uneditable files is to prepare an editable copy through an optical character recognition (OCR) process. This is also a method of choice with many of our peers.

Now, the main dilemma is that clients don’t always agree with the OCR charges. Instead, they would often prefer to have a linguist translate into an empty DOCX file while referring to the original. The willingness to avoid the OCR costs is perfectly reasonable, particularly with small and less significant jobs. But let’s consider some of the reasons why OCR might be a better alternative.

Indirect Benefits of OCR for Clients

  1. It is usually easier and faster to translate an editable file than an uneditable one. A translator doesn’t get distracted by low-value formatting tasks, which often consume much time and energy that should be rather spent on high-value translation activities. An editable file eliminates the need to manually re-key numbers, company names, product names, addresses, and so forth. Translating into an empty DOCX file means you need to double-check whether all content was transferred from the source file correctly, while a properly OCRed file makes such check either marginal or completely unnecessary.
  2. Editable files make it possible to use a translation environment tool (TEnT). This provides at least two benefits in terms of quality: (a) easier bilingual editing process, especially for an editor, and (b) the ability to use an automatic QA tool.
  3. Using a TEnT also provides an efficient backup functionality, with a translation memory storing each translated segment and making disaster recovery easy. We learned this the hard way many years ago after losing a day’s worth of work by accidentally deleting a document which was created by putting translation into an empty file.

These performance improvements have a positive impact on speed and quality of translation, resulting in indirect benefits for clients.

Direct Benefits of OCR for Clients

  1. OCR enables advanced analysis of a source file against a translation memory, allowing to detect internal repetitions, which are otherwise unseen. This may result in significant discounts. One project we completed with OCR had as many as 25,000 repetitions (with 10,000 unique words only). Paying full rate for repetitions instead of paying just a fraction of costs for OCR would have been a major waste for our client.
  2. One of the common problems associated with uneditable files from the clients’ perspective is the uncertainty about the word count and the estimated time of completion. OCR essentially eliminates this headache, because as soon as an OCRed file is ready, a client can receive an accurate quote. If you are a project manager at a translation agency, you can then use this information to produce a more specific and meaningful quote for your end client. Using OCR can also put you in a particularly great light if your competitors submit vague or too expensive quotes based on guessing, rather than an accurate analysis.
  3. It is sometimes necessary to divide an uneditable file between several translators due to time constraints. Letting each translator take care of formatting on their own normally results in an inconsistently formatted final file. Such inconsistent layout may require rework or even go unnoticed and damage your reputation when seen by your clients or prospects.

What are some of your approaches to uneditable files when you buy or provide translation?

Extending Basic OmegaT Functionality under Windows and Linux

February 16th, 2011, Roman Mironov

OmegaT with spell checking, language tool, and tokenizerThis post offers instructions on basic OmegaT setup. Although OmegaT comes ready-to-use, its out-of-the-box functionality can be improved significantly by taking just a few simple steps. This post is intended as a one-stop explanation of these steps so that any user can start benefiting from the extended functionality quickly instead of taking the hard way of trial and error. As the instructions are very basic, I also provide links to more detailed descriptions. The post is intended for those translators who want to evaluate OmegaT or reviewers who need to connect to an existing OmegaT-based translation workflow used by a translation company.

  1. You need to start by downloading OmegaT from the SourceForge. It is important to download the latest beta version available in the Files > Latest section. I don’t think it makes sense to use the older stable version, because it’s obsolete. The beta version also seems safe to use. Choose the version appropriate to your operating system, normally without Java (unless you don’t have Java installed on your system). Under Windows, you will need to install and run OmegaT as you do with any other software. Under Linux, you need to unpack the downloaded archive and run OmegaT shell script.
  2. The next step is to enable spell checking. You can do this by going to Options > Spell Checking. Enable Automatically check the spelling of text option. Create a dictionary file folder and browse to it in this window. Then, click Install to install the dictionaries for your target languages. Unlike many other translation environment tools, OmegaT will now check spelling as you translate, making it easier to write correctly from the start instead of having to return and correct errors.
  3. Now, add the Language Tool, which is a style and grammar checker. Go to OmegaT plugin page at the SourceForge. Select the Language Tool and download the latest version. Since the Language Tool is a platform-independent plugin, you can use the same version both under Windows and Linux. Create “plugins” subfolder in OmegaT installation folder and unpack the downloaded archive into this subfolder. Restart OmegaT, go to Options and make sure Language Checker option is enabled. The Language Tool suggestions will be underlined in blue as you translate. A detailed instruction is available in the Readme file that comes with this plugin.
  4. The next step is to follow a similar procedure to install the tokenizer plugin. It provides better fuzzy and glossary matches by finding other forms of a given word such as a plural form. You can download it from the same OmegaT plugin page at the SourceForge as mentioned above. After unpacking, again, place all files to “plugins” subfolder in OmegaT installation folder. If any files already exist, just overwrite them. Now, running OmegaT with the tokenizer enabled requires creating and editing a launch script, but it’s nothing difficult really. You need to create a separate launch script per source language. For instance, if your specialty is English to Russian translation and German to Russian translation, you need an English script and a German script. To proceed with the below instruction, you will likely need to read this HowTo page, which provides all necessary details.
  • Under Windows, you need to download (or create) the BAT script file provided by one of OmegaT developers, Mr. Marc Prior, at the HowTo page mentioned above. Create a copy of this file and name it e.g. “OmegaT_EN.bat.” Open it with any text editor and add the tokenizer string after “OmegaT.jar.” The entire script content will be as follows:
  • java -jar OmegaT.jar %* –Itokenizer=org.omegat.plugins.tokenizer.SnowballEnglishTokenizer

  • From now on, use this launch script to run the program instead of the EXE file. For the German language, repeat this procedure to create “OmegaT_DE.bat,” replacing the English tokenizer with the respective German tokenizer.
  • Under Linux, you just need to create a copy of OmegaT shell script file, rename it to reflect the source language, and add the tokenizer string after “…OmegaT.jar,” e.g.:
  • …OmegaT.jar” $* –Itokenizer=org.omegat.plugins.tokenizer.SnowballEnglishTokenizer

    Now that you have extended the basic OmegaT functionality, you can use it more efficiently. If you have any questions about these instructions or other OmegaT-related questions, please feel free to ask them in the comments. Velior will be happy to help.

    OmegaT Revisited: Overriding a Snap Judgment

    January 19th, 2011, Roman Mironov

    I am a great believer in free and open-source software as it lends itself to empowering people with the technology they need to be more efficient. The ability to use a free alternative instead of a commercial product can be of great value to any person or company, especially a small business like ours, which has to run a very lean operation in order to maintain competitive edge. When a major production tool in an industry is available for free, it is arguably a blessing to many people engaged in this industry. One of such tools in the translation industry is OmegaT.

    Sometimes Intuition May Be Misleading

    I first got my hands on OmegaT in 2009 and I must confess I wasn’t too impressed. I fell victim of what I now know was a snap judgment—the simplistic GUI and the philosophy that didn’t align with my previous experience with other translation environment tools (TEnTs) required a degree of flexibility I couldn’t come up with at that time.

    A year later, I revisited OmegaT to actually rediscover it in a way that now makes me feel bad about the previous snap judgment. In this post, I want to share a few general thoughts based on my recent experience. What I mention here is just a tip of the iceberg, and I hope to be blogging more about this tool in the future as Velior continues using it in our translation projects.

    How You Can Benefit from OmegaT

    1. Packing all essential TEnT features, including project management, translation memories, and glossaries, into a single tool, OmegaT is a full-fledged translation environment software that provides a viable alternative to similar commercial products.
    2. For a freelance translator who is just embarking on a journey to a career in this industry and doesn’t have the knowledge and/or money necessary to buy a commercial TEnT, OmegaT gives a strong helping hand. For instance, it might be a good starting point for those English to Russian translators who are building their translation business from the ground up or seeking cost-efficient ways to improve productivity and quality.
    3. For an in-house translator, OmegaT gives the freedom of choice, making it possible to continue working on a project at home or using a laptop on the go just as easy as in the office.
    4. Although SDL essentially discontinued development and support of the TTX format, it remains among the most common in the industry. This means that you need the commercial SDL Trados package to accept TTX-based projects and may be a potential roadblock limiting your availability to translation agencies. OmegaT, however, eliminates this barrier by allowing you to handle the TTX format, and many others for that matter.
    5. Personally, I also enjoy the feeling of the community-based development process that is open to requests concerning bugs and new features. You can watch the software maturing and may even feel a sense of ownership in case you are somehow involved in the process.

    What Limitations Need to Be Considered

    Just as many other open-source initiatives, OmegaT carries a certain amount of limitations. Similarly, OpenOffice.org is arguably less sophisticated than Microsoft Office, and Ubuntu is less mature than Windows. Probably inherent to free software, such limitations are often minor in the sense that you can live with them if you make up your mind to do so. What matters most is your mindset—if your chief aim is to save wherever possible or you support free software philosophy in general, you are likely to be okay with the limitations, finding and using a temporary walkaround until they are fixed by the developers.

    I am not exactly advocating for using OmegaT, because it is just one of the options available on the market, and it has its limitations. My point is that OmegaT is a valuable alternative to commercial products that can be considered by many translators, and I am happy with the freedom of choice it adds to our industry. Hats off to this project’s team for their enthusiasm!

    Which free tools do you consider to be of great value in your work? Is OmegaT among them?

    Contact Us

    Phone

    +7 (962) 155-89-07
    +7 (4932) 23-87-23

    info@velior.ru
    velior@list.ru