Yes, I want my consultation

How to Save Money and Improve Translation Quality with OCR

PDF requiring OCR before translationDespite the rise of content/translation management systems, the share of uneditable source texts in translation supply chains remains significant. One reason is simple unavailability of the original editable versions. An example could be a Soviet Union patent issued back in 1980s that now needs translation from Russian into English. Or a client may want to translate documentation, which was provided by a subcontractor in PDF format several years ago, and now they lost contact with the subcontractor. Because handling such files can be challenging, it is important to consider preparing them for translation before you proceed.

Best Way to Prepare Uneditable Files for Translation

It is our experience that the best approach to uneditable files is to prepare an editable copy through an optical character recognition (OCR) process. This is also a method of choice with many of our peers.

Now, the main dilemma is that clients don’t always agree with the OCR charges. Instead, they would often prefer to have a linguist translate into an empty DOCX file while referring to the original. The willingness to avoid the OCR costs is perfectly reasonable, particularly with small and less significant jobs. But let’s consider some of the reasons why OCR might be a better alternative.

Indirect Benefits of OCR for Clients

  1. It is usually easier and faster to translate an editable file than an uneditable one. A translator doesn’t get distracted by low-value formatting tasks, which often consume much time and energy that should be rather spent on high-value translation activities. An editable file eliminates the need to manually re-key numbers, company names, product names, addresses, and so forth. Translating into an empty DOCX file means you need to double-check whether all content was transferred from the source file correctly, while a properly OCRed file makes such check either marginal or completely unnecessary.
  2. Editable files make it possible to use a translation environment tool (TEnT). This provides at least two benefits in terms of quality: (a) easier bilingual editing process, especially for an editor, and (b) the ability to use an automatic QA tool.
  3. Using a TEnT also provides an efficient backup functionality, with a translation memory storing each translated segment and making disaster recovery easy. We learned this the hard way many years ago after losing a day’s worth of work by accidentally deleting a document which was created by putting translation into an empty file.

These performance improvements have a positive impact on speed and quality of translation, resulting in indirect benefits for clients.

Direct Benefits of OCR for Clients

  1. OCR enables advanced analysis of a source file against a translation memory, allowing to detect internal repetitions, which are otherwise unseen. This may result in significant discounts. One project we completed with OCR had as many as 25,000 repetitions (with 10,000 unique words only). Paying full rate for repetitions instead of paying just a fraction of costs for OCR would have been a major waste for our client.
  2. One of the common problems associated with uneditable files from the clients’ perspective is the uncertainty about the word count and the estimated time of completion. OCR essentially eliminates this headache, because as soon as an OCRed file is ready, a client can receive an accurate quote. If you are a project manager at a translation agency, you can then use this information to produce a more specific and meaningful quote for your end client. Using OCR can also put you in a particularly great light if your competitors submit vague or too expensive quotes based on guessing, rather than an accurate analysis.
  3. It is sometimes necessary to divide an uneditable file between several translators due to time constraints. Letting each translator take care of formatting on their own normally results in an inconsistently formatted final file. Such inconsistent layout may require rework or even go unnoticed and damage your reputation when seen by your clients or prospects.

If you want to take advantage of OCR for your English to Russian translation, request a quote from us that is absolutely free.

One comment

  • Обычно при расчете стоимости мы включаем стоимость OCR непосредственно в цену перевода одного слова.

Add comment

About the Author

Roman Mironov
Roman Mironov
CEO & Founder

As the founder of Velior, Roman has had the privilege of being able to turn his passion for languages into a business. He has over 15 years of experience in the translation industry. Roman has helped dozens of clients increase sales by making their products appealing for speakers of other languages.