Best Practices for Re-creating Formatting Before Translation

Even though we are well into the information age, clients often provide documents to be translated in an uneditable format, such as PDF or JPG. It is a very good idea for translators to re-create the original documents in Microsoft Word files first and only then translate them, rather than translate into DOCX files directly. To re-create text and formatting in new files effectively and consistently requires following best practices. This post lists some of them, in no particular order.

Illegible text

Whenever you cannot read something in original files, whether due to low image quality or incomprehensible handwriting, do not omit it, but use the following mark:

Original Re-created file
 Illegible text translation  < illegible >

Omission can cause readers’ confusion, because they will think something is missing.


By the same token, use the following symbols for signatures:

Original Re-created file
 Signature translation   < signature >

You could try to guess some of the names in the signatures, but this is often futile. If you guess 9 signatures out of 10, you will still have to write all 10 of them, for consistency.


Seals, or stamps, need to be spelled out. Example:

Original Re-created file



Parts of seals can be illegible, but you already know what to do about it.

Keeping pages one to one

Whereas with most documents, it it okay for the translated text to end up on a page different from that of the respective original text, some documents require mirroring original pages one to one. For example, if you have a document with two pages —a graduate certificate on page 1 and its transcript on page 2—you cannot let your translation of the graduate certificate extend to page 2, because it will now occupy two pages instead of one, and the transcript will now start on page 3 instead of 2. This is especially important for translating personal documents, such as the one used in this example, and documents used as evidence in legal proceedings. Text arranged differently in translation may confuse readers and prevent them from comparing it to the original effectively. From the technical standpoint, this means you need to use page breaks in re-created files, as well as check translated files to make sure the text is arranged properly.

Keeping pages one to one in translation

Numbered lists with letters

If your numbered lists use letters, it is a good idea to use the letters of the target language from the outset. Here is an example from a Russian to English translation:

Original Russian PDF document Re-created Russian DOCX document
а), б), в) a), b), c)

Letters in numbered lists

Because you will have to change these Russian letters to the English ones one way or another, it is best to change them as early as possible: in the source files, rather than translated files.

Headers and footers

Remember to use headers and footers for any text that repeats over a range of pages. If you need to have different headers or footers on different pages, add a section break and then break the link between the previous section and the new one. Use this article as a guidance.

What other best practices have you found to be useful for re-creation of original documents?

One comment

  • Artyom Vecherov says:

    ABBYY Finereader is a pretty good software, however, the output is still far from perfect. The best practice for me is NOT to use “Editable copy” option at all, but to save the document as “Plain text” and then re-create the necessary styles in Word. It is reasonable because ABBYY Finereader generates a lot of junk styles thus increasing the number of useless tags in the resulting .xliff, while there are usually just 4-5 styles in a typical document that can be easily created and applied manually. The final document contains no “tag soup”, has proper headings, table of contents etc. and I also think that in most cases applying styles in Word is much faster than copying and pasting tags in an .xliff.

