Why are PDFs so Difficult to Translate?

The PDF format for storing information is more difficult to translate for both human translators and professional translators.

It’s comparatively easy just to translate the words and sentences in a PDF document but the problem arises when a professional translator uses any type of machine translation to aid the German English translation process.

Most professional translators prefer to do German translations using documents in formats like power point or word because of the ease of processing. If PDF is the only format available, German translations can be painfully slow and the client has to pay more money to get a good translation.

A German translator generally makes use of tools that act as computer-assisted translation (CAT) and translation memories too both of which help to improve work efficiency. Unfortunately, many tools are of little use if the computer is unable to read the text because it is in a PDF file.

Machine translators are simply unable to read PDF files, even if they seem like quite plain text by the human eye, it is not the same when viewed by a machine. Scanned documents are difficult for a machine to read too unless an optical character recognition (OCR) software is used but it can’t necessarily do a good job either.

There a few documents that can easily be read automatically but those tools are certainly not perfect and the alignment is often difficult to maintain which might be a requirement for the translation.

Another problem with PDF files is that they are often protected by a password which means they have to be unlocked before any machines or other tools can read the content.

Sometimes there is only one option for translating a document in PDF format and that is to use manual translation only for the whole projects. This includes the reading of the file, the rewriting of as well as translating, editing and proofreading. This means the translation costs grow sharply for the client.

This description makes PDFs sound like they are the worst file formats in existence. They are, however, very useful and look particularly attractive to readers.