Survey reveals poor performance of PDF to EPub conversion tools

The study, carried out by the Flemish Innovation Center for Graphic Communication (VIGC), found the average effectiveness score for tools used to convert library books to ebooks was just 30%, with the worst scoring just 10%.

The VIGC assessed 13 tools beginning with a perfect Pdf/X-4 file and converting it to an Epub file using 13 different tools. Each version was validated as conforming to  Epub specifications, using four different methods, and then checked visually with five types of viewer, such as the Amazon Kindle or Apple iBook.

In total, 65 elements were tested, from simple italics to open-type functions using a print-ready test file containing text and images.

The tests showed that converting ligatures – two or three letters joined together for aesthetic reasons – was particularly problemetic and often were not recognised and therefore converted wrongly.

VIGC general manager Eddy Hagen said each validation system returned different results. "Some tools could validate one EPUB file, while another tool couldn’t," he explained. "And the differences were inconsistent too – it wasn’t a case of one tool always being different from the other three. Based on our results, publishers face a big challenge in ensuring EPUB files – and subsequently the ebooks themselves – have been converted accurately."

"There’s no doubt that the popularity of ebooks is soaring. This trend leaves publishers facing a big challenge – they have to convert their whole back catalogue to make them available as ebooks.

Hagen said an obvious approach was to take the print-ready PDF files and convert them to EPUB files, the standard file format for ebooks. "For printers, this presents a potential new service they can offer their customers. On the internet you can find a lot of tools for converting PDFs to EPUB files – unfortunately, however, it’s not that straightforward."