Optical character recognition is a mature technology, and the machines it now runs on are immensely powerful compared with those used in the early days of OCR, so it's not unreasonable to expect any current program to recognise pages of scanned text perfectly. In this respect ABBYY FineReader Pro does not disappoint, but as nearly every scanner comes with a 'lite' OCR package for free, a program with a £100 price tag has to offer more than just accuracy, and it's here that FineReader Pro really excels.
As you'd expect, FineReader Pro can extract words from pages of mixed text and graphics; it can recognise columns, tables and boxed text; and it can reconstruct the layout of an original document in a number of formats including Word, Word XML, PDF and HTML.
More surprisingly, it has an easy familiarity with a range of input formats beyond the usual TIFF files: it can open PDF files, extract the text from documents photographed with digital cameras and recognise document images saved in several other popular image formats. It can even extract meaningful text from screen shots, which can be captured with a separate utility, of which more later.
Despite the program's flexibility, its user interface is remarkably uncluttered and the same toolbar copes with any type of recognition task. The controls will be familiar to anybody who has ever used a mainstream OCR package, consisting of four buttons to acquire a source image, recognise it, check it, and finally save it. For newcomers to the OCR game there's also a wizard to guide them through all four stages until they become familiar with the steps involved.
For big projects, a scheduled batch processing facility allows you to scan multiple documents and have FineReader Pro recognise them overnight or when you're away from your PC. There's also an optional fast recognition mode that more than doubles the recognition speed for documents with simple layouts.
FineReader does a good job of analysing complex pages to separate the text and graphics, but its tendency to turn graphical elements, such as coupons and logos, into text whenever it can is not always welcome. There is a manual override allowing you determine for yourself how a page is divided into text, graphics, tables and barcodes, but it's probably not worth the effort unless FineReader makes a mess of things when left to its own devices.
We tested the program with a range of documents including photocopied faxes, screwed up originals rescued from a waste bin, coloured text, shaded backgrounds and tissue-thin pages where the reverse side shows through. It handled all of them with alacrity, turning the pages automatically if scanned upside down or in the wrong orientation, and coping well with skewed scans.
The built-in spelling checker is comprehensive, and you can also ask FineReader to check your Microsoft Word custom dictionary if you use a lot of specialist words. The program supports 179 different languages and comes with spelling checkers for 30 of them. It had no problems coping with multilingual documents in French, Swedish and English but had to be told up front which languages to expect, not being able to work them out for itself.
FineReader's party tricks are really cool: the PDF recognition facility is useful if you want to edit a PDF file which is only available in a locked version. You can also extract groups of pages, or individual pages, and save them as a new PDF file.
The ability to recognise text that has been photographed with a digital camera is also impressive. Although the use of a 5-megapixel camera with good natural lighting is recommended, we achieved excellent results using a 3-megapixel pocket camera under poor ambient lighting conditions with flash. The results when using a low-resolution camera built into a PDA were hit and miss - and you can't expect them to be much better from any device with a fixed-focus lens - but the ability to capture documents inside libraries and public places, where scanning is not an option, is welcome.
The screen capture utility is equally nifty but less obviously practical. It captures text from any rectangular area of the screen and opens it in FineReader, Word or Excel. It's a neat trick, and it makes it possible to capture graphical text, such as that found on Web pages, for editing, but we can't say we've ever had the need. Maybe you know better.
Verdict
FineReader isn't cheap but it's worth every penny if you regularly need to convert paper or PDF documents into alternative editable formats. Great results can be achieved using the default settings, so FineReader starts saving you time and money from day one.
As you'd expect, FineReader Pro can extract words from pages of mixed text and graphics; it can recognise columns, tables and boxed text; and it can reconstruct the layout of an original document in a number of formats including Word, Word XML, PDF and HTML.
More surprisingly, it has an easy familiarity with a range of input formats beyond the usual TIFF files: it can open PDF files, extract the text from documents photographed with digital cameras and recognise document images saved in several other popular image formats. It can even extract meaningful text from screen shots, which can be captured with a separate utility, of which more later.
Despite the program's flexibility, its user interface is remarkably uncluttered and the same toolbar copes with any type of recognition task. The controls will be familiar to anybody who has ever used a mainstream OCR package, consisting of four buttons to acquire a source image, recognise it, check it, and finally save it. For newcomers to the OCR game there's also a wizard to guide them through all four stages until they become familiar with the steps involved.
For big projects, a scheduled batch processing facility allows you to scan multiple documents and have FineReader Pro recognise them overnight or when you're away from your PC. There's also an optional fast recognition mode that more than doubles the recognition speed for documents with simple layouts.
FineReader does a good job of analysing complex pages to separate the text and graphics, but its tendency to turn graphical elements, such as coupons and logos, into text whenever it can is not always welcome. There is a manual override allowing you determine for yourself how a page is divided into text, graphics, tables and barcodes, but it's probably not worth the effort unless FineReader makes a mess of things when left to its own devices.
We tested the program with a range of documents including photocopied faxes, screwed up originals rescued from a waste bin, coloured text, shaded backgrounds and tissue-thin pages where the reverse side shows through. It handled all of them with alacrity, turning the pages automatically if scanned upside down or in the wrong orientation, and coping well with skewed scans.
The built-in spelling checker is comprehensive, and you can also ask FineReader to check your Microsoft Word custom dictionary if you use a lot of specialist words. The program supports 179 different languages and comes with spelling checkers for 30 of them. It had no problems coping with multilingual documents in French, Swedish and English but had to be told up front which languages to expect, not being able to work them out for itself.
FineReader's party tricks are really cool: the PDF recognition facility is useful if you want to edit a PDF file which is only available in a locked version. You can also extract groups of pages, or individual pages, and save them as a new PDF file.
The ability to recognise text that has been photographed with a digital camera is also impressive. Although the use of a 5-megapixel camera with good natural lighting is recommended, we achieved excellent results using a 3-megapixel pocket camera under poor ambient lighting conditions with flash. The results when using a low-resolution camera built into a PDA were hit and miss - and you can't expect them to be much better from any device with a fixed-focus lens - but the ability to capture documents inside libraries and public places, where scanning is not an option, is welcome.
The screen capture utility is equally nifty but less obviously practical. It captures text from any rectangular area of the screen and opens it in FineReader, Word or Excel. It's a neat trick, and it makes it possible to capture graphical text, such as that found on Web pages, for editing, but we can't say we've ever had the need. Maybe you know better.
Verdict
FineReader isn't cheap but it's worth every penny if you regularly need to convert paper or PDF documents into alternative editable formats. Great results can be achieved using the default settings, so FineReader starts saving you time and money from day one.
No comments:
Post a Comment