Article by Kevin Savetz

First Published:
Date Published:
Copyright © by Kevin Savetz

Selling books and magazines via online auction presents a unique problem: you want to give potential buyers plenty of information about the item for sale, but you don't to want spend all day describing its contents. Sure, you can scan the cover and describe the condition of the binding, but will that really tell the buyer what's in the book?

The answer can be to include information from the book itself -- for example, the table of contents, a description from the dust jacket, and the author biography. Unless you're a patient and speedy typist, entering that information is too much work. Without it, though, the buyer might not understand just how interesting the book is. What to do?

The answer is OCR software. OCR stands for optical character recognition. The software works with your scanner to turn printed matter into text that you can edit on the computer screen and paste into your auction listing. By selecting a page or two from your book or magazine, you can let it describe itself without working your fingers to the bone. Ta da -- here's the table of contents. Poof, there's the preface.

Instead of simply making a graphic of the scanned page, as a graphics program would do, OCR software takes a closer look and tries to figure out what each letter is. In just a few seconds, the program will serve up editable text that resembles the page on the scanner. OCR software can be tricked by similar-looking symbols (like the letter O and zero, or the letters P and R) and oddly formatted text. Some software is smart enough look for clues to these problems, like using a dictionary to decide if PPOOF or PROOF is a better guess. You may have to touch up the text, but this will certainly take less time than typing the whole page yourself.

Of course, you could just upload a graphic of the printed page in GIF or JPG format, but these could weigh in at 100 K or more. OCR'd text takes up just a few bytes and will download to your impatient bidder in a fraction of the time.

OCR will work with just about any printed manner that you can slap down on a flatbed scanner: books and magazines, record album liner notes, cereal boxes, or a pre-nuptual agreement... Just about anything in a reasonably neat typeface. But OCR software isn't magic: the software can't read handwriting or smeared words. Black-on-white text with simple formatting is best; a random page out of Wired magazine could send it into shock.

You'll need a scanner and OCR software. If you already own a scanner, odds are an OCR program came bundled with it. (If one did, it's probably fairly simple, but should be adequate for grabbing a page here and there for your auction listings.) If you don't have OCR software, don't fret. Many commercial OCR programs are available -- one good choice is OmniPage Pro, ( which is available for Windows and MacOS.

