[ home ] [ people ] [ projects ] [ courses ] [ meetings ]


Shape-Free Statistical Information in Optical Character Recognition


Monday, April 2nd -- Scott Leishman


Abstract:
 

Optical Character Recognition (OCR) systems attempt to convert an image of a textual document into a symbolic character sequence using computer vision and machine learning algorithms. Traditionally, OCR research has focused on *bottom-up* processing, which treats OCR as a classification problem where the inputs are the pixels in the image of a single character and the label is the character identity. This approach does very badly when you try to do OCR on a document in an unknown font. Ideally, the document should be its own font model, since almost all characters appear at least once. In this talk, we will explore how far we can go towards achieving this "model free" goal by pursuing a top-down "codebreaking" approach, in which we assume we know nothing a priori about the shape of the characters and use only statistical information about the sequence in which the glyphs occur.