Susan Laflin's Projects.

Project Number 7. Computer-based Handwriting

Before my retirement, I was engaged on research into methods of "automatic" recognition of handwritten documents. Since my retirement, I have been distracted from this by various historical researches, but some of my papers remain on my personal web page and there is scope for several M.Sc projects in this area.

My method of handwriting-recognition requires the generation of word-images in a particular alphabet and handwriting; the splitting of a document into a string of individual word-images; and the comparison of the generated images with those extracted from the document in order to identify words of phrases from the document.

One obvious project would be to write the software to generate the word images. This may use any one of the hands used in England over the centuries, or may use some other alphabet such as Greek, Russian, Linear B, Egyptian Hieroglyphs or any other ancient or modern alphabet. To build up the word-images, it is necessary for each key on the keyboard to correspond to the image of a character and, as the word (or string of characters) is typed in, so the image is built up on the screen. For a cursive script, it is also necessary to include a method of joining up the letters to form a word.

Another project would provide the pre-processing neccessary to split a large image of a whole page of the document into a string of word-images. The document would need to be split into lines of text and then the lines split into individual words. If time, this could also include separating the descenders from one line and the ascenders from the line below and replacing any gaps in the lines. There may also be a need to straighten crooked lines within the text. This is a project which can be extended as time permits.

A more specialised project would be to analyse an image of a page of a nineteenth century census return where vertical and horizontal ines separate the various fields. These lines may be used to identify text-images of the different fields and use different techniques to interpret them. For example the field containing the age must be numeric but may contain some letters (e.g. an age given as "6 months").

These are some of the topics available in this area for a good student who wants a more challenging project. There is also scope for some straightforward projects which would be less risky given the limited time for M.Sc projects.