Optical Character Recognition

What's the optical character recognition?
Part -1

Let's make a simple example. Soppose You have a piece of paper where there are notes handwritten you want to put on your PC. You have two options:

- Save the document in a graphic format (example :miofile.jpg) from a scanner

- Put the document on PC using word processing program (eg Word)

The first solution is certainly faster, but not, it don't allow to make changes, and it don't allow it to search the text because the notes were stored in a graphic format and are processing by the machine like a picture.

The second solution, however, is certainly more elegant but has the inconvenite. Forcing you to do the work to read and rewrite the text. Well, OCR (Optical Character Recognition) is a field where it studying how is possible to replace us with the machine in this tedious work of copying.

The OCR systems in practice are programs dedicated to the conversion of an image containing text into editable text with a normal word processing program. Usually the images are acquired by an image scanner or a digital camera. The text is converted to Unicode or ASCII codes (coding standard for storing character) or in the case of more advanced systems, in a format that can contain the layout of the document. To do this OCR programs using techniques of artificial intelligence.

 

What you can do and what is still a dream?

While the exact recognition of text written in Latin alphabet is now considered a problem solved almost perfectly, texts recognition of a free hand written and recognition of non-Latin alphabets is a problem that still has not found solutions really good and is still the subject of study and research.

In other words, now , you can buy software witch from a newspaper page, read by a scanner and it will return a nice document in text format (eg Word), relieved you from charge to type by hand, but you can't "give in meal " four scribbles taken in a hurry perhaps in a lecture to university and expect that an OCR program will provide a nice text document clean and well written. This unfortunately is still a dream!

One area where the studies are really working, using artificial neural networks, is the recognition of individual characters written by hand free. These studies apply, for example, on palm where software allow them to recognize ,after an appropriate phase of training, (typical of almost all systems based on neural networks) the characters marked with an ' accuracy exceeding 98%.

 
Why the hand recognition is so difficult to do?

The main problem that we have when we must to recognize the character is that a character have never the same geometry or the same characteristics identified. A letter can change in vary shape, dimensions in the accuracy with which it is defined and, not least, may vary according to who writes!

For example if you have to recognize my handwriting it would take a recognition of scribbles!

If, instead of recognizing the characters, we must recognize words belong to a text written in italics and a free hand, will complicate things enormously. It 'very difficult, in fact, come to isolate the individual characters because it is not easy to see where it ends one and where it begins another. In these cases, the research is directed toward the recognition of a word as a whole. To do so is flanked to the OCR engine testing grammar, spelling and contextual. Knowing the script allows for the removal of many ambiguities, for example a document that speaks of history contain many dates and then in a sentence like: "The Second World War started in 197S" the system assumed that the curve near to 197 is 5 and not an S.

 

 

Il Software

The software we will develop can't compete sure with some powerful commercial tools , the aim is purely educational, and is to show how to use an artificial neural network to build an OCR program.

The language we will use is the VB6. Someone I could say me:
"Do you want to use language that possesses an rapid development environment ok, but at least use the Vb.net! "

Well, the point is that, frankly, I still don't have 'studied this new environment Uncle Bill! There is never time! Bah, let's leave the time, otherwise because we touches pull in ballet relativity!
Back to us and let's start

The interface of our program is shown in the picture above.
Good but, how can you use it?

So the first thing to do is download the executable file of course!


Program Istruction


The first thing to do is train the network to recognize the characters defined by you. You can do it in two ways:
Load a previous training or make a new one

Load-training (in the file supply.txt) pressing the button LOAD

-Train network to recognize the "i", "L" "C", "E", "F" (uploaded to the previous step) by pressing the button ADDESTRA.

The network is now ready to recognize the four characters.

-Pressing the button "CLEAR MATRIX"
-Draw-F in the matrix 5 x 6 (as in Fig1).

To draw:
move the mouse over the matrix holding down the
left mouse button or click on point to blackened

To clear:
hold down the shift key on the keyboard and move the
mouse on the grid by holding down the left mouse button
or, always taking the shift down, click on
a point to make it white.

-Press the button to make the RICONOSCIMENTO

recognition. If all "smooth line" in the box Orange appear the "F" letter.

You can try to design the "i" or "E" or "L" or "C" to verify that the network can recognize these characters. Finally, you can, try to ruin the characters in somewhere point to see, until the network is able to discriminate a particular character.

OSS1:

The network is very rudimentary no software modules that perform the auto center, scaling or rotation of characters and then, you must draw ever to occupy the maximum space on the matrix. It 'also necessary, and letters like "i", are always positioned it at the center of the matrix!

curiosity : The roto-translation operations are performed normally even when we move our eyes. The recognition of a given image to be part of the brain, in fact, must be independent from the position that the image itself that occupies our retina. The biological organisms evolved, like us, have developed a neural network can learn this transformation from retinal coordinates to Skull Centric coordinates. This type of network can be simulated artificially by a particular type of learning by reinforcement. This is a variant of the algorithm (ARP-associative Reward penality - Mazzoni, Andersen and jordan 1991)

Drawing from memory and character corruption


If you like, you can quickly draw a character by selecting from the list of those available. To do this you must select a character from the list that says "N. Cicli" The selected character will be drawn directly into the matrix.

Now you can try to "destroy" the character to see how the network is able to recognize it. Recognition, as before, is done by pressing the "RICONOSCI" button

Network training


If you have not already tired , you can make another training (instead of loading one with which I enjoyed myself).
Proceed as follows:

1.Re-Start program

2.Draw a character in the matrix

3.Insert in the text box, below the grid, the character you want to associate. This will result in orange box.

4.Press "MEMO" to the association. The character will be show in l"N° CICLI" and "NEURONI ASSOCIATI " list box

5.Repeat steps 2,3,4 for each character to learn from the network

6.Press "ADDESTRAMENTO" key to train the network.

The training will be confirmed by the sentence:

"Addestramanto Neuroni Completato"

Verification:

You must verify whether the associations have been learned properly. Select the first character from the list "N° Cicli" and click "RICONOSCI". If the character that appears in the orange box is correct go ahead and repeat the operation for the next character. If the character is not recognized correctly, that the network has failed to learn the transformation. It 'must therefore provide another example of the same character. To do this, proceed as follows:

1. Make sure that the specified character is displayed in the matrix otherwise select it from the list "N ° cycles." Suppose this is a "7".

2. Enter the "7" in the text box under the grill

3. Modify the "7" in the grid so as to provide a network an example like the same character or leave unchanged the character (unless you want the network should learn a variant on the question).

4. Press the "MEMO" to do the association (now in "N°. Cicli" list are two examples for the same character).

5. Press trained to conduct the training.

6. Select either "7", then press the "RICONOSCI" button and check if the character is recognized correctly. If not we need to repeat this procedure until the character is recognized correctly.

Once verified that all characters are recognized correctly if you want to save the associations pressing the "SAVE" button.
The File Supply.txt will be update (or will be create a new one if the file supply.txt does not exist). If we hold and not to lose the file Supplì.txt on a previous training, rename the file Supply.txt before with Supplyold.txt , for example. When you need to recharge the old training will be enough to restore supplyold.txt its original name!

NB: The list "N.Cicli" can be long at most 21 points

Network Limit


The neural network used is very simple (it is only one layer of neurons).

Increasing the number of characters increases the likelihood that more neurons are working simultaneously in response to a given stimulus input. This phenomenon is called interference. To remove it you must submit to the network several times the same pair of training. In this way, however, if by one hand increases the selectivity of the network, the other hand will decrease the network capacity to generalize, then decrease the capacity of the network to recognize corrupt characters!

To overcome these limitations is necessary to use a more powerful neural network (but also more complex) as the Back Propagation.

Well the rest of article clarify the operation of the program and the type of neural network used. About that I don't go into detail, because it take me many pages and too long time. The Time is a precious resource!

However for who not interested of technical aspects this article finish here!

If you want instead take the path that I took myself, the madness, go ahead and think no more!

 

2° Part

Roma, 15/07/2007

 

By Fabio Pacioni

 

 


Installing and running

percept-OCR 
Setup

 

The program runs on Microsoft platform.

To Win XP service pack2:

- Download the file Perceptive-OCR.zip

- The unzip all in a folder

- Run Perceptron.exe


For Windows98 you must have the VB6.0 run time libraries and unless you don't have already it installed on the machine, you have to download all installation package:

- Download setup.zip

- Unzip all in a folder

- Launch the Setup.exe file

- Follow the instructions

- Run Perceptive-ocr.exe

Made with Namu6