Tuesday, October 28, 2008

Current Status of Bangla OCR | 29th OCtober, 2008

In the past few days I was feeling the urgency of writing about the current status of our research and development. Its quite long time (more than 40 days) since I post anything about the status of our Bangla OCR into the blog. In the last post I wrote about segmentation success and shown a color segmented image. After that we tried to combine everything and go for the 2nd release of Bangla OCR which will actually tesseract based Bangla OCR. Feeling the actual demand we plan to develop it in both Windows and Linux platform. Thanks goes to Joyonto da, Firoj alam and Murtoza who encourage me to think and move towards both platform. I would like to write that we have significant improvement in our development task. Shouro is dealing with the finishing task of the Ubuntu GUI. I have faced several difficulties regarding to the issues like: including tessnet2.dll, handling the buffer overrun problem, post processor adding etc and spend past two weeks on these. I was depressed because of the problems of tessnet2.dll loading. At last Remi Thomas (developer of tesnet2.dll) ensure me that the problem (buffer overrun) that I was facing was on tesseract. So, I moved my focus back to the tesseract.exe and tried to include it with my windows application and run it as a hidden (background process). Now I am successful because I can see the output. Few things are yet to include in our application. Hope we can finish it soon and also release 2nd version of Bangla OCR.

2 comments:

জয়ন্ত said...

We are waiting for second release. Great work. I will be happy if I can any contribute about this project. But my VB and VC knowledge is very poor and I forgot all coding. This is a graet work ......

Md. Abul Hasnat said...

Dear Joyonto da,
nice to see your comment. Hope we will be able to release a version at the first week of December... please stay with us, continue giving your valuable feedback and suggestions..