Tuesday, October 28, 2008
Current Status of Bangla OCR | 29th OCtober, 2008
In the past few days I was feeling the urgency of writing about the current status of our research and development. Its quite long time (more than 40 days) since I post anything about the status of our Bangla OCR into the blog. In the last post I wrote about segmentation success and shown a color segmented image. After that we tried to combine everything and go for the 2nd release of Bangla OCR which will actually tesseract based Bangla OCR. Feeling the actual demand we plan to develop it in both Windows and Linux platform. Thanks goes to Joyonto da, Firoj alam and Murtoza who encourage me to think and move towards both platform. I would like to write that we have significant improvement in our development task. Shouro is dealing with the finishing task of the Ubuntu GUI. I have faced several difficulties regarding to the issues like: including tessnet2.dll, handling the buffer overrun problem, post processor adding etc and spend past two weeks on these. I was depressed because of the problems of tessnet2.dll loading. At last Remi Thomas (developer of tesnet2.dll) ensure me that the problem (buffer overrun) that I was facing was on tesseract. So, I moved my focus back to the tesseract.exe and tried to include it with my windows application and run it as a hidden (background process). Now I am successful because I can see the output. Few things are yet to include in our application. Hope we can finish it soon and also release 2nd version of Bangla OCR.