They build and release crowdsourced, metadata-rich datasets and open source them through research competitions. The added complexity results in ~13,000 different grapheme variations (compared to English’s 250 graphemic units).Ī sample of the Bengali Alphabet from īangladesh-based non-profit Bengali.AI is focused on solving this problem. This means that there are many more graphemes, or the smallest units in a written language. While the script has 49 letters (to be more specific 11 vowels and 38 consonants) in its alphabet, there are also 18 potential diacritics. Optical character recognition is particularly challenging for Bengali. This challenge hopes to improve on existing approaches to Bengali handwriting recognition. Considering the language’s reach, there is significant business and educational interest in developing AI that can optically recognize images of the language’s handwritten script. It’s the official language of Bangladesh and the second most spoken language in India. The organizers of the competition provide the following motivation for the task:īengali is the 5th most spoken language in the world with hundreds of million of speakers. This project is part of of the Bengali.AI Handwritten Grapheme Classification Kaggle competition.
0 Comments
Leave a Reply. |
Details
AuthorWrite something about yourself. No need to be fancy, just an overview. ArchivesCategories |