Nepali Unicode Keyboard Layout Standarization based on Genetic Algorithm
Our attempt to standarize the Nepali Unicode Keyboard Layout based on Genetic Algorithm.
We have analyzed a Nepali corpus of 7,797 articles, consisting of 2,452,937 words to obtain the frequencies of monograph and digraphs. The obtained data has been processed using Genetic Algorithm to give an optimum Nepali Keyboard Layout.
The following criteria have been considered:
- Learnability
- Load Distribution
- Modifier Overhead
- Hand Alteration
- Consecutive Usage of Same Finger
- Big steps by fingers of same hand
- Hit direction
The report has also been submitted to Nepali Language in Information Technology, HLCIT for further review. Wish us luck ;).
The source code and corpus data will be available very soon..
NOTE:
The keyboard layout is in draft form currently. The Genetic Algorithm code we've written takes around 2 minutes to calculate one generation. We're projecting that 2000 generation would give us an optimized layout, which would take 2*2000 = 4000 minutes ~ 3 days to calculate. But since 3 days of continuous electricity seems almost impossible for now (due to load-shedding), we're not being able to compute the layout. The layout will be available as soon as we get the resources or electricity becomes more consistent.
UPDATE:
This research is now being continued at Open Technologies Resource Center (OTRC).
| Attachment | Size |
|---|---|
| nepalikeyboardlayout_report_draft.pdf | 498.67 KB |
| nepali_corpus_frequency_result.ods | 44.99 KB |
| freqcalculate.py.txt | 1.82 KB |