Statistical Model for Identifying Unclear and Doubtfully Restored Signs of the Indus Script
Project by Polygence alum Varun
Project's result
Varun presented his project at the Seventh Symposium of Rising Scholars.
They started it from zero. Are you ready to level up with us?
Summary
A writing system developed between 2500 and 1800 BCE in the Indus Valley civilization in the Indian subcontinent and it remains undeciphered. Indus script texts found so far in the archeological digs from this civilization are limited in number and include a lot of damaged artifacts with unclear and missing signs. Identifying the missing and unclear signs and extending this text corpus will be beneficial for further research. This work aims at predicting the missing and unclear signs using n-gram Markov chain models using the ICIT Indus text corpus. First, we analyze patterns and concordances of the signs, pairs, triplets, and other n-grams and discover how the signs behave with respect to their positions in the texts. With that understanding, we built Markov chain language models based on n-grams, augmented with positional probability. Since signs could be missing in any location of the texts, we devised and implemented effective sign fill-in models on top of these Markov chain models. Using the language models and the sign fill-in models, we then identified missing single signs in the test dataset and tuned our parameters to improve the accuracy of a match to about 63%. Then we filled in the actual unclear texts with our predicted signs. We hope that the statistical models we developed here and the results from this work add to the Indus text corpus and aid in understanding the Indus script and contribute to the decipherment effort.
Ali
Polygence mentor
PhD Doctor of Philosophy
Subjects
Literature, Computer Science, Languages
Expertise
Natural Language Processing, Artificial Intelligence, Arabic Linguistics
Check out their profile
Varun
Student
School
Dublin High School
Graduation Year
2024
Project review
“I was able to learn Natural Language Processing and complete a research paper.”
About my mentor
“I was very delighted to work with Dr. Ali. I learnt a lot from him”
Check out their profile