A Modified Markov-Based Maximum-Entropy Model for POS Tagging of Odia Text

A Modified Markov-Based Maximum-Entropy Model for POS Tagging of Odia Text

Sagarika Pattnaik, Ajit Kumar Nayak
Copyright: © 2022 |Pages: 24
DOI: 10.4018/IJDSST.286690
Article PDF Download
Open access articles are freely available for download

Abstract

POS (parts of speech) tagging, a vital step in diverse natural language processing (NLP) tasks, has not drawn much attention in the case of Odia, a computationally under-developed language. The proposed hybrid method suggests a robust POS tagger for Odia. Observing the rich morphology of the language and unavailability of sufficient annotated text corpus, a combination of machine learning and linguistic rules is adopted in the building of the tagger. The tagger is trained on tagged text corpus from the domain of tourism and is capable of obtaining a perceptible improvement in the result. Also, an appreciable performance is observed for news article texts of varied domains. The performance of the proposed algorithm experimenting on Odia language shows its manifestation in dominating existing methods like rule based, hidden Markov model (HMM), maximum entropy (ME), and conditional random field (CRF).
Article Preview
Top

This section discusses some of the selective works related to the proposed methodology and the present state of art of taggers developed for Odia. Starting with rule based technique a primitive method has been efficiently adopted in building English tagger (Brill, 1992; Pham, 2020). For Indian language like Hindi (Garg et al., 2012) the said method has also proved efficient by giving a noticeable performance of 87.55% accuracy. But in due course researchers observed that developing taggers monopolized by linguistic rules proved to be a difficult task. A shift towards statistical methods that comply more on axioms of probabilities took place proving relatively more efficient.

Complete Article List

Search this Journal:
Reset
Volume 16: 1 Issue (2024)
Volume 15: 2 Issues (2023)
Volume 14: 4 Issues (2022): 1 Released, 3 Forthcoming
Volume 13: 4 Issues (2021)
Volume 12: 4 Issues (2020)
Volume 11: 4 Issues (2019)
Volume 10: 4 Issues (2018)
Volume 9: 4 Issues (2017)
Volume 8: 4 Issues (2016)
Volume 7: 4 Issues (2015)
Volume 6: 4 Issues (2014)
Volume 5: 4 Issues (2013)
Volume 4: 4 Issues (2012)
Volume 3: 4 Issues (2011)
Volume 2: 4 Issues (2010)
Volume 1: 4 Issues (2009)
View Complete Journal Contents Listing