Article Preview
TopIntroduction
Semantic, in basic form is the meaning of the sentence. In computer driven world of automation, it has become necessary for machine to understand the meaning of the given text for applications like automatic answer evaluation, summary generation, translation system etc. In linguistics, semantic analysis is the process of relating syntactic structures, from words and phrases of a sentence to their language independent meaning. Given a sentence, one way to perform semantic analysis is to identify the relation of the words with action entity of the sentence. For example, Rohit ate ice cream, agent of action is Rohit, object on which action is performed is ice cream. This type of association creates predicate-arguments relation between the verb and its constituent. This association is achieved in Sanskrit language through kArakA analysis. Understanding of the language is guided by its semantic interpretation. Semantic analysis in Sanskrit language is guided by six basic semantic roles given by pAninI as kAraka values.
pAninI, an ancient Sanskrit grammarian, mentioned nearly 4000 rules called sutra in book called asthadhyAyi; meaning eight chapters. These rules describe transformational grammar, which transforms root word to number of dictionary words by adding proper suffix, prefix or both, to the root word. Suffix to be added depends on the category, gender, number of the word. Structured tables containing suffix are maintained for the purpose. These declension tables are designed in such a way that their position in the table are defined with respect to number, gender and karka value. Similar ending words follow the same declension, for example rAma is a-ending root word and words generated using a-ending declension table are rAmH, rAmau rAmAH by appending H, au and AH to rAma, respectively. Suffix based information of the word reveals not only syntactic but drives a way to find semantic based relation of words with verb using kAraka theory.
Next section describes Sanskrit language and kAraka theory, section three states the problem definition, followed by NN model for semantic analysis. Features extracted from corpus of pre-annotated text are supplied as input to system with objective of making system learn six kAraka defined by pAninI. This paper presents the concept of Neural Network, work done in the field of NN and Natural Language Processing, algorithm, annotated corpus and results obtained.
TopSanskrit And Karaka Theory
Sanskrit language, with well-defined grammatical and morphological structure, not only presents relation of suffix-affix with the word, but also provides syntactic and semantic information the of words in a sentence. Due to its rich inflectional morphological structure; it is predicted to be suitable for computer processing. Work at NASA on Sanskrit language reported that triplets (role of the word, word, action) generated from this language are equivalent to semantic net representation (Briggs 1995).
Sanskrit grammar is developed by three main individuals over the years – pAninI, Katyanan and Patanjali. pAninI was first one to identify nearly 4000 rules to define the language (Kak, 1987). Sanskrit is one of the 22 official languages of India and standardized dialect of old Indo-Aryan community. It is a rule-based language having flexibility and precision, both. It is an order free language – changing the position of the word, does not change the meaning of the sentence. Processing techniques for order free languages cannot be derived from rigid order language like English. pAninI’s idea of semantic interpretation was based on identification of relations of words with action in a sentence. as agent, object etc. This was called as kAraka theory. In Sanskrit, this relationship is developed by adding specific, pre-set syllables, known as case-endings or vibhakti to the basic noun form.Natural language processing concept in pAnini way describes the processing of language