Parsing Bangla Grammar Using Context Free Grammar

Parsing Bangla Grammar Using Context Free Grammar

Al-Mahmud (Khulna University of Engineering and Technology, Bangladesh), Bishnu Sarker (Khulna University of Engineering and Technology, Bangladesh) and K. M. Azharul Hasan (Khulna University of Engineering and Technology, Bangladesh)
DOI: 10.4018/978-1-4666-6042-7.ch044
OnDemand PDF Download:


Parsing plays a very prominent role in computational linguistics. Parsing a Bangla sentence is a primary need in Bangla language processing. This chapter describes the Context Free Grammar (CFG) for parsing Bangla language, and hence, a Bangla parser is proposed based on the Bangla grammar. This approach is very simple to apply in Bangla sentences, and the method is well accepted for parsing grammar. This chapter introduces a parser for Bangla language, which is, by nature, a predictive parser, and the parse table is constructed for recognizing Bangla grammar. Parse table is an important tool to recognize syntactical mistakes of Bangla sentences when there is no entry for a terminal in the parse table. If a natural language can be successfully parsed then grammar checking of this language becomes possible. The parsing scheme in this chapter works based on a top-down parsing method. CFG suffers from a major problem called left recursion. The technique of left factoring is applied to avoid the problem.
Chapter Preview


In computing, a parser is one of the components in an interpreter or compiler that checks for correct syntax and builds a data structure (often some kind of parse tree, abstract syntax tree or other hierarchical structure) implicitly in the input tokens. Parsing can be defined as a method where a parser algorithm is used to determine whether a given input string is grammatically correct or not for a given grammar. Parsing is a fundamental problem in language processing for both machines and humans. In general, the parsing problem includes the definition of an algorithm to map any input sentence to its associated syntactic tree structure (Saha, 2006). The parser often uses a separate lexical analyzer to create tokens from the sequence of input characters. Parsers may be programmed by hand or may be automatically or semi-automatically generated (in some programming languages) by a tool.

A parse tree for a grammar is a tree where the root of the tree is the start symbol for the grammar, the interior nodes are the non-terminals of the grammar, the leaf nodes are the terminals of the grammar and the children of a node starting from the left to the right correspond to the symbols on the right hand side of some production for the node in the grammar. Every valid parse tree represents a string generated by the grammar (Yarowsky, 1995).

Complete Chapter List

Search this Book: