Search the World's Largest Database of Information Science & Technology Terms & Definitions
InfInfoScipedia LogoScipedia
A Free Service of IGI Global Publishing House
Below please find a list of definitions for the term that
you selected from multiple scholarly research resources.

What is N-Gram

Handbook of Research on Digital Libraries: Design, Development, and Impact
A n-gram is a substring of a word, where n is the number of characters in the substring, typical values for n being bigrams (n=2) or trigrams (n=3).
Published in Chapter:
Standardization of Terms Applying Finite-State Transducers (FST)
Carmen Galvez (University of Granada, Spain)
DOI: 10.4018/978-1-59904-879-6.ch010
Abstract
This chapter presents the different standardization methods of terms at the two basic approaches of nonlinguistic and linguistic techniques, and sets out to justify the application of processes based on finitestate transducers (FST). Standardization of terms is the procedure of matching and grouping together variants of the same term that are semantically equivalent. A term variant is a text occurrence that is conceptually related to an original term and can be used to search for information in a text database. The uniterm and multiterm variants can be considered equivalent units for the purposes of automatic indexing. This chapter describes the computational and linguistic base of the finite-state approach, with emphasis on the influence of the formal language theory in the standardization process of uniterms and multiterms. The lemmatization and the use of syntactic pattern-matching, through equivalence relations represented in FSTs, are emerging methods for the standardization of terms.
Full Text Chapter Download: US $37.50 Add to Cart
More Results
Machine Learning in Text Analysis
Making group of ‘n’ words from a sequence to convey some meaningful things.
Full Text Chapter Download: US $37.50 Add to Cart
Natural Language Processing Applications in Language Assessment: The Use of Automated Speech Scoring
n-gram refers to a series of n adjacent letters and syllables that exist in a language dataset or adjacent phonemes obtained from a speech-recording dataset.
Full Text Chapter Download: US $37.50 Add to Cart
Analyzing Process Data from Problem-Solving Items with N-Grams: Insights from a Computer-Based Large-Scale Assessment
A contiguous sequence of n items from a given sequence of text or speech in the fields of computational linguistics and probability. Items can be phonemes, syllables, letters, words, or actions depending on the application.
Full Text Chapter Download: US $37.50 Add to Cart
An Extensive Text Mining Study for the Turkish Language: Author Recognition, Sentiment Analysis, and Text Classification
They are words that consist of n-element subsets of a word. If N is equal to 1, 2, and 3, N-gram is called unigram, bigram, and trigram, respectively.
Full Text Chapter Download: US $37.50 Add to Cart
Full Text Chapter Download: US $37.50 Add to Cart
Using Computational Text Analysis to Explore Open-Ended Survey Question Responses
A contiguous sequence of “n” items (words), from unigram (one-gram) to bigram, three-gram, four-gram, and so on.
Full Text Chapter Download: US $37.50 Add to Cart
eContent Pro Discount Banner
InfoSci OnDemandECP Editorial ServicesAGOSR