Search the World's Largest Database of Information Science & Technology Terms & Definitions
InfInfoScipedia LogoScipedia
A Free Service of IGI Global Publishing House
Below please find a list of definitions for the term that
you selected from multiple scholarly research resources.

What is Regex

Machine Learning for Societal Improvement, Modernization, and Progress
The abbreviation of the term “regular expression”. It is a string of characters developed in theoretical computer science and formal language theory that allows to create patterns for matching, locating, and managing text.
Published in Chapter:
Deep Learning for Information Extraction From Digital Documents: An Innovative Approach to Automatic Parsing and Rich Text Extraction From PDF Files
Yavuz Kömeçoğlu (Kodiks Bilişim, Turkey), Serdar Akyol (Kodiks Bilisim, Turkey), Fethi Su (Kodiks Bilisim, Turkey), and Başak Buluz Kömeçoğlu (Gebze Technical University, Turkey)
DOI: 10.4018/978-1-6684-4045-2.ch009
Abstract
Print-oriented PDF documents are excellent at preserving the position of text and other objects but have difficulties in processing. Processable PDF documents will provide solutions to the unique needs of different sectors by paving the way for many innovations such as searching within documents, linking with different documents, or restructuring in a format that will increase the reading experience. In this chapter, a deep learning-based system design is presented that aims to export clean text content, separate all visual elements, and extract rich information from the content without losing the integrated structure of content types. While the F-RCNN model using the Detectron2 library was used to extract the layout, the cosine similarities between the wod2vec representations of the texts were used to identify the related clips, and the transformer language models were used to classify the clip type. The performance values on the 200-sample data set created by the researchers were determined as 1.87 WER and 2.11 CER in the headings and 0.22 WER and 0.21 CER in the paragraphs.
Full Text Chapter Download: US $37.50 Add to Cart
More Results
Intelligent CALL: Using Pattern Matching to Learn English
Regex and Regexp stand for regular expressions, which are powerful search expressions that can match characters, words and/or strings.
Full Text Chapter Download: US $37.50 Add to Cart
eContent Pro Discount Banner
InfoSci OnDemandECP Editorial ServicesAGOSR