Information Extraction of Protein Phosphorylation from Biomedical Literature

M. Narayanaswamy; K. E. Ravikumar; Z. Z. Hu; K. Vijay-Shanker; C. H. Wu

doi:10.4018/978-1-60566-274-9.ch009

Special Offers
- IGI Global’s New Emerging Topic e-Book Collections
  Acquire highly focused and affordable Cutting-Edge Peer-Reviewed Research Content through a selection of 17 topic-focused e-Book Collections discounted up to 90%, compared to list prices. Collection topics include Artificial Intelligence, Data Science, Language Learning, Marketing and Customer Relations, Sustainability, and many more. Hosted on the InfoSci^® platform, these collections feature no DRM, no additional cost for multi-user licensing, no embargo of content, full-text PDF & HTML format, and more.
  Learn More
- Open Access Book (Free Access) - Encyclopedia of Information Science and Technology, Sixth Edition (ISBN: 9781668473665)
  The Encyclopedia of Information Science and Technology, Sixth Edition) continues the legacy set forth by the first five editions by providing comprehensive coverage and up-to-date definitions of the most important issues, concepts, and trends pertaining to technological advancements and information management within a variety of settings and industries. The entire book is being published under open access.
  Read Now
- Open Access Book (Free Access) - Food Sustainability, Environmental Awareness, and Adaptation and Mitigation Strategies for Developing Countries (ISBN: 9781668456293)
  Food Sustainability, Environmental Awareness, and Adaptation and Mitigation Strategies for Developing Countries provides information on the recent technology, mitigation, and environmental protection that must be applied for food sustainability in developing countries. This book is being published under Platinum Open Access through funding from Diponegoro University, Indonesia.
  Read Now
- Open Access Book (Free Access) - New Models of Higher Education: Unbundled, Rebundled, Customized, and DIY (ISBN: 9781668438091)
  The Walmart Corporation and the Lumina Foundation have provided funding to make New Models of Higher Education: Unbundled, Rebundled, Customized, and DIY fully open access, completely removing any paywall between scholars in education and the latest research on new models for the future of higher education.
  Read Now
- Open Access Book (Free Access) - Handbook of Research on the Global View of Open Access and Scholarly Communications (ISBN: 9781799898054)
  Through a collaboration between IGI Global and the University of North Texas, the Handbook of Research on the Global View of Open Access and Scholarly Communications has been published as fully open access, completely removing any paywall between researchers of any field, and the latest research on the equitable and inclusive nature of Open Access and all of its complications.
  Read Now
Books
- - Books by Subject
  - Business, Administration, & Management
  - Scientific, Technical, & Medical (STM)
  - Education
  - Books by Field
Journals
- - Journals
  - OnDemand Journal Articles
  - Journals by Subject
  - Business, Administration, & Management
  - Scientific, Technical, & Medical (STM)
  - Education
  - Journals by Field
e-Collections
OnDemand
Open Access
- View All Open Access Opportunities
  Search across all of IGI Global’s available open access publishing opportunities to unleash your research potential.
  Find an Open Access Journal for Your Next Manuscript
  Search across all of IGI Global’s available open access publishing opportunities to unleash your research potential.
  Submit an Open Access Book Proposal
  Learn more about open access book publishing and how it can propel your research forward in the field.
  Convert Your Work to Open Access
  Already published? You can convert your work to open access to increase its impact through IGI Global’s Restrospective Open Access Program.
  Utilize Open Access Collection Database
  Open up your research potential by utilizing our open access content or integrating the open access collection into your library
  Consider Open Access Agreements
  For Libraries: consider no-cost or investment-level open access agreements with IGI Global to support your faculty's research endeavors.
  Search Funding Resources
  Looking for additional funding resources to support your open accesss endeavors? View industry resources compiled by our open access team.
  Review Open Access Policies & Ethical Guidelines
  Considering IGI Global to publish your work under open access? Review IGI Global’s open access policies and ethical guidelines
Publish with Us
Resources
- - Instructors
  - Course Adoption
  - Teaching Cases
  - K-12 Online Learning Collection
  - Authors and Editors
  - eEditorial Discovery^® System
  - Peer Review Process
  - Ethics and Malpractice
  - COPE Membership
  - Fair Use Policy
  - Open Access Publishing
  - FAQ
Catalogs
About Us

Information Extraction of Protein Phosphorylation from Biomedical Literature

M. Narayanaswamy, K. E. Ravikumar, Z. Z. Hu, K. Vijay-Shanker, C. H. Wu

Source Title: Information Retrieval in Biomedicine: Natural Language Processing for Knowledge Integration

DOI: 10.4018/978-1-60566-274-9.ch009

OnDemand:

(Individual Chapters)

Available

$37.50

Current Special Offers

No Current Special Offers

Abstract

Protein posttranslational modification (PTM) is a fundamental biological process, and currently few text mining systems focus on PTM information extraction. A rule-based text mining system, RLIMS-P (Rule-based LIterature Mining System for Protein Phosphorylation), was recently developed by our group to extract protein substrate, kinase and phosphorylated residue/sites from MEDLINE abstracts. This chapter covers the evaluation and benchmarking of RLIMS-P and highlights some novel and unique features of the system. The extraction patterns of RLIMS-P capture a range of lexical, syntactic and semantic constraints found in sentences expressing phosphorylation information. RLIMS-P also has a second phase that puts together information extracted from different sentences. This is an important feature since it is not common to find the kinase, substrate and site of phosphorylation to be mentioned in the same sentence. Small modifications to the rules for extraction of phosphorylation information have also allowed us to develop systems for extraction of two other PTMs, acetylation and methylation. A thorough evaluation of these two systems needs to be completed. Finally, an online version of RLIMSP with enhanced functionalities, namely, phosphorylation annotation ranking, evidence tagging, and protein entity mapping, has been developed and is publicly accessible.

Chapter Preview

Top

Introduction

Protein post translational modification (PTM), a molecular event in which a protein is chemically modified during or after its being translated, is essential to many biological processes. Protein phosphorylation is one of the most common PTMs, which involves the addition of a phosphate group to serine, threonine or tyrosine residues of a protein, and is fundamental to cell metabolism, growth and development. Many cellular signal transduction pathways are activated through phosphorylation of specific proteins that initiate a cascade of protein-protein interactions, leading to specific gene regulation and cellular response. It is estimated that one third of the mammalian genome coding sequences code for phosphoproteins. The phosphorylation state of cellular proteins is also highly dynamic, detection, quantification and functional analysis of the dynamic phosphorylation status of proteins, and the kinases involved are essential for understanding the regulatory networks of biological pathways and processes, which are under extensive investigation by researchers of many areas of biological research.

While PTMs are fundamental to our understanding of cellular processes, the experimental PTM data are largely buried in free-text literature. For example, a recent PubMed query for protein phosphorylation returned 103,478 papers. Although PTMs, especially phosphorylation, are among the most important protein features annotated in protein databases, currently only limited amount of data are annotated in a few resources, such as UniProt Knowledgebase (UniProtKB) (Wu et al., 2006), and specialized databases including Phospho.ELM and PhosphoSite, which can not keep up with the fast-growing literature. With the increasing volume of scientific literature now available electronically, efficient text mining tools will greatly facilitate the extraction of information buried in free text. Information extraction of PTM information on specific proteins, sites/residues being modified, and enzymes involved in the modification are particularly useful not only to assist database curation for protein site features and related pathway or disease information, but also to allow users to quickly browse and analyze the literature, and help other bioinformatics software to integrate text mining component into pathway and network analysis.

There are many BioNLP relation extraction systems that have been developed in the past few years. Some of these employ special rule/pattern based approaches (e.g., Blaschke et al., 1999; Pustejovsky et al., 2002). Other approaches for extracting protein-protein interactions include detecting co-occurring proteins (Proux et al., 2000; Stapley and Benoit, 2000; Stephens et al., 2001), or using a text parser tailored for the specialized language typically found in the biology literature (e.g., Friedman et al., 2001; Daraselia et al., 2004). The rule-based approach involves designing patterns to extract specific types of information, while the parser approach requires development of grammars, methods for disambiguation and further effort to provide methods that map parse information to objects involved in the relation. More modern approaches employ machine learning for relation extraction (e.g., Bunescu and Mooney, Gioliana et al). Such methods require an annotated corpus, where the sentences are marked with the relation and related objects manually. Machine learning techniques are then employed to learn a model that will extract from unseen text.

Complete Chapter List

Search this Book:

Reset

MLA

APA

Chicago

Export Reference

Information Extraction of Protein Phosphorylation from Biomedical Literature

Abstract

Introduction

Complete Chapter List