Reference Hub2
Serialized Co-Training-Based Recognition of Medicine Names for Patent Mining and Retrieval

Serialized Co-Training-Based Recognition of Medicine Names for Patent Mining and Retrieval

Na Deng, Caiquan Xiong
Copyright: © 2020 |Volume: 16 |Issue: 3 |Pages: 21
ISSN: 1548-3924|EISSN: 1548-3932|EISBN13: 9781799804994|DOI: 10.4018/IJDWM.2020070105
Cite Article Cite Article

MLA

Deng, Na, and Caiquan Xiong. "Serialized Co-Training-Based Recognition of Medicine Names for Patent Mining and Retrieval." IJDWM vol.16, no.3 2020: pp.87-107. http://doi.org/10.4018/IJDWM.2020070105

APA

Deng, N. & Xiong, C. (2020). Serialized Co-Training-Based Recognition of Medicine Names for Patent Mining and Retrieval. International Journal of Data Warehousing and Mining (IJDWM), 16(3), 87-107. http://doi.org/10.4018/IJDWM.2020070105

Chicago

Deng, Na, and Caiquan Xiong. "Serialized Co-Training-Based Recognition of Medicine Names for Patent Mining and Retrieval," International Journal of Data Warehousing and Mining (IJDWM) 16, no.3: 87-107. http://doi.org/10.4018/IJDWM.2020070105

Export Reference

Mendeley
Favorite Full-Issue Download

Abstract

In the retrieval and mining of traditional Chinese medicine (TCM) patents, a key step is Chinese word segmentation and named entity recognition. However, the alias phenomenon of traditional Chinese medicines causes great challenges to Chinese word segmentation and named entity recognition in TCM patents, which directly affects the effect of patent mining. Because of the lack of a comprehensive Chinese herbal medicine name thesaurus, traditional thesaurus-based Chinese word segmentation and named entity recognition are not suitable for medicine identification in TCM patents. In view of the present situation, using the language characteristics and structural characteristics of TCM patent texts, a modified and serialized co-training method to recognize medicine names from TCM patent abstract texts is proposed. Experiments show that this method can maintain high accuracy under relatively low time complexity. In addition, this method can also be expanded to the recognition of other named entities in TCM patents, such as disease names, preparation methods, and so on.

Request Access

You do not own this content. Please login to recommend this title to your institution's librarian or purchase it from the IGI Global bookstore.