Anatomical Therapeutic Chemical Classification (ATC) With Multi-Label Learners and Deep Features

Anatomical Therapeutic Chemical Classification (ATC) With Multi-Label Learners and Deep Features

Loris Nanni, Sheryl Brahnam, Gianluca Maguolo
Copyright: © 2020 |Pages: 14
DOI: 10.4018/IJNCR.2020070102
(Individual Articles)
No Current Special Offers


Automatic anatomical therapeutic chemical (ATC) classification predicts an unknown compound's therapeutic and chemical characteristics. Predicting the organs/systems an unidentified compound will act on has the potential of expediting drug development and research. That a given compound can affect multiple organs/systems makes automatic ATC classification a complex problem. In this paper, the authors experimentally develop a multi-label ensemble for ATC prediction. The proposed approach extracts a 1D feature vector based on a compound's chemical-chemical interaction and its structural and fingerprint similarities to other compounds, as defined by the ATC coding system. This 1D vector is reshaped into 2D matrices and fed into seven pre-trained convolutional neural networks (CNN). A bidirectional long short-term memory network (BiLSTM) is trained on the 1D vector. Features extracted from both deep learners are then trained on multi-label classifiers, with results fused. The best system proposed here is shown to outperform other methods reported in the literature.
Article Preview


The cost of developing a single new drug is now estimated to exceed two billion dollars, and the approval process often takes decades. According to a recent study by Wong and Siah (2019), only about fourteen percent of all newly developed drugs pass through clinical trials. Many of these drugs fail because they lack efficacy or have adverse side-effects (Pitts, 2014). The problem of accurately predicting the therapeutic indications and side-effects of new drugs has led many researchers to work on developing machine learning (ML) systems to classify compounds based on Anatomical Therapeutic Chemical (ATC) classes. ML systems that filter out drugs with a significant probability of failing in clinical trials offer the promise of accelerating drug development and reducing costs.

The ATC coding system, controlled by the World Health Organization, categorizes drugs into overlapping classes at five different levels based on their therapeutic, pharmacological, and chemical properties and on the organs or systems the drugs act on. The first level identifies the broad anatomical groups a drug targets by coding it with one of fourteen letters:

  • A: Alimentary tract and metabolism

  • B: Blood and blood-forming organs

  • C: Cardiovascular system

  • D: Dermatologicals

  • G: Genitourinary system and sex hormones

  • H: Systemic hormonal preparations, excluding sex hormones and insulins

  • J: Anti-infectives for systemic use

  • L: Antineoplastic and immunomodulating agents

  • M: Musculoskeletal system

  • N: Nervous system

  • P: Antiparasitic products, insecticides, and repellents

  • R: Respiratory system

  • S: Sensory organs

  • V: Various

Levels 2 through 3 of the ATC coding system mostly represent pharmacological subgroups, whereas level 5 identifies the chemical substances. A drug is given one or more ATC codes based on its memberships in the different classes contained in these five levels. Acetylsalicylic acid, or Asprin, for example, has three ATC codes since it functions as a local oral treatment (level 1 group A), as a platelet inhibitor (level 1 group B), and as an analgesic and antipyretic (level 1 group N).

Although the ATC classification system has become an essential tool for providing guidance to drug developers regarding the potential clinical value of a compound, only a fraction of all pharmaceuticals has been assigned ATC codes. That so few drugs have been labeled is due to the labor-intensive experimental methods required to identify a new drug's ATC classes. This bottleneck has resulted in the proposal of many ML methods and the establishment of some web servers capable of performing automatic ATC classification (Dunkel, Günther, Ahmed, Wittig, & Preissner, 2008; Wu, Ai, Liu, & Fan, 2013). Most research in this area, including the study presented here, focuses on identifying the fourteen organ/system classes at the first level of the ATC coding system. As illustrated with Aspirin, predicting a compound’s level 1 classification is a complex multi-label problem.

Complete Article List

Search this Journal:
Volume 12: 1 Issue (2024): Forthcoming, Available for Pre-Order
Volume 11: 4 Issues (2022): 1 Released, 3 Forthcoming
Volume 10: 4 Issues (2021)
Volume 9: 4 Issues (2020)
Volume 8: 4 Issues (2019)
Volume 7: 4 Issues (2018)
Volume 6: 2 Issues (2017)
Volume 5: 4 Issues (2015)
Volume 4: 4 Issues (2014)
Volume 3: 4 Issues (2012)
Volume 2: 4 Issues (2011)
Volume 1: 4 Issues (2010)
View Complete Journal Contents Listing