Development of an Information Research Platform for Data-Driven Agriculture

Development of an Information Research Platform for Data-Driven Agriculture

Takahiro Kawamura, Tetsuo Katsuragi, Akio Kobayashi, Motoko Inatomi, Masataka Oshiro, Hisashi Eguchi
DOI: 10.4018/IJAEIS.302908
Article PDF Download
Open access articles are freely available for download

Abstract

Comprehensive research data are acknowledged as a necessity for research acceleration. Research institutes and universities are engaged in developing research data management systems. The National Agriculture and Food Research Organization of Japan (NARO) developed NARO-linked databases (Narolin DBs) in addition to a supercomputer. In the Narolin DB various research data on agriculture are cataloged using common metadata. The relationship between complicated data in natural science is described in RDF, property graph, or RDB format to facilitate the application of statistical analysis and machine learning. Our system is unique in that it is connected to a data catalog, a private cloud database, a supercomputer for data analysis, and a data/service portal for business applications, such as a data pipeline. Through the development of agricultural information research platforms, NARO will accelerate data-driven agricultural research at various stages in the agricultural supply chain, ranging from genome analysis to plant breeding, cultivation, food processing, and food distribution.
Article Preview
Top

Introduction

In recent years, information and communication technology (ICT) use has increased, even in the field of agriculture, and research data have been digitized in large volumes. By appropriately collecting and managing these research data, sharing them to the extent necessary, and applying data science and AI technology, such as statistical analysis and machine learning, higher efficiency and profitability of agriculture and the creation of interdisciplinary agricultural research beyond the conventional field and region is expected. These processes are called “smart agriculture” or “data-driven agriculture,” which are also required to strategically utilize research data that have not been previously shared by opening the data to external organizations, especially in the area of international cooperation (open data), sharing it among consortium parties, such as national projects (disclosed data), or concealing it for competitive utilization (closed data). Smart agriculture and data-driven agriculture have been around for a while; however, these processes are still not widely used, and the use of data is insufficient. One of the reasons for this is that many people, such as researchers, agricultural companies, farmers, do not know what data is available, what it is for, and how it can be used. So, the research issue in this paper is: How can we make better use of data in the agricultural sector? In other words, as a national research institute, what kind of comprehensive mechanism should we prepare to promote data utilization?

The National Agriculture and Food Research Organization (NARO), the largest agricultural research body that is composed of several institutes, was established in FY 2016 through the integration of multiple institutes under the Ministry of Agriculture, Forestry and Fisheries (MAFF) in Japan. NARO found it necessary to establish a unified research data strategy and infrastructure to promote data-driven agricultural research and development while also coordinating research data across diverse research fields, such as genomic information on animals, plants, and microorganisms, breeding information, cultivation and growth management information, including viruses and pest control, food processing and distribution information, robotic tractor design, and environmental information, such as climate change and soil information. According to a preliminary survey conducted in 2019, approximately 60% of the research data were stored in the researcher’s PCs and Hard Disk Drives (HDDs) causing data sharing inside and outside NARO to be delayed.

Therefore, NARO developed an agricultural information research platform, including NARO linked databases on agricultural-related data (Narolin DB), a supercomputer that utilized the database for AI operations such as machine learning and a data distribution system called WAGRI. Especially, in response to the above research issue, a platform was provided as a pipeline to collect data, add unified metadata to make it easy to discover in the primary DB, prepare the data format as pre-processing for analysis in the secondary DB, and make it easy for non-ICT experts in the agricultural sector to utilize the data by data analysis environment provided by AI supercomputer, and by WAGRI to provide data via API to private companies that serve farmers as end-users. In addition to the development of such computational systems, an integrated mechanism was created that included institutional design for human systems, such as formulation of terms of use for research data and AI education. Thus, the contribution in this paper is to share the knowledge and experience acquired in the design and implementation of the comprehensive mechanism to promote data utilization in the agricultural sector.

In the rest of this paper, the systems and services related to the research data are outlined in the second section. The structure of NARO’s information research platform and the common metadata of NARO are described in the third section. After the introduction of a supercomputer for AI operations in the fourth section, and WAGRI for data distribution for business in the fifth section, the authors discuss the comparison with existing systems and case studies on their platform in the sixth section, and a summary, future issues, and institutional measures for dissemination of the data-driven approach are explored in the final section.

Complete Article List

Search this Journal:
Reset
Volume 15: 1 Issue (2024): Forthcoming, Available for Pre-Order
Volume 14: 1 Issue (2023)
Volume 13: 2 Issues (2022): 1 Released, 1 Forthcoming
Volume 12: 4 Issues (2021)
Volume 11: 4 Issues (2020)
Volume 10: 4 Issues (2019)
Volume 9: 4 Issues (2018)
Volume 8: 4 Issues (2017)
Volume 7: 4 Issues (2016)
Volume 6: 4 Issues (2015)
Volume 5: 4 Issues (2014)
Volume 4: 4 Issues (2013)
Volume 3: 2 Issues (2012)
Volume 2: 2 Issues (2011)
Volume 1: 2 Issues (2010)
View Complete Journal Contents Listing