A E-Business Case of Study: Modelling the Quality of the Wine using its Physicochemical and Qualitative Properties

A E-Business Case of Study: Modelling the Quality of the Wine using its Physicochemical and Qualitative Properties

Maria Vargas-Vera (Adolfo Ibanez University, Viña del Mar, Chile), Camilo Salles (Adolfo Ibanez University, Viña del Mar, Chile), Joaquin Parot (Adolfo Ibanez University, Viña del Mar, Chile) and Sebastian Letelier (Adolfo Ibanez University, Viña del Mar, Chile)
Copyright: © 2017 |Pages: 20
DOI: 10.4018/IJKSR.2017070101


The main purpose of this research was to find relations between the chemical composition of the wines and the wine testers' opinions on the wine quality. We used in our study a dataset which contains examples of red wine from Vinho Verde, Portugal. Firstly, we did an analysis on the attributes of the examples, in the dataset, to find correlations between quantitative and qualitative properties in wines. Secondly, we performed clustering using the algorithms k-means and x-means. Additionally, we used the J48 algorithm for getting a decision tree and then to extract first order logic rules. We concluded that, there is a relation between physicochemical properties and quality of wines. This result opens the possibility of further analysis and perhaps this could lead to use fewer wine testers and therefore, our research could bring benefit to the wine industry.
Article Preview


Many centuries ago, wine was a luxury good however; nowadays the cost of the wine has dropped and therefore, the wine consumption has spread to wider sectors of the population worldwide. To our knowledge, the wine industry, has invested in modern technologies to improve the wine production and sale processes (Ferrer et al., 2008). Two important aspects considered by the wine industry are “wine certification” and “quality assessments.” Certification is the legal aspect of the wine industry to prevent the illegal adulteration of wines and assures quality for the wine market. Wine certification is generally assessed by physicochemical and sensory tests (Ebeler, 1999). Physicochemical laboratory tests used to characterize wine include determination of density, alcohol or pH values, while sensory tests (like taste, colour, smell, texture, among other) rely mainly on human experts. Taste is the least understood of the human senses (Smith & Margolskee, 2006), then, wine classification even for humans is a difficult task. The relationships between the physicochemical and sensory analysis are complex and they are not fully understood (Legin, et al., 2003). Even so, we believe that the main goal (analysis of the quality of the wine) relies on finding relationships between wine qualitative properties and quantitative properties. In this way, we could predict a priori the quality of the wine without testing the wine. Then, the motivation behind this project is to contribute with technology to the wine industry development.

Advances in information technology have made possible to store, manage and process big datasets. In particular, Data Mining has an important role by helping users to understand their data and find relevant patterns on the data. Data mining techniques aim at extracting high-level knowledge from raw data. However, the use of the Data Mining methods requires that we perform a selection of variables and model selection. Variable selection is useful to discard irrelevant inputs, leading to simpler models that are easier to interpret and that usually give better performance. Complex models may over-fit the data, losing the capability to generalize, while a model that is too simple could present limited learning capabilities (Agrawal et al., 1993).

The experiments presented in this paper were carried out using the wine dataset which can be found in (Lichman, 2013). We selected Portugal wine for two reasons a) Portugal was one of the top ten wine exporting countries with 3.17% of the market share in 2005 (FAOSTAT, 2005) and b) we have the dataset to our disposal (Lichman, 2013). The software used in the experiments was WEKA an open source which contains several algorithms for Data Mining (Hall et al., 2009). Our main contribution is to perform an analysis on wine quality that later could be used locally by the Chilean wine industry. The reason behind this decision is that Chile is also among the 10 top wine producers (FAOSTAT, 2005). Therefore, we believe that research on wine quality (finding relationships between qualitative and quantitative properties) could benefit in great deal the Chilean wine industry. The rest of the paper is organized as follows: firstly, it provides an overview of related work. Secondly, it presents our solution to wine quality problem using clustering and classification algorithms. Additionally, it presents the first order logic rules which show relations between wine quantitative and qualitative properties. These rules were obtained from the decision tree generated by the rule induction algorithm (J48). Finally, it gives our conclusions and future work.

Complete Article List

Search this Journal:
Volume 8: 4 Issues (2017)
Volume 7: 4 Issues (2016)
Volume 6: 4 Issues (2015)
Volume 5: 4 Issues (2014)
Volume 4: 4 Issues (2013)
Volume 3: 4 Issues (2012)
Volume 2: 4 Issues (2011)
Volume 1: 4 Issues (2010)
View Complete Journal Contents Listing