Missing data is often an actual problem in real data sets, and different imputation techniques are normally used to alleviate this problem. Imputation is a method to fill in missing data with plausible values to produce a complete data set. In this chapter, we analyze the performance of the different traditional data imputation methods. A new fuzzy imputation approach is proposed using ordered weighted average operators and the majority concept. In order to form the majority concept, we propose the use of neat OWA operators and linguistic quantifiers with two fusion strategies for aggregation operators.
Key Terms in this Chapter
Fuzzy: (related to fuzzy sets theory): It is an extension of classical set theory used in fuzzy logic. In classical set theory, the membership of elements in relation to a set is assessed in binary terms according to a crisp condition; an element either belongs or does not belong to the set. In contrast, fuzzy set theory permits the gradual assessment of the membership of elements in relation to a set; this is described with the aid of a membership function.
MA-OWA: MA-OAW is an aggregation operator of the neat OWA family. This operator obtains representative values so that the majority of the elements are considered without omission of the others groups.
Majority Imputation: It is a modification of traditional imputation methods where the majority concept is included through the OWA operators and fuzzy quantifiers.
Average: In mathematics, an average or central tendency of a data set refers to a measure of the middle of the data set. There are many different descriptive statistics that can be chosen as a measurement of the central tendency. The most common method, and the one generally referred to simply as the average, is the arithmetic mean.
Majority: It is a political philosophy or agenda that asserts that a majority (sometimes categorized by religion, language, or some other identifying factor) of the population is entitled to a certain degree of primacy in society, and has the right to make decisions that affect the society.
Semantics: Semantics refers to the aspects of meaning that are expressed in a language, code, or other form of representation.
Normalization: It is a mathematical process that adjusts for differences among data from varying sources in order to create a common basis for comparison.
Quantification: It is a construct that specifies the quantity of individuals of the domain of discourse that apply to (or satisfy) an open formula.
Imputation: Imputation is the prediction of a missing value based on some procedure, using a mathematical model in combination with available information.
OWA (Ordered Weighted Averaging): It is an aggregation technique lying between the logical or and and. Formally, it is defined in dimension n as a mapping with an associated n vector such that and .