The aim of this paper is to present a typology of career paths in France drawn up with the Kohonen algorithm and its extension to a clustering method of life history analysis based on the use of Self Organizing Maps (SOMs). Several methods have previously been presented for transforming qualitative into quantitative information so as to be able to apply clustering algorithms such as SOMs based on the Euclidean distance. Our approach consists in performing quantitative encoding on labor market situation proximities across time. Using SOMs, the preservation of the topology also makes it possible to check whether this new method of encoding preserves the particularities of the life history according to our economic approach to careers. Lastly, this quantitative encoding preprocessing, which can be easily applied to analysis methods of life history, completes the set of methods extending the use of SOM to qualitative data.
Several methods are generally used to study the dynamic aspects of careers. The first method, which estimates some reduced-form transition models, has been extensively used in labor microeconometrics, using event-history models for continuous-time data or discrete-time panel data with Markov processes. Those of the second kind, which include the method presented here, are sequence analysis methods dealing with complex information about individual labor market histories, such as the various states undergone, the duration of the spells, multiple transitions between the states, etc.. The idea was to empirically generate a statistical typology of sequences by performing cluster analysis (Lebart, 2006). This method thus makes it possible to define “cluster paths” constituting endogenous variables and explained in terms of individual characteristics such as gender, educational level or parental socio-economic status. The optimal matching method, which has been widely used in social science since the pioneering paper by Abott (Abbott & Hrycak, 1990), is an attractive solution for analysing longitudinal data of this kind. The basic idea underlying this method is to take a pair of sequences and calculate the cost of transforming them into each other by performing a series of elementary operations (insertion, deletion and substitution). However, this method has been heavily criticized because it may be difficult to determine the values of these elementary operations. Here we adopt another strategy. First, in order to classify sequences into groups, we have defined a measure of the distance between each trajectory, which is coherent with our data and with some well-known theoretical hypotheses in the field of labor economics. We then use Self Organizing Maps (the Kohonen algorithm) for classification and purposes.
Self Organizing Maps (see Kohonen, 2001, Fort, 2006) are known to be a powerful clustering and projection method. Since this method accounts efficiently for changes occurring with time, SOMs yield accurate predictions (see for example Cotrell, Girard & Rousset, 1998, Dablemont, Simon, Lendasse, Ruttiens, Blayo & Verleysen, 2003, Souza, Barreto & Mota, 2005). Life histories can be considered as a qualitative record of information, while SOMs are based on Euclidean distance. Many attempts have been made to transform qualitative variables into quantitative ones: using for example the Burt description (see the KACM presentation in Cottrel & Letremy, 1995) or using the multidimensional scaling (Miret, Garcia-Lagos, Joya, Arazoza & Sandoval, 2005). In our approach, the quantitative recoding focuses on the proximity between items considering particularities of the data (a life history) according to our economic approach. When the preprocessing of recoding is performed, Self Organizing Maps is a useful clustering tool, first considering its pre-mentioned clustering and projection qualities and also because of its ability to make the efficiency of our new encode emerge.