Article Preview
TopIntroduction
Resumes play an important role in human career. Resumes are used for employees to find jobs and for HR to select staffs. The demographic information and work experiences are recorded on the resumes. Too many valuable patterns can be explored from resumes.
Visualization techniques have already been used to assist the analysis of humanities (Ware, 2014), which are good at storytelling. Different aspects of humanities visualization are presented in the previous work, such as visualizing traffic data (Shi et al., 2018), table tennis data (Wu et al., 2018), E-commerce data (Kim et al., 2019). But as far as I know, there is almost no visualizing work dedicated to resumes.
Resumes are commercial secrets, so it is very difficult to obtain lots of resumes, which is the biggest difficulty in resume mining and visualization. Through the offline cooperation with a social insurance payment company, we collected 372,829 Chinese resumes working in Beijing with rich and valuable attributes, such as income and Chinese ID number, which are not easily available.
People in the resumes are of different ages and majors. The decade characteristics and major characteristics should be considered when analyzing and visualizing resumes. To assist resume data mining, 1,837,281 documents in the People’s Daily from May 1946 to December 2015 and the national college entrance examination scores of 42 majors in 27 Beijing universities from 2005 to 2015 are collected to build the multi-source dataset. By the observation of the multi-source dataset, we find that: (1) there are distinct characteristics of the decades in the 1950s, 1960s, 1970s, 1980s; (2) with the development of the decades, different majors have different degrees of attention.
An interactive visualization system called ResumeVis is developed to explore the correlations among the attributes. From the perspective of job seekers, users can find out the successful experiences of the predecessors. For example, what abilities and work experiences do high-paid persons have. From the perspective of human resources, users can check the attribute distributions. Based on the above data observation, the attention degree of a major reflected by the number of occurrences in the People's Daily and the national college entrance examination is integrated in the system, which can help users predict trends. User-friendly interactions, such as filter elements, reorder attributes, brushing and linking, are integrated to provide an easy-use interface.
The main contributions include:
- •
A multi-source dataset is constructed, including 372,829 Chinese resumes with rich and valuable attributes, such as income and Chinese ID number, 1,837,281 documents in the People’s Daily from May 1946 to December 2015 and the national college entrance examination scores of 42 majors in 27 Beijing universities from 2005 to 2015. The data observation of the decade characteristics and major characteristics are explored from the multi-source dataset to assist resume data mining.
- •
A complete interactive visualization system called “ResumeVis” is developed to explore the different aspects of resumes by multi-source data analysis, especially the attribute correlations. In the system, parallel coordinates with multi-valued attributes are proposed, which can show multiple values in one attribute to meet the characteristics of resumes.
The rest of the paper is organized as follows: Section 2 gives related work. Section 3 describes the multi-source data collection. Section 4 describes the observation of multi-source data to explore the decade characteristics and major characteristics. Section 5 designs ResumeVis system. Section 6 shows case studies. Section 7 finally draws the conclusions.