Article Preview
TopIntroduction
In recent years, with the rapid development of the mobile application market, competition among mobile application (app) software has become increasingly fierce. In this context, user-driven software evolution has been the focus of researchers. However, most of the current research focuses on user reviews, software profiles and other display features that embody user experience. Keertipati et al. (2016) found that app reviews contain valuable feedback about what features should be fixed and improved. There is some research aimed at improving user experience by mining user reviews. For example, Guzman & Maalej (2014) used natural language processing techniques to identify fine-grained app features in the reviews. Wu et al. (2021) proposed a novel approach that leverages app software profiles and user reviews to identify key features of a given app.
In addition to user reviews and software profiles, Shu et al. (2018) pointed out that user operations such as viewing or downloading provide rich information about users' preferences and usage habits of app software, which have great potentials to advance app recommendations. However, the existed studies have limitations of processing and analyzing huge app software user operation data efficiently and accurately.
In earlier studies, some commercial frameworks were applied to evaluate the usability of mobile devices. For example, Flurry Analytics (2012) provided data and analytics tools that captured how users use their mobile app devices and how the app performed on different phones. Subsequently, Lettner and Holzmann (2012) pointed out that to use these frameworks, developers need to manually add source code for different applications, which means that developers have to spend a lot of time studying the structure of the framework and the target application. He also noted that these statistics reflect how the user generally uses the app software, such as how many times the user uses the app software in a week and how much traffic is consumed, but they cannot reflect the details of user operations of the app software, such as what functions users often use in the app software and what content they focus on and search for.
Balagtas and Hussmann (2009) proposed a method to simplify the analysis of a large number of logs with the help of visualization and third-party tools. Lettner and Holzmann (2012) proposed a method to automatically collect and analyze logs. This method further simplified the work of researchers and can effectively collect usability data. However, McMillan et al. (2015) argued that most logs were limited to recording coarse-grained user operations. For example, these log-based methods were able to answer which application was opened or closed, but specific details of the user operations inside the application were difficult to capture. Users can use WeChat to transfer files, update status, participate in group chats or any other functions that wechat contains.
Krieter and Breiter (2018) used a program based on Python script to convert high-quality screen recording videos into pictures. Then, they analyzed the user operations in the app through image processing technology and defined event picture templates. Based on this method, some studies on user operation data analysis are conducted. For example, Liu et al. (2019) defined app software functions through screen shots and established the connection between app software functions and user operations (A case study of snapchat). However, due to the large differences in the user interfaces of different apps, the manual definition of app software functions based on screen captures is not universal. Similarly, combined with screen recording, timed screen capture, image comparison technologies, Xin (2021) designed and developed the user operation capture function in the app, and selected NetEase Youdao Dictionary app as an application case to observe and analyze the user operation of 13 user samples for a month. Through semi-automated data analysis, they found that the user experience and requirements hidden in the user operation details, thus providing more powerful support for product design, development and maintenance. However, Xin (2021) also pointed out the disadvantages of this method. For example, high quality video recording on the screen will occupy a large amount of mobile storage space, and processing screen shots is time consuming. Xin (2021) tried to reduce the processing time by compressing the images, but he found that this reduced the accuracy of user operation recognition.