Article Preview
TopIntroduction
Automatic code generation technology is one of the means to improve the efficiency and quality of software development. Many researchers have studied technical means and improved automatic code generation implementation methods to enhance the quality of automatically generated code and achieve the purpose of meeting programmer’s expectations. Therefore, automatic code generation technology has always been the core issue for practitioners and researchers (Hu et al., 2019).
In recent years, artificial intelligence technology has made great progress and development, which has further formed an important promotion for the research on automatic code generation technology. Automatic code generation technology has also achieved vigorous development (Yang et al., 2020). Part of the current research on automatic code generation technology has been applied to actual development. The automatic code generation tools implemented according to a certain code generation method are usually embedded in the integrated development environment in the form of plugins. For example, integrated development environments such as IntelliJ IDEA, Eclipse, and PyCharm all support embedded code automatic generation plugins to help programmers improve development efficiency.
Among the number of automatic code generation technologies, the ability to evaluate the model includes the accuracy of predicting the next token, the MRR indicator used in the field of information retrieval, and so on. The evaluation indicators cannot be directly converted because of automated evaluation standards are different (Hu et al., 2019). Therefore, the existing research on automatic code generation technology focuses on the improvement of its model capability, ignoring the research on the quality measurement of its generated code. In addition, the existing research on software code quality measurement is based on the content of the written code, ignoring the programmer's programming behavior information. However, programming behavior information is a factor that cannot be ignored when doing the automatic code generation measurement. The programming behavior information is continuously and dynamically changing. So the automatic generation of code measurement is also a process of continuous dynamic changeable. Traditional software quality measurement methods are not suitable for code measurement generated by code automatic generation technology. Therefore, it is of great significance to study the metric method for the code generated by the automatic code generation technology.
To fully consider that the automatic code generation metric is a process of continuous dynamic changeable, this paper proposes a metric method for automatic code generation based on dynamic abstract syntax tree (DAST method). Specifically, the authors build a dynamic abstract syntax tree, and then extract metric-related content from the dynamic abstract syntax tree. Finally, the authors complete the metric for automatic code generation according to the extracted content. The experiment results show that the method can effectively realize the metrics of automatic code generation. Compared with the automatic generation metric method (Zhang, 2021) (MAST method) of which constructed by all programming codes and programming records, the method in this paper can improve the convergence speed by 80% when training the model, and can shorten the time-consuming by an average of 46% when doing the automatic code generation metric prediction.
The contributions of this paper are summarized as follows: