Personalized Education Resource Recommendation Method Based on Deep Learning in Intelligent Educational Robot Environments

The goal of this article is to analyze the problem of low computational efficiency and propagation error rate in entity recognition and relation extraction. This paper proposes a personalized education resource recommendation algorithm framework XMAMBLSTM based on deep learning in an intelligent education robot environment. XMAMBLSTM uses XLNet to assign word vectors to text sequences, employs a Multi-Bi-LSTM layer to represent complex information of word vectors, and combines a multi-headed attention layer to realize weight distribution of each word vector. The experimental results show that compared with the traditional collaborative filtering algorithm, the comprehensive evaluation indexes of the proposed method, based on the intelligent education robot environment on the two platforms, are higher than 5.05% and 17.3%, respectively.


INTRoDUCTIoN
With the advancement of Internet technology, the development of a network education mode is gradually progressing in the direction of rich resources and diverse teaching methods; thus, the transformation of a higher education mode is receiving increasing attention (Coral & Bernuy, 2022).As more teachers and students embrace the online education cloud platform, classroom management is also exhibiting an information trend.Among these, the smart classroom is an important component of the current university information environment, and the platform also highlights the issue of information overload2 (Zhang, 2021).This paper aims to address this issue by employing a recommendation algorithm based on deep learning in the context of an intelligent educational robot.To address the issue of insufficient data extraction and low accuracy of the platform recommendation system, the XMAMBLSTM model is proposed based on an intelligent education robot environment.It consists of the pre-training layer XLNet for input information word vector, the Multi-Bi-LSTM layer for extracting context information, and the CRF layer for extracting entity information.The context information and entity relationship of the word vector are transferred to the next layer to achieve entity recognition and relationship extraction effectiveness.
There are two prerequisites for effective course information recommendation: information overload and the inability to precisely describe the demand with keywords.In the recommendation system, the collaborative filtering algorithm is one of the most important recommendation algorithms.The collaborative filtering algorithm is based on the idea that "people gather in groups, and things cluster together."It requires no specialized knowledge and is simple to implement in engineering (Murad et al., 2020).Therefore, it has become the focus of many experts and scholars.However, the collaborative filtering-based recommendation algorithm faces the issues of a cold start and sparse data (Chae et al., 2020;Wei et al., 2020).It is unable to filter and recommend courses based on the actual circumstances of users, and its data scalability is low, so it cannot recommend effective courses to new users.The content recommendation algorithm can recommend courses that are highly relevant to the user's interests.Azizi and Do (2018) proposed a recommendation algorithm based on content and collaborative filtering, which modeled users' interests and search targets.This method proposed a recommendation model based on an algorithm for content recommendation.However, this algorithm is incapable of dynamic adjustment, and there is also the issue that the recommended courses are overly professional; a hybrid algorithm was created by combining all types of recommendation algorithms.This method can partially compensate for the shortcomings of various recommendation algorithms.Mohammadpour et al. (2019) proposed a method for demand prediction based on a hybrid algorithm to obtain user contact and context information.However, this method has a number of drawbacks, including complicated calculations and low recommendation efficiency.
In recent years, deep learning has become the main tool for entity recognition, relationship extraction, and other tasks (Keser & Aghalarova, 2022;Shanshan et al., 2021;Tarus et al., 2018).As a result, incorporating deep learning into personalized recommendations can alleviate the issues caused by information overload, accurately extract the potential relationship between word vectors, and enhance the precision and diversity of recommendations.
This paper studies a recommendation algorithm of the school education cloud platform based on deep learning in the environment of an intelligent educational robot in order to solve the problem of entity recognition and relationship extraction.This method utilizes the Multi-Head-Attention model and combines the Multi-Bi-LSTM and XLNet layers.It enables the effective extraction of the relevance of user input data and the recommendation of courses with high precision.The basic ideas are as follows: 1).Use XLNet to assign the word vector to the text sequence; 2).The Multi-Bi-LSTM layer is used to characterize the complex information of the word vector; 3).The Multi-Headed Attention layer is used to realize the assignment of the weight of each word vector.Compared to the traditional collaborative filtering algorithm's recommendation method, the proposed method's innovations include: • Since the proposed recommendation algorithm based on deep learning in the environment of intelligent educational robots is implemented on distributed computing nodes with limited resources, it eliminates the traditional step-by-step identification methods' diffusion and propagation of errors between different network layers.• An improved joint entity recognition and relationship extraction method is proposed based on an intelligent education robot environment.Compared to the conventional recommendation algorithm, the proposed algorithm can make the correlation between word vectors more apparent and, as a result, is more efficient and scalable.
The second section introduces related research, including the traditional course recommendation algorithm, the machine learning-based recommendation algorithm, and the deep learning-based course recommendation algorithm.The third section introduces the intelligent recommendation algorithm of personalized curriculum resources based on deep learning on the online education cloud platform, as well as the algorithm's overall architecture, XLNet embedding layer, Multi-Bi-LSTM model, and Multi-Head Attention model.The fourth section provides an overview of the experiment and analysis, including the experimental environment, experimental data set, evaluation index, model training, and experimental comparison and analysis.The conclusion is the fifth section.

RELATED woRKS
A personalized course recommendation system is one of the typical applications of data mining (Premalatha et al., 2018) in the field of education, with the goal of developing individualized learning resources or path services for target learners.

Traditional Course Recommendation Algorithm
Traditional recommendation algorithms can be categorized as collaborative-filtering-based recommendation methods, content-based recommendation methods, demographic-based recommendation methods, knowledge-based recommendation methods, and mixed recommendation methods.
Among them, the collaborative-filtering-based recommendation algorithm (Jalili et al., 2018) is the first to be proposed and is one of the algorithms utilized in a variety of fields.Since the algorithm is simple and efficient, it is widely studied and utilized.As online learning and education continue to evolve, some researchers are beginning to apply the recommendation system method to the field of education and to improving the algorithm to achieve a greater recommendation effect.The collaborative filtering-based course recommendation algorithm can be divided into the recommendation method based on the neighborhood (Beniwal et al., 2021;Kużelewska, 2020) and the model-based method (Zarzour et al., 2020).The recommendation method based on neighborhood relies on the calculation of the similarity of users or items to recommend the relevant courses of adjacent users.The model-based method extracts the potential characteristics of users and items to determine the user's preference for courses.Huang et al. (2019) developed a cross-user domain collaborative filtering algorithm to accurately predict each student's score in an elective course by utilizing the course score distribution of the most similar senior students.However, this method's user domain is relatively limited, and its scalability must be demonstrated further.Ma and Ye (2018) proposed an improved collaborativefiltering method to calculate the correlation between career goals and courses in order to mitigate the cold start problem and recommend professional courses to users.However, this method relies on previous feature analysis and classification, thus affecting the accuracy of the results.Obeidat et al. (2019) combined collaborative-filtering and the Apriori algorithm to generate high-quality association rules based on the courses chosen by students.However, this method is limited by the Apriori algorithm's iterative layer-by-layer search, and the solution may be slow for large data sets.Through comprehensive analysis of the above references, the collaborative-filtering method needs to use the evaluation information of users, and the scalability of the data set is low.It cannot handle the issue of new users, also known as a cold start.
In light of the aforementioned issues, previous research has proposed a solution to alleviate the cold start and sparsity problems by employing ontology domain knowledge and sequential learner mode in the absence of sufficient evaluation (John et al., 2017).Ramadhan and Musdholifah (2021) utilized the cosine similarity method to update the recommended content based on the various knowledge levels and learning styles of learners.Due to the lack of course text information, this method is unable to effectively analyze and process other course factors, which can easily result in unsatisfactory recommended results and the possibility of over-specialization of recommended results.

Recommendation Algorithm Based on Machine Learning
Due to the poor scalability, prior data support, and cold start of traditional recommendation algorithms, machine-learning algorithms, such as clustering and transfer learning, have been introduced into the field of personalized recommendation.Clustering refers to the unsupervised clustering method that divides a data set into different classes or clusters based on a certain criterion (such as distance criterion), so that the similarity of data objects in the same cluster is as high as possible, and the difference between data objects in different clusters is as high as possible.After clustering, data of the same class are clustered together as much as possible, while data of different classes are separated as much as possible.Relying on this method, Venkatesh and Sathyalakshmi (2020) proposed an E-learning Personalized Bee Recommender (PBReL) based on Artificial Bee Colony (ABC) optimization to build a recommendation structure using K-means clustering.This scheme is limited by ABC's inherent characteristics, such as poor local search capability and low precision.Valcarce et al. (2018) proposed a computational recommendation algorithm that combined a posteriori probability clustering and collaborative filtering.This method improves the scalability of the algorithm, but imposes greater pressure on computing resources to a certain extent.Kuelewska (2020) proposed a collection of clustering schemes based on multi-clustering that emphasized the accurate modeling of the neighborhood of active users (users who generate recommendations).However, the quality of this scheme is affected by the granularity of the lowest layer of the grid structure and lacks consideration of the relationship between grid cells.
The small sample size prevents the deep learning model from receiving sufficient training and makes it susceptible to over-fitting.For small sample learning, a traditional method is based on model fine-tuning.Typically, this method conducts model pre-training on a large data set, transfers to a small target data set, and then fine-tunes some neural network model layers (Ruder et al., 2019;Wang et al., 2019;Zhuang et al., 2020).Comparing the performance of various classification networks on image classification data sets, Kornblith et al. (2019) demonstrated that ImageNet architecture could be effectively generalized across data sets.However, it suffers from poor scalability.Ge and Yu (2017) adopted a selective joint fine-tuning deep transfer-learning scheme to enhance the performance of deep learning tasks when insufficient training data was available.However, this scheme requires a substantial amount of computing resources and has a relatively slow processing speed.Guo et al. (2019) proposed a selective joint fine-tuning method to improve the model's performance when training data is scarce.However, it is prone to negative transfer due to the limited amount of training data.
The above-mentioned machine learning methods are mainly aimed at the training process of small sample data sets.For a large number of training data, a more effective learning algorithm for feature extraction and data set classification is required to achieve a more effective personalized course recommendation scheme.

Course Recommendation Algorithm Based on Deep Learning
Deep learning methods are widely used, including rating prediction (Seo et al., 2017), text recommendation (Jin et al., 2018), image recommendation (Yu et al., 2018), and location recommendation (Dadoun et al., 2019;Zhang et al., 2019).Deep Neural Networks (DNN) automatically learn the implicit representation of data via non-linearity, thereby producing more dense abstract semantics at the highest level.DNN is primarily used to learn implicit features from users and projects in recommendation systems (Dargan et al., 2020).It typically reconstructs the information associated with users or items (including rating data, text, images, etc.) in order to obtain the implicit representation of users or items, thereby enhancing the vector expression of users' interests and preferences.DNN adapts to complex nonlinear relationships by identifying and learning deep-seated features, which are usually used to extract implicit feature relationships existing in user and project interaction data (Huang et al, 2021;Safran & Lee, 2022).The classic models in the recommendation system include the collaborative filtering model and Click Through Rate (CTR) model.He et al. (2017) proposed a Neural Collaborative Filtering (NCF) model, which employed a multi-layer perceptron as opposed to the simple inner product of the matrix decomposition model to discover the implicit relationship between users and items.However, it only considers the scoring matrix of users and items, ignoring the use of supplementary data.Rosa et al. (2018) proposed a Knowledge-Based Recommendation System (KBRS) for analyzing the user's emotion detection using convolutional neural network and Bidirectional Long Short-Term Memory (BiLSTM) -Recurrent Neural Network (RNN).However, RNN inevitably has the problems of gradient disappearance and gradient explosion, which greatly affects the solution results.Fu et al. (2018) proposed a collaborative filtering model based on deep learning priori to solve the problem that the feed-forward neural network process cannot be solved, but the cold start and sparsity problems under this scheme have not been adequately addressed.Manogaran et al. (2018) proposed a Multi-Kernel Learning with deep learning method of Adaptive Neuro-Fuzzy Inference System (MKL with ANFIS), but MKL degraded the performance of the AMD platform to a certain degree, limiting the scalability of this method significantly.It can be seen that the course recommendation method based on deep learning can effectively process online learning data, further capture the deep-seated features of learners and courses, and avoid the issues of data sparsity and cold start in the traditional recommendation algorithm.However, the disadvantage of the deep learning model is that it requires a large amount of calculations and is prone to producing results that are difficult to interpret.

INTELLIGENT RECoMMENDATIoN ALGoRITHM oF PERSoNALIZED CoURSE RESoURCES
The demand for online education platforms for entity recognition and relationship extraction is growing rapidly.In general, the process of identifying model entities and extracting entity relationships can be divided into two steps.The first is Named Entity Recognition (NER), which identifies user-entered information using named entity recognition technology, and the second is Relationship Extraction (RE), which uses classification models to detect specific relationships between entities.Although the step-by-step architecture can handle information in a flexible manner, it is difficult to eliminate errors and will lead to further diffusion.To address this issue, this paper builds an intelligent recommendation algorithm for personalized course resources on the online education cloud platform using deep learning to realize the simultaneous process of model entity recognition and relationship extraction, as well as the sharing of parameters in the two steps of the joint model.The model first uses the XLNet model to obtain the word vector of the text, uses the Multi-Bi-LSTM unit in the context encoding, and finally uses the Multi-Headed Attention model to extract the correlation in the information and predict the current information.

overall Architecture of the Algorithm
Architecture XMAMBLSTM can accomplish model entity recognition and relationship extraction simultaneously.The specific architecture of the algorithm is shown in Figure 1.The input information of the model is the text sequence entered by the user, and then XLNet is used to assign the word vector to the text sequence.The Multi-Bi-LSTM layer is used to characterize the complex information of subvectors.Then, the Multi-Headed Attention layer is used to assign the weight of each word vector.Finally, recognition of text sequences and relationship extraction are performed.

XLNet Embedded Layer
The XLNet-based named entity recognition model is used to enhance the entity recognition capability of the online education cloud platform.The model extracts word vector features from the input data and forwards them to the next level after combining them.
XLNet implements bidirectional prediction utilizing the AR language model.Using autoregressive LM mode, in the process of Transform, it achieves bi-directional prediction of sequence feature information by recombining the Attention Mask matrix and significantly reducing errors in the information extraction process under the Mask mechanism of the BERT model.
The word vector extraction mechanism of XLNet is shown in Figure 2. The circle indicates that the information can be collected; the cross indicates that it cannot be collected, and the dotted box indicates that each position vector can only utilize the previous hidden state information.Suppose that there is a column of input strings w w w w w w = ( ) , , , , , and its recombined word vector is y y y y y y = ( ) , , , , .For the vector y , since y 4 is located in the first bit, it is impossible to use the information of other bits.
As depicted in Figure 3, the mutually independent relationship between word vectors is resolved by omitting the Mask mark.For lengthy text input tasks, XLNet can conceal context information in the model by maintaining the left-to-right generation of the surface, which is more conducive to the model learning the interdependence between input characters.

Multi-Bi-LSTM Model
The learning of resource information between input characters is achieved by constructing multiple paths in the input layer of a neural network in order to improve relation extraction performance.The overall framework of the proposed Multi-channel Bi-directional Long and Short-Term Memory neural network (Multi-Bi-LSTM) is shown in Figure 4.The model is composed of four layers: input, multi-channel, Bi-LSTM, and output.
The information of the input layer is composed of word vector, part of speech feature vector, position value vector, and dependent syntax vector in the entire data set.Its length is defined as Length.If it is too long, only the information at the beginning will be retained.The input layer forms In the relation extraction task, enable the model to acquire a more comprehensive understanding of the meaning of the word vector and the hidden relation information.The multi-channel layer is used to combine the word feature as the main body with the part-of-speech feature, position feature, and dependency syntax to generate multiple channels as the network model's input.
Long Short-Term Memory (LSTM) is an improvement of recurrent neural networks.The implicit information U i and the storage location L i in the model are functions with the parameters of the implicit information U i−1 and the storage location L i−1 in the previous step, and the implicit information U i +1 and the storage location L i +1 in the subsequent step.The output information in the three-layer Bi-LSTM network is taken as the mean and variance of the input, and the final value is normalized and regularized.

Multi-Head Attention Model
The attention mechanism calculates and assigns the attention weight of the word vector at each position during the coding process.The implicit vector representation of the entire input is then computed as a weighted sum.However, excessive concentration of attention is an issue when encoding the current position.To assign different word vectors to the same attention, the Multi-Head Attention model transforms and queries different projections obtained through independent learning, and then combines the vectors that have been converted.The objective of the pooling operation is to prevent the over-fitting of the model.Finally, the output of the attention pool is combined to generate the final output.The Multi-Head Attention model is shown in Figure 5.

Assuming query
∈ ℜ , and value Z k l ∈ ℜ , the calculation method of each attention header is as follows: The relationships that can be extracted are The function representing attention pooling is the "dot product" attention of additive attention and scaling.The output is subjected to a linear transformation to characterize the result of T heads combination.The model may focus on different input word vectors and may represent more complex functions than the general weighted average.

Experimental Environment
To verify the accuracy and related performance of the algorithm proposed in this paper, the experimental environment and hardware-related configuration are shown in Table 1.

Experimental Data Set
This paper uses the Aminer and Himalayan platforms as verification data sources for the proposed personalized intelligent design algorithm.The Aminer platform selects the user data of 100 users, a

Evaluation Index
This paper uses precision and recall to evaluate the personalized intelligent design algorithm's recommendation results.In the results recommended by the design algorithm, precision is defined as the proportion of recommended courses that meet specific requirements in all recommended courses.Its expression is as follows: where A1 represents the number of courses meeting specific requirements, and A2 represents the number of courses that do not meet the specific requirements.
In the results recommended by the proposed algorithm, recall β is defined as the proportion of courses that meet specific requirements in all courses.Its expression is as follows: where A3 represents the number of courses that meet specific requirements but are not recommended.
The precision and recall evaluate the performance of the algorithm from two aspects.A comprehensive evaluation index is defined to evaluate the performance of the algorithm in order to avoid the contradiction caused by the low value of an index in extreme cases.The specific expression is as follows: This index takes both precision and recall into account.The smaller its value, the smaller the difference between precision and recall, and the better the recommended result.

Model Training
In the experiment, the data sets of the Aminer and Himalayan platforms were used to train the four recommendation models of collaborative filtering (Fu et al., 2018), XLNet (Wang et al., 2021), Multi-Bi-LSTM (Liu & Du, 2019), and XMAMBLSTM.The trained models were comprehensively compared and analyzed in terms of precision, recall, and comprehensive evaluation index.The performance of the recommendation algorithm was intuitively evaluated further through data analysis, and its shortcomings were analyzed.
The comparison of precision, recall, and comprehensive evaluation index of each network model is shown in Table 2.
Based on the calculation results, the precision of the collaborative filtering model is relatively low, at only 0.812, whereas the precision of the XMAMBLSTM model is the highest, at 0.893.Although the precision of the four models in the same data set is not identical, the difference is not significant.This is because the implementation of the model includes a pre-training procedure that can prevent gradient explosion.The average and standard deviation calculated in small batches are used to dynamically adjust the segmentation of the middle layer output in the deep learning, so that the middle output of each layer is more stable, thus improving the precision of the model.
In terms of the comprehensive evaluation index, collaborative filtering has the lowest because it relies on the user's behavior to make recommendations for the user.For the new subject matter, it is impossible to calculate a similar subject, namely the cold start problem.At the same time, the general user only produces operation behavior on a small number of target objects.The operation behavior matrix of the user is extremely sparse.The inaccuracy of the similarity of target objects calculated by the sparse behavior matrix ultimately affects the recommended evaluation results.

Experimental Comparison and Analysis
The goal was to make all algorithms run in the same software and hardware environment.Inputs include the user's attention list, search list, learning results, etc. Different feature vectors are generated based on the format of the input data, the data enters the gradient node, and it is then corrected using the loss function.After calculating the gradient values of all parameters, the random gradient descent optimizer performs back-propagation via the update node, constrained by the learning rate, thus completing the update of the feature vector.After repeated training, the values of each parameter are confirmed, and the likelihood of users selecting courses can be predicted and sorted.Finally, a list of recommended courses is obtained.
Collaborative filtering (Fu et al., 2018) was selected as the control group.The epoch of the comparative deep learning model was set to 40, the maximum number of iterations was 30, the learning rate was 0.02, and the gradient clipping coefficient was 5.0.The test results are shown in Table 3.
The table demonstrates that the experimental group is 5.05% higher than the control group in the Aminer platform, and the experimental group is 17.3% higher than the control group in the Himalayan platform.This demonstrates that the combination of a collaborative filtering recommendation algorithm and a deep learning-based content recommendation algorithm can effectively improve the accuracy of the collaborative filtering algorithm.
When the different number of recommended courses "n" is set, different Epoch values will have an impact on the comprehensive evaluation index, as shown in Figure 6.As the value of Epoch increases, so does the comprehensive evaluation index.When Epoch exceeds 30, the growth rate of the comprehensive evaluation index slows down.It can be seen that the comprehensive evaluation index performs better as more courses are recommended.As the number of courses increases gradually, the comprehensive evaluation index tends to be fixed.When Epoch = 40, the comprehensive evaluation index is 5.1% greater when there are eight recommended courses rather than when there are only two.

CoNCLUSIoN
This paper designs a course recommendation system for online education cloud platforms based on an intelligent education robot environment in order to address the bottleneck problem of improving the performance of recommendation algorithms under massive data.A framework for the recommendation algorithm XMAMBLSTM, based on deep learning, is proposed for intelligent educational robots.Experiments indicate that this algorithm is capable of achieving the efficient recognition and relationship extraction of character sequences, as well as enhancing the input reliability of subsequent network layer calculations.Compared with the traditional method, the comprehensive evaluation index is 5.05% and 17.3% higher than the traditional method.
However, because the cloud platform includes many types of courses, more advanced algorithms are required to solve the technical problems of rapid collaborative recommendations of multi-type courses.In future research, additional evaluation factors can be incorporated into the design of cloud platforms.Such an algorithm will provide more comprehensive and efficient multiple recommendation services for users and creators on the cloud platform for school education.

Figure 1 .
Figure 1.Intelligent recommendation algorithm architecture for personalized course resources Figure 2. XLNet word vector extraction mechanism Figure 4. Multi-Bi-LSTM network model structure diagram Figure 5. Multi-Head-Attention network model structure diagram

Table 1 .
Experimental environment courses, and the total number of course participants is 2,104.Data from 80 users were obtained on the Himalayan platform, with a total of 109 courses, and 1,275 participants in all courses.