Predictors of Usability of a Mobile Intelligent Agent Information Provider for College Students

This study determined the factors that influenced the usability of a mobile-based intelligent agent called “AskRed.” The design-related factors were evaluated in terms of performance, accuracy, responsiveness, aesthetics, and completeness. The usability of the software was determined in terms of satisfaction and intention to re-use the software. The software received favorable ratings from the students. Experts’ software evaluation recommended strengthening the design of the intelligent agent in terms of security, performance, completeness, and ease-of-use. Multiple regression analyses showed that performance and completeness influenced satisfaction and intention to re-use. Aesthetics and responsiveness influenced satisfaction but not intention to re-use. Responsiveness had a negative impact on satisfaction. The predictive powers of the regression equations are 58% and 73%. This study provided empirical evidence on the predictors of usability of an intelligent agent used in a university setting.

An emerging application of intelligent agents is improving school administrative services by providing information (Lee et al., 2019;Bendici, 2018). Through intelligent agents, schools can improve their communication channels, compliance with standards, and retention of students (Bendici, 2018). Integrating intelligent agents as information providers is consistent with the management goal of schools, i.e., to provide the students with the highest level of service (Seeman & O'hara, 2006). For instance, the intelligent agent can assist students in identifying pre-enrollment requirements, provide information related to financial aid, remind students about upcoming events, and locate school buildings (Bendici, 2018;Hussain, 2018;Lee et al., 2019). Because of these capabilities, intelligent agents could reduce the administrative workload of staff (Lee et al., 2019).
Despite the importance of intelligent agents in improving university services, very few studies have examined the factors that influence their use and usability in a university setting (Lee et al., 2019). A lack of understanding of the factors that affect chatbot usability may lead to an unusable chatbot, which may then lead to the discontinuance of its use. As pointed out by Janssen et al. (2021), chatbot usage failure is attributed to poor content, the wrong use case, and ignored user requirements. Thus, it is critical to understand the design factors and how they affect chatbot usability to ensure that the technology is optimally used. Understanding how these factors influence a chatbot's usability in a university setting could aid in identifying design aspects that meet the information needs of its users.
In light of this research gap, this study was conducted. The study developed AskRed (subsequently referred to as software) that could provide students with the information they need regarding the services, policies, and general information about the university. AskRed is a mobile-based intelligent agent that responds to college students' queries relating to the policies, services, and general information of the university. It serves as a platform that could serve as an access point for students' queries and inquiries through the use of their mobile devices. It can accept queries in the form of text or voice and perform the queries through the use of natural language processing (NLP). This chatbot could make it easier for students to gather information. It may also reduce the workload of university personnel who provide the information (Patel et al., 2019).
Moreover, the study identified the factors that influence the usability of AskRed. Toward this goal, this study sought answers to the following questions. 1) What are the design-related factors in terms of performance, accuracy, responsiveness, aesthetics, and completeness? 2) What is the usability of the software in terms of satisfaction and intention to re-use the software? 3) Do design-related factors influence the usability of the software?
The rest of this paper is divided into seven sections. The second section is the Literature Review section, which is divided into two subsections. The third section defines the research variables and states the null hypotheses. The fourth section, AskRed Architecture, describes the software's framework. The fifth section is Methodology, which is further divided into four sub-sections. The study's findings are presented in the sixth section. The performance and completeness of the chatbot were found to influence satisfaction and the intention of reuse. Aesthetics and responsiveness influenced satisfaction but not the intention to reuse. Responsiveness had a negative impact on satisfaction. These findings are then discussed in detail in the Discussion section. The eighth section ends with a discussion of the Conclusion, Recommendations, and Future Works.

Usability Measures and design-Related Factors
Usability has no universal definition (Sindhuja & Dastidar, 2009) because it depends on the nature of the object under investigation (Granic et al., 2011) and the context of use (Scheller & Kühn, 2015). Nonetheless, different studies attempted to propose a framework for evaluating usability. According to International Standard Organization (ISO) 9241 (ISO, 1998), usability is composed of efficiency (resources spent in performing tasks), effectiveness (the ability of users to complete tasks using technology with the assurance that the output of that task has quality), and satisfaction (subjective evaluation of contentment by using the technology). However, it was shown that effectiveness can be a design factor and not an indicator of usability (Bringula, 2016).
In another study, Zhang and Adipat (2005) proposed a list of usability attributes for mobilebased applications. The attributes are learnability, efficiency, memorability, error, satisfaction, effectiveness, simplicity, readability, and learning performance. One of these attributes related to this study is effectiveness. Zhang and Adipat (2005) defined effectiveness as the completeness and accuracy of a mobile application with which users achieve certain goals. Arthur and Stevens (1992) defined completeness as a set of documentation where all required information was present. Furthermore, accuracy was defined as the degree to which the system is complete, timely, and free from errors (Arthur & Stevens, 1992;Yuniarto et al., 2018). Another attribute related to this study was satisfaction. Zhang and Adipat (2015) and ISO (1998) had the same definition of satisfaction.
In a similar study, Coursaris and Kim (2011) conducted a meta-analysis of 100 empirical mobile usability studies published from 2000 to 2010. They found more than 30 attributes investigated in mobile usability research. Three attributes were found to be relevant in this study: functionality, responsiveness, and aesthetics. According to McNamara and Kirakowski (2006), functionality is the assessment of the performance, reliability, and durability of a system. In a website study, performance refers to the overall user preference rating considering the loading speed of a webpage (Schmidt et al., 2009). In terms of reliability, it is the feeling that a product is dependable or fit to be trusted (Baharuddin et al., 2013). Responsiveness is a desired characteristic of mobile devices where they can perform well, especially during time-constraint service (Kleijnen et al., 2007). It also refers to the ability of the system or device to respond promptly to the actions invoked by users (Ho & Lee, 2007).
Aesthetically designed software is usable (Tractinsky, Katz, & Ikar, 2000). Aesthetics refers to the visual qualities of the interface (Lindgaard & Dudek, 2002). The effects of aesthetics on usability have been widely investigated in web and mobile application research. For example, Teoh et al. (2009) andWells et al. (2011) disclosed that uniformity of design, appropriate graphics, organized patterns, good color combinations, and text were desirable qualities of a website. Aesthetics in terms of colors and graphics on a website were found to be significant predictors of commercial website quality (Wells et al., 2011). Similarly, color affects the usability of mobile applications. Silvennoinen, Vogel, and Kujala (2014) found that color improved the hedonic and pragmatic qualities of task-and entertainment-oriented applications. However, in the study by Tuch et al. (2012), aesthetics did not affect the perceived usability of online shopping. This discrepancy can be attributed to the age of the users (Djamasbi et al., 2011;Huang et al., 2017;Punchoojit, 2022).
Another usability indicator is the intention to reuse. The intention to reuse was the result of previous positive experiences with using a service or product (Nigam, 2012;Oliver, 1999). Alalwan (2020) used satisfaction and the intention to reuse as indicators of the usability of mobile foodordering apps. Alalwan (2020) found that online reviews, online ratings, online tracking, performance expectancy, hedonic motivation, and price value predicted e-satisfaction and continued intention to reuse. In another study, Kim et al. (2020) revealed that aesthetics and quality of service influenced guest satisfaction with smartphone applications. Satisfaction, in turn, influenced the intention to reuse the application.

Usability of Intelligent Agents
Different studies have reported on the usability of intelligent agents used in various areas. In the field of education, Mabanza and De Wet (2013) investigated the usability of pedagogical educational agents to assist adult learners in acquiring basic computer skills. Data from the control (i.e., traditional teaching methods) and experimental groups (i.e., participants who used a pedagogical agent) revealed that adult learners in the latter group performed better than the former. Liang, Liang, and Tseng (2019) investigated the usability of an intelligent agent in e-commerce. The intelligent agent acted as a price negotiator for prospective buyers. The use of intelligent agents reduced the effort required to collect buyer information, reduce transaction costs, and negotiate with sellers. User satisfaction and perceived fairness increased, while performance risk was reduced after using the intelligent agent. In another study, Bogers et al. (2019) gathered and analyzed the perceptions of 357 Danish-speaking respondents on the usability of intelligent personal agents (IPAs). They found IPAs usability issues that include reliability, poor voice recognition, unnatural dialogue responses, and the inability to support mixed-language speech recognition. Ren et al. (2019) conducted a systematic mapping study on 19 usability articles on chatbots (a type of intelligent agent). They presented their findings in four categories: usability techniques, usability characteristics, research methods, and types of chatbots. Questionnaires (e.g., System Usability Scale and ad hoc), interviews, think-aloud sessions, direct observations, and cognitive walkthroughs were the most popular usability techniques employed. Ad hoc questionnaires are context-dependent questionnaires that are used to assess satisfaction. A field study was another method employed in usability studies (Duh et al., 2006). Ren et al. (2019) did not report a study that employed a field study method. Effectiveness, efficiency, and satisfaction were the primary usability characteristics investigated. This finding is consistent with the usability definition of ISO (1998).
Finally, Balsa et al. (2020) developed and evaluated an intelligent anthropomorphic virtual assistant to support older people with type 2 diabetes mellitus. The intelligent agent reminded users of their medications and lifestyle changes. Twenty people, who were purposively-selected, participated in the study. System Usability Scale (SUS) and qualitative responses were used to determine the usability of the intelligent agent. In terms of SUS, the system rating was between good and excellent. Both positive aspects and areas for improvement were identified through thematic analysis. The study recommended investigations of efficiency, effectiveness, and satisfaction as key attributes of usability for intelligent agents.
In the review of the literature by Ren et al. (2019) on the usability of chatbots, SUS was the most utilized survey instrument. The authors reported sixteen studies that utilized SUS. However, it could not reflect the actual strengths and weaknesses of the specific design features of the software. Brooke (2013), the developer of the instrument, pointed out this limitation. Moreover, there is a methodological limitation in the study of Ren et al. (2019). The chatbots investigated are not necessarily implemented in a school setting.
In the recent study by Federici et al. (2021), they evaluated the usability of a chatbot for eGLU-Box Pro. This usability test is a web-based tool intended to assist Italian Public Administration practitioners in creating remote usability tests. The evaluation tool can also analyze the participants' answers and interaction data after they complete usability tasks. Bio-behavioral evaluation methods (e.g., eye tracking, electroencephalography, and facial expression recognition) were employed to assess the users' experience with the chatbot. The study revealed that the automatic usability assessment procedures had no significant effect on the quality of interaction in terms of the user experience except for emotions.
Studies investigating the factors that affect the usability of chatbots in a university setting remain elusive. Very few studies have investigated the usability of chatbots for university purposes. For instance, Von Wolff et al. (2020) evaluated the chatbot they developed for the acquisition of information at a German university. Von Wolff et al. (2020) distributed a questionnaire that contains three parts: (1) general questions about the participant, (2) questions about the current or previous procedure of the students to acquire information and their satisfaction with it; and (3) questions about their experience and valuation of chatbots as well as topics to support and issues to answer (e.g., "How would you rate the characteristics of a chatbot?").
Two related mobile-based chatbots intended for university usage discussed their functionalities (Dibitonto et al., 2018;Patel et al., 2019). Dibitonto et al. (2018) presented their initial findings about the design and implementation of "LiSA" (Link Student Assistant). This chatbot can help students by providing information about a university, which includes enrollment, scholarships, opportunities, events, and schedules. Patel et al. (2019) investigated a web-based chatbot in a similar study. The study by Patel et al. (2019) described the infrastructure and functionalities of their chatbot, called Unibot. Unibot contains information regarding departmental syllabus, events, admission procedure and fees, basic university details, class timetables, and important circulars. Similar to the study by Dibitonto et al. (2018), they did not evaluate the usability of the chatbots and did not determine which factors could influence the usability of their chatbots.

SyNTHeSIS, deFINITIoNS oF ReSeARCH VARIABLeS, ANd NULL HyPoTHeSeS
Two studies reported the capabilities of chatbots used in university settings (e.g., Dibitonto et al., 2018;Patel et al., 2019). The two studies did not investigate the factors that could influence the usability of the chatbots. This leaves a gap in the literature on how to develop a usable chatbot for a university setting. To address this gap, AskRed was developed. Moreover, a researcher-made instrument was used in this study to capture the nature and context of the use of the chatbot being investigated. Furthermore, the self-created instrument was used because the current study aims to identify which specific design-related factors of a chatbot could influence its usability.
Numerous variables can be considered in evaluating the usability of a system. Taking the nature of the software being investigated into account, only five design factors were considered in this study. One of the variables is accuracy. The ability of the chatbot to provide accurate results from students' queries is a critical design component of the software. Students may not continue using the chatbot if the information provided to them is inaccurate. However, it is not enough for the chatbot to produce accurate results. The results must also provide answers that are complete, comprehensive, and upto-date. Therefore, completeness was another design factor considered.
Performance is another suitable criterion for evaluating AskRed to determine whether the system performs its intended function. The goal of the performance criterion is to determine whether the software can execute the inquiries via text or voice inputs. The results of the inquiries must be provided as promptly as possible. If the answer to the query is not in the database, the software must be able to direct users to the appropriate department for a response. This feature must be taken into account when designing a chatbot.
The last factor considered is aesthetics. Although there are competing findings on whether aesthetics could influence usability, this criterion is included in the chatbot considering the types of users (i.e., college students). College students are relatively young (16 to 20 years old). Thus, the aesthetic appeal of the software may influence the usability of the chatbot.
Satisfaction was chosen as a usability indicator for two reasons. First, due to the technical aspects of their devices, students may not be able to rate the efficiency of the system. Second, effectiveness has been considered a design consideration rather than the result of a usable system (Bringula, 2016). As a result, only the students' satisfaction with the chatbot's features was evaluated. Meanwhile, as Alalwan et al. (2020) point out, a satisfied user will use the software again. This study agreed with this finding and hypothesized that design factors might influence the reuse intentions of the students.
The synthesis of the related literature served as the basis for the selection of research variables for this study. The definitions and how the variables are adapted in this study were discussed below.
The following are design-related factors: 1) Accuracy -This was based on the studies of Arthur and Stevens (1992) and Yuniarto et al. (2018). It refers to the degree to which the system processes of the software are timely and free from errors. Only the complete component of accuracy was considered a distinct indicator of usability in this study. 2) Aesthetics -This variable was adopted from the study of Lindgaard and Dudek (2002).
Additionally, this study viewed aesthetics as a feeling of ease when using the software.
3) Completeness -This was adapted from Arthur and Stevens (1992) and operationally defined as the ability of the software to provide the requested information. 4) Performance -The software can perform its intended functions and meet the user's goals. This was adapted from Schmidt et al. (2009). 5) Responsiveness -This variable was based on the definitions of Kleijnen et al. (2007), and Ho and Lee (2007).
Usability consisted of two indicators, namely: 1) Satisfaction with use -This study adopted the definition of Zhang and Adipat (2015) and ISO (1998). 2) Intention to reuse -The concept of intention to reuse was adapted from Alalwan (2020), Nigam (2012), and Oliver (1999). It refers to the user's feeling about using again the software.
This study tested two null hypotheses: H 0a : Design-related factors do not influence the usability of the software in terms of satisfaction. H 0b : Design-related factors do not influence the usability of the software in terms of intention to reuse.

The AskRed Architecture
Based on the literature review, there are a plethora of studies advocating usability and system design considerations. Performance, accuracy, responsiveness, aesthetics, and completeness were purposively selected as design considerations for AskRed. They were selected because they were deemed appropriate for the development of AskRed. This section describes the software and the knowledge database.

Software Description
AskRed can provide information related to the university. The information about the university is categorized into 22 areas (e.g., admission requirements and procedures, enrollment/registration requirements and procedures, school fees, etc.). These areas were based on the initial study by Lizaso et al. (2016). The content of AskRed is more comprehensive than the existing studies (e.g., Dibitonto et al., 2018;Patel et al., 2019) and extends the use case of chatbots (Hussain, 2018). A user may input a text or voice query or command in English into the system (Table 1). An automatic speech recognition system converts the voice input query into a text format (Figure 1). The Google Voice-to-Speech API is utilized to convert voice to text (Hinton et al., 2012). Afterward, all textual inputs are then fed into the Query Classifier (QC). The QC analyzes the parsed texts. The parsed texts are then analyzed in the Question-Answering module. In this module, the parsed texts are used to search for answers in the database. The Question-Answering module is based on the OpenEphyra framework (Ferrucci et al., 2010;Figure 2). If the answer is not in the knowledge database, the question is sent to the knowledge expert. A knowledge expert is any authorized university employee. For example, if the question pertains to an examination schedule, it will be sent to the Office of the Registrar. The response of the knowledge expert is then saved to the knowledge database, and the answer is sent to the student. A sample input-output interface is shown in Figure 3.

Knowledge Database
An initial survey was conducted to build the initial knowledge database of AskRed (Lizaso et al., 2016). The survey was conducted using a self-administered questionnaire using Google Forms. The survey form was distributed to the different official Facebook groups of the different colleges of the university. Four hundred students answered the Google Forms. Respondents were asked to rate the importance of the information relating to university services (e.g., admission procedures, enrollment procedures, etc.). A five-point Likert scale was used to answer the items of the survey form, where 1 (not important) denoted the most negative response and 5 (very important) denoted the most positive. Based on the survey, the top five important pieces of information were related to school policies and regulations, scholarships, examination dates, on-the-job training and graduation information,  Table 1.

Types of Inputs Sample Inputs Process Action
Voice command "Show me the school policies. enrollment/registration requirements and procedures, and class suspensions and holidays. In addition, general information about the university (e.g., its history and administration officials) was included in the knowledge database. The pieces of information on these categories were collected from the different departments or official records (e.g., student manual) and manually encoded in the system.

The Research design, Locale, Sample Size, Sampling design, and Participants
This quantitative-descriptive study evaluated the design-related factors of the software and determined whether these factors influenced the usability of the software (Bogers et al., 2019). The software was utilized at one university in Manila, which served as the research locale for the study. Using Soper's calculator with the following parameters: effect size = 0.15, power level = 0.80, number of predictors = 5, probability level = 0.05, the minimum sample size of 91 was computed (Soper, 2021). Twenty participants were selected from each of the six colleges at the university. Participants were selected through convenience sampling. They were chosen through their classroom assignments. The classrooms were written on a piece of paper and randomly selected. The researchers visited the classroom and requested that the students (that is, the participants) participate in the study. The participation of the students was entirely voluntary. The participants did not receive any incentives or demerits. One hundred students participated in the study. The data collected from the 100 participants was used in the study. Respondents to the study were mostly male (54%) and third-year students (40%). The average age of the participants was 18.8 years. The number of students who were invited and participated in the study is shown in Table 2.

The Research Instrument
A self-made survey form was the research instrument (Ren et al., 2019). Google Forms was used in the construction and distribution of the survey. The survey form has three parts. The first part collected the demographics of the students in terms of age, year level, and gender. The second part measured the design-related factors of the software in terms of completeness, performance, accuracy, aesthetics, and responsiveness. The third part measured the usability of the software through satisfaction and intention of reuse. In this study, the intention to reuse was considered a component of the usability indicator. A five-point Likert scale was utilized to answer the items in the second and third parts of the survey form (Table 3). A faculty member with usability research experience and an information technology practitioner validated the content of the survey form (Balsa et al., 2020). They also served as expert evaluators. Furthermore, the survey form was pilot-tested with 40 respondents. The pilot testers were also students at the university but not part of the actual survey. Factor analysis revealed that the factors were valid (factor loading ≥ 0.50) (Pallant, 2001). An item is said to be highly loaded to a factor if its factor loading is at least 0.40 (Pituch & Stevens, 2016). Cronbach's alpha analysis disclosed that all items were acceptable (α ≥ 0.45) (Taber, 2018). A lower Cronbach alpha value was used because there is not yet an established questionnaire for intelligent agents employed in an academic setting. Moreover, low levels of alpha may still be useful depending on the context of the study (Hair et al., 2016). For usability studies, low levels of Cronbach's alpha values are acceptable (Schrepp, 2020). This is because the selected variables were deemed to have practical contributions to explaining the usability of the intelligent agent employed in an educational setting (Schrepp, 2020). This is consistent with the study of Schmitt (1996), which indicated that there is no agreed-upon level of acceptable (or unacceptable) level of alpha value. The items, factor loadings, and Cronbach's alpha of the survey form are shown in Table 4. The variable inflation factor (VIF) was used to detect the multicollinearity of the factors. All VIFs were less than the threshold value of 10, indicating that all factors were independent of one another (Hair et al., 1995).

data Gathering Procedure
The students used the software for five consecutive days. They used the software at their convenience. This was done to simulate an authentic setting, that is, a point in time and place when a student requires information and the software is the only source (or provider) of information available. They were requested to create an account within the software using their smartphones. An orientation on the use of the software was provided. In the orientation, students were given an overview of the software and its functionalities. They were asked to use text and voice search commands. Five sample queries were given (see Table 1 and Figure 3). They were also instructed to provide their queries to test the capability of the software. After using the software for five consecutive days, the students rated it using the survey form. The survey form was automatically sent to their accounts once they completed the five-day testing. Participants need to complete at least five queries (one query per day) before the survey forms automatically open. The software was also subjected to experts' evaluation. In this process, the expert evaluators used and analyzed the software for at least an hour. They provided textual feedback regarding the software using the same criteria for design-related factors. Their consensus comments were written on a piece of paper. One of the researchers analyzed the comments, categorized them, and labeled them. The result of this process was presented to the group members. The labels were revised until a consensus was reached.

Statistical Treatment of data
Frequency counts, means, and standard deviations were used to describe the data. Multiple regression analysis was employed to determine which of the design-related factors influenced the usability of the software. A 0.05 level of significance was used as a threshold value to determine the significant predictors of the usability of the software.

design-Related Factors and Usability of AskRed
Completeness had the highest mean rating of all design-related factors (Table 5). Meanwhile, the aesthetic component had the lowest mean rating. Nevertheless, the respondents agreed that the software was developed based on design-related factors. The respondents were satisfied with the use of the software. There was a desire to use the software again. The standard deviations were less than one, implying the ratings were not widely dispersed. Table 6 shows the qualitative experts' evaluation of the software. The software can still be improved in the areas of security, performance, ease of use, completeness, and platform. In terms of security, the evaluators agreed that only officially enrolled students should have authorized access to the software. Another security consideration was the display of relevant office information for students. For example, a payroll department does not have relevant information for students' information needs. Another comment involved notifying the students that the knowledge experts had responded to their pending questions. The ease of use emerged from the comments. It refers to the inclusion of a category on a list of frequently asked questions.

Regression of AskRed Usability on design-Related Factors
Four of the five design-related factors were significant predictors of satisfaction (Table 7). Performance was the strongest predictor of satisfaction. Responsiveness had a negative influence on satisfaction. A one-unit increase in responsiveness will have a 0.201 standard deviation decrease in satisfaction. The three significant predictors were able to explain 73% (Adj. R 2 = 0.73) of the variation in satisfaction. The result of the regression analysis was unlikely to have arisen from the sampling error (F-value = 53.881, p-value < 0.05). Accuracy was not a significant predictor of satisfaction. Hence, the first null hypothesis is partially rejected. Table 8 displays the results of a multiple regression analysis of the intention to reuse on designrelated factors. Performance and completeness were significant predictors of the intention to re-use the software. The strongest predictor of intention to reuse was completeness. More than 50% (Adj. R2 = 0.58) of the variation in intention to reuse the software was attributed to the performance and completeness of the software. The results of the regression analysis were unlikely due to sampling error (F-value = 28.732, p < 0.05). Accuracy, responsiveness, and aesthetics were not significant predictors of intention to reuse. Hence, the second null hypothesis is partially rejected. Table 7.

Security
• "Only officially enrolled students should have access to the software." • "Only offices directly related to the needs of students should be provided." • "Provide an information kiosk within the campus." • "Provide different layers of users: students, parents, and guests. This will add security to the system." • "Generate a one-time password as an added security feature." Performance • "Provide notifications for responses received from knowledge experts." Ease of Use • "Make a list of frequently asked questions." • "Classify the data gathered from experts and display it appropriately." Completeness • "Consider guest users, such as parents and prospective enrollees, in the system in future research."

dISCUSSIoN
This study determined the usability of a mobile-based intelligent agent named AskRed. Furthermore, this study established whether design-related factors predicted the usability of AskRed. As shown in Table 3, completeness received the highest mean rating, while aesthetics received the lowest (but still acceptable) rating. This finding reflects the ability of the software to answer the questions of the students completely. The software provided a pleasant user experience. Similarly, the accuracy aspect of the software received a favorable rating. The positive ratings on completeness and accuracy can be attributed to the comprehensive and up-to-date knowledge database of the software. The initial survey is found to be a helpful step in the development of the knowledge database. Therefore, an initial survey of students is a necessary step to ensure a comprehensive and accurate knowledge domain for a university-based intelligent agent. In addition, the acceptable rating of the software can be attributed to the ability of the software to update its knowledge. The software allows knowledge experts to input information that is not yet included in the knowledge database. This mechanism builds the knowledge of the intelligent agent, which, in turn, boosts its performance and responsive qualities.
The usability of the software received positive feedback from the respondents. Both satisfaction and the intention to reuse were positively rated. The respondents are pleased with the capabilities of the software, and they are willing to recommend it to their fellow students. Similarly, they agree that they will again use the software. However, there are comments from the software evaluators that are worth noting. The evaluators emphasized improving the software in relation to security, performance, completeness, and ease of use. Email notification, an item of performance design consideration, was also raised in the evaluation. Despite these drawbacks, the software received an overall favorable rating.
Multiple regression analysis explains the relationship between design-related factors and the usability of the software. Performance, aesthetics, and completeness had a positive effect on satisfaction. Performance is the strongest predictor of satisfaction. This finding not only confirms prior studies (McNamara & Kirakowski, 2006;Schmidt et al., 2009), but also shows that the best predictor of the usability of information-provider software like AskRed lies in its ability to perform its intended functions. Additionally, the completeness of the information could also influence the satisfaction of the software, thereby confirming the study of Zhang and Adipat (2005). Together with conformance to the intended performance and complete features of the software, it is equally important to design the software with visual appeal. This finding is in agreement with the studies of Tractinsky et al. (2000) and Silvennoinen et al. (2014).
Meanwhile, responsiveness is the only significant predictor with a negative impact on satisfaction. It contradicts the studies of Ho and Lee (2007) and Kleijnen et al. (2007). The negative impact of responsiveness and the contradiction of the results with existing studies can be attributed to the context of the software. Even if the information is not readily available in the knowledge database, the software can respond to the students' queries. For example, responses include informing the students that the software has no available answer yet (that is, a "do not know" response) and that their inquiries will be forwarded to the respective department (that is, a "redirecting" response). These responses are not helpful, as they are indications that the information needs of the students are not being met and that the desired information is not being provided instantly. Equally important are the results of the regression analysis of the intention to reuse the software on design-related factors. Performance and completeness had a positive influence on the intention to reuse. In this regression result, completeness is a stronger predictor than performance. Apart from this, aesthetics is no longer a contributing factor to the intention to reuse the software. This is consistent with the prior work of Tuch et al. (2012).
The study did not provide empirical evidence on the relationship between the accuracy and usability of the software. The importance of accuracy in the design of intelligent agents cannot be dismissed outright. Accuracy might have an indirect impact on usability. Therefore, other statistical tests (e.g., Kim et al., 2020) can be employed to uncover this unexplored relationship.
The regression results offer vivid contributions to intelligent agent usability studies and development. First, the usability of intelligent agents tends to decrease when they fail to provide definite and instant information. To avoid this pitfall, intelligent agent developers are encouraged to make the knowledge database of intelligent agents comprehensive. However, it is not suggested to exclude the "do not know" or "redirect" responses of intelligent agents since, at the very least, they provide feedback on the query. Instead, it is suggested to limit the chances of experiencing these responses. Strengthening the completeness of the knowledge database of the software can address this design issue.
The second contribution of the paper is the identification of predictors of the usability of intelligent agents employed in the context of a university setting. Previous studies (e.g., Dibitonto et al., 2018;Patel et al., 2019) reported the development of chatbots in a university setting but did not evaluate the impact of the design features on the usability of the chatbots. In this study, it was found that both performance and completeness are consistent predictors of satisfaction and intention to reuse, but to varying degrees. Performance is the primary predictor of satisfaction. On the other hand, completeness is the strongest predictor of intention to reuse. Therefore, intelligent agent developers should focus on both design considerations for intelligent agents. From a theoretical perspective, the use of both variables as indicators of usability is encouraged. Multiple usability indicators are needed to have a better understanding of the usability of an intelligent agent.
Another contribution of the paper is the need to extend the use cases of campus chatbots proposed by Hussain (2018). This study, along with the studies of Dibitonto et al. (2018) and Patel et al. (2019), confirms that chatbots are indeed information providers. To become a dependable information provider, a comprehensive database of answers should be incorporated into the system.
Lastly, this study clarifies the conflicting pieces of evidence on the effects of aesthetics on usability. In the context of the usability of intelligent agents, satisfaction depends on the aesthetic components of the intelligent agent. On the one hand, the intention to reuse is not dependent on this factor. In other words, depending on whether the usability dimension is measured, aesthetics can (or cannot) influence the usability of the software.

LIMITATIoNS
This research is subject to several limitations. The first limitation is the selection of independent variables. The predictive powers of the two sets of predictors are 58% and 73%. These imply that other variables not included in this study may increase the predictive power of predictors. The usability attributes listed by Zhang and Adipat (2005) may be considered in future studies. Another limitation is the statistical tool used in the study. Other statistical tests may establish the relationship between the accuracy and usability of an intelligent agent. Researchers are therefore advised to explore other statistical tests in future usability studies on intelligent agents in a university setting.

CoNCLUSIoN, ReCoMMeNdATIoNS, ANd FUTURe woRKS
This study evaluated the design-related factors and usability of a mobile-based intelligent agent for university services named AskRed. Based on the findings of the study, the software received favorable ratings. Therefore, the software was designed based on design-related factors. In addition, multiple indicators of usability are needed to have a full picture of the usability of intelligent agents. Conflicting results regarding the influence of aesthetics on usability are clarified in this study, i.e., it may or may not influence usability depending on what indicators of usability were measured.
As the consistent and strongest predictors of usability indicators, performance and completeness should be the topmost design factors in the development of intelligent agents for university-related services. Negative responses (e.g., "do not know" and "redirect" feedback) are important design aspects of the software. However, the chances of experiencing these forms of feedback should be minimized to maintain the acceptable usability status of the software. Strengthening the comprehensiveness of the knowledge database through an initial survey can address this issue.
The findings of the study inform chatbot developers that the usability of chatbots intended for university services is influenced by specific design factors (e.g., performance, responsiveness, aesthetics, and completeness). Other design features, such as security and ease of use, may also be considered. Therefore, when developing chatbots, the survey instrument in this study could serve as a guideline in the development of other mobile-based chatbots for university settings. Furthermore, chatbot developers are informed that other variables not considered in the study (e.g., security and ease of use) could increase the usability of the chatbot.
Future work may address the limitations of the study. Language input is one design limitation of the software. The software can only accept inputs based on the English language. Future research may develop an intelligent agent capable of accepting voice or text inputs based on their native language (Bogers et al., 2019). Expert evaluators' design considerations may be incorporated in future designs of intelligent agents for a university setting. Other usability tests and analyses are also encouraged (Balsa et al., 2020;Ren et al., 2019). For example, future studies may collect interface-interaction data and analyze this data accordingly. Lastly, an examination of the relationship between accuracy and usability indicators using path analysis is encouraged.