A Systematic Review of Gamification and Its Assessment in EFL Teaching

The aim of this study is to examine the satisfaction of EFL teachers with gamification platforms as well as to investigate how EFL teachers perceive gamification and its effects on pupils’ motivation and learning outcomes. Five major databases (ERIC, Scopus, WoS, EBSCO, ProQuest) and Google Scholar were used to search for relevant studies. The study followed the PRISMA guidelines and the PICO framework. Inter-rater reliability analyses were performed for both study selection and study quality assessment. Eleven relevant quantitative or mixed studies were identified. The findings indicate that EFL teachers perceived a positive effect of gamification on pupils’ motivation and are satisfied with the applicability of gamification platforms. The findings revealed that internet and technology issues and a lack of teachers’ skills are the most prominent negative factors when implementing gamification. Further experimental research is needed to provide evidence of the EFL teacher-perceived effectiveness of gamification on learning outcomes.


INTRodUCTIoN
The modernisation of English as a foreign language (EFL) education has been fundamentally intertwined with the rapid technological development of digital content, which has been frequently utilised to enhance the learning process and promote positive learning experiences.One of these technological innovations that could increase the effectiveness of EFL teaching is gamification.Gamification refers to a methodical strategy that facilitates a gameful experience to motivate and engage users (Hamari, 2019).Gamification utilises leaderboards, digital badges, achievements or in-game digital currency to reward users for their efforts and provide valuable feedback (Flores, 2015;Kapp, 2012).Nowadays, gamification elements are often used to amplify the effectiveness of other innovative language learning methods that utilise modern technology, such as mobile-assisted language learning (MALL), computer-assisted language learning (CALL) as well as an assortment of learning management systems facilitating blended learning and flipped classrooms (Chwo et al., 2018;Dicheva et al., 2015;Somova & Gachkova, 2022).Applications such as Duolingo, Kahoot or Quizizz have incorporated gamification into their core designs and found great success in EFL teaching.Moreover, gamification has been implemented not only for learning new materials or revising old ones, but also for providing assessments and obtaining feedback from learners via digital response systems (Tan & Saucerman, 2017).
However, despite all the potential practical uses of gamification in EFL learning that have been investigated over the past decade, still significant knowledge gaps exist in how EFL teachers perceive gamification and gamification platforms.In this regard, Zhang and Hasim's (2023) systematic review highlighted the dire need for further research to explore the perspectives on gamification applications in EFL education.This need is compounded by the absence of studies that would take a comprehensive view on EFL teachers' perspectives across all gamification applications that are implemented in EFL teaching, and by the absence of a systematic review that would provide a more rigorous overview of quantitative and mixed studies examining the effects of gamification from EFL teachers' perspectives, especially at lower secondary and secondary education levels.Additionally, previous systematic reviews often limited their searches either to a small number of databases or used quite a narrow list of search terms.
In this study, the authors' systematic review aimed to address these limitations and fill the critical knowledge gap in gamified EFL teaching.Therefore, the main objectives were: 1. Analysing teachers' satisfaction with gamification platforms in relation to their applicability in EFL teaching.2. Analysing teachers' perceptions of gamification in relation to learning outcomes in EFL teaching.3. Analysing teachers' perceptions of gamification in relation to learners' motivation in EFL teaching.
Based on these objectives, the authors formulated the following research questions: 1. How satisfied are teachers with the applicability of gamification platforms in EFL teaching?2. How do teachers perceive gamification in relation to learning outcomes in EFL teaching?3. How do teachers perceive gamification in relation to learners' motivation in EFL teaching?
Several systematic and literature reviews have already explored the effectiveness of gamification from EFL learners' perspectives (Al-Dosakee & Ozdamli, 2021;Boudadi & Gutiérrez-Colón, 2020;Dehghanzadeh et al., 2019;Ortiz et al., 2017).The results of reviewed studies suggest a positive effect of gamification on learners' motivation and engagement.On the other hand, there is a significant lack of clear connection between gamification and learning outcomes.These findings are also supported in Kaya and Sagnak's (2022) and Zhang and Hasim's (2023) recent systematic reviews, where most studies reported a positive effect of gamification on learners' language skills, attitudes, and motivation, and found that gamification provides an authentic learning environment.This also applies to systematic reviews of studies on the effectiveness of gamification in mobile and computer assisted language learning (Ishaq et al., 2021;Lin & Lin, 2019;Su et al., 2021), blended learning (Ramalingam et al., 2022), and flipped learning (Ekici, 2021), where positive effects of gamification were found on learners' motivation, academic performance, and engagement.Additionally, some systematic reviews provided a more detailed analysis of individual gamification platforms in EFL learning (Dehghanzadeh & Dehghanzadeh, 2020;Shortt et al., 2021;Wang & Tahir, 2020).Dehghanzadeh and Dehghanzadeh (2020) investigated the use of gamification across different applications and found that Duolingo and Kahoot were used the most, adding that both applications support effective language learning.This is also supported by Shortt et al. (2021), who found a positive relation between the use of Duolingo and learners' academic performance.
On the other hand, some systematic and literature reviews have taken English teachers' perspectives into consideration, for instance Lim and Yunus's (2021) systematic review.Their findings suggest that EFL teachers generally perceive the gamification platform Quizizz positively on the basis of its perceived effectiveness, ease of use, and motivating capabilities.However, the study also suffers from several limitations, the most important of which is the limited range of databases used.As to literature reviews, only two studies were found that provided a brief outline of publications in the area of gamification of EFL teaching, namely Singh et al. (2020) and Degirmenci (2021).Singh et al.'s (2020) literature review revealed that most studies examined EFL teachers' perspectives qualitatively.Nonetheless, the results of both studies suggest that EFL teachers generally agree that gamification platforms such as Quizizz provide an engaging environment that motivates learners and improves their attitudes towards language learning.

Protocols and Preregistration
To ensure a high quality of the systematic review, the authors followed the standardised guidelines of the Preferred Reporting Items for Systematic Reviews and Meta-Analyses (Page et al., 2021).The population, intervention, control, and outcomes (PICO) review framework was used to formulate the literature search strategy, research questions, and eligibility criteria (Davies, 2011).The core participants for this study were EFL teachers from lower secondary to higher education levels.The intervention in question was gamification in comparison to nongamified EFL teaching.The outcomes of interest were the self-reported scores of individual scales/questionnaires measuring the perceived effectiveness of gamification and EFL teachers' satisfaction.A preregistration was made on the Open Science Framework Web site on April 28th, 2022, adopting the preregistration form of the standard international prospective register of systematic reviews, PROSPERO (Centre for Reviews and Dissemination, University of York, n.d.), to ensure the validity of the systematic review (Helvich et at., 2022).Additionally, all study code, data, supplementary materials as well as completed PRISMA check list were uploaded with free public access on Open Science Framework Web site.

Inclusion Criteria
The authors included studies conducted at the lower secondary to higher education levels, which corresponds to the age range of learners 11 -26, coinciding with preintermediate to advanced English levels (A2 -C1).Additionally, they included studies that analysed only the effect of digital gamification on nonnative English teaching; in other words, digital gamification of EFL teaching.Regarding the second group of criteria, only peer-reviewed quantitative and mixed English-written studies published in journals or conference proceedings between 2011 and the day of the search, May 1st, 2022 were included, as stated in the preregistration form.Concerning the research design, the authors included quantitative and mixed studies that exclusively used English questionnaires/scales examining EFL teachers' perspectives.

Exclusion Criteria
The authors excluded studies conducted at preschool and primary education, which correspond to learners younger than 11, known as young learners, pertaining to beginner and elementary English levels.Seeing that the systematic review aimed to analyse studies where gamification was used in more versatile and intricate teaching applications, the authors decided to exclude studies that focused on the initial stages of English teaching.For similar reason, studies that focused on andragogy were also excluded.Additionally, the authors also excluded qualitative or non-peer-reviewed studies published earlier than 2011 that were written in languages other than English or were not focusing on EFL teaching.The systematic review did not include books, book chapters, theses or other systematic reviews.Furthermore, the authors excluded studies that did not use English questionnaires/scales or focused on assessments of learners' or parents' perspectives.

databases and Search Strategy
In order to search for relevant studies five major databases were used: ERIC, Scopus, WoS, EBSCO, and ProQuest.The authors decided to limit the year of publication to 2011, in connection with the first gamification summit in 2011 held in San Francisco and Gartner Research of Hype Cycle for Emerging Technologies published in 2011 (Fenn & LeHong, 2011).These databases were chosen based on their ranking, broad coverage of studies, and retrieval optimizations.Google Scholar was also included as a secondary source, being able to obtain relevant studies outside the selected databases.According to Bramer et al. (2017), Google Scholar, combined with other databases, performed best and guaranteed efficient coverage.However, due to its capricious recall and limited search interface, the authors decided to condition its use to increase the reproducibility of the search.Therefore, only the first 300 Google Scholar search results were included based upon a Haddaway et al.'s (2015) utility analysis of Google Scholar.According to their findings, academic journal papers appear sooner in search results than any other type of publication.However, Google Scholar has another significant limitation regarding its search line capacity, which does not allow searches that exceed 32 search words, including Boolean operators.The authors searched individual databases using selected search terms and Boolean operators adjusted to the search interfaces of the databases, but a different search strategy had to be designed for Google Scholar.The authors also tracked the use of advanced search filters and limiters in databases.

Search Terms and Syntaxes
In order to cover as many relevant search term variations as possible, the authors used the preliminary search, thesaurus.com,and merriam-webster.comas primary resources to scan for synonyms.Additionally, they included other keywords closely related to the field of study of their systematic review, terms such as MALL, CALL, or game-based learning.Although referring to rather diverse types of technology-based approaches, these terms occasionally overlap with the term gamification or are even used interchangeably, in some cases.For this reason, the authors extended the search term list to encompass a broader range of related studies.Moreover, due to the large scope of existing gamification platforms, the authors decided to isolate the currently most prominent gamification platforms and cluster the other less prominent ones under the prior stated search terms.In the case of Google Scholar, condensed search term lists and two searches were conducted to circumvent Google Scholar's word count limitation.For databases ProQuest, Scopus, and EBSCO, the authors selected title, abstract, and keywords to scan for search terms, and for Web of Science, they scanned all fields.In the case of ERIC and Google Scholar, neither of the interfaces provides such broad search settings.

Interrater Reliability Analyses
The authors explored interrater reliability in two steps.The first analysis was done after the study selection process between the authors JH and PM.The second analysis was performed after the study quality assessment between the authors JH and LN.Both analyses were executed in the R programming environment (The R Foundation, 2002).The degree of agreement in both steps was initially evaluated via percent agreement and Cohen's kappa using the irr package (Gamer et al., 2019) and the psych package (Revelle, 2016).Supplementary Material 2 (Helvich et al., 2022) provides further details.

Study Selection Process
The studies, including abstracts and full texts, were first imported into the program Zotero and then exported them into an Excel spreadsheet for subsequent eligibility analysis.For the study selection process, two authors (JH, PM) independently evaluated the studies based on the exclusion and inclusion criteria and tracked the frequency of individual exclusion criteria.If the questionnaire/scale was not present in the full-text, the authors of the particular studies were contacted by the research team via an email and asked to provide the questionnaire/scale items.If the authors could not be reached, the study was then excluded.This also applies to studies where full-texts were not accessible.If discrepancies occurred during the study selection process, a discussion ensued, and the third author (LN) resolved the issue.Afterwards, the research team explored interrater reliability between the authors JH and PM.

Study Quality Assessment
For the study quality assessment process, the authors used the Quality Assessment Tool for Observational Cohort and Cross-Sectional Studies that the National Heart, Lung, and Blood Institute (NHLBI, 2021) developed in 2013.This tool was chosen due to its wide range of quality assessment criteria and its feasible applicability in the field of social studies.Ma et al. (2020) compared available systematic methodological rick of bias assessment tools and recommended implementing NHLBI's quality assessment tool for both cohort and cross-sectional studies.The tool contains 14 yes/no question items, with the third option not applicable/cannot determine/not reported if a particular criterion is not suitable for the quality assessment.The tool also includes a detailed manual that ensures the validity of the quality assessment process and provides a detailed explanation of each question item.Based on this, the authors excluded the question item 8, as different levels of exposure to gamification have not been preestablished.Both authors JH and LN assessed the studies independently, and, if any discrepancies occurred, a discussion between the authors ensued.Then, each study received a final quality assessment rating (poor/fair/good) according to their final percentile score.If the study total quality assessment score was higher than the 0.75 percentile, the authors labelled the study as "good."They labelled a study as "fair" if the total score of the study fell between 0.75 and 0.50 percentiles, and as "poor" if the total score of the study fell below 0.50 percentile.Afterwards, the research team explored interrater reliability between the authors JH and LN.

data Extraction
For each study that met all the authors' inclusion eligibility criteria, they extracted general information regarding the publication, such as title, authors' names, and year of publication.Afterwards, data concerning the study sample, such as the number of participants and other socio-demographic data (e.g., age, sex or country of origin) were also extracted.These also included information detailing the level of education at which the respondent group had been teaching.Regarding the questionnaires/ scales, the authors extracted both the questionnaire/scale results and the authors' conclusions of individual studies that measured the overall EFL teachers' satisfaction with the applicability of gamification platforms and the teacher-perceived effect of gamification on learners' motivation and learning outcomes.

RESULTS
In total, the authors identified 728 studies across all selected databases and Google Scholar, out of which 98 were duplicates, and could not retrieve 3 full-text records.Conclusively, the authors selected 11 studies for quality assessment.Figure 2 shows the PRISMA flow diagram.

Study Selection
The authors JH and PM reached an agreement of 93% for the study selection, implying a substantial level of consensus.However, Cohen's kappa test indicated a surprisingly low degree of agreement: K = 0.26, 95% CI [0.11 -0.4], p < 0.001.It is possible that such small kappa values were the result of the first kappa paradox (Warrens, 2010;Zec et al., 2017), which might occur when the real degree of agreement is high (Gwet, 2008).Previous studies (Cibulka & Strube, 2021;Wongpakaran et al., 2013;Xie, 2013;Zec et al., 2017) indicated that an alternative agreement measure (i.e, Gwet's AC1 statistics) should provide valid results if the degree of agreement is in fact high.For this reason, the authors estimated the degree of agreement via Gwet's AC1 statistics using the irrCAC package (Gwet, 2019).The interpretation of Gwet's AC1 is similar to Fleiss' generalised kappa.Gwet's AC1 coefficient revealed a very high agreement between JH and PM: 0.93; 95% CI [0.90 -0.95]; p<0.001.Taken together, the results of interrater reliability for the study selection process implied a high agreement between the raters.

Quality Assessment
The authors JH and PM reached an agreement of 96% for the quality assessment; Gwet's AC1 statistics also revealed a very high agreement between the two raters, with Gwet's AC1 coefficient value: 0.93; 95% CI [0.87 -0.99]; p<0.001.Thus, the results of interrater reliability for the study quality assessment process suggested a high agreement between authors.Supplementary Material 3 (Helvich et al., 2022) provides the quality assessment table, which includes individual question items, scores for respective studies, and the conclusive quality rating for each study.Figure 2 evidences that the overall quality of the included studies is very low.Although the authors labelled three studies as "good", only one study scored half of the 14 total quality assessment score.Table 2 reveals that the majority of respondents across all studies were female teachers, mostly from Asia and the Middle East.Most studies examined teachers' perceptions of gamification in secondary education, but only a few were in lower secondary or higher education levels.Four studies examined teachers' perceptions across different education levels, but only some disclosed the percentages of participants at each level, and only two included higher education.Moreover, a number of studies omitted some of the key sociodemographic information such as sex or age, which consequently decreased the final quality assessment score.

SyNTHESIS ANd dISCUSSIoN
The data synthesis has provided sufficient evidence regarding EFL teachers' satisfaction with gamification platforms, but it also revealed several external and internal limitations that may obstruct implementing gamification lessons.Moreover, the findings indicate a positive teacher-perceived effect of gamification on learners' motivation.However, the analysed studies did not provide sufficient evidence of the teacher-perceived effect of gamification on learning outcomes.

Satisfaction with Applicability
The applicability of gamification platforms is generally measured with regard to their perceived usefulness, practical implementation, and intrinsic limitations.However, these dimensions influence not only the overall teachers' satisfaction, but also one another.Asiri (2019) examined how perceived usefulness and positive attitude affect the EFL teachers' proclivity to implement gamification applications in their classes.The results revealed that positive perceptions of usefulness showed a high correlation with behavioural intention to use gamification applications, followed by a positive attitude with a medium relation level.One way to positively foster these aspects is through exposure and use.In Mahbub's (2020) research, 70.3% of the participants would recommend using the gamification application Kahoot! more often in EFL lessons after a year-long use, and only 11.1% stated otherwise.This is supported by Portero and Rodríguez's (2022) study, where the majority of 95 preservice EFL teachers agreed that they would implement gamification as teachers in the future Poor after the exposure.In Azar and Tan's ( 2020), Mahbub's (2020), Pham and Pham's (2022), and Qing and Halim's ( 2021) studies, most respondents agreed that using gamification platforms is fun and interactive, and creates pedagogical value.Rahayu and Wirza (2020) examined the applicability of gamification platforms during the COVID-19 pandemic and provided more details on the matter.Among the most used digital learning systems in the study, 78.5% used gamification.Out of 102 participants, 72.2% stated that they would use the digital learning systems in the future, as 80.4% deemed them useful for teaching, but only 39.2% sustained that digital learning systems make teaching easier.Nevertheless, 88.1% of participants saw themselves involved with the platforms, and 77.2% agreed to modify their lessons to take advantage of them.As to versatility, in both Azar and Tan's (2020) and Qing and Halim's (2021) research, the majority of the participants agreed that gamification can be implemented at different levels of knowledge and that gamification can be used as a supplement at all levels of education.Concerning the last dimension, i.e., the intrinsic limitations, Boonmoh et al. (2021) and Qing and Halim (2021) provided a general overview of factors and challenges that negatively affect the implementation of gamification applications.Among the external factors, lack of facilities, Internet issues, and technology issues were the most prominent negative factors, whereas time constraints, overall workload, and learners' interest and readiness were less impactful.However, in Pham and Pham's ( 2022) study, about half of the respondents agreed that utilising gamification can be time demanding.In terms of internal factors, they rated teachers' skills and interest as the most disruptive.In Rahayu and Wirza's (2020) research, only 43.6% of the teachers regarded online learning systems as easy to use, with 47.6% of the respondents finding them unclear and hard to understand.In Bajasilova and Abdykhalykova's (2019) work, the respondents sustained that gamification applications such as Kahoot! are convenient for both teachers and learners, and only 5% stated that gamification applications have more disadvantages than advantages.Therefore, the findings indicate that EFL teachers are generally satisfied with the applicability of gamification platforms and would recommend their use.However, it is important to consider the negative factors limiting their implementation in EFL lessons.In their systematic review, Lim and Yunus (2021) examined English teachers' perspectives on a gamification platform Quizizz and focused on three aspects connected to applicability: Feasibility, difficulty, and willingness to use.Similar to the systematic review in this study, Lim and Yunus's findings implied that English teachers perceive the platform positively mostly because it provides many benefits that facilitate learners' academic performance and create an engaging learning experience.This, of course, does not exclusively apply to Quizizz, but also to other gamification applications.Degirmenci's (2021) literature review revealed similar findings on Quizizz, as both language teachers and learners perceived the platform favourably and reported a positive effect on English learning and teaching.In their systematic review on Duolingo, Shortt et al. (2021) also reported a relatively high level of satisfaction and enjoyment when using the app.In this regard, the authors of this study bridged the data related to applicability and, in addition, provided evidence on EFL teachers' satisfaction across the gamification platforms.Dehghanzadeh and Dehghanzadeh (2020) shared a similar sentiment on Duolingo: They found Duolingo to be the most implemented gamification platform among the reviewed studies, Kahoot!being second, with most studies concluding that Duolingo made the foreign language learning process more effective and engaging.As to the limitations of use, Lim and Yunus's (2021) and Singh et al.'s (2020) results coincide with the findings in this study: The Internet and technology issues were the most prominent external negative factors when implementing gamification, and teachers' skills and interest were the most voiced internal negative factors, especially among senior teachers.Also Dehghanzadeh and Dehghanzadeh (2020) and Zhang and Hasim (2023) identified similar issues, with the majority of the studies reporting the Internet, logistics, and insufficient technology as the main challenges of using gamification platforms.

Perceived Effect on Motivation
One of the key incentives for implementing gamification in everyday lessons is its presumed perceived effect on learners' motivation.In Qing and Halim's (2021) study, teachers agreed that the utilisation of gamification increases learners' motivation for language learning.This notion is supported by Portero and Rodríguez's (2022) research, where most participants confirmed that gamification increases learners' motivation and interest in the subject and fosters curiosity and a willingness to learn.Aldahash and Alenezi (2021) examined gamification and its effects on five dimensions, one of them being learners' motivation.Their findings suggested that gamification energises learners, increases learners' motivation toward learning, and encourages learners to achieve set goals.Furthermore, the findings implied that gamification fosters positive reactions toward learning and promotes an overall positive learning experience.Mahbub (2020) examined the teacher-perceived effect of Kahoot! on learners' motivation; a significant majority of respondents agreed that Kahoot!increased learners' motivation and engagement, facilitated the learning experience, and promoted collaboration.Furthermore, 92.6% of the teachers' agreed learners were more attentive while using Kahoot.Bajasilova and Abdykhalykova (2019) also suggested that gamification applications such as Kahoot!have a positive effect on learners' motivation, rather than a negative one, and added that these applications are also suitable for motivating adult learners.However, only Pham and Pham (2022) and Toy and Buyukkarci (2020) examined the effect of gamification and gamification applications on learners' motivation when learning a specific language skill.All respondents in Toy and Buyukkarci's (2020) research agreed that using the gamification application Quizlet for vocabulary learning increased learners' motivation, and 90% saw Quizlet promoting the enjoyment of learning vocabulary.On the other hand, in Pham and Pham (2022)'s study, 80% of the teachers agreed that gamification is designed to motivate, teach as well as entertain, and that gamification is a motivating and fun approach to teaching grammar, particularly for weak learners.
Taken together, the data synthesis provided sufficient evidence of the positive effect of gamification on learners' motivation from EFL teachers' perspectives.Lim and Yunus (2021) also analysed data detailing how Quizizz motivates learners.Lim and Yunus concluded that English teachers see how Quizizz can potentially motivate learners to learn more effectively.Besides, they commented on other benefits, such as creating a fun and comfortable environment, increasing attention to the lesson content or establishing friendly competition among peers.This coincides with Dehghanzadeh et al.'s (2019) and Singh et al.'s (2020) conclusions, as both identified several studies that reported positive outcomes of gamification on learners' motivation and engagement.Zhang and Hasim's (2023) systematic review revealed that both learners and teachers held a positive attitude toward gamification mainly because of its capabilities to increase learners' motivation and stimulate their interest and engagement in learning English.Al-Dosakee and Ozdamli (2021) and Kaya and Sagnak (2022) presented similar findings, as they reported a positive effect on learners' motivation, engagement, and overall enjoyment.These findings underscore the significance of the authors' data in this study, since both learners and EFL teachers have perceived the positive effect of gamification on learners' motivation in EFL classes, by now.Nevertheless, further research is needed to determine how digital gamification capable of influencing both extrinsic and intrinsic motivation of EFL learners.

Perceived Effect on Learning outcomes
The majority of the studies analysed, except for two, contained only a narrow scope of data on the teacher-perceived effect of gamification on learning outcomes.In Azar and Tan's (2020), Bajasilova and Abdykhalykova's (2019), and Mahbub's (2020) studies, most of the respondents agreed that gamification or gamification applications such as Kahoot!improve learners' academic performance and stimulate English language acquisition.In Toy and Buyukkarci's (2020) research, the respondents confirmed that gamification applications could help learners pick up new vocabulary and enable them to develop other language skills, such as writing, reading, and listening.However, only 10 out of 50 teachers sustained Quizlet is suitable for developing speaking skills.Pham and Pham (2022) reported the same notion, as 60.2% of the respondents agree that gamification can help learners improve not only their grammar, but also other language skills.Aldahash and Alenezi (2021) investigated comprehension and creative thinking as two dimensions connected to learning outcomes.Their findings indicated that gamification helps the learners achieve their educational goals, increases their level of mastering the skills of the lesson, improves problem-solving skills, widens imagination, and helps learners innovate.
The analysed studies did not present enough conclusive evidence regarding teachers' perceptions of the effect of gamification on learning outcomes.Both Lim and Yunus (2021) and Singh et al. (2020) identified several studies that reported the effectiveness of gamification and gamification platforms on learning outcomes.However, their research is primarily descriptive, rather than experimental, and thus does not provide a causal link between the gamification elements and learning outcomes (Dehghanzadeh et al., 2019).Furthermore, even if this link were to be established, the data still do not clarify which gamification elements affect learning outcomes the most or which learning outcome is being most impacted by which gamification element.Shortt et al. (2021) also mentioned similar issues, as very few studies focused on how Duolingo contributed to improving learners' academic performance.Also Al-Dosakee and Ozdamli (2021) supported this, as they identified insufficient data in relation to learning outcomes.Boudadi and Gutiérrez-Colón (2020) concurred; they stated that only a few studies established clear interconnections between gamification and learning outcomes.In this systematic review, the authors identified several studies examining how gamification affects learning outcomes from EFL teachers' perspectives, which revealed another research area that suffers from a lack of high-quality research.

IMPLICATIoNS ANd SUGGESTIoNS
The systematic review uncovered several research gaps.The first is the absence of high-quality quantitative studies that would provide more compelling and reliable data on the subject matter.The analysed studies evidence that the research topic is a relatively new matter, since the studies that were focused on EFL teachers' perceptions date back to no further than 2019.Additionally, studies generally scored very poorly in the quality assessment, so the overall strength of evidence is low.Therefore, future research should conduct better study designs with more robust methodologies to avoid biases and thus improve the overall quality of studies.This includes clearly defining the study population and providing justification for sample size, power description or variance and effect estimates.Subsequently, the key potential confounding variables should be measured and adjusted statistically for their impact on the relationship between exposure and outcome.Furthermore, in order to accurately measure the teacher-perceived effects of gamification, it is important to measure the exposure before the outcome is measured.Lastly, the exposure should be assessed more than once to investigate how EFL teachers' perceptions change over time.
Most of the questionnaires/scales in the analysed studies rarely provided in-depth examinations, especially concerning learning outcomes.This issue is often a consequence of misconstrued assessment tools, which contain items that do not provide clear and definite data or are too general to imply any specific outcomes.Hence, valid and more reliable questionnaires/scales that would provide more accurate data are needed.In this systematic review, the authors identified studies with a multidimensional scope on applicability and motivation, but none that would assess the teacherperceived effect of gamification on learning outcomes.Thus, more research is needed to measure the impact of gamification on specific language skills as well as other learning outcomes.Additionally, other more objective sources of data should be considered when examining the effect of gamification on learning outcomes, such as test scores and grades before, during, and after gamification use, which none of the analysed studies included in their study design.
Another aspect that researchers of previous studies has been repeatedly overlooking is the assessment of individual gamification applications, which would provide significant data for comparing how each application is perceived.Studies often structure their questionnaires/scales to assess the implementation of gamification, but rarely examine the platforms themselves.For this reason, future research should measure not only the underlying effect of gamification, but also how EFL teachers perceive various application elements, such as the range of available activities, interface, distribution of gamification elements, ease of operation, and other relevant features.This would provide crucial data for both teachers and developers.
In practice, EFL teachers at all education levels should consider gamification for motivating and encouraging learners, but they should also be aware of its external and internal limitations, especially in socioeconomically disadvantaged areas.Nowadays, gamification platforms have been streamlined so that teachers of all ages can effortlessly access them and reap the benefits the gamification platforms provide.However, teachers should be cautious of the longevity of presumed effects of gamification as well as how reliably it impacts other aspects of learners' academic performance, if at all.

STRENGTHS ANd LIMITATIoNS
In this systemic review, the authors formulated a rigorous research design with an extensive range of search terms and databases (Supplementary Material 1) (Helvich et al., 2022).The findings provided essential data about the quality and prevalence of research in the field of EFL education and digital gamification.Moreover, the authors also discussed and provided solutions to some of the underlying limitations, including the Google Scholar character count limit, the paradox of Cohen's kappa (Supplementary Material 2) (Helvich et al., 2022), and study quality assessment for educationrelated studies (Supplementary Material 3) (Helvich et al., 2022).The high quality of this systematic review was achieved by adopting the PRISMA checklist, the PICO framework, the NHLBI guidelines for quality assessment, the PROSPERO preregistration form, and open access to the research data.
The first limitation stems from the fact that this systematic review included only English-written studies that contained only English questionnaires/scales.Therefore, the authors omitted data from studies written in other languages that could have provided more information on the topic.Secondly, the study is inevitably prone to various selection biases that may have influenced the authors during the course of this systematic review.

CoNCLUSIoN
The aim of this systematic review was to investigate EFL teachers' perceptions of digital gamification and its effect on learners' motivation and learning outcomes and examine how EFL teachers are satisfied with the applicability of gamification platforms.The results indicated that gamification has a positive effect on learners' motivation, fosters a positive learning experience, and encourages them throughout the lesson.Furthermore, the findings uncovered various dimensions of the applicability of gamification applications and how they impact the overall teachers' satisfaction as well as each other.The findings suggested that teachers find gamification applications a valuable educational asset and would recommend their use in EFL lessons.The systematic review also allowed to identify various perceived obstacles that may impact the integration of gamification in everyday EFL lessons.Regarding learning outcomes, the results provided inconclusive data on how gamification impacts academic performance, primarily due to the poor quality of the studies or insufficient data to infer any potential benefits.Finally, the high Gwet's AC1 interrater reliability coefficient for the study selection process indicates that the authors precisely formulated the inclusion and exclusion criteria, thus making the findings reliably replicable.

Figure 4 .
Figure 4. Measurement focus of studies

Table 1 . Inclusion and exclusion eligibility criteria items Criteria Category and Items Inclusion Criteria Exclusion Criteria
(Helvich et al., 2022) 1(Helvich et al., 2022)provides a complete summary of all search syntaxes, used filters for individual databases and Google Scholar, and search results.