Semi-Supervised Event Extraction Incorporated With Topic Event Frame

Supervised Meta-event extraction suffers from two limitations: (1) The extracted meta-events only contain local semantic information and do not present the core content of the text; (2) model performance is easily degraded because of labeled samples with insufficient number and poor quality. To overcome these limitations, this study presents an approach called frame-incorporated semi-supervised topic event extraction (FISTEE), which aims to extract topic events containing global semantic information. Inspired by the frame-based knowledge representation, a topic event frame is developed to integrate multiple meta-events into a topic event. Combined with the tri-training algorithm, a strategy for selecting unlabeled samples is designed to expand the training sets, and labeling models based on conditional random field (CRF) are constructed to label meta-events. The experimental results show that the event extraction performance of FISTEE is better than supervised learning-based approaches. Furthermore, the extracted topic events can present the core content of the text.

1. We design a topic event frame inspired by frame-based knowledge representation, which organizes the meta-events representing different facets of the text to form a topic event. 2. We propose to improve the performance of the meta-event extraction by training the sequence labeling model with not only the human-labeled samples but also the samples automatically selected by tri-training. 3. We generate a trigger table by adopting the entropy-based feature selection algorithm and use triggers to filter irrelevant information to promote the efficiency of the meta-event extraction during the extraction phase. 4. We conduct experiments on real-world datasets to validate the effectiveness of the proposed approach.
The remainder of this study is organized as follows. Section 2 reviews the related work. Section 3 presents the proposed approach. Section 4 presents the experimental results. Finally, Section 5 provides the conclusions and future work.

RELATED WoRK
In this section, we introduce related works on event extraction, including meta-event extraction and topic event extraction. In addition, we summarize the related research on tri-training.
Earlier approaches for meta-event extraction mainly relied on pattern matching. Developed by Riloff (1993), the AutoSlog system, the earliest known event extraction system based on pattern matching, is used for curbing terrorism by extracting terrorist events. AutoSlog exploited a small number of linguistic patterns and a manually labeled corpus to obtain event patterns. Yangarber et al. (2000) designed the ExDisco system that does not need to annotate and pre-categorize the corpus and only needs to formulate a small number of seed patterns for learning automatically to obtain excellent matching patterns. Compared with AutoSlog, ExDisco greatly reduces manual intervention, and its event extraction performance is optimal. Valenzuela-Escárcega et al. (2015) defined a rule-based event extraction frame that is simple, powerful, robust, and fast. They used the frame to develop a grammar for the biochemical domain, which approached human performance. Cao et al. (2015) proposed a pattern expansion approach to import frequent patterns extracted from external corpora to boost event extraction performance. To solve the problem that some frequent expressions involving event triggers do not appear in the training corpus, Cao et al. (2018) introduced expert patterns from TABARI to boost event extraction performance. In addition, many other approaches have been proposed to facilitate automatic pattern construction by designing machine learning algorithms to learn new patterns based on a few seed patterns (Li et al., 2016;Tadesse et al., 2020).
Machine learning-based meta-event extraction can be divided into pipeline-based and joint learning-based models. The former regards trigger extraction and event argument extraction as different sequential processes, and models meta-event extraction as a task pipeline. Trigger extraction can determine the meta-events and their types, and the extraction performance of triggers directly determines the performance of subsequent argument extraction. Chen and Ji (2009) proposed training a trigger labeling model using text features such as lexical, syntactic, and semantic features. However, it cannot recognize unknown triggers that do not appear in the training set. To address this issue, He et al. (2014) used "Synonymy Thesaurus" to expand triggers to help obtain unknown triggers. Liu et al. (2016) designed two types of global information, event-event correlation and topic event correlation, to construct a probabilistic soft logic (PSL) model to assist in the decision-making of the output of each subtask model to determine the final extraction results. C.  proposed a CRF-based sequence labeling model to identify triggers from complex sentences that can consider the local context of text, achieving a good generalization performance. However, all pipeline-based approaches have a drawback; that is, the error of the previous subtask carries over to the next subtask. The joint learning-based approaches coordinate two subtasks, that is, the uncertain information in the previous subtask is transmitted to the next subtask, and the valuable information generated in the next subtask is allowed to be fed back to the previous subtask. Li et al. (2013) solved the problem of event extraction from the perspective of structured learning and designed a joint learning algorithm by combining sentence-level local and document-level global features. Specifically, they presented a joint frame based on structured prediction that extracts triggers and arguments together so that the local predictions can be mutually improved.
The above-mentioned meta-event extraction approaches are more or less involved in feature engineering, which either rely on existing prior knowledge to design complex features or use NLP tools to extract relevant features. Therefore, they are not only laborious and inefficient, but also generate incorrect information, thereby resulting in a decreased event extraction performance. Compared with traditional machine learning, modern machine learning, namely, deep learning, adopts an end-toend modeling scheme and avoids feature engineering. Therefore, in recent years, researchers have proposed various deep learning-based meta-event extraction approaches. These approaches can be divided into sequence-based methods and graph-based methods. For the sequence-based methods, Chen et al. (2015) developed a dynamic multi-pooling convolutional neural network (DMCNN) that automatically extracts lexical-level and sentence-level features to obtain the most important information for each part of the sentence. Z.  proposed a joint event extraction approach based on convolutional neural networks (CNNs) that can extract triggers and arguments simultaneously. Kodelja et al. (2019) constructed a representation of the global context using a bootstrapping approach and integrated the representation into a CNN model for event extraction. These methods input text as sequence data and can automatically learn feature representations. For the graph-based methods, recent work (Cui et al., 2020;Xie et al., 2022;Liu et al., 2021;Huang et al., 2020) employ Graph Convolution Networks and Graph Attention Networks using the dependency graph generated from syntactic dependency-parsers. These methods can capture long-distance dependencies between words related to event extraction. These types of meta-event extraction approaches can not only avoid feature engineering, but also obtain excellent event extraction performance.
As the name suggests, ontology-based topic event extraction is centered on the characteristics of the ontology. It first extracts meta-events and their related entity information from text according to the concepts specified by the ontology and then correlates the entity information of different metaevents based on the relations given by the ontology. It includes three steps: construction of the domain ontology, text semantic annotation, and event extraction. Lee et al. (2003) developed an ontology-based event extraction system that involved an ontology model with four layers, namely, domain, category, event, and extended concept layers. Their experimental results showed that the ontology model was effective for the extraction of Chinese meteorological news events. An event extraction system from a document is generally domain-dependent. To avoid this dependency as much as possible, Sahnoun et al. (2020) proposed applying an open information extraction approach for modeling any event type ontology representation. Their experimental results confirmed the effectiveness of this approach. Event frame-based topic event extraction approaches show the same effectiveness as ontologybased topic event extraction approaches. A hierarchical and structured frame is designed in place of the ontology to guide topic event extraction. First proposed by Minsky (1974), frames have now become a common knowledge representation approach for describing the outline of related concepts. The basic idea is that the human brain stores a large number of typical scenarios, and when people face new scenarios, a basic empty knowledge structure called a frame is selected from the memory. Wu et al. (2020) classified meta-events according to the information presented by the meta-events in the topic and designed a frame model based on these meta-events, representing the topic events in the form of hierarchical meta-events. They developed CRF-based labeling models to extract topic events from various court verdicts.

Semi-Supervised & Distant-Supervised Event Extraction
In recent years, due to the lack of available training data, semi-supervised and distant supervised methods were gradually proposed.
Due to the successful application of distance supervision in relational extraction tasks, many researchers have also tried to apply distance supervision to the field of event extraction. Zheng et.al. (2019) uses Transformer and sequence annotations for sentence-level entity extraction to obtain the argument of the event and continuously adds event argument to the event table by constructing a directed acyclic graph to complete the extraction of events in the document-level entity extraction phase. Zhu et.al. (2021) developed DE-PNN, an encoder-decoder model for document-level event extraction, which is based on document-level encoding and multi-granular decoding, respectively. However, distance supervision requires aligning the background knowledge base with the accompanying natural language document corpus. For event extraction, such a data source required for distance supervision is often not readily available. Zhou and Zhong (2015) proposed a semi-supervised learning framework based on hidden topics for biomedical event extraction. In this framework, sentences in the unannotated corpus are elaborately and automatically assigned with event annotations based on their distances to these sentences in the annotated corpus. Zajec and Mladenić (2022) iteratively labeled unlabeled data using semi-supervised learning combined with probabilistic soft logic, inferring the pseudo-tokens of each instance from the predictions of multiple base learners. The proposed methodology is applied to Wikipedia pages about earthquakes and terrorist attacks in a cross-lingual setting.

Tri-Training
Co-training, first proposed by Blum and Mitchell (1998), requires the dataset to have two sufficient and redundant views. However, it is often difficult for actual datasets to meet these two conditions. Goldman and Zhou (2000) proposed a collaborative training algorithm that does not require two sufficient and redundant views. They utilized different decision tree algorithms to train two different classifiers, adopted cross-validation to label unlabeled samples, and combined the two learning approaches to form the final prediction. Owing to the extensive use of cross-validation, the algorithm has a high time complexity. Tri-training, which is also a semi-supervised learning algorithm based on collaborative training, does not require sufficient and redundant views or uses different learning algorithms. The difference among the training classifiers is guaranteed by using different sample subsets extracted from the original sample set. Because tri-training has no constraints on the attribute set and the learning algorithm used by the classifier, and does not require cross-validation, it has a wider application scope and higher efficiency.
Generally applied to classification tasks, tri-training is relatively rare in sequence labeling tasks. This is because when performing classification tasks, there is only one classification result, whereas when performing sequence labeling tasks, the labeling result is a sequence. When tri-training is combined with sequence labeling, it is necessary to obtain a consistent label sequence from the labeling results of multiple models as pseudo labels. Chen et al. (2006) developed the consistency calculation approach called S2A1D that calculates the consistency rate of labeled results of the two models for a sequence and selects the sequence with the highest consistency rate to add to the pseudo-label sample set. Chou et al. (2016) selected a sample from unlabeled sample set, labeled it with different models, and selected the top m label sequences with the highest probability from the labeling results of each model. If the label sequences generated by the two models are exactly the same, then the probability sum of the two is calculated, and the label sequence when the probability sum takes the maximum value is selected as the pseudo label.

FRAME-INCoRPoRATED SEMI-SUPERVISED ToPIC EVENT EXTRACTIoN
Here we present our design process for topic event extraction from text, which includes two core parts, as shown in Figure 1. The first part includes the steps of frame design, labeled corpus construction, and trigger table generation. The second part is a semi-supervised meta-event extraction based on tri-training.

Topic Event Frame
Meta-events only contain local semantic information and do not present the core content of the text from a global perspective. In contrast, a topic event is composed of multiple states and actions, including multiple meta-events related to text topics, and their global semantic information can effectively present the core content of the text. However, the description information of the topic event is usually scattered in the document; thus, most meta-event extraction approaches hardly meet the needs of topic event extraction. Inspired by the frame-based knowledge representation, we design a topic event frame that aims to integrate multiple meta-events representing different facets of information into a topic event, thereby representing a topic event in the form of meta-event sets. A topic event generally has the following characteristics: 1. Separation: A topic event often involves multiple facets of information. A facet refers to a type of meta-event set, and different facets are semantically separated.

Cohesion:
A topic event includes a topic core meta-event and other facet meta-events. The topic core meta-event describes the topic information, and all other facet meta-events are related to the topic through the topic information.
There exists a close relationship between a topic event and meta-event. Thus, this study designs a meta-event-based knowledge representation frame to describe a topic event. The frame regards a topic event as a multiple meta-event set. The triggers and event arguments in the meta-events are extracted to structure the meta-events, after which different types of meta-events are combined to hierarchically present a topic event. The meta-event types include topic information and facet metaevent types, which are formally defined as follows: 1. Definition 1: TInfo (topic information) is a general description of a topic event, including the most basic information of the topic event, such as time, location, and people. 2. Definition 2: TEF (topic event facet), TEF i ={ME i1 , ME i2 ,...},where TEF i refers to a facet of the topic event, and "ME i1 , ME i2 ,..." in TEF i denote the same type of meta-events, composed of triggers and event arguments. 3. Definition 3: TE (topic event) is described by topic information and multiple facets of the topic event, that is, TE={TInfo, TEF 1 , TEF 2 , …}.
The topic event frame based on the meta-events is shown in Figure 2. A topic event is a frame structure, and its slot values include topic information and information on various facets. Facet information is a type of meta-event set. Each meta-event itself is also a sub-frame, and its slot values are a trigger and event arguments.

Labeled Corpus Construction
In this section, we manually construct a small amount of labeled corpus, which is part of the preparation for the subsequent meta-event extraction. The process is illustrated in Figure 3. First, we use the web crawler technology to obtain a large amount of unlabeled text from specific websites. Then, we preprocess the obtained text, including section acquisition, corpus grouping, word segmentation, partof-speech (POS) tagging, and dependency parsing. Finally, we annotate various meta-event corpora after preprocessing and obtain labeled training sets of various meta-events.

Trigger Table Generation
Considering the co-occurrence of triggers and event arguments in meta-events, in the meta-event extraction phase, we locate the description information of meta-events existing in the text set using triggers to filter out irrelevant information, which improves the efficiency of event extraction. We utilize the entropy-based feature selection algorithm (Dash & Liu, 2000) to obtain triggers and regard trigger extraction as a clustering feature extraction problem. Because of the high computational complexity of this algorithm, to reduce the number of words involved in the calculation, we group the description sentences of each type of meta-event into a set, namely, , where each element of I represents a description sentence of a meta-event, and n is the number of sentences. As the POS of the trigger is a noun or a verb, we filter out other words of the POS except verbs and denote the set of all words in I, where m is the number of words.
We calculate the entropy value of I as E through Equation 1, where S ij is the similarity function , D ij is the Euclidean distance between i i and i j , a is a positive number, and its value is -ln . / 0 5 D , and D is the average distance among all i i . is obtained. We select the top 10 words with the largest increase in the value of E as the candidate seed trigger. Repeating the above steps, we obtain candidate seed triggers for each type of meta-event. Then, we match the candidate seed triggers with the real triggers in the training set and determine the top three words as seed triggers from the matching results. Finally, we add them to the trigger table according to the meta-event types. In addition, in the meta-event extraction phase, we add the newly identified unknown triggers in labeled results to the trigger table to complete an iterative update of the trigger table. In the next meta-event extraction, we always utilize the updated trigger table to filter irrelevant information.

Tri-training Theory
Tri-training guarantees the difference among the training classifiers by using different sample subsets extracted from the original labeled sample set. The general steps of the tri-training algorithm are as follows. First, the labeled sample set denoted as L is sampled to generate three training subsets, which are used to train three classifiers, namely, h i , h j , and h k . In each round, h j and h k ( j k i , ¹ ) are where e i is the error rate of classifier h i on L i . Because L i is selected from U through classifiers h j and h k , it is difficult to evaluate the error rate. Assuming that U and L are identically distributed, e i can be determined using the classification error rate of h j and h k on L, as shown in Equation 3.
. Assuming that the initial classification error rate e i ' . = 0 5 , the initial value of | | ' L i can be calculated using Equation 7, and the value of L i in each round can be calculated using Equations 4 and 5. The last step of each round is to combine L and L i retrain h i . The above process is iterated until Equation 2 no longer holds.

Algorithms
We treat the identification of triggers and event arguments from meta-events as a sequence labeling task, introduce tri-training into the sequence labeling process, and propose a semi-supervised event extraction approach based on tri-training and CRF. This approach is divided into two phases: the training phase (Tri-Training-CRFs, as shown in Algorithm 1) and the testing phase (Testing-CoLabeling, as shown in Algorithm 2). Step 1: Training the initial sequence labeling model: Use the bootstrap algorithm to obtain three different training subsets from L, and train three initial sequence labeling models separately based on CRF (refer to lines 1-5 in Algorithm 1).
Step 2: Obtaining a new training sample set: After the model is trained in the previous round, whether to continue iterative training in the next round must first meet two conditions simultaneously: 1)  * is smaller than U . We assume that the algorithm stops at N iterations, and hence, the overall time complexity of Tri-training-CRFs is O N U S N l ( | | ) * * * 2 . In the Tri-training-CRFs algorithm, to ensure that the new training samples selected in each round have a high degree of confidence, we design a reasonable strategy for selecting unlabeled samples, namely, Training_CoLabeling, as shown in Function 1 (refer to line 12 in Algorithm 1). Function 1: Training-CoLabeling Input: U: Unlabeled sample set; h j , h k : CRF labeling model,( j k i , ¹ ) Step 1: Selecting the consistent label sequence: Use h j and h k to label each sample x in U to obtain Y j and Y k , and take the intersection of the two to obtain Y , where, Y j , Y k , and Y represent the sets that are made up of multiple label sequences corresponding to x (refer to lines 7-10 in Function 1).
Step 2: Selecting the samples whose probability sum meets the threshold conditions: For each label sequence y inY , calculate the sum of P y x j ( | ) and P y x k ( | ) as p jk . If the maximum value of p jk satisfies the given threshold condition, we take the label sequence obtained when p jk takes the maximum value as the label y of the sample x, and add (x, y) to the set NewIns, where P y x j ( | ) and P y x k ( | ) , respectively, denote the conditional probability of the sample x being labeled by h j and h k as the label sequence y (refer to lines 11-22 in Function 1).
Step 3: Sorting by probability sum descending order: For each instance in NewIns, we sort the instances in descending order according to the value of psum (refer to lines 24-25 in Function 1).
In the meta-event extraction phase, we first utilize the triggers in the trigger table to filter the irrelevant information in the text and preprocess the description sentences of each type of metaevent. Then, we use the three sequence labeling models constructed in the training phase to label the preprocessed text at the same time. The process of determining the label for each unlabeled sequence is presented in Algorithm 2. Algorithm 2: Testing-CoLabeling Input: T: The testing samples set; h i , i Î { , , } 1 2 3 : CRF labeling model. 1 2 3 , and take the intersection of labeled results to obtain Y (refer to Lines 6-10 in Algorithm 2).
Step 2: Determining the label of the unlabeled sample: For each label sequence y in Y , calculate the sum of P y x 1 ( | ) , P y x 2 ( | ) , and P y x 3 ( | ) as p 123 . If the maximum value of p 123 satisfies the threshold condition, the label of sample x is the label sequence y obtained when p 123 takes the maximum value (refer to lines 12-20 in Algorithm 2). Otherwise, the sum of P y x i ( | ) and P y x j ( | ) is calculated as p ij ( i j ¹ ). If the maximum value of p ij satisfies the threshold condition, the label of sample x is the label sequence y obtained when p ij takes the maximum value (refer to lines 21-32 in Algorithm 2). If the above threshold conditions are not met, the label of the sample x is the label sequence y obtained when p i ( i Î { , , } 1 2 3 ) takes the maximum value (refer to lines 33-41 in Algorithm 2).
Through the above steps, the time complexity of Testing-CoLabeling is O T S N l ( ) * * 2 , where T is the number of samples in the test set, S is the length of the sequence to be labeled, and N l is the number of category labels.

EXPERIMENTS
In this section, we analyze the experimental results of FISTEE. First, we introduce the dataset. We then elaborate on the experimental settings and measurements. Next, we present the comparison approaches and compare the experimental results in detail. Finally, we summarize the error-extraction process.

Experimental Dataset
In this study, we utilize court verdicts in the legal field as an experimental corpus. A court verdict is a type of long text data that roughly includes five parts: basic information, legal roles, indictment, case information, and judgment. The content of the case information is lengthy, complex, and diverse, and it usually contains factual information on multiple aspects of the case. Thus, we consider the topic events of extracting case information as an example to verify the effectiveness of the proposed algorithm.
As there is currently no publicly available training corpus of court verdicts, we crawl the court verdicts on various motor vehicle accidents, issued by courts of different regions in China, from the "China Judgment Online 1 ". We randomly selected 1,800 court verdicts, used regular expressions to match full-text information, and obtained descriptive sentences of case information as experimental data. Then, we divided them into training and test corpora in a ratio of 16:2. After conducting a statistical analysis of all the corpora, we find that the case information contains four aspects: "Topic information," "Liability," "Insurance," and "Disability. Under the guidance of the topic event frame, we regard them as the four facets of the case topic event, and each facet contains a type of meta-event set.
Through the above analysis, we define a unique event representation frame for each type of meta-event, as shown in Table 1. To obtain the labeled corpus and unlabeled corpus of each type of meta-event, we process the case information as follows: First, we label the description sentences in the training corpus manually regarding the meta-event types in Table 1 as labels. Then, we extracted the labeled description sentences and classified them according to the category of labels. Next, we use the "LTP" (Che et al., 2010) to perform word segmentation, POS tagging, and dependency parsing on the training corpus to obtain the feature vector set of the training corpus. Subsequently, we divide the feature vector set into two subsets in a ratio of 1:1, and choose one subset denoted as U. Finally, we manually label another subset to obtain a labeled sample set denoted as L 1 , and label U to obtain another labeled sample set denoted as L 2 . The label sets of the various meta-events are listed in Table 2. In addition, we adopted an entropy-based feature selection algorithm to generate the trigger table. In the meta-event extraction phase, we utilize triggers to filter irrelevant contents in the case information to improve the efficiency of the meta-event extraction.

Experimental Settings and Measurement
In this study, we use "CRF++" to train the sequence labeling models. We need to specify the feature template and set the value of the hyper-parameter c. Hyper-parameter c is used to balance the degree to which the model fits the training samples. The larger the value of c, the higher the degree of fitting. For c, we set six different values: 1, 1.5, 2, 2.5, 3, and 4. For feature templates, we formulated five feature templates by analyzing the features contained in the meta-event training set. Among them, Template01 contained only word and POS features, and the context window was 3. Template02 and Template03 add dependency features based on Template01, and their context windows are 3 and 5, respectively. Template04 combines Template01 and Template02 to form multiple feature templates, whereas Template05 adds the POS joint dependency feature on the basis of Template04 to form a multiple cross-feature template. To obtain the optimal sequence labeling model, we combined different feature templates and parameters to conduct experiments using ten-fold cross-validation. Based on the extraction results of meta-events obtained by various models, we determined the feature templates and parameters. The experimental results of ten-fold cross-validation on the training set of each type of meta-event showed that the labeling results of the models were optimal when the value of c was 1.5, and the template was Template05.
In this study, the extraction results of meta-events are evaluated in terms of precision (P), recall (R), and F 1 , as shown in Equations 8, 9, and 10.
where N r and N e are the number of case text extracted correctly and incorrectly, and N num is the number of case text in the meta-event standard set.

Comparison Approaches
To verify the effectiveness of our approach, we compared the performance of FISTEE with the following comparison approaches.
1. BasicCRF: BasicCRF ignores trigger information. Based on the CRF, the sequence labeling model of event arguments is trained on the training set L 1 +L 2 and used to label a given case text. Arguments in meta-events are obtained by combining the words labeled "-A." 2. SEE (subject event extraction) (Wu et al., 2020): SEE regards the identification of triggers and event arguments in meta-events as a sequence labeling task. On a specific training set, a joint sequence labeling model of triggers and event arguments was trained based on the CRF.
In this study, we trained two SEE models. Specifically, we used the labeled training set L 1 to train a SEE model, namely, SEE(L 1 ). Similarly, we used the labeled training set L 1 +L 2 to train another SEE model, namely, SEE(L 1 +L 2 ).

Object
Label set

Experimental Results Analysis
We chose Template05 as the feature template, and the value of c was set to 1.5. Experiments were performed on four meta-event test sets of the case information. The experimental results of our algorithm and the comparison approaches are shown in Tables 3-7. 1. Compared with BasicCRF(L 1 +L 2 ), SEE(L 1 ), SEE(L 1 +L 2 ), and FISTEE(L 1 +U) are all joint sequence labeling models of triggers and event arguments. Their extraction results are improved in terms of P, R, and F 1 . On the one hand, because a trigger contains rich contextual semantic information, it can promote the performance of the joint sequence labeling model; on the other hand, using triggers can filter irrelevant information in the text to reduce noise interference. 2. Compared with SEE (L 1 ), FISTEE(L 1 +U) has better extraction results for the four meta-event test sets. The overall extraction results are improved by 17.2%, 16.8%, and 17% in P, R, and F 1 , respectively, which demonstrates that FISTEE can improve the extraction performance by taking advantage of unlabeled samples. This is because SEE (L 1 ) only relies on the manually labeled training set to construct the sequence labeling models, whereas labeled training set is limited and may exist as a data-sparse problem, which leads to poor generalization performance of the models. To increase the coverage of labeled samples, FISTEE selects high-confidence pseudolabel samples via tri-training from U and adds them to L 1 when training the models, thereby improving the effectiveness of the models. 3. Compared with SEE (L 1 +L 2 ), FISTEE(L 1 +U) has better extraction results on the four meta-event test sets, and its overall extraction results are improved by 14.9%, 14.4%, and 14.7% in P, R, and F 1 , respectively, indicating that FISTEE can improve the performance of meta-event extraction while reducing the manually labeled corpus. This is because SEE (L 1 +L 2 ) uses all labeled training sets including L 2 to construct the sequence labeling models, which induces overfitting in the models and decreases the extraction performance, whereas FISTEE gradually selects samples with high confidence and adds them to the training sets until the models converge, thus avoiding overfitting of the models.

The Effect of Feature Template on the Performance of the Sequence Labeling Model
To observe the effect of the feature template on the performance of the sequence labeling model, we set the value of the hyper-parameter c to 1.5 and used different feature templates to construct the sequence labeling models based on the FISTEE algorithm. To intuitively compare the performance of the sequence labeling models constructed using different feature templates, we connect the values of the models' labeling results on P, R, and F 1 to form a broken line. Although the broken line itself has no special meaning, the broken line trend clearly presents the difference in the performance of the sequence labeling models. Figure 4(a-d) shows the comparative experimental results of different sequence labeling models on "Topic information," "Liability," "Disability," and "Insurance" metaevent test set, respectively. It can be seen from the broken lines' trend that the sequence labeling  sets of the case information. Figure 5(a-d) depicts the extraction performance on the four metaevent test sets with different values of c, from which we can observe that the change in c has lesser influence on the performance compared with the change in the feature template. On the four meta-event test sets about "Topic information," "Liability," "Disability," and "Insurance," the models' labeling results show slight differences and the connecting lines of P, R, and F 1 are very close. When we set c to 1.5, the performance of FISTEE on the four meta-event test sets was slightly better than that of the other values. In general, the experimental results prove that FISTEE is stable under various values of c.

Error Analysis
Through the statistics of the error-extraction results of meta-events, we find that there are three possible error results: the returned meta-event extraction results are empty, the returned meta-event extraction results are incomplete, and the returned meta-event extraction results are semantically duplicated. The main reasons for these three types of errors are as follows.
1. The description sentences of meta-events are not recognized, resulting in an empty extraction result. This is because some description sentences containing triggers were not added to the test sets. For example, in the "Topic information" meta-event, the test set does not contain the description sentence of the trigger word "collision," which leads to the extraction results of related meta-events being empty. 2. The sequence labeling model does not recognize the description words of event arguments, resulting in incomplete meta-event extraction results. For example, the vehicle type is not recognized in the "Topic information" meta-event and the person's name is not recognized in the "Liability" meta-event, which leads to the failure to extract a complete set of event arguments from the meta-event set. 3. The meta-event extraction results were duplicated semantically. For example, the event argument of the "traffic accident" extracted from the "Insurance" meta-event is the "vehicle causing the accident," but the event argument is also extracted as "the vehicle of a certain license plate." If the "vehicle causing the accident" is correctly extracted in the "Topic information," that is, the "vehicle causing the accident" and "the vehicle of a certain license plate" can be merged to solve the problem of semantic duplication. However, if errors 1 and 2 occur when extracting arguments from the "Topic information" meta-event, the "vehicle causing the accident" cannot be merged with its corresponding license plate number, which will cause the semantic duplication of the event arguments extracted from the meta-events.

THEoRETICAL AND PRACTICAL CoNTRIBUTIoNS
In this section, we discuss the theoretical and practical contributions of our work. In theory, we design a topic event frame, which organizes the meta-events representing different facets of the text to form a topic event. We improve event extraction performance with a semi-supervised approach via tri-training and automatically select new training samples using the tri-training algorithm in the sequence labeling model. At present, there is little research work on semi-supervised event extraction. (Zhou & Zhong, 2015) used sentence structure and hidden topic embedding in sentences to describe distances, and annotated sentences in an unannotated corpus based on the distance between sentences. Compared to our work, the predefined trigger table in (Zhou & Zhong, 2015) is fixed, which limits the performance of the model because massive unlabeled words are ignored in the trigger table. Moreover, when generating a new annotated sentence, the model (Zhou & Zhong, 2015) only considers the similarity of content and structure while overlooking the rich contextual information between the words, which may lead to a lack of semantic interpretation of the model. (Ferguson et al., 2018) proposed a method for self-training event extraction systems that mention the same event instance in parallel in news text. This method labeled each event cluster for assigning triggers' labels, which added new training samples to the dataset. However, to generate event clusters, this method only uses the same number of entities mentioned in different news in a day to calculate the weight to form an event cluster, which causes some sentences in the cluster to be unrelated to other sentences. Our work fully considers all the meta-events related to the topic in the document to avoid this situation. (Zajec & Mladenić, 2022) used semi-supervised method and integrates cross-language data into the learning process, enhancing the pseudo-annotation supported by probabilistic soft logic. Moreover, to avoid manually annotating data when extracting event argument, (Zajec & Mladenić, 2022) combined Wikipedia and Wikidata to obtain the labeled data. The method takes into account the subject, language, and argument in the annotation corpus, but unfortunately this work focuses only on argument extraction and ignores what we think is the most important and fundamental work of event extraction-event detection.
In practice, our event extraction method is oriented to the legal field and can help with issues such as predicting judgment of cases. Existing event extraction frameworks are mainly applied to the financial and medical fields (Shun et al., 2019), while there is very little work applying event extraction to the legal field. In recent years, the legal field has gradually promoted intelligent legal management, and there are difficulties in automatic conviction and sentencing, large-scale judgment documents, and the analysis of legal issues in legal forums. Event extraction extracts the fine-grained key events of a case and then makes legal judgments based on the extracted event information. Attempts (Shen et al., 2020;Li et al., 2020;Feng et al., 2022) were made to apply event extraction to the legal field using a supervised approach. However, due to the confidentiality of legal documents, it is difficult to annotate a large number of new legal documents. Therefore, we propose a semi-supervised event extraction method that can obtain high-confidence extraction results with only a small number of annotated documents, which is more suitable for the special field of justice.

CoNCLUSIoN
In this study, inspired by frame-based knowledge representation, we design a topic event frame to integrate all the topic-related meta-events scattered in the document to form a topic event. We present a semi-supervised approach to improve the performance of topic event extraction models via tritraining. We propose a reasonable strategy to introduce tri-training into the sequence labeling task, which can select a certain number of samples with high confidence as new training samples from the unlabeled sample set. The selected samples are used together with the human-labeled samples to train the better sequence labeling models. The effectiveness of this approach was verified by conducting experiments. Furthermore, an extracted topic event is represented by different types of structured meta-events, thus presenting the core content of the text from a global perspective.
In future work, we plan to apply our approach to other fields, such as finance and education, sports, to further verify the effectiveness of our approach. Moreover, owing to the introduction of new training samples in iterative learning, our approach inevitably contains noise, which degrades the performance of the sequence labeling model. Therefore, we intend to design a data editing algorithm to identify the samples labeled by errors to optimize the training set and further improve the performance of the sequence labeling model.