Knowledge Extraction From National Standards for Natural Resources: A Method for Multi-Domain Texts

National standards for natural resources (NSNR) plays an important role in promoting efficient use of China’s natural resources, which sets standards for many domains such as marine and land resources. Its revision is difficult since standards in different domains may overlap or conflict. To facilitate the revision of NSNR, this paper extracts structural knowledge from the NSNR files to assist its revision. NSNR files are in multi-domain texts, where the traditional knowledge extraction methods could fall short in recalling multi-domain entities. To address this issue, this paper proposes a knowledge extraction method for multi-domain texts, including sub-domain relation discovery (SRD) and domain semantic features fusion (DSFF) module. SRD splits NSNR into sub-domains to facilitate the relation discovery. DSFF integrates relation features in the conditional random field (CRF) model to improve the capability of multi-domain entity recognition. Experimental results demonstrate that the proposed method could effectively extract structural knowledge from NSNR.


INTRoDUCTIoN
National Standards for Natural Resources (NSNR) (Huang, C. et al., 2018;Wang, S. et al., 2018) plays an important role in the utilization of China's natural resources such as marine, land and urban resources. It contains the standards of many kinds of important industries 1 and guides the future of national development and natural resource utilization. NSNR sets standards for multiple domains (Huang, C. et al., 2018), and the standards of each domain are organized from the perspective of different disciplines 2 , as shown in Figure 1. This characteristic could lead to duplicated and conflicting standards and makes the revision of NSNR difficult. Therefore, the knowledge extraction of NSNR (Wu, X et al., 2015) has significance for the analysis and revision of NSNR 3 .
For domain knowledge extraction (Chen, Y., 2018), entity recognition (Li, B., 2019) and relationship extraction (Wu, W. et al., 2019) are essential steps. The traditional methods of domain entity recognition and relationship extraction (Wang, B. et al., 2019) mainly include rule-based domain expert system (Jun-Ke, Z. et al., 2019) and machine learning (Chen, H. et al., 2008 based methods (Chen, Y. et al., 2020;Liu, Z., &Chen, H., 2017). These methods mostly target certain domains with single industry or subject, instead of multiple domains. For the expert system, it has higher accuracy, but it is greatly time-consuming and requires high labor cost (Eftimov, T. et al. 2017). Moreover, the rule designing process is complicated, where various detailed features of domain entities and relationships should be considered requiring lots of manual labor. In addition, this kind of method has poor portability, an expert system is only applicable to one domain. When the domain changes, the expert system needs to be redesigned. For machine learning-based methods or some mixed methods (Jiang, Y., 2019;Savova, G. K. et al. 2010), they are more automatic (Qi, Y. et al., 2020) than the expert system. But these methods usually rely heavily on tagged corpus or domain relational database . For NSNR, there are little tagged corpus. Therefore, it is difficult to use methods that demand lots of labeled domain corpus.
In this context, knowledge extraction from NSNR has the following challenges and features: (1) NSNR is a cross-industry domain (Liu, J. et al., 2019). It includes the standards of various industries, which contain multi-domain entities and relations (Mei, W., 2019;Yang, X. et al. 2019), leading to difficulties of domain entity and relation extraction.
(2) For the standards of an industry (Jun- Lin, W., 2018;Zhang, Y., 2020), they are interdisciplinary which means that these standards are described in the perspective of various disciplines, as shown in Figure 1. So, the subjects among the same industry standard files are usually very different. (3) There lacks tagged corpus by experts for NSNR. (4) The entity relationships in NSNR tend to express recommendations or constraints in semantics, such as "should" and "must". In response to the above features and challenges, designing a multi-dimension featured domain knowledge extraction method (Huang, X., 2017) which less depends on tagged corpus is of great significance for knowledge extraction from NSNR.
Traditional methods have limitations for the knowledge extraction from NSNR due to the pre-mentioned features. Due to the cross-industry characteristics of NSNR, the domain ontology designing becomes difficult. Moreover, the knowledge extraction methods that depend on the entity database of one certain domain also falls short due to the lack of tagged corpus. Besides, a subdomain could include multiple industries, which makes it not feasible to split sub-domains based on the theme of industry. To solve these problems, this paper proposes a knowledge extraction method for multi-domain texts, which includes sub-domain relation discovery (SRD) module for relation extraction and domain semantic features fusion (DSFF) module for entity recognition. For SRD, an indicator SDs_Dbw is designed to correctly split sub-domains from NSNR. A feedback mechanism integrates the indicator to iteratively discover the optimal splitting of the sub-domains. For DSFF, a conditional random field (CRF) based model integrates the domain relation features and semantic structures to improve the recall of multi-domain entities. It could recognize more entities through the representative relations (due to the feature (4) of NSNR) and the unique semantic patterns that are learned from a small set of tagged data.
The rest of this paper is organized as follows. Section II introduces previous works for domain knowledge extraction. Section III presents the overall framework of the proposed method. Section IV introduces the Sub-Domain Relation Discovery Model. Section V introduces the entity recognition model with domain semantic feature fusion. Section VI shows the experimental results and analyses. Section VII summarizes the contributions of this paper. Section VIII concludes this paper.

BACKGRoUND
The focus and themes of domain knowledge extraction vary from domain to domain (Kwan, M. & Cheung, P., 2012;Chua et al., 2012). In recent years, many domain knowledge extraction methods Chen et al., 2021;Netti et al., 2021) have been proposed in various industries, most of them focus on specific domains but less on multi-dimensional ones. Zhao et al. (2019) use the rule description method to recognize medical entities in unstructured medical data. They designed the patterns of medical entities and extracted the labels of entities through semantic analysis, which improved the accuracy of entity recognition in the medical domain. However, the method in this article focuses on the rule design of the lexical level of medical terms, which requires a great demand for experts' domain analysis and not portable for other domains. Lin (2019) proposes a dependency tree-CRF based model to automatically extract patent information terms. And an improved k-means algorithm (Yao, Y., &Chen, H., 2018) is proposed to perform clustering analysis (Soares, R. G. F., Chen, H., & Yao, X, 2017) on term tags to obtain the category characteristics of entities. The method tends to guarantee the retrieval accuracy of the patent filed. It realizes the automation of feature mining through label analysis of domain entities, but the model lacks the filtering of prediction. The effect is strongly dependent on the recognition ability of the CRF model. Zhao et al. (2020) design a domain entity feature learning method based on CBOW and CEW. It takes full account of the multi-word composition of professional terms of medical entities. Through entity feature fusion RNN model, they improve the recognition accuracy of medical entities. Qi et al. (2020)   These methods mainly target domains which have single subjects. But they are not effective in multi-dimension featured domains, which consist of various industries or subjects. For knowledge extraction from multi-domain texts, Huang (2017) proposes a multi-dimensional semantic knowledge fusion framework for government website information analysis. It focuses on the domain ontology construction in multi-dimension featured data set. The framework uses clustering methods to study the relationships among various domains and realizes the ontology construction of the "Haze Prevention and Cure" domain. However, it is too complex for the lightweight multi-domain knowledge extraction and requires lots of labor on domain ontology designing (Kwahk et al., 2007). The single-subject domain targeted knowledge extraction methods are not adapted with multi-domain texts, which focus on the entities and relationships in only one subject. And existing methods for multi-domain texts mostly have a high cost on domain ontology designing.

oVERALL FRAMEwoRK
This section briefly introduces the main components of the proposed method, i.e., modules of subdomain relation discovery (SRD) and domain semantic features fusion (DSFF), as shown in Figure 2.
SRD consists of three parts which includes feedback based sub-domain clustering, sub-domain relationship clustering and sub-domain relation extraction, which are briefly introduced as follows: 1) Sub-domain clustering is designed to cluster the theme of multi-domain documents, which split the documents into sub-domains based on the similarity distances among documents. However, the number of total sub-domain clusters is unknown yet. In response, a feedback mechanism is proposed, which adjusts the critical distance of hierarchical clustering (Liu, X. et al., 2017) according to the effect of sub-domain relationship clustering. Moreover, an indicator SDs_Dbw is designed to measure the effect of sub-domain relationship clustering. On this basis, a feedback mechanism adjusts the sub-domain clustering iteratively to produce the optimal sub-domain clustering (detailed in Section III). 2) Sub-domain relationship clustering performs the relation discovery in sub-domains. Dependency parsing (Jian, Z. et al. 2020) is executed on the sub-domain's texts to obtain the dependency trees of texts. And word embedding (Nikfarjam et al., 2015;Segura-Bedmar et al. 2015;Xu, J. et al., 2016) is performed on the Chinese word segmentation (Feng, L. et al., 2020) of the dependency trees. Certain components of the dependency tree are extracted as the relation words. Subsequently, the relation representations are clustered and filtered to obtain the sub-domain's relation trigger words. 3) Relation trigger word matching is performed for sub-domain relation extraction, and the extracted relation sentences are filtered through the semantic correcting gate. Some of the filtered relation sentences, defined as clear relation sentences, are chosen as to-be-tagged data for entity recognition.
DSFF is designed to extract entities in sentences with less work of tagging. A domain semantic feature fusion method is proposed, which learns the dependency tree construction features. Different sub-domains have similar syntactic characteristics, and it is reflected by the dependency trees. To realize syntactic feature fusion, a conditional random field (CRF) model is integrated with the dependency tree by taking the syntactic components as properties of word sequences. By this way, the CRF model could learn the domain semantic patterns. On this basis, the predicted entities are re-completed (introduced in Section IV) through an entity correcting gate.

SUB-DoMAIN RELATIoN DISCoVERy MoDEL
This section introduces the Sub-Domain Relation Discovery Model (SRD Model) from three parts, subdomain clustering part, sub-domain relationship clustering part and sub-domain relation extraction part.

Feedback Based Sub-Domain Clustering
The Feedback Based Sub-Domain Clustering aims to divide the multi-dimension featured domain into subdomains according to the industries. It uses the tf idf -(Zhou X., 2020) similarity cosine distance as the sub-domain distances for clustering.
In order to analyze the similarity between the documents in the multi-dimension featured domain for subdomain clustering, they are transformed into vectors through tf idf values of Chinese word segmentation, which represents the sub-domain features of documents. Let the multi-dimension featured domain documents be: d i represents each document in the domain. Then process Chinese word segmentation on each document d i and remove the stop-words from it, the Chinese word group sequence of d i is obtained, supposed as: W si represents each Chinese word group in the word sequence of d i . The set that contains all the appeared word groups is supposed as W domain , where W domain can be expressed as: g i represents each appeared Chinese word group in the multi-dimension featured domain. T num represents the type number of appeared Chinese word groups. To represent the semantic features of one document through word frequency, the tf value of word group g j in document d i can be expressed as: ( ) represents the times that a word group g j appears in a document d i . The idf value is to represent the weight of each word. The idf value of word group g j can be expressed as: DF g j ( ) represents the number of documents where a word group g j appears. The tf idf value of word group g j in document d i denote as: In order to process sub-domain clustering, a domain semantic feature matrix is needed, where each row represents the semantic feature of one document. And the tf idf matrix of the multidimension featured domain is the matrix, supposed as: For sub-domain clustering, the tf idf vector matrix needs to be turned into a distance matrix, which can represent the similarity of each two documents.
Then hierarchical clustering is processed on M cids tf idf and a dendrogram is generated. The dendrogram shows the clustering process of the domain, and the critical distance will be used to obtain the sub-domain clusters from the dendrogram with a feedback mechanism. The root node of the dendrogram is supposed as DG tf idf -.

DG
HiCluster HiCluster is shown in Algorithm 1. The details of the dendrogram are introduced as follows. Suppose N 0 as the root node of the dendrogram and suppose the child nodes of N i as N i 2 1 + and N i 2 2 + . Each leaf node N leaf of the dendrogram corresponds to each document in the multi-dimension featured domain di. Each non-leaf node N nleaf represents a cluster, which is made up of N nleaf rooted sub-tree's leaf nodes. Besides, each non-leaf node also represents one merging process between two closest clusters (two child nodes). Therefore, each node N i in the dendrogram has a distance property which represents the distance of the two merged clusters, supposed as Dis N i of total nodes in the N i rooted sub-tree is supposed as NodeList N i ( ) , and the list of total leaf nodes in the sub-tree is supposed as LeafNodes N i ( ) . They are used in HiCluster , algorithm 1.
The critical distance CrD doamin determines the clustering results from the dendrogram, nodes with lower merged distance than CrD doamin are viewed as one cluster. According to CrD doamin each node N j that meets following conditions is a root node of a cluster (sub-domain).

Dis N CrD
To obtain the clusters (sub-domains) from dendrogram according to CrD doamin , the function GGCD is designed as shown in Algorithm 2. After the sub-domains D t are obtained, CrD doamin is adjusted according to the feedback validation SDs Dbw _ of relationship clustering in D t . Then the best CrD doamin value is obtained and GGCD is repeated for the most reliable sub-domains, supposed as D sub .
, , ..., 1 2 c mean text data of each sub-domain. The details are introduced as follows. The feedback mechanism is to make the sub-domain clustering adapt to the relationship clustering. So, the CrD doamin is adjusted according to an evaluation metric, which can evaluate the sub-domain relation clustering. The s Dbw _ (Jianhua, T., & Hongzhou, T., 2009) validation is the best validation to evaluate the clustering now, the lower, the better. But in this situation, it cannot directly evaluate the sub-domain relationship clustering due to the scale of sub-domains. When the sub-domain is large with many documents, the sub-domain self contains various industries very possibly, which leads to a broader word vector model space (Chen, H. et al. 2013a(Chen, H. et al. , 2013b(Chen, H. et al. , 2014. And relation words in different industries usually have greater word-vector distances, and the s Dbw _ will be lower than when they are in different sub-domains. In response to this situation, the scale of each sub-domain cluster is considered as a negative factor of the validation.
And it is determined by CrD doamin , because CrD doamin determines the sub-domain clustering, and sub-domain relationship clustering is dependent on the sub-domain clustering result. SUB _CLU represents the sub-domain clustering process before, and SDS represents the SDs Dbw _ calculation. The lower SDs Dbw _ is, the better sub-domain relationship clustering is, and the better sub-domain clustering is. So CrD doamin takes the value which makes the SDs Dbw _ value the lowest.

Sub-Domain Relationship Clustering
The Sub-Domain Relationship Clustering targets on the data in the same sub-domain S i in D sub . The details are presented as follows.
To obtain the operable data from the documents, the sub-domain data S i is split into sentences, supposed as S data .
In order to obtain the relation words of sentences, for each sentence S i in S data , dependency parsing is processed, and its dependency tree is obtained, supposed as T i .
{ } 1 2 represents each relation words in R base . The relation words will be transformed into vectors for relationship clustering, through word embedding. For word embedding, the nodes of T i are taken in order, supposed as Nodes T i ( ) , as the word group sequence, defined as W i . And the word sequence of the source data is supposed as W total .

W Nodes T Nodes T Nodes
T skip gram represents the training process with skip-gram model. In the process of word embedding, suppose one-hot encoding length as V, the length of embedding level as N, then the word-vector of each word group r i in relation word set R base is obtained, supposed as M total W V Ń represents the embedding level of the model, which contains the representations of embedded words through the unsupervised learning of the skip-gram model. Index r i represents the word-vector corresponding to ri. Suppose the relation word-vector set as V r .
Then it comes to sub-domain relationship clustering for the word vectors that have the relation features. Hierarchical clustering is processed on relation word-vector set V r according to cosine distances. The dendrogram of the hierarchical clustering is obtained, and the root node of which is supposed as DG ROOT . The cosine distance matrix of V r can be obtained by the same way as in sub- The symbols above are introduced before in sub-section A. The dendrogram has been introduced in sub-section A as well and the critical distance of the hierarchical clustering here is supposed as CrD . The ideal critical distance is a little higher than the densest cluster-merged distance ranges, so the relation words can be clustered in large density groups. But in some sub-domains, which have a small number of relation words, the above critical distance makes confusion. Different types of relation words will be clustered in the same group, because the word-vectors usually have great distance close to the maximum. In this situation, CrD takes the following value.

CrD MIN
Dis N 0 ( ) represents the distance of merging clusters of N ROOT Node 0 ( ) . The clustering result is supposed as C r .

Sub-Domain Relation Extraction
For each T i of s i in S data , match node ROOT T i ( ) and node ADV ROOT T i ( ) With a list of non-clear right entity word groups, the non-specific relations will be filtered out. Then the clear relation sentences of sub-domain S i are obtained, supposed as RI S i .

RI s s S f True FIL s True
For T i of s i in S , tag the entity nodes, supposed as Ne T i ( ) . The tagging process is supposed as f s Tag i ( ) . Then the tag sequence of s i is obtained, supposed as tg i .
Then process CRF training and prediction to get the entities.

PreE SeqInfer Model RI
Train seqs ( ) represents the CRF training on the input sequences seqs . SeqInfer represents the sequence inference. The prediction entities PreE are composition completed through the entity correcting gate. It is designed with attention to the syntactic component sequence. Some clear relation sentences have separated parts in one entity, such as {管理, 系统, 和, 支撑, 环境, 是, 数据, 存储, 、, 管理, 和, 运行, 维护, 的, 软硬件, 及, 网络, 条件} ({management, system, and, support, Environment, is, data, storage,、, Management, and, executing, maintenance, hardware and software, and, network, conditions}), the entities 数据管理 (data management), 数据运行 (data executing) and 数据维护 (data maintenance) are separated. In response to this, the composition completing gate connects the separated parts, through constraints on the syntactic components of dependency trees. Figure 5 shows the process of Domain Semantic Features Fusion Model.

Dataset
The proposed method is evaluated on publicly available documents of NSNR, which can be obtained from the official website of Ministry of Natural Resources of the people's Republic of China 4 . We used 172 files of them, including national standards for land, geology and minerals, national marine standards, and national surveying and mapping standards.

Sub-Domain Clustering
In order to show the multi-dimension features of the NSNR domain and for sub-domain clustering, tf idf vector hierarchical clustering are processed on the standard documents. The dendrogram of it is shown in Figure 6. It can be noticed that most documents have a great cosine distance in tf idf vector among one another. The standard-dimension feature leads to this result, even standard documents in the same industry may have different subjects. For example,"基础地理信息数据库基 本规定" ("Basic rules of basic geographic information database") and "地理信息质量评价过程" ("Evaluation process of Geographic Information quality"), both describe the standards of geographic information, but one is about database rules and the other focuses on the evaluation, which are described in different professional perspectives. So the reliable CrD domain ranges from 0.7 to 0.9.
The feedback mechanism on sub-domain clustering CrD domain is shown in Figure 7. The CrD SDs Dbw -_ curve first shows a downward trend and then an upward trend, which accords with the actual case. It can be told that the best CrD domain is 0.81, which is in line with expectations, a little greater than the densest cluster merged distances. The sub-domain clusters in this case have the best relationship clustering effect. It can be also told that when the CrD domain comes far from the extreme value 0.81, the SDs_Dbw value tends to rise up. Here we discuss the extreme situation of CrD domain and its influence to the proposed metric SDs_Dbw. When CrD domain comes very high (close to 1), all the documents will be gathered in one same sub-domain. The weight of the penalty item DN (sub-domain document numbers) in equation (14) will be extremely large (the number of total documents) and the SDs_Dbw will become bad (high) as well. on the other hand, when CrD domain comes extremely low (close to 0), all the sub-domains self contain only one documents, and the metric of relation clustering must be extremely bad. Here the metric s_Dbw of each sub-domain in equation (14) becomes high and thus the proposed metric SDs_Dbw will become bad (high) as well. Through the discussion on extreme situations above, it can be reflected that the proposed metric SDs_Dbw which evaluates the relation clustering of multi-dimensional domain are reasonable and reliable.
The proposed metric SDs_Dbw is more than reasonable, but also flexible. The utilized clustering metric s_Dbw can be replaced by any other reasonable metric, such as AMI, ARI, V-measure, et al., as long as it can evaluate the relation discovery of each sub-domain, including but not limited to clustering method metrics. The kernel of the metric SDs_Dbw is the penalty of its expression, the scale of each sub-domain, when the relation discovery metric comes higher to be better which is different from s_Dbw, the penalty item can be put into the fraction in consistent with the relation discovery metric. This pattern can be utilized in any multi-dimensional task when the scale of each sub-clusters make influence in the next options.

Relationship Clustering
There are usually two types of situations in the hierarchical clustering on sub-domain data.
Situ. 1: One is that the sub-domain has obviously various relation types with many relation words.
The cluster merged distances are mostly low and individually high. Because distances between relation word-vectors in the same type are close, and distances between different types are high. Situ. 2: The other situation is that the sub-domain which contains only a few documents, even only one or two. It has some relation types, and each one has only a few relations. Therefore, the distances among the relation word-vectors are mostly large, even close to the largest one in the dendrogram.
The two situations are shown at Figure 8. CrD in Situ. 1 should be between mostly low distances and individually high distances, while CrD in Situ. 2 should be lower than the highest distance. In response to this, the CrD of the relation hierarchical clustering takes the minimum of two designed validations.
Usually the Situ. 2 is more difficult to deal with. There is a relation clustering sample of a subdomain in Situ. 2 shown in Table I, which takes CrD above. It can be told that the relation words, which means 'use' and 'consist', are clustered together and the relation words, which means 'define', are clustered together. This is close to manual classification.

Entity Recognition on Domain Semantic Features Fusion Model
To test the effect of the semantic fusion entity recognition model, we tag some clear relation sentences in a sub-domain. Clear relation sentences in one document are taken as training data, those in other four documents are testing data 5 . The entity counting recognition task is processed for the performance contrast between the DSFF model and the traditional CRF model. The labeled entities in the training  documents will be fed to the models and the predicted entities in testing data will be counted with both correct ones and error ones. Precision, Recall and F-Measure are used for comparison. The sequence of about 4000 words are taken as training data, which is split into 10 equal parts in order. Different scales of training data are taken randomly form the 10 split parts, fed into DSFF model separately in 5, 6 and 7 parts.

P Truepositives Truepositives Falsepositives
There is a comparison among semantic feature fusion model, word group sequence without properties model and POS tags fusion model in Table II. We compare the performances in two tasks, one is on the entity types which focuses on the recognized types of entities and ignores the appeared times of each entity. The other is on the entity numbers which counts the appeared times of each entity. It can be told that the semantic feature fusion model has more advantage than the other 2 models in entity type recognition, especially for recall validation, which confirms that the semantic feature fusion model has stronger ability in extract new syntactic-construction similar entities. The POS fusion model sometimes has higher recall on the total numbers of extracted entities but always has a lot lower precision than the other two models, which indicates that the single POS fusion model may cause confusion in recognizing the subject entity in a relation instance, because it cannot learn the syntactic features.
To test DSFF model's entity recognition ability with less training data, we use part of the tagged data CDUG as training data of domain semantic feature fusion model (DSFF). The parts of the tagged data are respectively 50%, 60% and 70%, and the performances between DSFF model and traditional CRF model are shown in Table III. It can be told that with 60% training data, DSFF model has better performances in recall and F-measure than traditional CRF model on all tested documents, which confirms that DSFF model has a stronger ability in recognizing the entities through syntactic construction features. Compared to the traditional entity recognition model, it can reach the same or better performance with less training data.

CoNTRIBUTIoNS
The contributions of this paper are as follows: 1. This paper extracts structural knowledge from National Standards for Natural Resources (NSNR), which facilitates the revision and execution of NSNR to help make better utility of China's natural resources. 2. A multi-domain splitting model with feedback mechanism is proposed for relation extraction from NSNR, which could automatically split the multi-domain texts into sub-domain texts with little demand of expert efforts. 3. A new evaluation metric based on sub-domain relationship clustering is designed to better split multi-domain texts, which is experimentally demonstrated to be effective. 4. An entity recognition method with semantic features fusion is designed to learn the multidimensional semantic features, which reaches good performance with less training data and greatly reduces the workload of manual tagging.

CoNCLUSIoN
This paper proposes a knowledge extraction method for multi-domain texts to extract structural knowledge from National Standards for Natural Resources (NSNR). The proposed method addresses some challenges of knowledge extraction from NSNR: 1) The domain is made up of different subdomains. 2) Each sub-domain is interdisciplinary. 3) Different sub-domains are partly similar. 4) There lacks tagged corpus. Concretely, a multi-dimension splitting method with feedback mechanism is proposed for knowledge discovery of the multi-domain texts. For the feedback mechanism, SDs Dbw _ validation is designed with the consideration of wrong merging risks to evaluate subdomain relation clustering. In sub-domain relation clustering, a CrD valuing method is designed for automatic relation discovery. For entity recognition, we use a semantic feature fusion CRF model through dependency parsing feature fusion. Experimental results demonstrate that the proposed method has good effect on relation discovery and entity recognition of NSNR. However, the method still has certain limitation. When the data covered by the multi-dimensional domain is too extensive, certain entity features of some sub-domains could be lost. In future work, some entity frequencyrelated indicators could be explored and integrated to break this limitation.