The Impact of News on Public-Private Partnership Stock Price in China via Text Mining Method

The Impact of News on Public-Private Partnership Stock Price in China via Text Mining Method

Poshan Yu, Zhenyu Xu
Copyright: © 2021 |Pages: 23
DOI: 10.4018/978-1-7998-4963-6.ch013
(Individual Chapters)
No Current Special Offers


In data analytics, the application of text analysis is always challenging, in particular, when performing the text mining of Chinese characters. This study aims to use the micro-blog data created by the users to conduct text mining and analysis of the impact of stock market performances in China. Based upon Li's instance labeling method, this chapter examines the correlation between social media information and a public-private partnership (PPP)-related company stock prices. The authors crawled the data from EastMoney platform via a web crawler and obtained a total of 79,874 language data from 10 January 2017 to 28 November 2019. The total material data obtained is 79,616, which the authors use for specific training in the financial corpus. The findings of this chapter indicate that the investor investment sentiment has a certain impact on the stock price movement of selected stocks in the PPP sector.
Chapter Preview

Market Overview Of China’S Ml Industry

Definition and Classification of ML

ML refers to the discipline that specializes in how computers simulate or implement human learning behaviors to acquire new knowledge or skills, allowing computers to reorganize the existing knowledge structure and continuously improve their performance. ML is based on data, searching for rules by studying sample data, and predicting future data based on the rules obtained. ML is the core of artificial intelligence (AI), a wide range of AI fields such as data mining, computer vision, NLP, and biometric recognition.

  • (1)

    According to different learning modes, ML can be divided into supervised learning, unsupervised learning and reinforcement learning:

① The training data of supervised learning has classification labels. The higher the accuracy of the classification label, the higher the accuracy of the learning model. Supervised learning establishes a function model based on the given training data to realize the annotation mapping of the new data. Supervised learning algorithms include regression and classification, and application areas include NLP, information retrieval, text mining, handwriting recognition, spam detection, etc.

② Unsupervised learning uses unlabeled limited data to describe the structure or law hidden in the data. Its typical algorithm is clustering. Unsupervised learning does not need to use manually labeled data as training samples, and avoids classification errors caused by positive sample offset and negative sample offset. Application areas of unsupervised learning include economic forecasting, anomaly detection, data mining, image processing, pattern recognition, etc.

③ Reinforcement learning refers to a learning mode that maximizes the return of the subject in the process of interaction with the environment. The purpose of reinforcement learning is to achieve the best evaluation of the agent through the external environment. Reinforcement learning is widely used in robot control, unmanned driving, industrial control and other fields.

  • (2)

    According to the depth of the algorithm network, ML can be divided into shallow learning and deep learning:

① The number of hidden layers of the shallow learning algorithm network is small, the algorithm framework is simple, and there is no need to extract multi-level abstract features. Typical shallow learning includes support vector machines, logistic regression and so on.

② Deep learning is a self-learning method based on multi-layer neural networks and large amounts of data as input rules. It relies on a large amount of actual behavior data provided to it, that is, the training data set, to adjust the parameters and rules in the rules. The deep learning algorithm network has many hidden layers and complex algorithms. Compared with shallow learning, deep learning pays more attention to the importance of feature learning. Typical deep learning algorithms include convolutional neural networks, recurrent neural networks and so on.

Key Terms in this Chapter

Deep Learning: It is a data-based learning algorithm based on artificial neural networks. Deep learning is an algorithm that represents learning based on data in ML.

Crawler: Crawlers or web crawlers, also known as web spiders, are web bots that automatically allow users browse the World Wide Web. Web crawlers let users save the pages they visit so that search engines can generate indexes afterwards for use.

Public-Private Partnership (PPP): A co-operation between government and social capital is a public infrastructure project operation mode. Under this model, private enterprises, private capital and the government are encouraged to participate in the construction of public infrastructure. Broadly speaking, PPP refers to the participation of non-public sector resources in the provision of public products and services in the process of co-operation between the government and the private sector, which generates more beneficial results to partners than expected individual actions.

Artificial Intelligence (AI): It refers to a system or machine that imitates human intelligence to conduct tasks, and can continuously improve itself according to the collected information.

Machine Learning (ML): ML is a branch of artificial intelligence. ML is a way to realize AI. By using ML, users may solve AI problems. ML theory is mainly about designing and analyzing algorithms that enable computers to “learn” automatically. ML algorithm, as a means, allow users to predict unknown data.

Natural Language Processing (NLP): NLP allows computer users to convert input languages into interesting symbols and relationships that are then processed for purpose. NLP helps users explore how to handle natural languages, which include many aspects and steps, including cognition, understanding, and generation. Natural language generation systems convert computer data into natural languages.

Text Mining: Text mining (also called text analysis) is the process of converting unstructured text data into meaningful and actionable information. Text mining uses different AI technologies to automatically process data and generate valuable insights, enabling companies to make data-driven decisions.

Complete Chapter List

Search this Book: