The chapter illustrates the importance of human factors required for building natural language processing (NLP) systems. The authors will examine different NL interface style and processing and correlate them with human factors such as: domain, interface, text style and medium of communication. They verified our assumption by presenting a NLP system which was built as a proof-of- concept. However, because of semantics and the very nature of language, the authors discuss our concern with possible abuse by unscrupulous persons who would attempt to exploit NLP systems for reasons other than legitimate information exchange.
TopIntroduction
Techniques of automatic natural language processing (NLP) have been under development since the earliest computing machines, and in recent years these techniques have proven to be robust, reliable and efficient enough to lead to commercial products in many areas. The applications include machine translation, natural language interfaces and the stylistic analysis of texts but NLP techniques have also been applied to other computing tasks besides these.
A natural language (NL) interface accepts user inputs in natural language allowing interaction with some system, typically a retrieval system, which then results in sufficient responses to input NL text or query statements. However, these systems can also be used subsequently to facilitate communication with the outside world. Hence, a natural language interface should be able to translate unrestrained natural language statements into appropriate actions for the system, and there should be safeguards should the system be used to communicate freely with others.
This type of unrestricted NL interface is an attractive choice because, if it could be built, it would offer many advantages. Firstly, it does not require any learning and training, because its structure and vocabulary are already recognizable by the user. Secondly, natural language allows users to encode composite meanings. Thirdly, this type of interface is text-based, making it suitable for all types of devices and medium. In contrast, form-based or graphical user interfaces need more sophisticated and specific resources.
Incorporating a NL interface requires translating ambiguous user inputs into clear intermediate representations. Two main problems are associated with building such systems: the first one is handling linguistic knowledge and the second one is handling domain knowledge.
A study of the current workplace shows that the deployed NL interface systems are rare and most of them are only prototypes. This problem is not related to the openness or restrictiveness of the domain. Although most task oriented activities are domain-specific, we do not yet find any restricted NL interface-based operational systems especially for poorly informatized languages such as Arabic. Not all languages have received equal investment in linguistic resources and tool development (Riloff, Schafer et al. 2002). As an example, most of the research published on IE discusses problems related to English, which is a resource-rich language. While some of the existing English-based systems performance is comparable to that of human experts, by comparison, Natural Language Processing (NLP) in the Arabic language is still in its initial stage (Hammo et al. 2002).
NL-based systems have the reputation of high development cost and low quality. Our goal in this chapter is to show that the most important issue in building NL based systems is the incorporation of human factors in the development, regardless of the target language in terms of resources richness, or type or complexity of the domain, or even cleanliness of the input text. If this approach is combined with treating a NLP project as an engineering problem, and not only as a traditional linguistic problem, it is almost guaranteed to produce a system with industrial quality and high extensibility, with the minimum resources possible.
The key success factors in building NL systems are understanding how people encode their thoughts, and finding the right representation to model the concerned domain knowledge. Furthermore, to help explore Socially Aware Language Understanding, it is important to identify a set of tasks and task performance goals, target groups of language users, and a set of situations, and then evaluate the effectiveness of meeting increasingly challenging goals under different approaches and styles of computationally mediated language usage.
In this chapter, we will first give background information about natural language interface styles and processing. Then, we present the human factors that involved in building a NLP system. Methods used in incorporating human factors are also discussed. Finally, we will discuss, as a case study, a system that can handle spontaneous Arabic SMS text and show human factors were key success elements.