In this chapter the authors discuss a particular approach to the creation of socio-technical systems for the meeting domain. Besides presenting a methodology this chapter will present applications that have been constructed on the basis of the method and applications that can be envisioned. Throughout the chapter, illustrations are drawn from research on the development of meeting support tools. The chapter concludes with a section on implications and considerations for the on-going development of social technical systems in general and for the meeting domain in particular.
Assimilation into the Borg Collective might be inevitable, but we can still make it a more human place to live.
Socio-technical computing inherits the complexity related to software engineering and system integration whilst embedding the human in the loop. It also inherits the difficulties of understanding and modeling human-human and human-computer interaction in the context of a changing environment (see Clancey, 1997). In this chapter we will outline an approach to the development of Social Technical Systems, with the focus on meeting support. This approach can be characterized as theory-informed data-driven. In essence the method consists of the following four steps.
Step 1: Collection of a multimodal corpus of social activity signals
Step 2: Description of a myriad of aspects of system relevant activities (annotation) in the collected material
Step 3: Discovery of interdependencies between recorded signals and annotations, annotations and annotations, and signals and signals (e.g. by means of machine learning.)
Step 4: System creation based on knowledge obtained from the previous steps
In the collection and annotation steps, the process relies heavily on the insights provided by the social sciences; in particular sociology, social psychology and linguistics. In return, the annotated collection and the machine learning effort may provide important insights for social theorizing as the annotated corpus provides the researcher with statistics about the occurrence and distribution of certain phenomena and interesting correlations. Increased insight into how people behave can point out problems they encounter in their activities that may be relieved by technologies that are based on this understanding of their activities as derived through Steps 1 to 3. This means that these steps can be viewed both as a way into requirements engineering and as providing the basic data and algorithms to build the tools that can solve some of these problems.
Technology that inherits these possibilities can be said to be social for three reasons. The first is in the way in which the system supports social activities. The second relates to the way the technology can provide insight into social processes which occurs when correlations between phenomena are found. The third reason in which the qualifier social relates to the term technical system is in how social theories are at the basis of the construction of the technical applications. Given theories on how humans ‘operate’, technology is equipped with the manual in order to understand and support their operating.
As example case for this chapter our focus is on small business meetings. Currently several projects worldwide are investigating the way technology can support the needs of people in meetings and how it can relieve them of some of the frustrations that meetings seem to impose upon them. Examples in this chapter will be drawn mainly from studies in a series of European projects on meeting analysis and meeting support: M4, AMI, and AMIDA. These projects investigated how human-centred computing techniques can detect and interpret activities of participants in smart meeting rooms and how these techniques can be used to design tools that support meeting participants in their encounters and activities.
This chapter discusses a variety of methodological issues and charts several results showing the rationale behind the scientific drive to develop technological support for social gatherings and events. The chapter also contains a short discussion on ethical issues and potential pitfalls on the road ahead.
Key Terms in this Chapter
Smart Meeting Room: A smart meeting room uses multi-modal sensors to detect and capture the verbal and nonverbal behavior of meeting participants. This is done in order to provide real-time support to these participants and to record meeting activity for off-line intelligent browsing and retrieval of meeting activities. Modeling multi-party human-to-human interaction, e.g. by using machine learning approaches, helps to recognize important activities and events during a meeting.
Sensor Information: Sensors in smart environments provide us with information about its inhabitants, their activities, and their interactions. Cameras and microphones allow audio-visual processing of perceived activity. Proximity and pressure sensors tell us about the location of inhabitants. Such sensors allow us to track the inhabitants and their activities in the environment. Devices that measure physiological information, including brain activity, can provide detailed information about the affective state of a user.
Machine Learning: Machine Learning: This subfield of artificial intelligence is concerned with the design, analysis, implementation and applications of programs that learn from experience. The discovery of general rules from large data sets using computational and statistical methods is an important application area. Such large data sets can, for example, be corpora that contain audio and video recorded human-human or human-computer nteraction.
Multimodal Interface: Interface to a computer system (from a mobile device to a smart environment) that allows multiple modes of interaction. Among the modalities can be speech, touch, gaze, or gestures. Modalities can supplement one another, but also complement one another.Combining different input modalities is called fusion. It allows a system to disambiguate user input in order to get a more complete understanding of a user’s commands or behavior.
Corpus-based Research: Traditionally a corpus is a collection of language examples: written or spoken examples of words, sentences, phrases or texts. Nowadays a corpus can be any collection of examples, for example, human-human interactions, protoin interaction, video fragments, maintenance information, etc. A corpus is collected in order to learn from it, that is, to extract domain-specific information. Examples can be analysed and rules and models underlying the examples can be discovered. Machine learning algorithms are used to extract relationships between examples. Manual structuring of such data (annotation) allows the integration of human preferences and knowledge in machine learning algorithms.
Nonverbal Behaviour: Nonverbal behaviour not only supports verbal communication. By observing nonverbal behavior, the observer, whether it is a computer system or a human observer, can learn about the intentions, the attitudes and the feelings of its human partner. Nonverbal behavior includes gaze behavior, facial expressions, body posture, gestures, and prosodic information, but it can also include physiological information. Hence, supporting verbal communication, issuing nonverbal commands, and allowing our human or computer partners to learn about our feelings, intentions, and preferences are the main reasons for needing to detect and interprete nonverbal behavior.
Annotation Process: A corpus of examples, whether these are language or interaction examples (distinguishing between different kinds of interaction) can be annotated with human knowledge that makes it possible to distinguish characteristics of these examples. Machine learning algorithms can be guided and supported by such annotations and machine learning results provide feedback about our intuition and heuristics concerning which features of the examples help to distinguish them into classes. To support human annotators, tools are developed that visualize and otherwise emphasize characteristics of the examples in the corpus.