Traditional user interface design generally deals with the problem of enhancing the usability of a particular mode of user interaction, and a large body of literature exists concerning the design and implementation of graphical user interfaces. When considering the additional constraints that smaller mobile devices introduce, such as mobile phones and PDAs, an intuitive and heuristic user interface design is more difficult to achieve. Multimodal user interfaces employ several modes of interaction; this may include text, speech, visual gesture recognition, and haptics. To date, systems that employ speech and text for application interaction appear to be the mainstream multimodal solutions. There is some work on the design of multimodal user interfaces for general mobility accommodating laptops or desktop computers (Sinha & Landay, 2002). However, advances in multimodal technology to accommodate the needs of smaller mobile devices, such as mobile phones and portable digital assistants, are still emerging. Mobile phones are now commonly equipped with the mechanics for visual browsing of Internet applications, although their small screens and cumbersome text input methods pose usability challenges. The use of a voice interface together with a graphical interface is a natural solution to several challenges that mobile devices present. Such interfaces enable the user to exploit the strengths of each mode in order to make it easier to enter and access data on small devices. Furthermore, the flexibility offered by multiple modes for one application allows users to adapt their interactions based on preference and on environmental setting. For instance, handsfree speech operation may be conducted while driving, whereas graphical interactions can be adopted in noisy surroundings or when private data entry, such as a password, is required in a public environment. In this article we discuss multimodal technologies that address the technical and usability constraints of the mobile phone or PDA. These environments pose several additional challenges over general mobility solutions. This includes computational strength of the device, bandwidth constraints, and screen size restrictions. We outline the requirements of mobile multimodal solutions involving cellular phones. Drawing upon several trial deployments, we summarize the key designs points from both a technology and usability standpoint, and identify the outstanding problems in these designs. We also outline several future trends in how this technology is being deployed in various application scenarios, ranging from simple voice-activated search engines through to comprehensive mobile office applications.