In this chapter, we introduce alternative ways to access digital audio collections. We give an overview of existing applications based on tow-dimensional, map-like representations of music collections. Further, we explain two applications for accessing audio files that are based on the Self-Organising Map, an unsupervised neural network model. These two applications—PlaySOM and PocketSOM—will be explained in greater detail, paying special attention to their unique properties and implementations for several mobile devices. These examples are supposed to gain the readers’ interest for alternative interfaces to large audio collections. Besides, we hope to show that alternative interfaces are feasible for both desktop computers and mobile devices and offer a practical approach to pressing issues in accessing digital collections.
Digital Libraries Of Audio Collections
An increasing number of users adapts new technologies like MP3 players and manages their audio collection digitally. Not only the wide availability of personal audio devices such as the Apple iPod™ drives the increasing private use of digital media files (i.e., audio and, more recently, video files); also, the music industry starts adapting new distribution channels, and at the same time, an increasingly large user base is buying their music online. This makes the need for advanced methods to browse and search for music a more pressing matter than ever. The tremendous demand for feasible means of navigating through ever-growing numbers of digital entities by a rising number of users and providers clearly shows the great potential new and more sophisticated approaches could hold.
While text-based searches for artist and track names, as well as browsing through collections that are hierarchically structured according to artist, album and track categories constitute the de-facto standard for accessing music collections on PCs and mobile devices. New means for visualising and exploring large audio collections are being developed based on automatic analysis of the acoustic content of the audio files. Different visualisations have been proposed, many using some two-dimensional landscape to map music files on. Most of them incorporate some kind of clustering (i.e., mapping from a high-dimensional feature space to a usually two-dimensional output space). A particularly interesting effect of using such unsupervised learning techniques is the potential to overcome problems stemming from manually assigned genre tags, since they may not suit every user or may simply be wrong (Pachet & Cazaly, 2000). Approaches offer varying interaction possibilities like drawing trajectories on the map, selecting the music underneath, or marking regions on a map, which are discussed in detail in this chapter.
A disc and rectangle visualisation used to display and manipulate playlists was proposed in (Torrens, Hertzog, & Arcos, 2004). The disc visualisation gives a better visual idea about the proportions within the collection, whereas zooming was more useable with the rectangle visualisation.
Several teams have been working on user interfaces based on the Self-Organising Map (SOM). The SOM is an unsupervised neural network that provides a topology-preserving mapping from a high-dimensional feature space onto a two-dimensional map in such a way that data points close to each other in input space are mapped onto adjacent areas of the output space. The SOM has been extensively used to provide visualisations of and interfaces to a wide range of data, including control interfaces to industrial processing plants (Kohonen, Oja, Simula, Visa, & Kangas, 1996) to access interfaces for digital libraries of text documents (Rauber & Merkl, 2003).
Creating a SOM-based interface for Digital Libraries of Music (i.e., the SOM-enhanced JukeBox (SOMeJB)) was first proposed in Rauber and Frühwirth (2001) with more advanced visualizations, as well as improved feature sets being presented in Pampalk, Rauber, and Merkl (2002) and Rauber, Pampalk, and Merkl (2003). Since then, several other systems have been created based on these principles, such as the MusicMiner (Mörchen, Ultsch, Nöcker, & Stamm, 2005), which uses an emergent SOM. A very appealing three-dimensional user interface is presented in Knees, Schedl, Pohle, and Widmer (2006), automatically creating a three-dimensional musical landscape via a SOM for small private music collections. Navigation through the map is done via a video game pad, and additional information like labelling is provided using Web data and album covers.
A mnemonic SOM (i.e., a Self-Organising Map of a certain shape other than a rectangle) is used to cluster the complete works of the composer Wolfgang Amadeus Mozart to create the Map of Mozart (Mayer, Lidy, & Rauber, 2006). The shape of the SOM is a silhouette of its composer, leading to interesting clusterings (like, for example, the accumulation of string ensembles in the region of Mozart’s right ear). An online demo is available at http://www.ifs.tuwien.ac.at/mir/mozart.