Article Preview
Top1. Introduction
In the last decade we could observe a significant improvement of photo and video capturing capabilities of smartphones. Hence, these days state-of-the-art smartphones support image capturing capabilities with 16 mega-pixels, including optical sensor-based image stabilization, as well as 4K video recording at very high frame rates (up to 240 frames-per-second). Furthermore, current smartphones provide multi-touch interaction with high resolution screens, as well as force-enabled touch interaction (e.g., soft tap and hard tap).
What significantly lags behind, however, is the software implementation for video interaction – particularly when user to search for content in videos. Even in current devices the video interaction is obviously optimized for simple video playback, and becomes very inconvenient when users want to search for content in videos (which is of definite interest when recording more and longer videos). As shown in Figure 1, the standard interface of mobile video players uses a very similar interaction model to the one developed for desktop PCs with mouse and keyboard interaction several years ago: buttons for play, pause, jump forward/backward, and a scrubbing-bar – also known as seeker-bar or thumb-bar – for navigation/random access in video.
Figure 1. Default video payer interfaces on current smartphones (Left: Android 5, Right: iOS 8)
The very problem here is, however, that users – when interacting with their finger – cannot smoothly navigate in the video, and therefore cannot conveniently search for content. In a long video, when they move the scrubbing-bar with the finger just by little, the player already jumps by seconds or even minutes. This behavior makes content search an awkward and frustrating process on mobile devices. Also, when using the scrubbing-bar on iOS (right part of Figure 1), which is place at the top area of the screen, the hand occludes the content of the video. This significantly hinders seeking for content in a video.
It has to be noted, however, that on Apple’s iOS system the scrubbing-bar provides a special ‘fine-scrubbing’ feature that allows to change the navigation resolution by dragging the virtual scrubbing-bar down to different vertical positions (see textual hint below the scrubbing-bar at right top of Figure 1). This special navigation feature is very similar to the method proposed by Hürst et al. for personal digital assistants a decade ago (Hürst et al., 2004). The problem in the current implementation of iOS is though, that when the user recognizes that he/she searched in a wrong area, he/she has to release the finger in order to go back to a different area in the video and perform fine scrubbing there. Since iOS version 9.0 the default video player also displays very small thumbnails behind the seeker-bar, very similar to the ‘interactive navigation summaries’ proposed in (Schoeffmann et al., 2010), which help to preserve the navigation context.
Even with these improvements navigation in video on smartphones still remains a very tedious and cumbersome interaction process. This is especially true when the smartphone is held in portrait orientation, as often preferred by users for convenience reasons, since the horizontal area provided for the scrubbing-bar is very small and hence limits the navigation resolution. As smartphone users record and store an increasing amount of videos on their devices and often want to share specific scenes, we need more efficient navigation means instead of pure playback features with inconvenient navigation support.
To this end we investigated alternative interaction models for video navigation on mobile devices with touchscreens, such as navigation by wipe gestures (Schoeffmann et al., 2014), by flick gestures (Schoeffmann et al., 2016), by vertical keyframe lists (Hudelist et al., 2013), and by interactive hierarchical visualizations of keyframes at different levels of granularity (Hudelist et al., 2015).