Only a few decades ago, the human-computer interaction was based on a rudimentary text user interface, a convenient method compared to the punch card era, but too tedious and not very appealing for the nonspecialist, and thereby, not suitable for the mass market. Later on, the multimedia era arrived, with personal computers and other devices having powerful graphic capabilities, plenty of full-coloured pictures shown to the user. Although images made more pleasant the interaction with computers, their use represented a new challenge for electronic engineers; while only a few bytes are needed to represent a text (typically one byte per character in extended ASCII), lots of data must be employed for images that, in a “raw” representation (i.e., uncompressed) for colour images, need as many as three bytes per single pixel (picture element, each dot forming an image). Thus, there was a clear urge to reduce the amount of bytes required to encode an image, mainly so as to avoid an excessive increase in both memory consumption and network bandwidth required to store and transmit images, which would limit or prevent their use in practice. In general, in order to exploit the information system resources in an efficient way when dealing with images, compression is almost mandatory. Fortunately, most images are characterized by highly redundant signals (especially natural and synthetic images), since pixels composing an image present high homogeneity, and this redundancy, often called spatial redundancy, can be reduced through a compression process, achieving a more compact representation.
The main contribution of this chapter is a brief survey on ISO standards for image coding. This chapter is organised as follows. For continuous-tune still-image lossy compression, the chapter reviews the classic ISO JPEG standard and the newer ISO JPEG 2000, while for lossless compression, the ISO JPEG-LS is presented. In addition, the authors review the JBIG standard, which aims at binary image coding, being widely used for fax transmission.
For continuous tone images (e.g., those from a digital camera), each pixel takes a value in a (nonbinary) range; typically any value in the range [0..28-1] for a greyscale image. Actually, for colour images, three bytes per pixel are commonly used (one byte per colour component, red, green, and blue, which is known as RGB colour space), where each pixel represents a colour from up to 224 possibilities. The well-known standard, JPEG (JPEG, 1992), focuses on this type of image. Its sequential mode, widely used throughout the entertainment industry, is based on removing information that is hardly perceived by a human viewer (in particular, high-frequency components are less accurately encoded). Although the decompressed image is not equal to the original one if compared pixel by pixel, the perceptual quality could be nearly the same, provided that no heavy compression is applied. An evolution from this standard is the new JPEG 2000 standard (JPEG 2000, 2000), based on the same ideas, which is more efficient and flexible. However, the higher complexity of JPEG 2000 and the current widespread use of JPEG make the success of this new version uncertain in the mass market.
This type of image coding process, where recovered data is not the encoded one, is called lossy compression. It is important to emphasise that the comparison of lossy encoders cannot be performed based on the final image size alone, but also on its visual quality, in a sort of cost/benefit ratio (usually represented as a rate/distortion curve).
If it is important to recover exactly the original image, for example, for legal reasons in medical imaging or in image editing, lossless compression can be done at the cost of lower compression performance (but no quality loss). Although JPEG 2000 has a lossless mode, specific standards, like JPEG-LS (JPEG-LS, 1997), offer better performance. Another famous lossless image encoder is GIF (CompuserveIncorporated, 1987). Frequently used on the Internet, it is intended for 256-colour images that are previously selected in a palette.
Key Terms in this Chapter
Lossy Coding: An image encoder can modify a source image in order to achieve higher compression ratios, while trying to keep the perceived quality unaltered. This is the case of lossy image compression, in which the decoded pixels do not maintain the same values as when they were encoded.
Entropy Coding: An entropy coder is a general lossless data compression method that encodes symbols by using an amount of bits inversely proportional to the probability of the symbols.
DC Component: When the two-dimensional discrete cosine transform is computed, the DC coefficient refers to the mean value of the pixels in the block, usually scaled according to the normalization used in the transform (in the DCT transform applied in JPEG, the DC is the mean value multiplied by 8).
Luminance: The brightness of an image is determined by the luminance component (usually referred to as Y ). It is the main component in a YCbCr colour space because the HVS is more sensitive to this component than to chrominance components. It can be computed from the RGB components as Y = 0.2126 R + 0.7152 G + 0.0722 B.
Human Visual System (HVS): In order to improve coding performance, a lossy encoder removes the information that is not seen by the human eye. The human visual system is the part of the nervous system that allows us to see. It has been modelled to determine its behaviour, and hence, to identify the information that can be removed without noticeable artifacts.
Lossless Coding: In lossless image compression, the decoded pixels are exactly the same as those that were encoded. For this reason, it is considered that there is no loss of data.
Chrominance: The difference between a certain colour component and the luminance component is called chrominance. The Cb and Cr components refer to the blue and red chrominances, increased by 0.5 and rescaled by 2 and 1.6, respectively ( Pennebaker & Mitchell, 1992 )