Image Compression using Discrete Cosine Transform and Adaptive

International Journal of Emerging Trends & Technology in Computer Science (IJETTCS)
Web Site: www.ijettcs.org Email: [email protected], [email protected]
Volume 3, Issue 1, January – February 2014
ISSN 2278-6856
Image Compression using Discrete Cosine
Transform and Adaptive Huffman Coding
Deepak Kumar Jain1, Devansh Gaur2, Kavya Gaur3 , Neha Jain4
1,2
3
Central Electronics Engineering Research Institute,
Pilani, Rajasthan, India, 333031
Rajasthan Institute of Engineering and Technology,
Jaipur, Rajasthan, India, 302026
4
GICTS Group of Institution,
Gwalior, M.P., India, 474002
Abstract: Image Compression means reducing the volume
of data needed in order to represent an Image. Image
compress- ion techniques are divided into two main
techniques: trans- forms (DCT, JPEG, FFT, and Wavelet)
and non transforms (PCM, DPCM).In this paper we present
the DCT transform to compress the Image with high quality.
In achieving those results, the wavelet scheme tries to
identify the wavelet filter that would result in the smallest
number of non- zero coefficients (thus giving a high
compression ratio), uses an HVS based Processing Module to
adaptively quantize wavelet coefficients, and efficiently codes
the processed Wavelet Transform
Keywords: DCT, Compression, Quantization, Adaptive
Huffman coding
1. INTRODUCTION
Compressing an image is significantly different than
compressing raw binary data. Of course, general purpose
compression programs can be used to compress images,
but the result is less than optimal. This is because images
have certain statistical properties which can be exploited
by encoders specifically designed for them. Also, some of
the finer details in the image can be sacrificed for the
sake of saving a little more bandwidth or storage space.
This also means that lossy compression techniques can be
used in this area.
Lossless compression involves with compressing data
which, when decompressed, will be an exact replica of the
original data. This is the case when binary data such as
executable, documents etc. are compressed. They need to
be exactly reproduced when decompressed. On the other
hand, images (and music too) need not be reproduced
'exactly'. An approximation of the original image is
enough for most purposes, as long as the error between
the original and the compressed image is tolerable.
Compression can be either lossy or lossless. Lossless
compression reduces bits by identifying and eliminating
statistical redundancy. No information is lost in lossless
compression. Lossy compression reduces bits by
identifying unnecessary information and removing it.[3]
The process of reducing the size of a data file is popularly
Volume 3, Issue 1 January – February 2014
referred to as data compression, although its formal name
is source coding (coding done at the source of the data
before it is stored or transmitted).[4]
A discrete cosine transform (DCT) expresses a finite
sequence of data points in terms of a sum of cosine
functions oscillating at different frequencies. DCTs are
important to numerous applications in science and
engineering, from lossy compression of audio (e.g. MP3)
and images (e.g.JPEG) (where small high-frequency
components can be discarded), to spectral methods for the
numerical solution of partial differential equations. The
use of cosine rather than sine functions is critical in these
applications: for compression, it turns out that cosine
function is much more efficient (as described below,
fewer functions are needed to approximate a typical
signal), whereas for differential equations the cosines
express particular choice of boundary conditions.
In particular, a DCT is a Fourier-related transform
similar to the discrete Fourier transform (DFT), but using
only real numbers. DCTs are equivalent to DFTs of
roughly twice the length, operating on real data with even
symmetry (since the Fourier transform of a real and even
function is real and even), where in some variants the
input and/or output data are shifted by half a sample.
There are eight standard DCT variants, of which four are
common.
The most common variant of discrete cosine transform
is the type-II DCT, which is often called simply "the
DCT", its inverse, the type-III DCT, is correspondingly
often called simply "the inverse DCT" or "the IDCT".
Two related transforms are the discrete sine transforms
(DST), which is equivalent to a DFT of real and odd
functions, and the modified discrete cosine transforms
(MDCT), which is based on a DCT of overlapping data.
Compression is useful because it helps reduce resources
usage, such as data storage space or transmission
capacity. Because compressed data must be decompressed
to use, this extra processing imposes computational or
other costs through decompression; this situation is far
from being a free lunch. Data compression is subject to a
space-time complexity trade-off. For instance, a
Page 90
International Journal of Emerging Trends & Technology in Computer Science (IJETTCS)
Web Site: www.ijettcs.org Email: [email protected], [email protected]
Volume 3, Issue 1, January – February 2014
ISSN 2278-6856
compression scheme for video may require expensive
hardware for the video to be decompressed fast enough to
be viewed as it is being decompressed, and the option to
decompress the video in full before watching it may be
inconvenient or require additional storage. The design of
data compression schemes involves trade-offs among
various factors, including the degree of compression, the
amount of distortion introduced (e.g., when using lossy
data compression), and the computational resources
required to compress and uncompressed the data.[5]
Lossless data compression algorithms usually exploit
statistical redundancy to represent data more concisely
without losing information. Lossless compression is
possible because most real-world data has statistical
redundancy. For example, an image may have areas of
color that do not change over several pixels; instead of
coding "red pixel, red pixel, and the data may be encoded
as” 279 red pixels". This is a basic example of run-length
encoding; there are many schemes to reduce file size by
eliminating redundancy.
The Lempel–Ziv (LZ) compression methods are among
the most popular algorithms for lossless storage.[6]
DEFLATE is a variation on LZ optimized for
decompression speed and compression ratio, but
compression can be slow. DEFLATE is used in PKZIP,
Gzip and PNG. LZW (Lempel–Ziv–Welch) is used in GIF
images. Also noteworthy is the LZR (Lempel-Ziv–Renau)
algorithm, which serves as the basis for the Zip method.
LZ methods use a table-based compression model where
table entries are substituted for repeated strings of data.
For most LZ methods, this table is generated dynamically
from earlier data in the input. The table itself is often
Huffman encoded (e.g. SHRI, LZX). A current LZ-based
coding scheme that performs well is LZX, used in
Microsoft's CAB format.
The best modern lossless compressors use probabilistic
models, such as prediction by partial matching. The
Burrows–Wheeler transform can also be viewed as an
indirect form of statistical modeling.[7]
The class of grammar-based codes is gaining
popularity because they can extremely compress highly
repetitive text, for instance, biological data collection of
same or related species, huge versioned document
collection, internet archives, etc. The basic task of
grammar-based codes is constructing a context-free
grammar deriving a single string. Sequitur and Re-Pair
are practical grammar compression algorithms for which
public codes are available.
In a further refinement of these techniques, statistical
predictions can be coupled to an algorithm called
arithmetic coding. Arithmetic coding, invented by Jorma
Rissanen, and turned into a practical method by Witten,
Neal, and Cleary, achieves superior compression to the
better-known Huffman algorithm and lends itself
especially well to adaptive data compression tasks where
the predictions are strongly context-dependent.
Volume 3, Issue 1 January – February 2014
Arithmetic coding is used in the bi-level image
compression standard JBIG, and the document
compression standard DjVu. The text entry system
Dasher is an inverse arithmetic coder.[8]
Lossy data compression is the converse of lossless data
compression. In these schemes, some loss of information
is acceptable. Dropping nonessential detail from the data
source can save storage space. Lossy data compression
schemes are informed by research on how people perceive
the data in question. For example, the human eye is more
sensitive to subtle variations in luminance than it is to
variations in color. JPEG image compression works in
part by rounding off nonessential bits of information.[9]
There is a corresponding trade-off between information
lost and the size reduction. A number of popular
compression formats exploit these perceptual differences,
including those used in music files, images, and video.
Lossy image compression can be used in digital
cameras, to increase storage capacities with minimal
degradation of picture quality. Similarly, DVDs use the
lossy MPEG-2 Video codec for video compression.
In lossy audio compression, methods of psychoacoustics
are used to remove non-audible (or less audible)
components of the signal. Compression of human speech
is often performed with even more specialized techniques;
speech coding, or voice coding, is sometimes
distinguished as a separate discipline from audio
compression. Different audio and speech compression
standards are listed under audio codecs. Voice
compression is used in Internet telephony, for example
audio compression is used for CD ripping and is decoded
by audio players.[10]
2. SYSTEM OVERVIEW
The flowchart for the proposed model is shown in Fig.1.
In this model, we simply define a model where fist we
take an input Image and finding out its color space of an
image and then finding out its discrete cosine transform
and its quantization and then image is to be encoded.
Apply Huffman algorithm and decode the file
3.PROPOSED METHOD
The proposed method plies to the requisite of luminosity
independent method and a completely automated active
speaker localization system. The gradual development of
the method is being discussed piecemeal.
3.1 Image Compression
The first and foremost step is to transforms the image
from its spatial domain representation to a different type
of representation using some well-known transform and
then codes the transformed values (coefficients). This
method provides greater data compression compared to
predictive methods, although at the expense of greater
computational requirements.
Page 91
International Journal of Emerging Trends & Technology in Computer Science (IJETTCS)
Web Site: www.ijettcs.org Email: [email protected], [email protected]
Volume 3, Issue 1, January – February 2014
ISSN 2278-6856

1
, u, v  0

N
D(u ), D (v)  
 2 , u , v  1,2,3...( N  1)
 N
The inverse 2D-DCT transformation is given by the
following equation:
f ( x, y) 
N 1 N 1
Figure 1: System Flow Chart
 D(u) D(v)c(u, v) cos
u 0 v 0
3.1.1 Image compression using Discrete Cosine
Transform
JPEG stands for the Joint Photographic Experts Group, a
standards committee that had its origins within the
International Standard Organization (ISO).JPEG provides
a compression method that is capable of compressing
continuous-tone image data with a pixel depth of 6 to 24
bits with reasonable speed and efficiency.JPEG may be
adjusted to produce very small, compressed images that
are of relatively poor quality in appearance but still
suitable for many applications. Conversely, JPEG is
capable of producing very high-quality compressed
images that are still far smaller than the original
uncompressed data.
JPEG is primarily a lossy method of compression.JPEG
was designed specifically to discard information that the
human eye cannot easily see. Slight changes in color are
not perceived well by the human eye, while slight changes
in intensity (light and dark) are. Therefore JPEG's lossy
encoding tends to be more frugal with the gray-scale part
of an image and to be more frivolous with the color
[21].DCT separates images into parts of different
frequencies where less important frequencies are
discarded through quantization and important frequencies
are used to retrieve the image during decompression.
Compared to other input dependent transforms, DCT has
many advantages: (1) It has been implemented in single
integrated circuit; (2) It has the ability to pack most
information in fewest coefficients; (3) It minimizes the
block like appearance called blocking artifact that results
when boundaries between sub-images become visible
[11].
The forward 2D_DCT transformation is given by the
following equation:
(2 x  1)u
(2 y  1)v
cos
2N
2N
….(2)
Where x, y = 0, 1, 2, …… (N-1)
3.1.2 JPEG Process
The JPEG process is shown below1. Original image is divided into blocks of 4-by-4 or 8 x
8.
2. Pixel values of a black and white image range from 0255 but DCT is designed to work on pixel values ranging
from -128 to 127. Therefore each block is modified to
work in the range.
3. Equation (1) is used to calculate DCT matrix.
4. DCT is applied to each block by multiplying the
modified block with DCT matrix on the left and transpose
of DCT matrix on its right.
5. Each block is then compressed through quantization.
6. Quantized matrix is then entropy encoded.
7. Compressed image is reconstructed through reverse
process.
8. Inverse DCT is used for decompression
3.1.3 Quantization
Quantization is achieved by compressing a range of
values to a single quantum value. When the number of
discrete symbols in a given stream is reduced, the stream
becomes more compressible. A quantization matrix is
used in combination with a DCT coefficient matrix to
carry out transformation. Quantization is the step where
most of the compression takes place. DCT really does not
compress the image because it is almost lossless.
Quantization makes use of the fact that higher frequency
components are less important than low frequency
components. It allows varying levels of image
compression and quality through selection of specific
c (u , v ) 
quantization matrices. Thus quality levels ranging from 1
N 1 N 1
(2 x  1)u
(2 y  1)v to 100 can be selected, where gives the poorest image
quality and highest compression, while 100 gives the best
D(u ) D (v)
f ( x, y ) cos
cos
2N
2N
quality and lowest compression. As a result quality to
x0 y 0
compression ratio can be selected to meet different
…(1)
needs.JPEG committee suggests matrix with quality level
Where u, v=0, 1, 2, 3………………………….. N-1
50 as standard matrix. For obtaining quantization
And
matrices with other quality levels, scalar multiplications

Volume 3, Issue 1 January – February 2014
Page 92
International Journal of Emerging Trends & Technology in Computer Science (IJETTCS)
Web Site: www.ijettcs.org Email: [email protected], [email protected]
Volume 3, Issue 1, January – February 2014
ISSN 2278-6856
of standard quantization matrix are used. Quantization is
achieved by dividing transformed image matrix by the
quantization matrix used. Values of the resultant matrix
are then rounded off. In the resultant matrix coefficients
situated near the upper left corner have lower frequencies
.Human eye is more sensitive to lower frequencies
.Higher frequencies are discarded. Lower frequencies are
used to reconstruct the image.
3.1.4 Huffman Algorithm
The basic idea in Huffman coding is to assign short code
words to those input blocks with high probabilities and
long code words to those with low probabilities. A
Huffman code is designed by merging together the two
least probable characters, and repeating this process until
there is only one character remaining. A code tree is thus
generated and the Huffman code is obtained from the
labeling of the code tree.
3.1.5 Peak Signal to Noise Ratio
The formula for calculating the peak signal to noise ratio
is:
2
A

A

SNRdb  10 log10  signal   20 log10  signal 
 Anoise 
 Anoise 
Where,
Asignal, Anoise = root mean square (RMS) amplitude
Figure 3. (Left-Bottom) Lena, 8-by-8 DCT, 4-by-4 DCT
(Right-Bottom) Apple, 8-by-8 DCT, 4-by-4 DCT
TABLE 1: Compression PSNR Results for Adaptive
Huffman Coding
4 RESULTS AND DISCUSSION
In the JPEG image compression algorithm, the input
image is divided into 4-by-4 or 8-by-8 blocks, and the
two-dimensional DCT is computed for each block. The
DCT coefficients are then quantized, coded, and
transmitted. The JPEG receiver (or JPEG file reader)
decodes the quantized DCT coefficients, computes the
inverse two-dimensional DCT of each block, and then
puts the blocks back together into a single image. For
typical images, many of the DCT coefficients have values
close to zero; these coefficients can be discarded without
seriously affecting the quality of the reconstructed image.
The example code below computes the two-dimensional
DCT of 8-by-8 blocks in the input image, discards (sets to
zero) all but 10 of the 64 DCT coefficients in each block,
and then reconstructs the image using the twodimensional inverse DCT of each block. The transform
matrix computation method is used.
JPEG1
JPEG2
JPEG3
JPEG4
JPEG5
JPEG6
JPEG7
JPEG8
JPEG9
Compres
sion by 8by-8 DCT
Compress
ion by 4by-4 DCT
Compress
ion by 8by-8 DCT
Image of
Mona
Lisa
Compress
ion by 4by-4 DCT
Image of
Mona
Lisa
Image of
Lena
Image of
Lena
6.70%
6.31%
2.97%
2.76%
6.24%
4.86%
2.47%
1.73%
6.14%
4.43%
2.29%
1.56%
6.04%
4.17%
2.14%
1.34%
5.19%
3.76%
1.51%
1.26%
4.47%
3.20%
1.26%
0.93%
3.79%
2.44%
1.11%
0.69%
3.02%
1.63%
0.81%
0.44%
2.25%
0.00%
0.26%
0.00%
5 CONCLUSION
Figure 2: Al-though there is some loss of quality in
the reconstructed image, it is clearly recognizable,
even though almost 85% of the DCT coefficients
were discarded
Volume 3, Issue 1 January – February 2014
DCT is used for transformation in JPEG standard. DCT
performs efficiently at medium bit rates. Disadvantage
with DCT is that only spatial correlation of the pixels
inside the single 2-D block is considered and the
correlation from the pixels of the neighboring blocks is
Page 93
International Journal of Emerging Trends & Technology in Computer Science (IJETTCS)
Web Site: www.ijettcs.org Email: [email protected], [email protected]
Volume 3, Issue 1, January – February 2014
ISSN 2278-6856
neglected. Blocks cannot be de-correlated at their
boundaries using DCT.
A new lossless image compression scheme based on the
DCT was developed. This method caused a significant
reduction in entropy, thus making it possible to achieve
compression using a traditional entropy coder. The
method performed well when compared to the popular
lossless.
References
[1] R. C. Gonzalez and R. E. Woods, “Digital Image
Processing”,Second edition,pp. 411-514,2004.
[2] D. Coltuc, “Low Distortion Transform for Reversible
Watermarking,”IEEE Trans. on Image Processing,
vol. 21, no. 1, pp. 412-417, Jan.2012
[3] A. K. Jain,Fundamentals of Digital Image
Processing,Prentice–Hall,Englewood Cliffs, NJ,
1989.
[4] Ronald A. DeVore, Bjorn Jawerth, and Bradley J.
Lucier, Member,"Image Compression Through
Wavelet Transform Coding" IEEE Trans. on
Information Theory, Vol. 38. NO. 2, pp. 719-746,
MARCH 1992.
[5] Amir Averbuch, Danny Lazar, and Moshe
Israeli,"Image Compression Using Wavelet transform
and Multiresolution decomposition IEEE Trans. on
Image Processing, Vol. 5, No. 1, JANUARY 1996.
[6] N. D. Memon and K. Sayood, Lossless image
compression: A comparative study, IS&T/SPIE
Electronic Imaging Conference. San Jose,University
of New Mexico. CA, February, 1995.
[7] D.S.Taubman and M.W.Marcellin, JPEG2000:
Image Compression Fundamentals,Standards and
Practice, Kluwar Academic Publishers,2002.
[8] P.J.Oonincx and P.M.d.Zeeuw, Adaptive lifting for
shape-based image retrieval. Pattern Recognition,
2003,36:2663-2672
[9] Zhao Yu-qian; Gui Wei-hua; Chen Zhen-cheng;
Tang Jing-tian; Li Ling-yun; "Medical Images Edge
Detection Based on Mathematical Morphology",
Engineering in Medicine and Biology Society, 2005.
IEEE-EMBS 2005. 27th Annual International
Conference. Issue Date: 17-18 Jan. 2006. Pp:64926595.
[10]Sonja Grgic, Mislov Grgic and Branka Zorko-cihlar,
"Performance Analysis of Image Compression Using
Wavelets" IEEE Trans on Industrial Electronics, Vol
48, No 3, June 2001.
[11]C. G. Boncelet, “Rlock arithmetic coding for source
compression,’’ IEEE Transactions on Information
Theory, vol. IT-39, pp. 1546-1554, Sept. 1993.
[12]Lossless Data Compression, Recommendation for
Space Data System Standards, CCSDS 121.0-B-1,
May 1997.
[13]Image Data Compression, Recommendation for
Volume 3, Issue 1 January – February 2014
Space Data System Standards, CCSDS 122.0-B-1,
November 2005.
[14]J. Feng, I. Lin, C. Tsai, et al., “Reversible
Watermarking: Current Statusand Key Issues,”
International Journal of Network Security, vol. 2,
no.3, pp. 161-171, May. 2006.
[15]K. Chung, Y. Huang, P. Chang, et al., “Reversible
Data Hiding-Based Approach for Intra-Frame Error
Concealment in H.264/AVC,” IEEE Trans. on
Circuits System.
[16]J. Fridrich and M. Goljan, “Lossless Data Embedding
for All ImageFormats,” in SPIE Proceedings of
Photonics West, Electronic Imaging,Security and
Watermarking of Multimedia Contents, vol. 4675,
pp. 572-583, San Jose, Jan. 2002.
[17]L. Luo, Z. Chen, M. Chen, et al., “Reversible Image
WatermarkingUsing Interpolation Technique,” IEEE
Trans. on Information Forensicsand Security. vol. 5,
no. 1, pp. 187-193, Mar. 2010.
[18]T. Kalker, F. M. Willems, “Capacity Bounds and
Code Constructions for Reversible Data-Hiding,”
Proc. of 14th International Conference on Digital
Signal Processing, DSP2002, pp. 71-76, 2002.
AUTHORS
Deepak Kumar Jain received Bachelor of
Engineering in 2010 and Master of Technology in
2012 from Jaypee University of Engineering and
Technology from Guna Campus, India.
Devansh Gaur received Bachelor of Engineering
from Birla Institute of Technology, Mesra, Ranchi
in 2010.
Kavya Gaur received Bachelor of Technology
from Rajasthan Institute of Engineering and
Technology, Jaipur, Rajasthan in 2013.
Neha Jain received Bachelor of Engineering
in 2008 and Master of Technology in 2012
from Jaypee University of Engineering and
Technology from Guna Campus, India.
Page 94