Introduction to Digital Image Processing Digital image processing (DIP) is a field that uses digital computers to process digital images1. A digital image is a two-dimensional function, f(x,y), where the amplitude at any given coordinate (x,y) represents the intensity or color of the image at that point2222. These individual points are known as pixels, which is the most widely used term for them3. A digital image is made up of a finite number of these pixels, each with a specific location and value4. The intensity of a monochrome image is also referred to as its gray level5. Images can be converted to digital form by digitizing both the spatial coordinates and the amplitude values6. Applications of Digital Image Processing DIP has a wide range of applications across various technical fields7. ● Remote Sensing: Satellite imagery is used to track Earth resources, create geographical maps, predict agricultural crops, monitor urban growth and weather, and aid in flood and fire control888888888. ● Space Imagery: DIP is used for recognizing and analyzing objects in images captured during deep space-probe missions9. ● Medical Processing: Applications include processing chest X-rays, cine angiograms, transaxial tomography projection images, and medical images from radiology, nuclear magnetic resonance (NMR), and ultrasonic scanning10101010. ● Other Fields: DIP is used in robotics and automated inspection of industrial parts, as well as in systems like RADAR, SONAR, and acoustic image processing11. It's also applied in business for image transmission and storage, teleconferencing, office automation (facsimile images), and security monitoring systems12121212121212121212121212121212. Components of an Image Processing System An image processing system is made of several key components that work together to capture and process images13131313. ● Image Sensors: These are physical devices that are sensitive to the energy radiated by an object, such as light, radar, or X-rays1414141414141414. They convert this energy into a voltage waveform, which is then digitized to create a digital image15. ● Specialized Image Processing Hardware: This includes a digitizer and other hardware components like an arithmetic logic unit (ALU) that can perform operations on images in parallel16161616. ● Computer: A general-purpose computer, which can be a PC or a supercomputer, depending on the application17. For some applications, specially designed computers are used for performance18. ● Software: This consists of specialized modules for specific tasks and can include capabilities for users to write their own code19. ● Mass Storage: This is crucial for image processing applications due to the large amount of data involved. Storage is categorized into short-term (for processing), on-line (for fast retrieval), and archival (for long-term storage)20202020. ● Image Displays: Most modern displays are color TV monitors driven by image and graphics cards that are part of the computer system21. ● Hardcopy Devices: These are used for recording images, and include laser printers, film cameras, inkjet units, and digital disks like optical and CD-ROM22. ● Networking: Due to the large data size of images, networking is a key consideration, particularly regarding transmission bandwidth23. Fundamental Steps in Digital Image Processing The fundamental steps in DIP can be divided into two main categories: methods where the output is an image, and methods where the output is an attribute extracted from the image24. 1. Image Acquisition: This is the initial step, which can be as simple as being given a pre-existing digital image25. It often involves processes like scaling26. 2. Image Enhancement: This is a subjective process aimed at improving the visual appearance of an image or highlighting specific features27. The input and output are both images28. 3. Image Restoration: This is an objective process that also aims to improve an image's appearance, but it is based on mathematical or probabilistic models of image degradation29. The input and output are images30. 4. Color Image Processing: This field deals with color models and their implementation in image processing, and it has gained importance due to the use of digital images on the internet31. 5. Wavelets and Multiresolution Processing: These techniques are used to represent an image at various levels of resolution32. 6. Compression: This involves techniques to reduce the storage space needed for an image or the bandwidth required to transmit it33. There are two main approaches: lossless and lossy compression34. 7. Morphological Processing: This involves tools for extracting image components that are useful for representing and describing the shape and boundaries of objects35. It is often used in automated inspection36. 8. Segmentation: This is the process of partitioning an image into multiple segments37. The output is an extracted attribute from the image383838. 9. Representation and Description: This step follows segmentation and involves converting the raw pixel data of an object's boundary or region into a format suitable for computer processing39. The output is an extracted attribute40. 10.Object Recognition: The final step, which assigns a label to an object based on its descriptors41. It's a high-level process that uses artificial intelligence42424242. 11.Knowledge Base: Knowledge about a problem domain is coded into the system to guide the processing steps, for example, by detailing regions of interest to limit the search area43. Image Sensing and Acquisition Image sensing and acquisition involves a source of illumination and a sensor that detects energy reflected from or transmitted through a scene44444444. The sensor converts this energy into a voltage, which is then digitized45. ● Single Sensor: A single sensor, such as a photodiode, can be used to acquire a two-dimensional image by moving it in both the x and y directions relative to the object being imaged46464646. This method is slow but can achieve high-resolution images47. ● Sensor Strips: An in-line arrangement of sensors forms a sensor strip, which provides imaging in one direction48. Motion perpendicular to the strip completes the other dimension of the image49494949. Flatbed scanners use this method50. ● Sensor Arrays: This is a two-dimensional arrangement of individual sensors, often a CCD array in digital cameras51515151. The main advantage is that a complete image can be obtained by focusing the energy pattern onto the array's surface, so no motion is necessary52. Image Sampling and Quantization To create a digital image from continuous data, two processes are needed sampling and quantization53535353: Sampling → Converting a continuous image (analog) into a digital form by dividing it into a grid of pixels. It decides how many pixels represent the image (spatial resolution). Quantisation → Assigning each pixel a discrete intensity (from a limited set of gray levels or colors). It decides how many intensity levels are available (brightness/color resolution). ● Sampling: This is the process of digitizing the continuous spatial coordinates (x and y) of an image545454545454. This effectively divides the continuous image into a grid of rows and columns55. ● Quantization: This is the process of digitizing the amplitude values (intensity or gray levels) of the image565656565656565656. The continuous gray levels are assigned to discrete levels57. The result of sampling and quantization is a matrix of real numbers, where each element is a pixel58585858. Relationships Between Pixels The spatial relationships between pixels are fundamental to image processing operations. ● Neighbors of a Pixel: A pixel at coordinates (x,y) has four horizontal and vertical neighbors with coordinates (x+1,y), (x−1,y), (x,y+1), and (x,y−1)59. These are called the 4-neighbors, denoted N4(p)60. It also has four diagonal neighbors at (x+1,y+1), (x+1,y−1), (x−1,y+1), and (x−1,y−1), denoted ND(p)61. The combination of all eight neighbors is called the 8-neighbors, denoted N8(p)62. ● Adjacency and Connectivity: Adjacency is defined by a set of gray-level values, V63. Two pixels, p and q, with values from V, are considered 4-adjacent if q is in the set N4(p)64. They are 8-adjacent if q is in N8(p)65. M-adjacency is a modified version of 8-adjacency that resolves ambiguities by ensuring there is only one path between two pixels. Two pixels are connected if a path exists between them composed entirely of pixels from a subset S67. A connected set of pixels is called a region68. ● Distance Measures: The distance between two pixels, p at (x,y) and q at (s,t), can be measured in different ways69. Image Transforms Image transforms are mathematical operations that convert an image from its spatial domain to another domain, often a frequency domain, to reveal different information75. Fourier Transform The 2-D Discrete Fourier Transform (2-D DFT) is a key transform for image processing76767676. It transforms a signal or image from the spatial domain to the frequency domain77. A significant property of the 2D Fourier transform is its separability, which means it can be implemented as a sequence of 1D Fourier transforms applied along the rows and then the columns of an image78. ● Properties: Key properties of the 2-D Fourier Transform include: ○ Linearity: A linear combination of functions in the spatial domain corresponds to the same linear combination of their transforms in the frequency domain79797979. ○ Shifting: Shifting a function in the spatial domain adds a phase shift to its Fourier transform80808080. ○ Modulation: Modulating a function in the spatial domain shifts its transform in the frequency domain81818181. ○ Convolution: Convolution in the spatial domain is equivalent to multiplication in the frequency domain82. ○ Multiplication: Multiplication in the spatial domain is equivalent to convolution in the frequency domain83. ○ Separability: If a function can be written as the product of two single-variable functions, its transform is the product of their individual transforms848484848484848484. Other Transforms ● Walsh Transform: This transform uses a kernel with values of +1 or -1 and results in a real symmetric matrix85858585. The 2D Walsh transform is a straightforward extension of the 1D transform86. The inverse transform is identical to the forward transform, apart from a multiplicative factor87878787. ● Hadamard Transform: Similar to the Walsh transform, the 2D Hadamard transform also has a kernel with values of +1 or -188. The inverse transform is identical to the forward transform89. ● Discrete Cosine Transform (DCT): The DCT is used to separate an image into parts of differing importance90. It transforms a signal from the spatial domain to the frequency domain, similar to the Fourier transform91. For most images, the low-frequency information is in the upper-left corner of the DCT matrix, while the high-frequency components in the lower-right are often small enough to be discarded for compression92929292. ● Discrete Wavelet Transform (DWT): The DWT represents an image at various degrees of resolution93. The 2D DWT is separable, meaning it can be applied to all rows and then all columns94. This process decomposes an image into four components: an approximation (LL), and details in the horizontal (LH), vertical (HL), and diagonal (HH) orientations. The low-pass component is the recognizable portion of the image, while the high-pass components are almost invisible96. Image Enhancement Image enhancement improves an image's quality for human or machine interpretation by manipulating pixels directly (spatial domain) or modifying their Fourier transform (frequency domain)97. Basic Intensity Transformations ● Image Negative: This transformation inverts the gray levels of an image98. For an image with gray levels in the range [0,L−1], the negative is calculated as S=L−1−r, where r and S are the input and output gray levels99999999. It's useful for enhancing white details in dark regions100. ● Logarithmic Transformations: These transformations expand the range of dark pixels and compress the range of high-intensity pixels101. The formula is s=clog(r+1), where c is a constant102. ● Power-Law Transformations (Gamma Correction): This transformation is defined by the formula s=crγ where c and γ are positive constants103. Different values of γ produce various effects: γ<1 expands dark values and compresses bright values, similar to log transformations, while γ>1 compresses dark values and expands bright values104104104104. When c=γ=1, it becomes an identity transformation. Histogram Processing A histogram of a digital image is a function that shows the number of pixels for each gray level. A normalized histogram gives the probability of occurrence for each gray level107. ● Dark Image: The histogram components are concentrated on the low (dark) side of the gray scale108. ● Bright Image: The histogram components are concentrated on the high (bright) side of the gray scale109. ● Low-Contrast Image: The histogram is narrow and centered in the middle of the gray scale110. ● High-Contrast Image: The histogram components cover a broad range of the gray scale, resulting in more detail111. Histogram Equalization is a technique to enhance image appearance by stretching out the gray levels to produce a more uniform histogram. Spatial Filtering Spatial filtering involves manipulating pixels directly within a local neighborhood, or mask, to create a processed image. The mask is a small 2D array of coefficients114. The operator is applied at each pixel location, utilizing the pixel values within the neighborhood115. ● Smoothing Spatial Filters: These are used to reduce image noise and blur sharp transitions116116116116. They work by attenuating high-frequency components of the image's Fourier transform117. ● Sharpening Spatial Filters: These filters enhance edges and fine details by attenuating low-frequency components and preserving high-frequency information118118118118. A high-pass filter is the opposite of a low-pass filter. Image Enhancement in the Frequency Domain Image enhancement in the frequency domain is achieved by modifying the image's Fourier transform120. ● Ideal Low-Pass Filter (ILPF): This filter cuts off all high-frequency components that are at a distance greater than a certain cutoff frequency (D0) from the origin121. This results in blurring and noise reduction, but it can also cause "ringing" artifacts in the output image due to the sharp cutoff122122122122. ● Butterworth Low-Pass Filter (BLPF): This filter has a smoother transition between passed and filtered frequencies than the ILPF123123123123. The cutoff frequency, D0, is defined where the filter's value is 0.5124. This smooth transition typically prevents the ringing artifacts seen with the ILPF, especially for lower-order filters125. ● Gaussian Low-Pass Filter (GLPF): This filter's transfer function is a smooth Gaussian curve126. Since a Gaussian in the frequency domain remains a Gaussian in the spatial domain, there are no ringing artifacts, making it a desirable filter for smoothing images127127127127. ● High-Pass Filters: High-pass filters (IHPF, BHPF, GHPF) are derived from their low-pass counterparts using the formula Hhp(u,v)=1−Hlp(u,v)128. They sharpen images by passing high-frequency information while attenuating low-frequency components129. The Ideal High-Pass Filter also suffers from ringing artifacts, while the Butterworth and Gaussian high-pass filters produce smoother results with less distortion.
0
You can add this document to your study collection(s)
Sign in Available only to authorized usersYou can add this document to your saved list
Sign in Available only to authorized users(For complaints, use another form )