Transforms Using Matlab - Digital Images Processing Using Matlab
Transforms Using Matlab - Digital Images Processing Using Matlab

Fan-Beam Projection Data

    Note For information about creating projection data from line integrals along parallel paths, see Radon Transform. To convert fan-beam projection data to parallel-beam projection data, use the fan2para function.

Fan-Beam Projection Data Definition

The fanbeam function computes projections of an image matrix along specified directions. A projection of a two-dimensional function f(x,y) is a set of line integrals. The fanbeam function computes the line integrals along paths that radiate from a single source, forming a fan shape. To represent an image, the fanbeam function takes multiple projections of the image from different angles by rotating the source around the center of the image. The following figure shows a single fan-beam projection at a specified rotation angle.

Fan-Beam Projection at Rotation Angle Theta


Computing Fan-Beam Projection Data

To compute fan-beam projection data, use the fanbeam function. You specify as arguments an image and the distance between the vertex of the fan-beam projections and the center of rotation (the center pixel in the image). The fanbeam function determines the number of beams, based on the size of the image and the settings of fanbeam parameters.

The FanSensorGeometry parameter specifies how sensors are aligned. If you specify the value 'arc' for FanSensorGeometry (the default), fanbeam positions the sensors along an arc, spacing the sensors at 1 degree intervals. Using the FanSensorSpacing parameter, you can control the distance between sensors by specifying the angle between each beam. If you specify the value 'line' for FanSensorGeometry parameter, fanbeam position sensors along a straight line, rather than an arc. With 'line' geometry, the FanSensorSpacing parameter specifies the distance between the sensors, in pixels, along the x axis.

fanbeam takes projections at different angles by rotating the source around the center pixel at 1 degree intervals. Using the FanRotationIncrement parameter you can specify a different rotation angle increment.

The following figures illustrate both these geometries. The first figure illustrates geometry used by the fanbeam function when FanSensorGeometry is set to 'arc' (the default). Note how you specify the distance between sensors by specifying the angular spacing of the beams.

Fan-Beam Projection with Arc Geometry

The following figure illustrates the geometry used by the fanbeam function when FanSensorGeometry is set to 'line'. In this figure, note how you specify the position of the sensors by specifying the distance between them in pixels along the x axis.

Fan-Beam Projection with Line Geometry


Reconstructing an Image from Fan-Beam Projection Data

To reconstruct an image from fan-beam projection data, use the ifanbeam function. With this function, you specify as arguments the projection data and the distance between the vertex of the fan-beam projections and the center of rotation when the projection data was created. For example, this code recreates the image I from the projection data P and distance D.

I = ifanbeam(P,D);

By default, the ifanbeam function assumes that the fan-beam projection data was created using the arc fan sensor geometry, with beams spaced at 1 degree angles and projections taken at 1 degree increments over a full 360 degree range. As with the fanbeam function, you can use ifanbeam parameters to specify other values for these characteristics of the projection data. Use the same values for these parameters that were used when the projection data was created. For more information about these parameters, see Computing Fan-Beam Projection Data.

The ifanbeam function converts the fan-beam projection data to parallel-beam projection data with the fan2para function, and then calls the iradon function to perform the image reconstruction. For this reason, the ifanfeam function supports certain iradon parameters, which it passes to the iradon function. See The Inverse Radon Transformation for more information about the iradon function.

Example: Reconstructing a Head Phantom Image

The commands below illustrate how to use fanbeam and ifanbeam to form projections from a sample image and then reconstruct the image from the projections. The test image is the Shepp-Logan head phantom, which can be generated by the phantom function. The phantom image illustrates many of the qualities that are found in real-world tomographic imaging of human heads.

  1. Generate the test image and display it.

    P = phantom(256);  imshow(P)

  2. Compute fan-beam projection data of the test image, using the FanSensorSpacing parameter to vary the sensor spacing. The example uses the fanbeam arc geometry, so you specify the spacing between sensors by specifying the angular spacing of the beams. The first call spaces the beams at 2 degrees; the second at 1 degree; and the third at 0.25 degrees. In each call, the distance between the center of rotation and vertex of the projections is constant at 250 pixels. In addition, fanbeam rotates the projection around the center pixel at 1 degree increments.

    D = 250; dsensor1 = 2; F1 = fanbeam(P,D,'FanSensorSpacing',dsensor1); dsensor2 = 1; F2 = fanbeam(P,D,'FanSensorSpacing',dsensor2); dsensor3 = 0.25 [F3, sensor_pos3, fan_rot_angles3] = fanbeam(P,D,... 'FanSensorSpacing',dsensor3);
  3. Plot the projection data F3. Because fanbeam calculates projection data at rotation angles from 0 to 360 degrees, the same patterns occur at an offset of 180 degrees. The same features are being sampled from both sides. Compare this plot to the plot of the parallel-beam projection data of the head phantom using 90 projections in Example: Reconstructing an Image from Parallel Projection Data.

    figure, imagesc(fan_rot_angles3, sensor_pos3, F3) colormap(hot); colorbar xlabel('Fan Rotation Angle (degrees)') ylabel('Fan Sensor Position (degrees)')
  4. Reconstruct the image from the fan-beam projection data using ifanbeam. In each reconstruction, match the fan sensor spacing with the spacing used when the projection data was created in step 2. The example uses the OutputSize parameter to constrain the output size of each reconstruction to be the same as the size of the original image |P|.

    output_size = max(size(P)); Ifan1 = ifanbeam(F1,D, ... 'FanSensorSpacing',dsensor1,'OutputSize',output_size); figure, imshow(Ifan1) Ifan2 = ifanbeam(F2,D, ... 'FanSensorSpacing',dsensor2,'OutputSize',output_size); figure, imshow(Ifan2) Ifan3 = ifanbeam(F3,D, ... 'FanSensorSpacing',dsensor3,'OutputSize',output_size); figure, imshow(Ifan3)

    The following figure shows the result of each transform. Note how the quality of the reconstruction gets better as the number of beams in the projection increases. The first image, Ifan1, was created using 2 degree spacing of the beams; the second image, ifan2, was created using 1 degree spacing of the beams; the third image, ifan3, was created using 0.25 spacing of the beams.

    Reconstructions of the Head Phantom Image from Fan-Beam Projections

Inverse Radon Transform Definition

The iradon function inverts the Radon transform and can therefore be used to reconstruct images.

As described in Radon Transform, given an image I and a set of angles theta, the radon function can be used to calculate the Radon transform.

R = radon(I,theta);

The function iradon can then be called to reconstruct the image I from projection data.

IR = iradon(R,theta);

In the example above, projections are calculated from the original image I.

Note, however, that in most application areas, there is no original image from which projections are formed. For example, the inverse Radon transform is commonly used in tomography applications. In X-ray absorption tomography, projections are formed by measuring the attenuation of radiation that passes through a physical specimen at different angles. The original image can be thought of as a cross section through the specimen, in which intensity values represent the density of the specimen. Projections are collected using special purpose hardware, and then an internal image of the specimen is reconstructed by iradon. This allows for noninvasive imaging of the inside of a living body or another opaque object.

iradon reconstructs an image from parallel-beam projections. In parallel-beam geometry, each projection is formed by combining a set of line integrals through an image at a specific angle.

The following figure illustrates how parallel-beam geometry is applied in X-ray absorption tomography. Note that there is an equal number of n emitters and n sensors. Each sensor measures the radiation emitted from its corresponding emitter, and the attenuation in the radiation gives a measure of the integrated density, or mass, of the object. This corresponds to the line integral that is calculated in the Radon transform.

The parallel-beam geometry used in the figure is the same as the geometry that was described in Radon Transform. f(x,y) denotes the brightness of the image and is the projection at angle theta.

Parallel-Beam Projections Through an Object

Another geometry that is commonly used is fan-beam geometry, in which there is one source and n sensors. For more information, see Fan-Beam Projection Data. To convert parallel-beam projection data into fan-beam projection data, use the para2fan function.

Improving the Results

iradon uses the filtered backprojection algorithm to compute the inverse Radon transform. This algorithm forms an approximation of the image I based on the projections in the columns of R. A more accurate result can be obtained by using more projections in the reconstruction. As the number of projections (the length of theta) increases, the reconstructed image IR more accurately approximates the original image I. The vector theta must contain monotonically increasing angular values with a constant incremental angle Dtheta. When the scalar Dtheta is known, it can be passed to iradon instead of the array of theta values. Here is an example.

IR = iradon(R,Dtheta);

The filtered backprojection algorithm filters the projections in R and then reconstructs the image using the filtered projections. In some cases, noise can be present in the projections. To remove high frequency noise, apply a window to the filter to attenuate the noise. Many such windowed filters are available in iradon. The example call to iradon below applies a Hamming window to the filter. See the iradon reference page for more information. To get unfiltered backprojection data, specify 'none' for the filter parameter.

IR = iradon(R,theta,'Hamming');

iradon also enables you to specify a normalized frequency, D, above which the filter has zero response. D must be a scalar in the range [0,1]. With this option, the frequency axis is rescaled so that the whole filter is compressed to fit into the frequency range [0,D]. This can be useful in cases where the projections contain little high-frequency information but there is high-frequency noise. In this case, the noise can be completely suppressed without compromising the reconstruction. The following call to iradon sets a normalized frequency value of 0.85.

IR = iradon(R,theta,0.85);

Example: Reconstructing an Image from Parallel Projection Data

The commands below illustrate how to reconstruct an image from parallel projection data. The test image is the Shepp-Logan head phantom, which can be generated using the phantom function. The phantom image illustrates many of the qualities that are found in real-world tomographic imaging of human heads. The bright elliptical shell along the exterior is analogous to a skull, and the many ellipses inside are analogous to brain features.

  1. Create a Shepp-Logan head phantom image.

    P = phantom(256);  imshow(P)

  2. Compute the Radon transform of the phantom brain for three different sets of theta values. R1 has 18 projections, R2 has 36 projections, and R3 has 90 projections.

    theta1 = 0:10:170; [R1,xp] = radon(P,theta1); theta2 = 0:5:175; [R2,xp] = radon(P,theta2); theta3 = 0:2:178; [R3,xp] = radon(P,theta3);
  1. Display a plot of one of the Radon transforms of the Shepp-Logan head phantom. The following figure shows R3, the transform with 90 projections.

figure, imagesc(theta3,xp,R3); colormap(hot); colorbar xlabel('\theta'); ylabel('x\prime');

Radon Transform of Head Phantom Using 90 Projections

Note how some of the features of the input image appear in this image of the transform. The first column in the Radon transform corresponds to a projection at 0 that is integrating in the vertical direction. The centermost column corresponds to a projection at 90, which is integrating in the horizontal direction. The projection at 90 has a wider profile than the projection at 0 due to the larger vertical semi-axis of the outermost ellipse of the phantom.

  1. Reconstruct the head phantom image from the projection data created in step 2 and display the results.

    I1 = iradon(R1,10); I2 = iradon(R2,5); I3 = iradon(R3,2); imshow(I1) figure, imshow(I2) figure, imshow(I3)

    The following figure shows the results of all three reconstructions. Notice how image I1, which was reconstructed from only 18 projections, is the least accurate reconstruction. Image I2, which was reconstructed from 36 projections, is better, but it is still not clear enough to discern clearly the small ellipses in the lower portion of the image. I3, reconstructed using 90 projections, most closely resembles the original image. Notice that when the number of projections is relatively small (as in I1 and I2), the reconstruction can include some artifacts from the back projection.

    Inverse Radon Transforms of the Shepp-Logan Head Phantom


Radon Transformation Definition

The radon function computes projections of an image matrix along specified directions.

A projection of a two-dimensional function f(x,y) is a set of line integrals. The radon function computes the line integrals from multiple sources along parallel paths, or beams, in a certain direction. The beams are spaced 1 pixel unit apart. To represent an image, the radon function takes multiple, parallel-beam projections of the image from different angles by rotating the source around the center of the image. The following figure shows a single projection at a specified rotation angle.

Parallel-Beam Projection at Rotation Angle Theta

For example, the line integral of f(x,y) in the vertical direction is the projection of f(x,y) onto the x-axis; the line integral in the horizontaldirection is the projection of f(x,y) onto the y-axis. The following figure shows horizontal and vertical projections for a simple two-dimensional function.

Horizontal and Vertical Projections of a Simple Function

Projections can be computed along any angle [[THETA]]. In general, the Radon transform of f(x,y) is the line integral of f parallel to the y-axis


The following figure illustrates the geometry of the Radon transform.

Geometry of the Radon Transform


Plotting the Radon Transform

You can compute the Radon transform of an image I for the angles specified in the vector theta using the radon function with this syntax.

[R,xp] = radon(I,theta);  

The columns of R contain the Radon transform for each angle in theta. The vector xp contains the corresponding coordinates along the x′-axis. The center pixel of I is defined to be floor((size(I)+1)/2); this is the pixel on the x′-axis corresponding to .

The commands below compute and plot the Radon transform at 0 and 45 of an image containing a single square object. xp is the same for all projection angles.

I = zeros(100,100); I(25:75, 25:75) = 1; imshow(I) [R,xp] = radon(I,[0 45]); figure; plot(xp,R(:,1)); title('R_{0^o} (x\prime)')

Radon Transform of a Square Function at 0 Degrees

figure; plot(xp,R(:,2)); title('R_{45^o} (x\prime)')

Radon Transform of a Square Function at 45 Degrees

Viewing the Radon Transform as an Image

The Radon transform for a large number of angles is often displayed as an image. In this example, the Radon transform for the square image is computed at angles from 0 to 180, in 1 increments.

theta = 0:180; [R,xp] = radon(I,theta); imagesc(theta,xp,R); title('R_{\theta} (X\prime)'); xlabel('\theta (degrees)'); ylabel('X\prime'); set(gca,'XTick',0:20:180); colormap(hot); colorbar

Radon Transform Using 180 Projections

Detecting Lines Using the Radon Transform

The Radon transform is closely related to a common computer vision operation known as the Hough transform. You can use the radon function to implement a form of the Hough transform used to detect straight lines. The steps are

  1. Compute a binary edge image using the edge function.

    I = fitsread('solarspectra.fts'); I = mat2gray(I); BW = edge(I); imshow(I), figure, imshow(BW)

  2. Compute the Radon transform of the edge image.

    theta = 0:179; [R,xp] = radon(BW,theta); figure, imagesc(theta, xp, R); colormap(hot); xlabel('\theta (degrees)'); ylabel('x\prime'); title('R_{\theta} (x\prime)'); colorbar

    Radon Transform of an Edge Image

  3. Find the locations of strong peaks in the Radon transform matrix. The locations of these peaks correspond to the locations of straight lines in the original image.

In the following figure, the strongest peaks in R correspond to and . The line perpendicular to that angle and located at is shown below, superimposed in red on the original image. The Radon transform geometry is shown in black. Notice that the other strong lines parallel to the red line also appear as peaks at in the transform. Also, the lines perpendicular to this line appear as peaks at .

Radon Transform Geometry and the Strongest Peak (Red)


DCT Definition

The discrete cosine transform (DCT) represents an image as a sum of sinusoids of varying magnitudes and frequencies. The dct2 function computes the two-dimensional discrete cosine transform (DCT) of an image. The DCT has the property that, for a typical image, most of the visually significant information about the image is concentrated in just a few coefficients of the DCT. For this reason, the DCT is often used in image compression applications. For example, the DCT is at the heart of the international standard lossy image compression algorithm known as JPEG. (The name comes from the working group that developed the standard: the Joint Photographic Experts Group.)

The two-dimensional DCT of an M-by-N matrix A is defined as follows.

The values Bpq are called the DCT coefficients of A. (Note that matrix indices in MATLAB always start at 1 rather than 0; therefore, the MATLAB matrix elements A(1,1) and B(1,1) correspond to the mathematical quantities A00 and B00, respectively.)

The DCT is an invertible transform, and its inverse is given by

The inverse DCT equation can be interpreted as meaning that any M-by-N matrix A can be written as a sum of MN functions of the form

These functions are called the basis functions of the DCT. The DCT coefficients Bpq, then, can be regarded as the weights applied to each basis function. For 8-by-8 matrices, the 64 basis functions are illustrated by this image.

The 64 Basis Functions of an 8-by-8 Matrix

Horizontal frequencies increase from left to right, and vertical frequencies increase from top to bottom. The constant-valued basis function at the upper left is often called the DC basis function, and the corresponding DCT coefficient B00 is often called the DC coefficient.


The DCT Transform Matrix

There are two ways to compute the DCT using Image Processing Toolbox software. The first method is to use the dct2 function. dct2 uses an FFT-based algorithm for speedy computation with large inputs. The second method is to use the DCT transform matrix, which is returned by the function dctmtx and might be more efficient for small square inputs, such as 8-by-8 or 16-by-16. The M-by-M transform matrix T is given by

For an M-by-M matrix A, T*A is an M-by-M matrix whose columns contain the one-dimensional DCT of the columns of A. The two-dimensional DCT of A can be computed as B=T*A*T'. Since T is a real orthonormal matrix, its inverse is the same as its transpose. Therefore, the inverse two-dimensional DCT of B is given by T'*B*T.


DCT and Image Compression

In the JPEG image compression algorithm, the input image is divided into 8-by-8 or 16-by-16 blocks, and the two-dimensional DCT is computed for each block. The DCT coefficients are then quantized, coded, and transmitted. The JPEG receiver (or JPEG file reader) decodes the quantized DCT coefficients, computes the inverse two-dimensional DCT of each block, and then puts the blocks back together into a single image. For typical images, many of the DCT coefficients have values close to zero; these coefficients can be discarded without seriously affecting the quality of the reconstructed image.

The example code below computes the two-dimensional DCT of 8-by-8 blocks in the input image, discards (sets to zero) all but 10 of the 64 DCT coefficients in each block, and then reconstructs the image using the two-dimensional inverse DCT of each block. The transform matrix computation method is used.

I = imread('cameraman.tif'); I = im2double(I); T = dctmtx(8); dct = @(block_struct) T * * T'; B = blockproc(I,[8 8],dct); mask = [1 1 1 1 0 0 0 0 1 1 1 0 0 0 0 0 1 1 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0]; B2 = blockproc(B,[8 8],@(block_struct) mask .*; invdct = @(block_struct) T' * * T; I2 = blockproc(B2,[8 8],invdct); imshow(I), figure, imshow(I2)

Although there is some loss of quality in the reconstructed image, it is clearly recognizable, even though almost 85% of the DCT coefficients were


Definition of Fourier Transform

The Fourier transform is a representation of an image as a sum of complex exponentials of varying magnitudes, frequencies, and phases. The Fourier transform plays a critical role in a broad range of image processing applications, including enhancement, analysis, restoration, and compression.

If f(m,n) is a function of two discrete spatial variables m and n, then the two-dimensional Fourier transform of f(m,n) is defined by the relationship

The variables ω1 and ω2 are frequency variables; their units are radians per sample. F(ω1,ω2) is often called the frequency-domain representation of f(m,n). F(ω1,ω2) is a complex-valued function that is periodic both in ω1 and ω2, with period . Because of the periodicity, usually only the range is displayed. Note that F(0,0) is the sum of all the values of f(m,n). For this reason, F(0,0) is often called the constant component or DC component of theFourier transform. (DC stands for direct current; it is an electrical engineering term that refers to a constant-voltage power source, as opposed to a power source whose voltage varies sinusoidally.)

The inverse of a transform is an operation that when performed on a transformed image produces the original image. The inverse two-dimensional Fourier transform is given by

Roughly speaking, this equation means that f(m,n) can be represented as a sum of an infinite number of complex exponentials (sinusoids) with different frequencies. The magnitude and phase of the contribution at the frequencies (ω1,ω2) are given by F(ω1,ω2).

Visualizing the Fourier Transform

To illustrate, consider a function f(m,n) that equals 1 within a rectangular region and 0 everywhere else. To simplify the diagram, f(m,n) is shown as a continuous function, even though the variables m and n are discrete.

Rectangular Function

The following figure shows, as a mesh plot, the magnitude of the Fourier transform,

of the rectangular function shown in the preceding figure. The mesh plot of the magnitude is a common way to visualize the Fourier transform.

Magnitude Image of a Rectangular Function

The peak at the center of the plot is F(0,0), which is the sum of all the values in f(m,n). The plot also shows that F(ω1,ω2) has more energy at high horizontal frequencies than at high vertical frequencies. This reflects the fact that horizontal cross sections of f(m,n) are narrow pulses, while vertical cross sections are broad pulses. Narrow pulses have more high-frequency content than broad pulses.

Another common way to visualize the Fourier transform is to display

as an image, as shown.

Log of the Fourier Transform of a Rectangular Function

Using the logarithm helps to bring out details of the Fourier transform in regions where F(ω1,ω2) is very close to 0.

Examples of the Fourier transform for other simple shapes are shown below.

Fourier Transforms of Some Simple Shapes


Discrete Fourier Transform

Working with the Fourier transform on a computer usually involves a form of the transform known as the discrete Fourier transform (DFT). A discrete transform is a transform whose input and output values are discrete samples, making it convenient for computer manipulation. There are two principal reasons for using this form of the transform:

  • The input and output of the DFT are both discrete, which makes it convenient for computer manipulations.

  • There is a fast algorithm for computing the DFT known as the fast Fourier transform (FFT).

The DFT is usually defined for a discrete function f(m,n) that is nonzero only over the finite region and . The two-dimensional M-by-N DFT and inverse M-by-N DFT relationships are given by


The values F(p,q) are the DFT coefficients of f(m,n). The zero-frequency coefficient, F(0,0), is often called the "DC component." DC is an electrical engineering term that stands for direct current. (Note that matrix indices in MATLAB always start at 1 rather than 0; therefore, the matrix elements f(1,1) and F(1,1) correspond to the mathematical quantities f(0,0) and F(0,0), respectively.)

The MATLAB functions fft, fft2, and fftn implement the fast Fourier transform algorithm for computing the one-dimensional DFT, two-dimensional DFT, and N-dimensional DFT, respectively. The functions ifft, ifft2, and ifftn compute the inverse DFT.

Relationship to the Fourier Transform

The DFT coefficients F(p,q) are samples of the Fourier transform F(ω1,ω2).

Visualizing the Discrete Fourier Transform

  1. Construct a matrix f that is similar to the function f(m,n) in the example in Definition of Fourier Transform. Remember that f(m,n) is equal to 1 within the rectangular region and 0 elsewhere. Use a binary image to represent f(m,n).

    f = zeros(30,30); f(5:24,13:17) = 1; imshow(f,'InitialMagnification','fit')

  2. Compute and visualize the 30-by-30 DFT of f with these commands.

    F = fft2(f); F2 = log(abs(F)); imshow(F2,[-1 5],'InitialMagnification','fit'); colormap(jet); colorbar

    Discrete Fourier Transform Computed Without Padding

    This plot differs from the Fourier transform displayed in Visualizing the Fourier Transform. First, the sampling of the Fourier transform is much coarser. Second, the zero-frequency coefficient is displayed in the upper left corner instead of the traditional location in the center.

  3. To obtain a finer sampling of the Fourier transform, add zero padding to f when computing its DFT. The zero padding and DFT computation can be performed in a single step with this command.

    F = fft2(f,256,256);

    This command zero-pads f to be 256-by-256 before computing the DFT.

    imshow(log(abs(F)),[-1 5]); colormap(jet); colorbar

    Discrete Fourier Transform Computed with Padding

  4. The zero-frequency coefficient, however, is still displayed in the upper left corner rather than the center. You can fix this problem by using the function fftshift, which swaps the quadrants of F so that the zero-frequency coefficient is in the center.

    F = fft2(f,256,256);F2 = fftshift(F); imshow(log(abs(F2)),[-1 5]); colormap(jet); colorbar

    The resulting plot is identical to the one shown in Visualizing the Fourier Transform.

Applications of the Fourier Transform

This section presents a few of the many image processing-related applications of the Fourier transform.

Frequency Response of Linear Filters

The Fourier transform of the impulse response of a linear filter gives the frequency response of the filter. The function freqz2 computes and displays a filter's frequency response. The frequency response of the Gaussian convolution kernel shows that this filter passes low frequencies and attenuates high frequencies.

h = fspecial('gaussian');  freqz2(h)

Frequency Response of a Gaussian Filter

See Designing and Implementing 2-D Linear Filters for Image Data for more information about linear filtering, filter design, and frequency responses.

Fast Convolution

A key property of the Fourier transform is that the multiplication of two Fourier transforms corresponds to the convolution of the associated spatial functions. This property, together with the fast Fourier transform, forms the basis for a fast convolution algorithm.

    Note The FFT-based convolution method is most often used for large inputs. For small inputs it is generally faster to use imfilter.

To illustrate, this example performs the convolution of A and B, where A is an M-by-N matrix and B is a P-by-Q matrix:

  1. Create two matrices.

    A = magic(3);  B = ones(3);
  2. Zero-pad A and B so that they are at least (M+P-1)-by-(N+Q-1). (Often A and B are zero-padded to a size that is a power of 2 because fft2 is fastest for these sizes.) The example pads the matrices to be 8-by-8.
    A(8,8) = 0;  B(8,8) = 0;
  3. Compute the two-dimensional DFT of A and B using fft2, multiply the two DFTs together, and compute the inverse two-dimensional DFT of the result using ifft2
    C = ifft2(fft2(A).*fft2(B));
  4. Extract the nonzero portion of the result and remove the imaginary part caused by roundoff error.
    C = C(1:5,1:5);  C = real(C)

    This example produces the following result.

    C = 8.0000 9.0000 15.0000 7.0000 6.0000 11.0000 17.0000 30.0000 19.0000 13.0000 15.0000 30.0000 45.0000 30.0000 15.0000 7.0000 21.0000 30.0000 23.0000 9.0000 4.0000 13.0000 15.0000 11.0000 2.0000

Locating Image Features

The Fourier transform can also be used to perform correlation, which is closely related to convolution. Correlation can be used to locate features within an image; in this context correlation is often called template matching.

This example illustrates how to use correlation to locate occurrences of the letter "a" in an image containing text:

  1. Read in the sample image.

    bw = imread('text.png');
  2. Create a template for matching by extracting the letter "a" from the image.

    a = bw(32:45,88:98);

    You can also create the template image by using the interactive version of imcrop.

    The following figure shows both the original image and the template.

    imshow(bw);  figure, imshow(a);

    Image (left) and the Template to Correlate (right)

  3. Compute the correlation of the template image with the original image by rotating the template image by 180o and then using the FFT-based convolution technique described in Fast Convolution.

    (Convolution is equivalent to correlation if you rotate the convolution kernel by 180o.) To match the template to the image, use the fft2 and ifft2 functions.

    C = real(ifft2(fft2(bw) .* fft2(rot90(a,2),256,256)));

    The following image shows the result of the correlation. Bright peaks in the image correspond to occurrences of the letter.

    figure, imshow(C,[]) % Scale image to appropriate display range.

    Correlated Image

  4. To view the locations of the template in the image, find the maximum pixel value and then define a threshold value that is less than this maximum. The locations of these peaks are indicated by the white spots in the thresholded correlation image. (To make the locations easier to see in this figure, the thresholded image has been dilated to enlarge the size of the points.)

    max(C(:)) ans = 68.0000 thresh = 60; % Use a threshold that's a little less than max. figure, imshow(C > thresh) % Display pixels over threshold.

    Correlated, Thresholded Image Showing Template Locations