Thursday, August 6, 2009

Act 10: Preprocessing Text

In this activity, we use the given Untitled_0001.jpg and crop a portion from it. I chose this particular segment

The image was binarized so that it can be processed, and titled by using mogrify so that the lines become horizontal. Performing FFT on the tilted image yields the image for the frequency domain

We can make an appropriate mask to for the above image. I made use of

The masked frequency domain image was returned to the spacial domain using ifft, yielding the image

By applying the appropriate threshold, the image is binarized

Using the closing operator with structural element of a 4x1 matrix consisting of ones, i.e. [1; 1; 1; ], the image below was obtained

Using bwlabel, we can mark the blobs


For the last part of the activity, we are to find all the instances of the word 'DESCRIPTION' in the Untitled_0001.jpg. We apply the same routine used in Template Matching using Correlation section of Activity 5. We start by rotating the image using mogrify so that the image is in its proper orientation. We also crop a part of the image containing the word description and place it in a black background with the same size as the original image, like so

Again, we use the same routine as Activity 5, where we apply fft2 on both images. We the correlate the above image with the binarized and rotated Untitled_0001.jpg. The algorithm yields the image below:

It can be seen from the image that there are three white spots. Looking back at Untitled_0001.jpg, the spots appear at the locations of the word 'DESCRIPTION'. Thus, we can say that there are three instances where the word appears.

I've tried to implement an algorithm for automatically checking for the number of instances the word appears in the image. I made use of a for loop to check the the whole normalized image element per element for values sufficiently close to 1, and count them. Fortunately, a value of 3 instances was returned. Unfortunately, I have yet to check if those 3 actually correspond to the correlations. I'll try to get back at this problem sometime later.

For this activity I will give myself a grade of 8/10 since the final processed image seems rougher than I'd want (i.e. the handwriting isn't exactly 1-pixel thick), but I have properly done the last part. I would like to thank Earl for the helpful discussions, and Mimie for teaching me how to properly use bwlabel in tandem with jetcolormap in imshow.

1 comment: