Tuesday, September 8, 2009

Act 17: Photometric Stereo

In this activity, we attempt to construct a 3-D image using four pictures of the same synthetic spherical image. These four images are for the four different locations of the point source namely:

V1 = {0.085832, 0.17365, 0.98106}
V2 = {0.085832, -0.17365, 0.98106}
V3 = {0.17365, 0, 0.98481}
V4 = {0.16318, -0.34202, 0.92542}

The image as displayed in Matlab is displayed below:


First, we compute the surface normal of the image, which is given in the following equation

where g is computed from

From the obtained surface normals, nx, ny and nz were used to obtain the partial derivatives of f(u,v) with respect to both x and y using the following equations:

To get f, the equation below was used


The Scilab function cumsum can be done in lieu of integration. The obtained surface normal was then plotted using plot3d, and the reconstruction is shown below.

I will give myself a grade of 10/10 since this activity was accomplished within the period. I would like to thank Earl for the cooperation and Martin for helping us with the final parts of the code.

Wednesday, September 2, 2009

Act 16: Neural Networks

In this activity, we also classify objects, except this time we make use of neural networks. A neural network is a computational model of how neurons in brains work. In comparison to linear discriminant analysis, there is no need to set rules to classify. Rather, neural networks learn the rules by examples which it then applies to the objects to be classified.

For this activity, we make use of the two classes in Activity 15. Cole's neural network code was used for this activity. After setting a seed so that each run will be consistent, we first make a training set:

x = [0.38 0.39;
0.35 0.38;
0.33 0.38;
0.31 0.36;
0.13 0.27;
0.12 0.27;
0.09 0.2;
0.09 0.2]

Note that the first column is the pixel area/1000 of the object and the second column is the red color channel value of the object. We set the values for classifying the objects as

[0 0 0 0 1 1 1 1]

This means that the first four items correspond to the class 0, which is the white squash seed while the other items correspond to the small round seed. We can now run the test set since the neural network has learned how to classify the objects. Our inputs for the test set are

x1 = [0.35 0.38;
0.31 0.36;
0.12 0.27;
0.11 0.25;
0.11 0.24;
0.1 0.23;
0.1 0.22;
0.09 0.22;
0.09 0.2;
0.09 0.2;
0.33 0.38;
0.1 0.24;
0.3 0.36;
0.28 0.35;
0.26 0.31;
0.24 0.3;
0.13 0.27;
0.38 0.39;
0.27 0.35;
0.27 0.34]

We set the learning rate to 1.0 and the training cycle to 1000. By manual classification, the output should be

[0 0 1 1 1 1 1 1 1 1 0 1 0 0 0 0 1 0 0 0]

The output of the neural network is:

[0.0081117 0.0258541 0.9832197 0.9897846 0.9904403 0.9936995 0.9940926 0.9957981 0.9962910 0.9962910 0.0134392 0.9932781 0.0346688 0.0675553 0.1581799
0.2973589 0.9757485 0.0039843 0.0923583 0.0977971 ]

Rounded off, this becomes:

[0 0 1 1 1 1 1 1 1 1 0 1 0 0 0 0 1 0 0 0]

We can see that the neural network has correctly classified the objects.

The value of the learning rate was then adjusted. I found out that decreasing the learning rate decreases the accuracy of the network. When set too low, the output values all become zero. Setting the training cycle too low will also have the same effect.

I will give myself a grade of 10 for this activity. Again, I would like to thank Earl for helping me collect the data in Activity 15, since it was used for this activity.

Act 15: Probabilistic Classification

In this activity, we make of use the results of Activity 14 and segregate two classes of objects in the image. In linear discriminant analysis (LDA) , this is done in two ways. First, a set of features that best distinguish an object is chosen, and then a classification rule or model is used to separate the objects. Much information for this activity was taken from http://people.revoledu.com/kardi/tutorial/LDA/.

We make use of the test image similar to Activity 14, except we eliminate all the largest seeds so that we will be left with only two groups.

The method for distinguishing the features of the objects used was also similar to that of Activity 14. From these data, we make use of LDA to classify the two objects into two separate classes. If LDA works, the separator between the two groups should be a line. Classifying the two groups with two discriminant functions will yield the plot below

It can be seen that the graph can easily be separated by a line. In fact, the two objects seem too distinct with each other.

I will give myself a grade of 10/10 for finishing this activity. I would like to thank Earl for this help in both data collection and programming.

Wednesday, August 26, 2009

Act 14: Pattern Recognition

In this activity, we extract patterns from a given image using image processing in order to define a set of features that will allow us to separate the the set into classes, and to find the most suitable classifier for the task, since some objects can have similar parameters. This prompts us to search for other parameters that may be unique. We take the mean of the numerical values of each of these parameters, and we then compare each object's parameter to the respective mean parameter. For an object to belong to a certain class, their parameter should be close to the mean.

We make use of an assembly of three different kinds of bird seeds: small round seeds, white squash seeds and large reddish-brown seed.

The parameters that were took into account were the size of the seeds and their red color channel values. The graph below summarizes the objects parameter's qualities:

The red colored dot stands for the mean value of the small round seed, the blue for the white squash seed and the blue for the large reddish-brown seed. It can be seen that similar objects are clustered together, save for a few deviations in color, which is between the class of the small round seeds and the white squash seeds. The color discrepancy may be attributed to the uneven illumination of the objects when the image was taken. However, the sizes of the two smaller seeds are somehow distinct from each other, so we can still infer the class the correctly belong to. The largest seeds are very far apart from the other two classes in terms of size, so the somehow large deviations from the mean does not matter.

I will give myself a grade of 10/10 for this activity since the code has successfully classified the objects. I would like to thank Earl for his assistance in data collection and help in the code.

Monday, August 17, 2009

Act 13: Correcting Geometric Distortion

In this activity, we correct a barrel distortion in a given image, in this case a grid, using two methods: bilinear interpolation and nearest neighbor method. The image to be reconstructed is shown below:

We compare the image with an ideal grid constructed computationally in Scilab, which shall serve as the reference image.

The vertices of the grid will serve as reference points for the reconstruction. Firstly, the most undistorted image was located from the image, and the number of pixels down and across one box was counted. For each box, c1 to c8 was computed from the four corner points of the box, which is given by the equation below.

which is in matrix-vector notation

Since we are looking for the coefficients, we manipulate the above equation to

We then determine the location of those points in the ideal rectangle using


The locations of the distorted image were found using Scilab's locate function. The results were also rounded off so that they become integer-valued.

Using nearest neighbor method, the reconstructed image looks like this:

while using bilinear interpolation yields the image below.

As expected, the bilinear interpolation has better quality as compared to the grid reconstructed using nearest neighbor method.

For this activity, I will give myself a grade of 10/10 for accomplishing this activity. I would like to thank Gilbert for his tremendous help in this activity.

Friday, August 7, 2009

Act 12: Color Image Segmentation

In this activity, we try to single out a particular region in a given image by taking a cropped image of the region's surface, and implement it using both parametric segmentation and non-parametric segmentation methods and compare their outcomes. We make use of the image below:

Our region of interest is from that of the silver mouse:

Applying parametric segmentation on the original image using the above region of interest, we get the image below:

As said in the the activity manual, histogram backprojection is similar to what was done in Activity 4, except that the lookup histogram is now two-dimenstional. Thus for the nonparametric part, the following histograms were obtained from the p(r) and p(g) values of the ROI:

Applying histogram backprojection yields the result below


This time, we make use of a portion of the wooden table as a region of interest:

Using parametric segmentation:

The histogram for the nonparametric segmentation is shown below

And the resulting image from histogram backprojection is:

Note that the histograms above correspond properly to the normalized color chromaticity space shown below. The silver mouse appears much more bluish looking and shinier, hence it being closer to both the white and blue regions of the color chromaticity space. On the other hand, the wood ROI, while not exactly white, appears in a region where the colors mix.

It can be seen from the images that the parametric segmentation makes the region of interest much more visible. However, the nonparametric segmentation somewhat recovers specular reflection, as can be seen in both images. Parametric segmentation is more useful when you need to clearly outline the parts of the image that correspond to your region of interest. Nonparametric segmentation may be more useful for manipulating your image, such as changing the regions of interest's colors, hue, brightness, etc.

I will give myself a grade of 8/10 for this activity, due to having finished the exercise. The reduced grade is for not completely understanding the exercise. Hopefully, it will all be explained in the next meeting. I would like to thank Gilbert for helping me with the code in this exercise.

Thursday, August 6, 2009

Act 11: Color Image Processing

In this activity, we intentionally take two wrongly balanced images and apply two types of algorithms to properly balance them: the white patch and gray world algorithm.

The first image is an ensemble of colorful objects:

Applying the white patch algorithm:

Applying the gray world algorithm:

For this image it seems better to use the gray world algorithm; the white patch-rendered image seems too bright and oversaturated. The gray world-rendered image seems more natural compared to the former. This is due to the fact that the gray world takes the average of all the colors in the image, and there seems to be a good balance of colors in the assembled ensemble of objects.

For the next image, we make use of an image with different hues of green

The image below is rendered using the white patch algorithm:

The image below is rendered using the gray world algorithm:

For this case it seems better to use the white patch algorithm since the color saturation of the color green actually messes up the overall average color of the image. The color balance of the gray world image as a result is worse than the original.

I will give myself a grade of 8/10 for this activity for somehow being able to accomplish a good part of this activity, but I am hard pressed for time to take images that are set-up in other lighting conditions. I would like to thank Earl for his cooperative work on this activity as well as Rommel for helping me with setting up the objects for the image.