In order to successfully complete this assignment, you must follow all the instructions in this notebook and upload your edited ipynb file to D2L with your answers on or before 11:59pm on Friday March 12th.

BIG HINT: Read the entire homework before starting.

Homework 3: Color based image segmentation

Image segmentation is the process of separating the stuff you are interested in (foreground) with stuff you are not interested in (background). Color is often used as an easy way to segment images. Thresholds are selected for pixels in a range of color and defined as either foreground or background.

Faces come in a diverse range of colors making them difficult to segment using traditional Red, Green and Blue values.

In this question we are going to use eigenvectors and Principal Component Analysis (PCA) to learn a new "skin tone colorspace" designed to make it much easier to segment faces (skin colors) from the background.

In [1]:
#Some python packages you may need in this homework.

%matplotlib inline
import matplotlib.pylab as plt
import numpy as np
import sympy as sym
sym.init_printing(use_unicode=True)

To start we are going to download a bunch of color values from the following website.

https://archive.ics.uci.edu/ml/datasets/skin+segmentation#

The file we are interested in is linked below:

https://archive.ics.uci.edu/ml/machine-learning-databases/00229/Skin_NonSkin.txt

The file contains thousands of colors selected from a diverse population of face images. Note these colors are in the order BGR or Blue, Green and Red. The file also contains hundreds of BGR colors selected from non-faces (these non-face colors are used in machine learning but are discarded for this homework). The fourth number in each row is a label where the number one (1) indicates the row is a skin color and two (2) indicates a non-skin color.

The following cells downloads the file, removes all of the non-skin values, reorders the points to RGB and plots the skin points as a scatter plot in traditional R G B space.

In [2]:
#get the data file from the internet:
from urllib.request import urlopen, urlretrieve

url = 'https://archive.ics.uci.edu/ml/machine-learning-databases/00229/Skin_NonSkin.txt'
file = "Skin_NonSkin.txt"

response = urlopen(url)
data = response.read()      # a `bytes` object
text = data.decode('utf-8') 
lines = text.split('\r\n')

data = []

#Read in file line by line
for line in lines:
    try:
        if line:
            data.append(list(map(int, line.split('\t'))))
    except:
        print('invalid line of data:',line)
response.close()
In [3]:
#Convert the file to a list of points
P = np.matrix(data)
P.shape
Out[3]:
$$\left ( 245057, \quad 4\right )$$
In [4]:
#Mask out only face values and keep just the RGBs
mask = np.array(P[:,3]==1)
mask = mask.flatten()
points = P[mask,:]

## Change order to Red, Green, Blue
points = points[:,(2,1,0)]
In [5]:
# Plot the points in 3D using their actual color values
from mpl_toolkits.mplot3d import Axes3D

fig = plt.figure()
ax = fig.add_subplot(111, projection='3d')

ax.scatter(points[:,0], points[:,1], points[:,2], c=points/255)

ax.set_xlabel('Red');
ax.set_ylabel('Green');
ax.set_zlabel('Blue');

Question 1: (10 points) Calculate the average (mean) Red, Green and Blue values in the points plotted in the above figure. Store the these values in a vector named mn.

In [6]:
#Put your answer to the above question here
In [7]:
from answercheck import checkanswer
checkanswer.detailedwarnings=False
checkanswer.vector(mn,"1fa50978a380472875752d3d083afa41");

Question 2: (10 points) Subtract the mean values (mn) from each of the points (i.e. center the points around the origin and store these in a new matrix called A (This is the first step in PCA).

In [8]:
#put your answer to the above question here.
In [9]:
from answercheck import checkanswer
checkanswer.detailedwarnings = False;
checkanswer.matrix(A, "968ac30b396e941c60b6fcfeade0335c");

Question 3: (5 points) Plot the adjusted points again and make sure they are approximately centered around the origin. NOTE: keep the color input to scatter the same so we see the original colors in the graph.

In [10]:
# YOUR CODE HERE
raise NotImplementedError()

Question 4: (10 points) Calculate the $3 \times 3$ Gram matrix $C = A^TA$.

In [11]:
##Put your answer to the above question here
In [12]:
from answercheck import checkanswer
checkanswer.matrix(C, "267893b255a1b2035403c91c74443a63");

Question 5: (10 points) Calculate the eigenvalues and eigenvectors of the Gram matrix $C$.

In [13]:
# YOUR CODE HERE
raise NotImplementedError()

Question 6: (10 points) Use the eigenvectors found in Question 5 to create a $3 \times 3$ orthonormal color transformation matrix T that can transform points from RGB space into the new principal component's color space. I.e. the rows of the transformation matrix should consist of the three normalized eigenvectors sorted from largest to smallest eigenvalues (largest on top).

In [14]:
# Put your answer here  
In [15]:
from answercheck import checkanswer
checkanswer.matrix(T, "dca594755c0e0df561f15b04bff2d091");

Now lets download and view an example image that has faces and see if we can segment out the faces using skin color.

In [16]:
from imageio import imread

url = 'https://hecatedemeter.files.wordpress.com/2013/12/diverse-crowd.jpg'
file = 'Faces.jpg'

urlretrieve(url,file);

im = imread(file)
plt.imshow(im);

The following code "unwraps" this image and puts it into a new im_points matrix:

In [17]:
#Turn into into 3 x n vector
im_points = im.reshape((im.shape[0]*im.shape[1],3))
im_points.shape
Out[17]:
$$\left ( 277596, \quad 3\right )$$

Question 7: (10 points) Now take the image points and center them using the mean vector mn calculated above into a new matrix im_A.

In [18]:
#Put your answer to the above question here
In [19]:
from answercheck import checkanswer
checkanswer.matrix(im_A, "0394347f996826c68245671d07e5bcf4");

Question 8: (5 points) Plot the centered im_A in the R G B space as above. NOTE: make sure you use the original image points for the color variable in scatter.

In [20]:
# YOUR CODE HERE
raise NotImplementedError()

Question 9: (10 points) Use the $3 \times 3$ color transformation matrix (T) calculated above to transform the im_A points into the new face PCA colorspace. Name the new points color_points which should have the same shape as im_points.

In [21]:
# Put your answer to the above question here.
In [22]:
##Checking size only.
assert(color_points.shape == im_A.shape)

The next step transforms the color_points back into image matrices

In [23]:
# Turn each component back into a square image
principal_axis_1 = np.array(color_points[:,0].reshape(im[:,:,0].shape))
principal_axis_2 = np.array(color_points[:,1].reshape(im[:,:,0].shape))
principal_axis_3 = np.array(color_points[:,2].reshape(im[:,:,0].shape))

Because we are dealing with pictures, we should be able to visualize how each color point in the image falls on the eigvenvectors (aka principal components vectors). The following code shows each principal axis as a grayscale image and the combined picture in a "skin tone" colorspace (where the first principal axis is mapped to Red, the second to Green and the third to Blue). This last one doesn't really tell us that much but it is interesting

In [24]:
f, ax = plt.subplots(1,4, figsize=(20,10))

ax[0].imshow(principal_axis_1, cmap='gray')
ax[0].axis('off')
ax[0].set_title('pricipal axis')
ax[1].imshow(principal_axis_2, cmap='gray')
ax[1].axis('off')
ax[1].set_title('second pricipal axis')
ax[2].imshow(principal_axis_3, cmap='gray');
ax[2].axis('off')
ax[2].set_title('third pricipal axis');

combined = im.copy()
combined[:,:,0] = principal_axis_1
combined[:,:,1] = principal_axis_2
combined[:,:,2] = principal_axis_3

ax[3].imshow(combined);
ax[3].axis('off')
ax[3].set_title('Combined');

Now, if we assume we did everything right the first picture on the right should represent the values on the first principal axis. The second and third image are orthogonal to the first. If we assume that most of the variance in the face colors are represented in this first image than we can model faces (skin) as the values close to this axis.

Or, another way to look at is is the points closest to the first axis are more like skin colors and the points farther away from the first image is less like a skin. Points farther away from the first axis will have large values in the second and third principal axes.

Question 10: (10 points) create a new matrix (same size as the image) with values equal to the euclidian distance of each of the PCA points to the first principal axis. In other words, write code to do the following where $p_2$ is the second principal axis and $p_3$ is the third:

$$ distance = \sqrt{p_2^2 + p_3^2}$$
In [25]:
#Put your answer to the above question here.
In [26]:
from answercheck import checkanswer
checkanswer.matrix(distance, "8e1e05f148bc760af2e4d43c3f816cdc");

We can then display this distance using the following code:

In [27]:
plt.imshow(distance, cmap='viridis')
plt.colorbar()

Low distances in the above distance matrix should represent colors close to "skin" and larger distances should represent colors farther away from skin.

Use the following code to pick a threshold to try and best highlight the pixels in the image that represent skin.

In [28]:
distance_threshold = 20
In [29]:
segment =  distance < distance_threshold

f, ax = plt.subplots(1,2, figsize=(20,10))

ax[0].imshow(segment, cmap='gray')
ax[0].axis('off')
ax[0].set_title('Segmented Image Mask')
ax[1].imshow(im)
ax[1].axis('off')
ax[1].set_title('Original Image')

If we did everything right the right-hand picture above should be a mask of all pixels in the left hand images that are skin tones. Obviously this is not a perfect model.

Question 11: (10 points) The above model fails to find really dark skin tones. Results like these are often viewed as racist. What are the main causes of bias in this model and what could be done to ensure that a more more representative set of skin tones are included in an updated model?

YOUR ANSWER HERE


Congratulations, we're done!

Written by Dirk Colbry, Michigan State University Creative Commons License
This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License.