Home
Course Guidelines
About the course Prerequite Material References
Python
Jupyter Notebooks Python overview
Exercises
Before the semester start: Installation and exercise setup Week 1: Introduction to Python and libraries Week 2: Vector representations Week 3: Linear Algebra Week 4: Linear Transformations Week 5: Models and least squares Week 6: Assignment 1 - Gaze Estimation Week 7: Model selection and descriptive statistics Week 8: Filtering Week 9: Classification Week 10: Evaluation Week 11: Dimensionality reduction Week 12: Clustering and refresh on gradients Week 13: Neural Networks Week 14: Convolutional Neural Networks (CNN's)
Tutorials
Week 1: Data analysis, manipulation and plotting Week 2: Linear algebra Week 3: Transformations tutorial Week 4: Projection and Least Squares tutorial Week 7: Cross-validation and descriptive statistics tutorial Week 8: Filtering tutorial Week 11: Gradient Descent / Ascent
In-class Exercises
In-class 1 In-class 2 In-class 10 In-class 3 In-class 4 In-class 8
Explorer

Document

  • Overview
  • 2. Non-linear decision boundaries
  • 3. Evaluating Classifiers
  • 4. Bases and Transformations
  • 5. HoG Classifier

Content

  • Creating the training data
  • Training a classifier
    • Task 1 Implement a classifier
  • Testing on new images
    • Task 2 Sliding window
    • Task 3 Classifying the test image
  • Reflections
    • Task 4 Reflections

HoG classifier

The current exercise builds on the HOG exercises from the previous exercise , but it can also be successfully completed independently.

List of individual tasks
  • Task 1: Implement a classifier
  • Task 2: Sliding window
  • Task 3: Classifying the test image
  • Task 4: Reflections

The cell below imports the necessary libraries, loads an example image and extracts and plots the HOG features along the original image.

import matplotlib.pyplot as plt import numpy as np import seaborn as sns from itertools import chain from skimage import data, transform, feature, exposure from skimage.feature import hog from sklearn.datasets import fetch_lfw_people from skimage.color import rgb2gray from sklearn.feature_extraction.image import PatchExtractor from sklearn.svm import LinearSVC #from sklearn.linear_model import LogisticRegression from sklearn.metrics import confusion_matrix image = data.astronaut() fd, hog_image = hog(image, orientations=8, pixels_per_cell=(8, 8), cells_per_block=(2, 2), visualize=True, channel_axis=-1) fig, (ax1, ax2) = plt.subplots(1, 2, figsize=(12, 6), sharex=True, sharey=True) ax1.axis('off') ax1.imshow(image, cmap=plt.cm.gray) ax1.set_title('Input image') # Rescale histogram for better display hog_image_rescaled = exposure.rescale_intensity(hog_image, in_range=(0, 10)) ax2.axis('off') ax2.imshow(hog_image_rescaled, cmap=plt.cm.gray) ax2.set_title('Histogram of Oriented Gradients') plt.show()
import matplotlib.pyplot as plt
import numpy as np
import seaborn as sns
from itertools import chain

from skimage import data, transform, feature, exposure
from skimage.feature import hog
from sklearn.datasets import fetch_lfw_people
from skimage.color import rgb2gray
from sklearn.feature_extraction.image import PatchExtractor
from sklearn.svm import LinearSVC
#from sklearn.linear_model import LogisticRegression
from sklearn.metrics import confusion_matrix

image = data.astronaut()

fd, hog_image = hog(image, orientations=8, pixels_per_cell=(8, 8),
                    cells_per_block=(2, 2), visualize=True, channel_axis=-1)

fig, (ax1, ax2) = plt.subplots(1, 2, figsize=(12, 6), sharex=True, sharey=True)

ax1.axis('off')
ax1.imshow(image, cmap=plt.cm.gray)
ax1.set_title('Input image')

# Rescale histogram for better display
hog_image_rescaled = exposure.rescale_intensity(hog_image, in_range=(0, 10))

ax2.axis('off')
ax2.imshow(hog_image_rescaled, cmap=plt.cm.gray)
ax2.set_title('Histogram of Oriented Gradients')
plt.show()

Creating the training data

Through the following steps you will construct the training data, which comprises of HOG features extracted from images containing faces and non-faces. The constructed set will be used for training a binary classification model.

Face Images (Positive class)

You will use the "Labeled Faces in the Wild" dataset ("LFW") for the images containing faces as provided by scikit-learn. The cell below loads the dataset, extracts the facial images, and displays a selection of these images for visualization.

Info

When first loading the dataset, please be aware that the code may require a few minutes to finish execution.

faces = fetch_lfw_people() face_images = faces.images print(face_images.shape) # 13233 face images to use for training # Visualize Positive Set (face images) fig, ax = plt.subplots(4,6) for i, axi in enumerate(ax.flat): axi.imshow(face_images[500 * i], cmap='gray') axi.axis('off')
faces = fetch_lfw_people()

face_images = faces.images
print(face_images.shape) # 13233 face images to use for training

# Visualize Positive Set (face images)

fig, ax = plt.subplots(4,6)
for i, axi in enumerate(ax.flat):
    axi.imshow(face_images[500 * i], cmap='gray')
    axi.axis('off')
(13233, 62, 47)

Non-face Images (Negative class)

For non-face images other images from the Scikit-Image library are used. The cell below loads these images. The PatchExtractor method is used for data augmentation.

# negative data samples imgs_to_use = ['camera', 'text', 'coins', 'moon', 'page', 'clock', 'immunohistochemistry', 'chelsea', 'coffee', 'hubble_deep_field'] # images = [] images = [data.camera(), data.coins(),data.text(),data.moon(),data.page(),data.clock(),data.coffee(),data.hubble_deep_field()] img = [] for im in images: if len(im.shape)==3: img.append(rgb2gray(im)) else: img.append(im) ### Make patches of the different negative samples to generate a larger dataset def extract_patches(img, N, scale=1.0, patch_size=face_images[0].shape): extracted_patch_size = tuple((scale * np.array(patch_size)).astype(int)) extractor = PatchExtractor(patch_size=extracted_patch_size, max_patches=N, random_state=0) patches = extractor.transform(img[np.newaxis]) if scale != 1: patches = np.array([transform.resize(patch, patch_size) for patch in patches]) return patches non_face_images = np.vstack([extract_patches(im, 1000, scale) for im in img for scale in [0.5,1.0,2.0]]) # [0.5, 1.0, 2.0] print(non_face_images.shape)
# negative data samples
imgs_to_use = ['camera', 'text', 'coins', 'moon',
               'page', 'clock', 'immunohistochemistry',
               'chelsea', 'coffee', 'hubble_deep_field']
# images = []

images = [data.camera(), data.coins(),data.text(),data.moon(),data.page(),data.clock(),data.coffee(),data.hubble_deep_field()]
img = []
for im in images:
    if len(im.shape)==3:
        img.append(rgb2gray(im))
    else:
        img.append(im)

### Make patches of the different negative samples to generate a larger dataset

def extract_patches(img, N, scale=1.0, patch_size=face_images[0].shape):
    extracted_patch_size = tuple((scale * np.array(patch_size)).astype(int))
    extractor = PatchExtractor(patch_size=extracted_patch_size,
                               max_patches=N, random_state=0)
    patches = extractor.transform(img[np.newaxis])
    if scale != 1:
        patches = np.array([transform.resize(patch, patch_size)
                            for patch in patches])
    return patches

non_face_images = np.vstack([extract_patches(im, 1000, scale)
                              for im in img for scale in [0.5,1.0,2.0]]) # [0.5, 1.0, 2.0]
print(non_face_images.shape)
(24000, 62, 47)

The cell below visualizes the newly created non-face image patches:

# Visualize Negative Set fig, ax = plt.subplots(4,6) for i, axi in enumerate(ax.flat): axi.imshow(non_face_images[600 * i], cmap='gray') axi.axis('off')
# Visualize Negative Set

fig, ax = plt.subplots(4,6)
for i, axi in enumerate(ax.flat):
    axi.imshow(non_face_images[600 * i], cmap='gray')
    axi.axis('off')

The cell below combines the face and the non face images, then extracts the HOG features from the combined set. It also creates the target labels for the newly assembled training dataset.

X_train = np.array([hog(im, orientations=8, pixels_per_cell=(8, 8), cells_per_block=(2, 2), visualize=False) for im in chain(face_images, non_face_images)]) Y_train = np.zeros(X_train.shape[0]) Y_train[:face_images.shape[0]] = 1
X_train = np.array([hog(im, orientations=8, pixels_per_cell=(8, 8), cells_per_block=(2, 2), visualize=False)
                    for im in chain(face_images, non_face_images)])


Y_train = np.zeros(X_train.shape[0])
Y_train[:face_images.shape[0]] = 1

Training a classifier

In the following task you will train a linear classifier using HoG features.

Task 1: Implement a classifier

The cell below contains the classification model. Your task is to:

  1. Fit the model to the training data.

  2. Calculate and print the accuracy of the classifier on the training data.

  3. Construct and plot a confusion matrix of the model predictions on the training set.

  4. Based on the accuracy and the confusion matrix, discuss how effective the model is.

model = LinearSVC(C=1.0, class_weight=None, dual=False, fit_intercept=True, intercept_scaling=1, loss='squared_hinge', max_iter=1000, multi_class='ovr', penalty='l2', random_state=None, tol=0.0001, verbose=0) # implement model from scikit-learn
model = LinearSVC(C=1.0, class_weight=None, dual=False, fit_intercept=True,
           intercept_scaling=1, loss='squared_hinge', max_iter=1000,
           multi_class='ovr', penalty='l2', random_state=None, tol=0.0001,
           verbose=0)

# implement model from scikit-learn
0.9962130368221739

Testing on new images

In this section you will evaluate the trained model on a novel image. You will implement a function that slides over the image and breaks it down to smaller patches. Then you will extract the HOG features from these patches, and categorize them as either a face (1) or a non-face (0) image.

# Load a test image test_img = data.astronaut() test_img = rgb2gray(test_img) test_img = transform.rescale(test_img, 0.5) test_img = test_img[:120, 60:160] plt.imshow(test_img, cmap='gray') plt.axis('off') print(test_img.shape)
# Load a test image
test_img = data.astronaut()
test_img = rgb2gray(test_img)
test_img = transform.rescale(test_img, 0.5)
test_img = test_img[:120, 60:160]


plt.imshow(test_img, cmap='gray')
plt.axis('off')
print(test_img.shape)
(120, 100)
Task 2: Sliding window
  1. Implement the method sliding_windows that takes the test image and extracts image patches from it.

    • The function should iteratively slide over the entire image and extract image patches of the same size as the inputs the classifier was trained on.
    • The function should return a tuple of the coordinates of the upper left corner of each subimage patch, and the patch itself.
# Sliding Window function - Goes Over the image patch by patch # and computes the HOG features for each patch. def sliding_window(img, patch_size=face_images[0].shape, istep=, jstep=): indicies = [] patches = [] ... return indicies, patches # Write you code iterating over the test image here. # EASIEST approach is to use a foor loop for each image dimension
# Sliding Window function - Goes Over the image patch by patch
# and computes the HOG features for each patch.

def sliding_window(img, patch_size=face_images[0].shape, istep=, jstep=):
    indicies = []
    patches = []
    ...
    return indicies, patches

# Write you code iterating over the test image here.
# EASIEST approach is to use a foor loop for each image dimension
Task 3: Classifying the test image

In the following task you have to:

  1. Extract the HOG features for all patches from the test image (test_img ) by making use of the sliding_window and hog functions. Store the features in an array called hog_patches and the corresponding indices in an array called indices .

  2. Use the model to classify the extracted HOG patches and store the results in an array called labels .

  3. Use the existing code to visualize the detections.

# Apply sliding window function to test_img # Write your code here # Visualize the detections fig, ax = plt.subplots() ax.imshow(test_img, cmap='gray') ax.axis('off') Ni, Nj = face_images[0].shape indices = np.array(indices) for i, j in indices[labels == 1]: ax.add_patch(plt.Rectangle((j, i), Nj, Ni, edgecolor='red', alpha=0.3, lw=2, facecolor='none'))
# Apply sliding window function to test_img

# Write your code here


# Visualize the detections
fig, ax = plt.subplots()
ax.imshow(test_img, cmap='gray')
ax.axis('off')

Ni, Nj = face_images[0].shape
indices = np.array(indices)

for i, j in indices[labels == 1]:
    ax.add_patch(plt.Rectangle((j, i), Nj, Ni, edgecolor='red', alpha=0.3, lw=2, facecolor='none'))

Reflections

Task 4: Reflections
  • Summarize the overall process of face detection implemented in this exercise.
  • There are multiple overlapping boxes of the same detected face in the test image. How could you ensure that a face is only detected once?
  • How could the face detection pipeline be improved? Consider the following steps: training data construction, model fitting, test images.