Task 1: Loading and inspecting the data
- Run the code cell
- Change the code to display the display sequences 4,5,7 and visually observe how these sequences vary.
In this in-class exercise you will be guided through the steps necessary for implementing a PCA on a sequence of human poses. You will work with the poses data, which was used for the exercises in week 6 . The dataset has a shape of $(1403, 100, 25*2)$. This means that there are 1403 pose sequences. Each sequence is a 100-frames time series capturing human poses. Each pose consists of 25 skeletal joints, where each joint is an x and y coordinate ($25*2$). For this exercise, you will use a single pose sequence of 100 frames and apply dimension reduction to the selected sequence.
The following cells loads the neccessary libraries, the dataset and provides functions for plotting the poses:
import numpy as np
import matplotlib.pyplot as plt
import warnings
import seaborn as sns
# Suppress the specific warning
warnings.filterwarnings("ignore")
def limb_number_plot(s_pose_x,s_pose_y,n1,n2,c="red",label=None):
if label is not None:
if (s_pose_x[n1]>0) and (s_pose_x[n2]>0) and (s_pose_y[n1]>0) and (s_pose_y[n2]>0):
plt.plot([s_pose_x[n1],s_pose_x[n2]], [s_pose_y[n1], s_pose_y[n2]],color = c, linestyle="-",label=label)
else:
if (s_pose_x[n1]>0) and (s_pose_y[n1]>0):
plt.plot(s_pose_x[n1], s_pose_y[n1],'*',color = c,label=label)
if (s_pose_x[n2]>0) and (s_pose_y[n2]>0):
plt.plot(s_pose_x[n2], s_pose_y[n2],'*',color = c,label=label)
if (s_pose_x[n1]>0) and (s_pose_x[n2]>0) and (s_pose_y[n1]>0) and (s_pose_y[n2]>0):
plt.plot([s_pose_x[n1],s_pose_x[n2]], [s_pose_y[n1], s_pose_y[n2]],color = c, linestyle="-")
def plot_single_pose(s_pose,c = "darkgreen",label=None,ds='body_25',c_head = 'red',head = True):
s_pose_x=s_pose[::2]
s_pose_y=s_pose[1::2]
#torso/body
limb_number_plot(s_pose_x,s_pose_y,2,5,c)
if label is not None:
limb_number_plot(s_pose_x,s_pose_y,9,12,c,label)
else:
limb_number_plot(s_pose_x,s_pose_y,9,12,c)
limb_number_plot(s_pose_x,s_pose_y,2,9,c)
limb_number_plot(s_pose_x,s_pose_y,5,12,c)
#left arm (person facing away)
limb_number_plot(s_pose_x,s_pose_y,2,3,c)
limb_number_plot(s_pose_x,s_pose_y,3,4,c)
#right arm
limb_number_plot(s_pose_x,s_pose_y,5,6,c)
limb_number_plot(s_pose_x,s_pose_y,6,7,c)
#left leg / foot
limb_number_plot(s_pose_x,s_pose_y,9,10,c)
limb_number_plot(s_pose_x,s_pose_y,10,11,c)
limb_number_plot(s_pose_x,s_pose_y,11,22,c)
#right leg / foot
limb_number_plot(s_pose_x,s_pose_y,12,13,c)
limb_number_plot(s_pose_x,s_pose_y,13,14,c)
limb_number_plot(s_pose_x,s_pose_y,14,19,c)
# head
if head:
limb_number_plot(s_pose_x,s_pose_y,0,15,c)
limb_number_plot(s_pose_x,s_pose_y,0,16,c)
limb_number_plot(s_pose_x,s_pose_y,15,17,c)
limb_number_plot(s_pose_x,s_pose_y,16,18,c)
return True
def plot_single_sequence(poses, pose_name='Poses',color='blue'):
"""
Plots a single sequence of skeleton joints.
Parameters:
poses (array-like): Skeleton sequence data, shape (T,D).
poses_name (string, optional): subtitle of each skeleton body in the sequence.
color (string, optional): color of skeleton bodies.
"""
plt.style.use('seaborn')
plt.figure(figsize=(25,5))
plt.title('Ground truth')
for i in range(len(poses)):
plt.subplot(5, 10, i + 1)
plot_single_pose(poses[i], c=color, head=True)
plt.ylim(1, 0)
plt.xlim(-1, 1)
plt.title(pose_name + str(i))
plt.axis('off')
plt.show()
The cell below:
data = np.load('poses_norm.npy')
print(data.shape)
N,T,D,C = data.shape
reshaped_data = data.reshape(N,T,D*C)
dataset = reshaped_data[19]
# Define the new shape you want (30, 50)
new_shape = (50, 50)
# Reshape the array to the new shape
reshaped_data2 = np.empty(new_shape) # Create an empty array with the new shape
reshaped_data2[:] = dataset[:new_shape[0], :]
plot_single_sequence(reshaped_data2,pose_name='Pose',color='blue')
(1403, 100, 25, 2)
The following tasks construct and inspect the covariance matrix for the chosen pose sequence.
# Calculate the covariance matrix for the entire dataset
cov_matrix = np.cov(dataset, rowvar=False)
# Plotting
sns.heatmap(cov_matrix, cmap='coolwarm')
Currently, the dataset is arranged by frames, with each frame being defined by an x and y coordinate.
# Get the number of rows and columns in the dataset
num_rows, num_columns = dataset.shape
# Separate even and odd columns
even_indexes = np.arange(0, num_columns, 2) # Even indexes (0, 2, 4, ...)
odd_indexes = np.arange(1, num_columns, 2) # Odd indexes (1, 3, 5, ...)
# Rearrange the dataset
rearranged_dataset = dataset[:, np.concatenate((even_indexes, odd_indexes))]
Use the rearranged_dataset
to:
where $\mathbf{x}_i$ represents the $i$-th coordinate in the dataset and $\boldsymbol{\bar{x}}$ is the mean vector obtained by averaging the coordinates for each joint $\boldsymbol{\bar{x}} = \frac{1}{N} \sum_{i=1}^{N} \mathbf{x}_i$
To center the data first calculate the mean vector, then subtract it from each data point of the pose sequence.
# write your solution here
Through the following steps you will implement the eigen decompositon and inspect crucial properties of the covariance matrix.
eigenvalues, eigenvectors = np.linalg.eigh(cov_matrix)
# write your solution here
Notice that the values may be slightly imprecise due to the finite precision of numerical representations. You can use np.isclose
to check whether two values are close to each other or not.
The sum of all eigenvalues should equal (approximately) the total variance in the original data.
# Write your solution here
- Are all eigenvalues greater than or equal to 0: True - Orthogonal eigenvectors: True - Sum of eigenvalues: 0.56 - Total variance: 0.56
# Write your solution here
The cell below calculates the cumulative explained variance ratio.
By using this cutoff point we want to retain 95% of the variation in the original data. Remember that the sum of the selected eigenvalues can be used as a measure of how much variance is retained.
cumulative_variance_ratio = np.cumsum(sorted_eigenvalues) / np.sum(sorted_eigenvalues)
# Write your solution here
9 Components keep 95% of the variance
The following section describes how much each variable contributes to the selected principal components:
# Write your solution here
print(mixing_params.shape)
(50, 9)
# Write your solution here
We can project the normalized data onto the selected principal components. This is done by taking the dot product of the data matrix with the eigenvector matrix, where each column represents a principal component. The following steps will implement this process.
Run the cell below to make sure that your data is centered. Use the centered data to:
Dot product.
# Calculate the mean vector
mean_vector = np.mean(dataset, axis=0)
# Subtract the mean from each data point
centered_data = dataset - mean_vector
# Write your solution here
# Create a scatter plot for each pair of components for 9 components
plt.figure(figsize=(15, 15))
for i in range(9):
for j in range(9):
plt.subplot(9, 9, (i * 9) + j + 1)
plt.scatter(projected_data.T[:, i], projected_data.T[:, j], marker=".")
plt.xlabel(f'PC {i + 1}')
plt.ylabel(f'PC {j + 1}')
plt.title(f'PC {i + 1} vs. PC {j + 1}')
plt.xlim([-1.5, 1.5])
plt.ylim([1.5, -1.5])
plt.tight_layout()
plt.show()
print(projected_data.T.shape)
(100, 9)
Dot product. Remember to add the mean!
# Write your solution here
print(reconstructed_data.shape)
(100, 50)
# Write your solution here