Numpy and Images

In [1]:
import numpy as np
import matplotlib.pyplot as plt
import matplotlib.image as mpimg

This discussion we'll be talking about the operations we can do on 2D arrays in NumPy. We'll be working with the cat image from lecture (mostly because it's familiar, but also because it's cute):

In [2]:
img=mpimg.imread('kitty-cat.jpg')
plt.imshow(img)
plt.show()

In lecture we learned how to plot an image based on how green it is. It looked something like this:

In [3]:
plt.imshow(img[:,:,1], cmap='gray')
plt.show()

What if instead we wanted the image to be plotted in green (so we just get the cat image, but in green)? We can do that by doing this:

In [4]:
new_img = img.copy()
new_img[:,:,0] = 0
new_img[:,:,2] = 0
plt.imshow(new_img)
plt.show()

or by doing this:

In [5]:
plt.imshow(img*[0, 1, 0])
plt.show()

Question: Why do these work?

Answer: Broadcasting. Both of these set the red and blue values to 0. Because there are no rows or columns in the value we are setting or multiplying by, it is automatically broadcasted to all of them.

Let's discuss some exercises again. These exercises all involve us making some change to the final picture by manipulating the NumPy array img. For these problems, think

  1. "What do I need to do to the array?"
  2. "How do I do that to the array?"

Remember to copy the array first if you do any operations that would change it.

Exercise 1: Flip the image horizonally (mirror the image).

.

.

.

.

.

.

.

.

Solution: We need to flip the array horizontally. This means we need to flip the columns by having them be in the reverse order. We can do this by img[:,-1::-1,:].

In [6]:
plt.imshow(img[:,-1::-1,:])
plt.show()

Question: What does img[-1:0:-1,:,:] do? What does img[:,:,-1:0:-1] do?

In [7]:
plt.imshow(img[-1::-1,:,:])
plt.show()
In [8]:
plt.imshow(img[:,:,-1::-1])
plt.show()

Exercise 2: Invert the colors of the bottom half of the image.

.

.

.

.

.

.

.

.

Solution: To invert the colors, we subtract the value of the color from 255. The bottom half of the image is the half of the rows with the highest indices (look at the above). So we want to get these with a splice and subtract the value from 255.

In [9]:
new_img = img.copy()
num_rows = img.shape[0]
new_img[num_rows//2:,:,:] = 255-new_img[num_rows//2:,:,:]
plt.imshow(new_img)
plt.show()

Question: How can we invert the left half? What if we wanted one of the diagonal halves?

In [10]:
new_img = img.copy()
num_cols = img.shape[1]
new_img[:,:num_cols//2,:] = 255-new_img[:,:num_cols//2,:]
plt.imshow(new_img)
plt.show()
In [11]:
new_img = img.copy()
x,y = np.ogrid[0:img.shape[0], 0:img.shape[1]]
to_invert = img.shape[1]*x + img.shape[0]*y > img.shape[0]*img.shape[1]
new_img[to_invert] = 255-new_img[to_invert]
plt.imshow(new_img)
plt.show()

Exercise 3: Delete all the pixels where the green value is more than the red value and the blue value. (Delete means to set it to white or black, your choice.)

.

.

.

.

.

.

.

.

Solution: We want to get the pixels where the green value is more than the red and blue value as a mask. We can get these separately and then use np.logical_and() to combine them. Then we use this mask to set the values of these pixels to 255 (or 0).

In [12]:
new_img = img.copy()
to_delete = new_img[:,:,1] > new_img[:,:,0]
to_delete = np.logical_and(to_delete,new_img[:,:,1] > new_img[:,:,2])
new_img[to_delete] = 255
plt.imshow(new_img)
plt.show()

Question: Suppose we wanted to delete only the grass. This is a pretty hard task, but how can we modify this solution to get close to it?

In [13]:
new_img = img.copy()
to_delete = new_img[:,:,1] > new_img[:,:,0] + 10
to_delete = np.logical_and(to_delete,new_img[:,:,1] > new_img[:,:,2] + 10)
new_img[to_delete] = 255
plt.imshow(new_img)
plt.show()

Exercise 4: Have the picture fade to black at the bottom.

(Hint: black is the RGB value [0, 0, 0])

.

.

.

.

.

.

.

.

Solution: We want the array values to drop to 0 as each column goes down. We can do that by multiplying each column by a vector that goes down from 1 to 0. Using broadcasting we can do this for every column and color.

In [14]:
column_size = img.shape[0]
fade = np.linspace(1,0,column_size)
fade = fade.reshape(column_size,1,1)
# here we divide by 255 because plt.imshow() interprets floats to be a value between 0 and 1
# with 1 being the max while integers are between 0 and 255.  If you didn't know this or you
# forget, PyPlot will tell you if you use a float above 1.
plt.imshow(img*fade/255)
plt.show()

Question: How could we fade to another color?

In [15]:
column_size = img.shape[0]
fade = np.linspace(1,0,column_size)
fade = fade.reshape(column_size,1,1)

reverse_fade = np.linspace(0,1,column_size)
reverse_fade = reverse_fade.reshape(column_size,1,1)

plt.imshow(img*fade/255 + reverse_fade)
plt.show()
Clipping input data to the valid range for imshow with RGB data ([0..1] for floats or [0..255] for integers).