Here are some hints, to make your life a little bit more pleasant when dealing with images.
The function we've been using to load images is
matplotplib.pyplot.imread
.
If you installed Spyder the way that I think most students did,
you probably also have
PIL, the Python Imaging Library,
installed.
If that's true, then imread
will
load most image formats (JPG, PNG, etc.).
If not, you might only be able to read in PNG files.
One source of annoyance is the return type from
imread
.
If you execute
implt.imread('img.png')
(and if the image exists, etc.),
then you'll get one of these three things:
im.shape == (N, M, 3)
(where N
is the height and
M
is the width)
im.shape == (N, M, 4)
(where again N
and
M
are the width and height;
the new fourth "color" channel is actually an alpha channel, where
alpha=0 means transparent, alpha=1 means opaque)
im.shape == (N, M)
(as usual, N
and
M
are the width and height;
now each pixel is a single number, with black=0, white=1)
Well, actually that's kind of a lie.
Sometimes imread
will return the pixel data using uint8
(i.e., "unsigned integer, 8-bit") data instead of
float32
(i.e., "floating point, 32-bit").
When it uses uint8
,
the pixels will be in the range 0 to 255 instead of 0 to 1.
For consistency, you might want to have a helper function or three to deal with all of this nonsense by converting to one of two forms: RGB or grayscale, always with floating-point data in the range 0 to 1. Here's what that might look like (or, download as a Python source file):
# Helper functions for image processing, for PIC 16, Spring 2017, UCLA
# Author: Andrew Krieger <akrieger@math.ucla.edu>
# Version: 0.1
# License: CC0 (intended for release into the public domain)
import numpy as np
def normalizeImage(im):
"""Renormalize an image read in by matplotplib.pyplot.imread.
The returned image will have pixel data in the range [0,1], even if the input
image used the range [0,255] instead.
NOT FULLY TESTED; USE AT YOUR OWN RISK!
"""
if im.dtype == np.uint8:
# 8-bit integer pixel data; convert to float and rescale
return im.astype(np.float32) / 255.0
elif im.dtype == np.float32 or im.dtype == np.float64:
# Floating-point data; make a copy for consistency
return im.copy()
else:
raise ValueError("Unrecognized data type: %s" % im.dtype)
def standardizeImageRGB(im):
"""Standardize an image that was read in by matplotlib.pyplot.imread.
Input: a numpy array, as returned by imread.
Output: a copy of im, returned as a RGB image with data in the range 0 to 1.
NOT FULLY TESTED; USE AT YOUR OWN RISK!
"""
im = normalizeImage(im)
if len(im.shape) == 2:
# Grayscale image. Convert to grayscale in an incorrect-but-simple way.
# For a better way, see:
# https://en.wikipedia.org/wiki/Grayscale#Colorimetric_.28luminance-preserving.29_conversion_to_grayscale
# This code re-interprets the original im array (of shape (N,M))
# as an array of shape (N,M,1), then multiplies by an array of shape
# (1,1,3) where all the elements are equal to 1.
# Afterwards, new_im[y, x, c] == old_im[y, x] * 1, for c=0,1,2
N,M = im.shape
return im.reshape((N, M, 1)) * np.ones((1,1,3))
elif len(im.shape) == 3 and (im.shape[2] == 3 or im.shape[2] == 4):
# Strip off the alpha channel, if there is one
return im
else:
raise ValueError("Unrecognized array format: shape == %s" % im.shape)
return im
def standardizeImageGrayscale(im):
"""Standardize an image that was read in by matplotlib.pyplot.imread.
Input: a numpy array, as returned by imread.
Output: a copy of im, returned as a RGB image with data in the range 0 to 1.
NOT FULLY TESTED; USE AT YOUR OWN RISK!
"""
im = normalizeImage(im)
if len(im.shape) == 2:
return im
elif len(im.shape) == 3 and (im.shape[2] == 3 or im.shape[2] == 4):
# See the link in the body of standardizeImageRGB to see why this may not be
# a good way to convert to grayscale.
# The slice below omits the alpha channel (if there is one).
# The sum method sums up along axis 2 (so it returns a two-dimensional
# array, where sum_result[y,x] = sum(im[y, x, c] for c in range(3)).
# Dividing by three gives us a simple average of the RGB channels.
return im[:, :, (0,1,2)].sum(2) / 3.0
else:
raise ValueError("Unrecognized array format: shape == %s" % im.shape)
Well, to answer my own question, it's probably because you're trying to
display a true grayscale image (i.e., a numpy array of shape
(N,M)
rather than
(N,M,3)
or
(N,M,4)
) using
matplotlib.pyplot.imshow
.
By default (at least when the input array is two-dimensional),
imshow
uses a color map,
which takes numerical values and maps them to a nice, multi-colored
pattern (see the
matplotlib docs for more details).
There are two ways to fix this: you could convert your image data
from an (N,M)
array to
(N,M,3)
array
(see the code sample above for one way of doing this),
or you could convince imshow
to just show my image using grayscale, thank-you-very-much!
The trick for this second option is to fill in some optional arguments
to imshow
.
First, we should tell it to use the
"gray"
color map, which (as you might guess) uses colors ranging from black
to white instead of yellow to purple or whatever.
But, that's not enough.
See, imshow
is meant for visualizing
two-dimensional data sets, not necessarily images,
so it doesn't use the grayscale values in your array directly.
Instead, it adds a normalizer.
The default normalizer reads through the entire image, picks out the
biggest value (which will be 1.0
if, and only if, you have at least one pure-white pixel in your image)
and the smallest value (ditto, except with
0.0
and black).
Then, it scales things linearly so that these extremes get mapped to
1.0
and
0.0
.
If you're following along, that means that if your image has pixel data in the usual range of 0 to 1, and if it has at least one white pixel and one black pixel, there's no problem—white stays as white, black stays as black, and everything in the middle stays the same too.
But, let's think about what happens if your image ends up only using
a few shades of light gray, say between
0.5
and
0.75
.
There's nothing wrong with this; maybe it's a (grayscale) picture of
a sunny landscape or something.
But, imshow
will re-scale the colors
so that 0.5
becomes black,
and 0.75
becomes white.
That will, of course, distort the appearance of your image.
The fix is to tell imshow
that your image uses grayscale values between 0.0 and 1.0
(even if you don't actually use the literal value
0.0
or
1.0
in the image).
Here's a call to imshow
with both of the optional arguments:
import matplotlib.pyplot as plt
plt.imshow(im, cmap="gray", norm=plt.Normalize(vmin=0.0, vmax=1.0))
These links are mostly repeating what you have in the lecture notes, and maybe adding more information on top. It's probably best to read the lecture notes first (they're usually a bit friendlier, since Wikipedia et al. tend to go into more depth than we need).
G(x,y)
in the "Mathematics"
section. There's a sample array lower on the page,
if you want another test case.
h(3)
in Matlab
we would write h[4]
in Python,
since Python starts counting at 0 whereas Matlab starts from 1.