PIC 16 (Python with Applications)

Image processing hints

Here are some hints, to make your life a little bit more pleasant when dealing with images.

Loading images with matplotlib

The function we've been using to load images is matplotplib.pyplot.imread. If you installed Spyder the way that I think most students did, you probably also have PIL, the Python Imaging Library, installed. If that's true, then imread will load most image formats (JPG, PNG, etc.). If not, you might only be able to read in PNG files.

One source of annoyance is the return type from imread. If you execute implt.imread('img.png') (and if the image exists, etc.), then you'll get one of these three things:

Well, actually that's kind of a lie. Sometimes imread will return the pixel data using uint8 (i.e., "unsigned integer, 8-bit") data instead of float32 (i.e., "floating point, 32-bit"). When it uses uint8, the pixels will be in the range 0 to 255 instead of 0 to 1.

For consistency, you might want to have a helper function or three to deal with all of this nonsense by converting to one of two forms: RGB or grayscale, always with floating-point data in the range 0 to 1. Here's what that might look like (or, download as a Python source file):

# Helper functions for image processing, for PIC 16, Spring 2017, UCLA
# Author: Andrew Krieger <akrieger@math.ucla.edu>
# Version: 0.1
# License: CC0 (intended for release into the public domain)

import numpy as np

def normalizeImage(im):
  """Renormalize an image read in by matplotplib.pyplot.imread.

  The returned image will have pixel data in the range [0,1], even if the input
  image used the range [0,255] instead.

  NOT FULLY TESTED; USE AT YOUR OWN RISK!
  """

  if im.dtype == np.uint8:
    # 8-bit integer pixel data; convert to float and rescale
    return im.astype(np.float32) / 255.0
  elif im.dtype == np.float32 or im.dtype == np.float64:
    # Floating-point data; make a copy for consistency
    return im.copy()
  else:
    raise ValueError("Unrecognized data type: %s" % im.dtype)


def standardizeImageRGB(im):
  """Standardize an image that was read in by matplotlib.pyplot.imread.

  Input: a numpy array, as returned by imread.
  Output: a copy of im, returned as a RGB image with data in the range 0 to 1.

  NOT FULLY TESTED; USE AT YOUR OWN RISK!
  """

  im = normalizeImage(im)

  if len(im.shape) == 2:
    # Grayscale image. Convert to grayscale in an incorrect-but-simple way.
    # For a better way, see:
    #   https://en.wikipedia.org/wiki/Grayscale#Colorimetric_.28luminance-preserving.29_conversion_to_grayscale

    # This code re-interprets the original im array (of shape (N,M))
    # as an array of shape (N,M,1), then multiplies by an array of shape
    # (1,1,3) where all the elements are equal to 1.

    # Afterwards, new_im[y, x, c] == old_im[y, x] * 1, for c=0,1,2
    N,M = im.shape
    return im.reshape((N, M, 1)) * np.ones((1,1,3))
  elif len(im.shape) == 3 and (im.shape[2] == 3 or im.shape[2] == 4):
    # Strip off the alpha channel, if there is one
    return im
  else:
    raise ValueError("Unrecognized array format: shape == %s" % im.shape)

  return im

def standardizeImageGrayscale(im):
  """Standardize an image that was read in by matplotlib.pyplot.imread.

  Input: a numpy array, as returned by imread.
  Output: a copy of im, returned as a RGB image with data in the range 0 to 1.

  NOT FULLY TESTED; USE AT YOUR OWN RISK!
  """

  im = normalizeImage(im)

  if len(im.shape) == 2:
    return im
  elif len(im.shape) == 3 and (im.shape[2] == 3 or im.shape[2] == 4):
    # See the link in the body of standardizeImageRGB to see why this may not be
    # a good way to convert to grayscale.

    # The slice below omits the alpha channel (if there is one).
    # The sum method sums up along axis 2 (so it returns a two-dimensional
    # array, where sum_result[y,x] = sum(im[y, x, c] for c in range(3)).
    # Dividing by three gives us a simple average of the RGB channels.
    return im[:, :, (0,1,2)].sum(2) / 3.0
  else:
    raise ValueError("Unrecognized array format: shape == %s" % im.shape)

Why are my grayscale images green and purple?

Well, to answer my own question, it's probably because you're trying to display a true grayscale image (i.e., a numpy array of shape (N,M) rather than (N,M,3) or (N,M,4)) using matplotlib.pyplot.imshow. By default (at least when the input array is two-dimensional), imshow uses a color map, which takes numerical values and maps them to a nice, multi-colored pattern (see the matplotlib docs for more details).

There are two ways to fix this: you could convert your image data from an (N,M) array to (N,M,3) array (see the code sample above for one way of doing this), or you could convince imshow to just show my image using grayscale, thank-you-very-much!

The trick for this second option is to fill in some optional arguments to imshow. First, we should tell it to use the "gray" color map, which (as you might guess) uses colors ranging from black to white instead of yellow to purple or whatever.

But, that's not enough. See, imshow is meant for visualizing two-dimensional data sets, not necessarily images, so it doesn't use the grayscale values in your array directly. Instead, it adds a normalizer.

The default normalizer reads through the entire image, picks out the biggest value (which will be 1.0 if, and only if, you have at least one pure-white pixel in your image) and the smallest value (ditto, except with 0.0 and black). Then, it scales things linearly so that these extremes get mapped to 1.0 and 0.0.

If you're following along, that means that if your image has pixel data in the usual range of 0 to 1, and if it has at least one white pixel and one black pixel, there's no problem—white stays as white, black stays as black, and everything in the middle stays the same too.

But, let's think about what happens if your image ends up only using a few shades of light gray, say between 0.5 and 0.75. There's nothing wrong with this; maybe it's a (grayscale) picture of a sunny landscape or something. But, imshow will re-scale the colors so that 0.5 becomes black, and 0.75 becomes white. That will, of course, distort the appearance of your image.

The fix is to tell imshow that your image uses grayscale values between 0.0 and 1.0 (even if you don't actually use the literal value 0.0 or 1.0 in the image). Here's a call to imshow with both of the optional arguments:

import matplotlib.pyplot as plt

plt.imshow(im, cmap="gray", norm=plt.Normalize(vmin=0.0, vmax=1.0))