Image Processing I

*Author: Alana White* ## Icebreaker Make a cool thing with [DALLE mini](https://huggingface.co/spaces/dalle-mini/dalle-mini) , tas will judge ## Objective By the end of this lesson, you will: - Understand the basics of how images are stored as data on computers. - Know how to manipulate individual pixels in an image to perform basic editing tasks. - Understand the potential applications of image editing and its significance in various contexts. ### The problem : how do we extract information from images? Fun Scenario: We are detectives tasked with a crucial but thankless mission: figuring out who is stealing lunches from the community fridge. The only thing we know is that the suspect was last seen sneaking scallion pancakes into their car. If we can identify the license plate, we'll be one step closer to solving the mystery. Our ultimate goal is this; however, we need to learn about image processing first. ![](https://i.imgur.com/YIZ80rb.png) # Part 1 - Images as data and basic editing **Question:** Have you ever wondered how images are stored in a computer? Here's an explanation stolen from this [medium article](https://alekya3.medium.com/how-images-are-stored-in-a-computer-f364d11b4e93). Today, we will learn about the process of storing images on a computer. **Activity**: Download the mystery car image onto your computer and open it. Now, if you zoom in further and examine it closely, what do you see? <details> <summary>Think, then click!</summary> We see a bunch of small squares. </details> ### PIXELS To store an image on a computer, the image is first broken down into these small squares, which are called pixels (short for picture elements). Pixels are the basic building blocks of any image. Here is how our mystery image looks after we pixelate it using this [tool](https://pinetools.com/pixelate-effect-image): ![](https://i.imgur.com/xHj1Qri.png) Now you might be wondering how many pixels a picture contains. Think of an image as a two-dimensional grid of pixels. ![](https://i.imgur.com/yn6lKBY.png) So the number of pixels it has is its height multiplied by its width. The resolution of an image refers to the number of pixels it contains. Higher resolution images have more pixels and can display finer details, while lower resolution images have fewer pixels and might appear blurry. **Discussion:** As you know, every image, whether it is grayscale (black and white) or a full-color image, has colors. The question is how to *represent* a color in a way that computers can work with. Computers store images in the form of numbers. Each pixel in the image is represented by a numerical value called **a pixel value**, which determines the color and intensity of the pixel. Here is an image of Abraham Lincoln showing the pixel value of each individual pixel: <img src="https://i.imgur.com/11yV1pL.png" alt="Image Alt Text" width="400"> ### Color Depth The range of these pixels values is called the **bit depth or color depth**. In other words, color depth refers to the maximum number of colors an image can have. Here are the common color depths: 1. **8 bits per pixel (bpp)**: This allows for 256 different colors or shades of gray in an image. It's often used in simple images like grayscale images and icons. For grayscale images, the pixel values range from 0 to 255. A value of 0 represents black, while 255 corresponds to white. Pixel values between 0 and 255 correspond to various shades of gray, with darker shades closer to 0 and lighter shades closer to 255. ![Grayscale Color Palette](https://i.imgur.com/lHLw2AL.png) *Image Source: [Processing](https://processing.org/tutorials/color)* 2. **24 bpp**: This is a common color depth for standard images, including colored images. It can represent 16.7 million colors. <details> <summary>More about color depth (optional)</summary> In general, for a given color depth (N), the number of possible colors is calculated as 2^N. | Bits Per Pixel | Number of Colors Available | Calculation | |----------------|---------------------------|----------------------------| | 1 | 2 | 2^1 = 2 | | 2 | 4 | 2^2 = 4 | | 4 | 16 | 2^4 = 16 | | 8 | 256 | 2^8 = 256 | | 16 | 65536 | 2^16 = 65536 | | 24 | 16777216 | 2^24 = 16777216 | Higher bit depth gives us images with better quality. However, there is a tradeoff. Images with lots of bits take up more space and this larger file size can potentially slow down how fast the image can be sent or processed. </details> So if our image is made up of pixels, that begs the question: is this enough information to catch our culprit with? **Discuss:** How much information can we get out of an image on a computer compared to what we perceive with our eyes? Is this enough information? ### Let the detective work begin! Now that we learned a little bit about images, let's start our detective work! To start, let's load up the image into our detective software ([Google Colab!](https://colab.research.google.com/drive/1pywjPqpSgkhalhVErlvHJFxvGTyZEhgG#scrollTo=B8_3cYlt6K63)). As always, remember to make a copy! Download the car image from [here](https://drive.google.com/file/d/1p6kpA4-lkfkkT6ugoCwlIHRcjiEeYoIW/view?usp=drive_link), save it to your computer as "mystery_car.png" and then upload it to your Colab. To load in the image, We have to use a library. In this case, we'll be using skimage (a.k.a. scikit-image), which you might need to install if you're not using Google Colab. ```python= import skimage.io from skimage import io ``` Finally, we can load the image! To simplify some things for now, we'll just look at black and white images (we'll take a look at colored images later) ```python= img = io.imread('/content/gdrive/MyDrive/mystery.png', as_gray=True) img = skimage.util.img_as_ubyte(img) ``` We can display the image by calling ```python= io.imshow(img) ``` Looking at the result of the image, we notice there's some axes on the sides. ![](https://i.imgur.com/Ln0PsGh.png) What do those mean? <details> <summary> Think, then click! </summary> This is the dimensions of the image (width and height) in pixels. </details> Let's see what else we can gather about the image. Let's check the shape of the image: ``` print(img.shape) ``` Result: ``` (1001, 1608) ``` The height and width again! And let's check the type of the image: ``` print(type(img)) ``` Result: ``` <class 'numpy.ndarray'> ``` What does this mean? Well, numpy is another library that's used specifically for making arrays. Numpy arrays are very similar to lists. We can access the value of specific pixels in the image using this syntax: ``` img[y,x] ``` Exercise - Let's index into the image, and figure out what the numbers mean (done on colab with students asking values to try) **Discuss:** what does this number mean? how big/small does it get? What do larger, smaller numbers represent? Remember, this is a black and white image. <details> <summary> Think, then click! </summary> What does this value mean: The color of a pixel Minimum value: 0 Maximum value: 255 Smaller numbers: Darker colors Larger numbers: Brighter colors </details> Now that we have an idea of what these numbers mean, we have a vague idea of how to get information from images. Discuss: How can we use these numbers to gather information from an image? ## Editing an image We can also use this indexing to change the value of pixels in an image. We'll make a copy of the image first though in order to prevent changing our original image. We can make a copy by making a copy of our numpy array, (we'll have to import numpy to do that). ```python= import numpy as np modified_img = np.copy(img) ``` Now, if we want to change the pixel located at 500 height, 1200 width to have the value 255, ```python= modified_img[500, 1200] = 255 ``` We'll see that the pixel on the license plate has turned white. If we just show this image, since the image showing function in scikit image is a little low res, we won't be able to see it, but if we save a copy to our google drive, like so: ``` io.imsave('/content/drive/MyDrive/modified.png', modified_img) ``` And then, if we open it, we can zoom in and see that the pixel in the bottom right corner of the license plate has turned white. ![](https://i.imgur.com/WWskjup.png) **Discuss:** Is this useful for solving our problem? How is editing images useful in general? **Discussion:** How can image editing be used for harmful purposes?