In this assignment we are going to go through all the setup and background we need for this course! We've divided it into 3 sections as follows:
This homework is due by Wednesday, January 29th, 2025 at 10:00 PM EST
Please click here to get the stencil code. Reference this guide for more information about GitHub and GitHub Classroom.
In order to complete (programming) assignments for this course, you will need a way to code, run, and debug your own Python code. While you are always free to use department machines for this (they have a pre-installed version of the course environment that every assignment has been tested against), you are also free to work on your own machines.
Below we give you some information that is helpful for either of these situations.
In order to set up your virtual environment for this couse, we highly recommend (and only formally support) the use of Anaconda to create and manage your Python environment.
Once you have cloned the Github Classroom assignment for this assignment, you can do the following to setup your virtual environment:
Download the Anaconda installer from here, and install it on your computer. We recommend using the Graphical Installer for the correct system (Windows / (Intel) Mac / Mac M1).
Note: If you have an existing Anaconda or Miniconda installation (such as from CS200), then you don't need to reinstall, and can just use that!
You can tell if you have an existing install if the command conda --version
is recognized.
Windows: When installing using the graphical installer, be sure to check the box which adds conda
to your PATH
.
Open a new terminal window and navigate to the root of the cloned assignment in a terminal (such as the one in VSCode) using cd
and ls
, and run ./env_setup/Other/conda_create.sh
. This should set up a virtual environment named csci1470
on your computer. If you have an Apple M1, this script will be different. (See below.)
You may need to restart your terminal after installing Anaconda in order for this to work.
Note: This might be slightly different depending on your platform:
Apple M1: We provide a slightly different script which you can call from "./env_setup/Apple M1/conda_create_m1.sh"
for those who have Apple silicon.
Windows and Others: If you are using Windows Powershell, then you can just run ./env_setup/Other/conda_create.sh
(forward slashes), but if you are using Command Prompt, then you need to run .\env_setup\Other\conda_create.sh
(backslashes).
Some users have also experienced problems running the *.sh
files entirely. If you are getting a permissions issue try running chmod a+x
in the command line from inside your repo. If that does not work, you can just open the conda_create.sh
script in a text editor (such as VSCode), and run each line individually in your terminal.
Run conda activate csci1470
. You will need to do this in every shell where you want to use the virtual environment.
If you are the above procedure doesn't work for you (although we highly recommend trying to troubleshoot that first), here is another method that does not rely on conda
commands:
Install Python 3.11 (or Python 3.9, we have not found any differences in functionality between them for our projects).
Install the following packages in either a virtual environment or your main python environment
ipython==8.8.0
matplotlib==3.5.3
numpy==1.23.5
Pillow==9.4.0
scipy==1.9.3
tensorflow==2.11.0
tqdm==4.64.1
python -m venv cs1470
. This will create a new virtual environment caled cs1470
.cs1470/Scripts/activate
on Windows machines or source cs1470/bin/activate
on Mac and Linux machines. Do this before starting any homework asssignment.deactivate
.pip
commands (ie pip install ipython==8.8.0
)pip install -r requirements.txt
Once this is complete, you should have a local environment to use for the course!
Note: Sometimes even if you set up your local environment correctly, you may experience unexpected bugs and errors that are unique to your local setup. To prevent this from hindering your ablity to complete assignments, we highly recommend that you familiarize yourself with the department machines, even if you expect to usually be working locally.
Department machines serve as a common, uniform way to work on and debug assignments. There are a variety of ways in which you can use department machines:
When using the department machines, you can activate the course virtual environment (which we have already installed) using:
source /course/cs1470/cs1470_env/bin/activate
Which will activate the course virtual environment. From here, you should be able to clone the repository (see a GitHub guide here for more information on using Git via the command line), and work on your assignment.
Note: Python files using tensorflow
may require a little more time on startup to run on department machines (likely because it is pulling files from the department filesystem), but they should all run nonetheless.
Python packages, or libraries, are external sets of code written by other industry members which might prove really helpful! (Imagine coding how to draw a graph in Python every single time)
However, different classes, tasks, and even projects, might require different sets of Python packages. We can manage these as different virtual environments which have different sets of packages installed.
If you are using conda
, you might notice the (base)
prefix in your terminal. This signifies that you're in the default (hence (base)
) environment.
To access CSCI1470's virtual environment, you can use conda activate csci1470
. You should now see the (csci1470)
prefix in your terminal!
To return back to the base environment, you can use conda deactivate
.
Given two column vectors \(a \in \mathbb{R}^{m \times 1}, \; b \in \mathbb{R}^{n \times 1}\) the outer product is \[\mathbf{a} \times \mathbf{b} = \begin{bmatrix}a_0 \\ \vdots \\ a_{m-1}\end{bmatrix} \times \begin{bmatrix}b_0 \\ \vdots \\ b_{n-1}\end{bmatrix} = \begin{bmatrix} a_0 b^T\\ \vdots \\ a_{m-1} b^T\\ \end{bmatrix} = \begin{bmatrix} a_0 b_0 & \cdots & a_0 b_{n-1}\\ \vdots & \ddots & \vdots \\ a_{m-1} b_0 & \cdots & a_{m-1} b_{n-1}\\ \end{bmatrix} \in \mathbb{R}^{m\times n} \]
Given two column vectors \(\mathbf{a}\) and \(\mathbf{b}\) both in \(\mathbb{R}^{r\times 1}\), the inner product (or the dot product) is defined as: \[ \mathbf{a} \cdot \mathbf{b} = \mathbf{a}^T\mathbf{b} = \begin{bmatrix} a_0\ \cdots\ a_{r-1} \end{bmatrix} \begin{bmatrix}b_0 \\ \vdots \\ b_{r-1}\end{bmatrix} = \sum_{i=0}^{r} a_i b_i \]
where \(\mathbf{a}^T\) is the transpose of a vector, which converts between column and row vector alignment. The same idea extends to matrices as well.
Given a matrix \(\mathbf{M} \in \mathbb{R}^{r\times c}\), and a vector \(x\in \mathbb{R}^c\) let \(M_i\) be the ith row of the \(M\). The matrix product is defined as: \[\mathbf{Mx} \ =\ \mathbf{M}\begin{bmatrix} x_0\\ \vdots \\ x_{c-1}\\ \end{bmatrix} \ =\ \begin{bmatrix} \mathbf{M_0}\\ \vdots \\ \mathbf{M_{r-1}}\\ \end{bmatrix}\mathbf{x} \ =\ \begin{bmatrix} \ \mathbf{M_0 \cdot x}\ \\ \vdots \\ \ \mathbf{M_{r-1} \cdot x}\ \\ \end{bmatrix} \] Further, given a matrix \(N \in \mathbb{R}^{c\times m}\) we define \[ MN = \begin{bmatrix} \mathbf{M_0\cdot N^T_0} \cdots\mathbf{M_0\cdot N^T_c} \\ \vdots \ddots \vdots \\ \mathbf{M_c\cdot N^T_0} \cdots \mathbf{M_c\cdot N^T_m}\end{bmatrix} \] And we have \(MN \in \mathbb{R}^{r\times m}\)
\(\mathbf{M} \in \mathbb{R}^{r\times c}\) implies that the function \(f(x) = \mathbf{Mx}\) can map \(\mathbb{R}^{c\times 1} \to \mathbb{R}^{r\times 1}\).
\(\mathbf{M_1} \in \mathbb{R}^{d\times c}\) and \(\mathbf{M_2} \in \mathbb{R}^{r\times d}\) implies \(f(x) = \mathbf{M_2M_1x}\) can map \(\mathbb{R}^c \to \mathbb{R}^r\).
Given this and your own knowledge, try solving these:
Prove that \((2) + (3)\) implies \((4)\). In other words, use your understanding of the inner and matrix-vector products to explain why \((4)\) has to be true.
Prove that \((4)\) implies \((5)\)
Recall that differentiation is finding the rate of change of one variable relative to another variable. Some nice reminders: \[\begin{align} \frac{df(x)}{dx} & \text{ is how $f(x)$ changes with respect to $x$}.\\ \frac{\partial f(x,y)}{\partial x} & \text{ is how $f(x,y)$ changes with respect to $x$ (and ignoring other factors)}.\\ \frac{dz}{dx} &= \frac{dy}{dx} \cdot \frac{dz}{dy} \text{ via chain rule if these factors are easier to compute}. \end{align}\] Some common derivative patterns include: \[\frac{d}{dx}(2x^3 + 4x + 5) = 6x^2 + 4 \]\[\frac{\partial}{\partial y}(x^2y^3 + xy + 5x^2) = 3x^2y^2 + x % \]\[\frac{d}{dx}(x^3 + 5)^3 = 3(x^3 + 5)^2 \times (3x^2) \]\[\frac{d}{dx}\ln(x) = \frac{1}{x} \]
Given this and your own knowledge:
Use (and internalize) the log properties to solve the following: \[\frac{\partial}{\partial y}\ln(x^5/y^2)\] The properties are as follows:
\[\log(x^p) = p\log(x)\] \[\log(xy) = \log(x) + \log(y)\] \[\log(x/y) = \log(x) - \log(y)\]
Solve the following partial for a valid \(j\) and all valid \(i\): \[\frac{\partial}{\partial x_j}\ln\bigg[\sum_i x_iy_i\bigg]\] Consider using the chain rule. Let \(g_1(x) = \sum_i x_iy_i\)…
There exist events that are independent of each other, meaning that the probability of each event stays the same regardless of the outcome of other events.
For example, consider picking a particular 3-digit number at random: \[P(x = 123) = P(x_0 = 1)P(x_1 = 2)P(x_2 = 3) = (1/10)^3 = 1/1000\]
Alternatively, some events are dependent on other events. For example, consider 3 draws from a set of 1 red, 1 green, and 1 blue ball. \[P(b_0 = R) = 1/3\] \[P(b_1 = G\ |\ b_0 = R) = 1/2\] \[P(b_2 = B\ |\ (b_0 = R) \cup (b_1 = G)) = 1/1\]
This starts off the notion of conditional probability, where some components are realized conditional to other components. An important formula for conditional probability is Bayes' Theorem:
\[P(A|B) = \frac{P(B|A)P(A)}{P(B)}\]
Whenever events happen at random, they happen with some probability. This is governed by some probability distribution. For example, \(X \sim P(x)\) is a realization (or variate, or random variable) of the \(P(x)\) distribution. Of note:
These distributions are equipped with expectation functions \(\mathbb{E}\) and \(\mathbb{V}\) that reveal their expected behavior (mean and variance, respectively). These also usually suggest the long-term equilibrium behavior, or the distribution of realizations after many realizations are drawn and accumulated.
Discrete Probability Distribution governs discrete events \(\{e_0, e_1, ...\}\).
Continuous Probability Distribution governs continuous values. For example, the unit normal distribution mentioned before.
Given this and your own knowledge:
You're trying to train up a cat/dog classifier which outputs prediction between 0 and 1. Given that the input is in fact an image of a cat or dog, the truth is always one of those two. As such, the output is a probability distribution \(Y\) with unknown \(P(Y = y)\) for all possible \(y\) in the domain of \(Y\). Your friend knows that their dataset \(\mathbb{D} = (\mathbb{X}, \mathbb{Y})\) is balanced between cats and dogs, and so argues that \(P(Y=y)\) is equal for all plausible \(y\).
For an overview of python syntax and common python uses we recommend checking out this python tutorial for a refresher.
Write up a Python class for a Square
whose constructor (the __init__
method) takes in a string name
and numeric length
field.
You can use this code to verify functionality:
square1 = Square("square1", 5)
square1.name == "square1"
square1.length == 5
We can give a class some special interaction patterns with other Dunder methods.
If we give a class, say a Multiplier
, a __call__
method, then we can specify what happens when we call an instance of the class! For example:
multer = Multiplier()
multer(5, 10)
This is effectively the same as calling:
multer.__call__(5, 10)
Write up a Python class Multiplier
which, when called on two integers, returns the product of the two integers
Use this code to check:
multer = Multiplier()
multer(5, 10) == 50
Throughout this course, we will be working with Python extensively.
Though you won't need to be an OOP expert, we do expect some basics which help Deep Learning libraries work in organized and efficient ways.
OOP (Object Oriented Programming) strongly focuses on objects, which encompass any value, varaible, etc. It's really any "tangible" thing.
thing1 = "i am an object!"
thing2 = 1234567
...
However, we might find it useful to organize these things into classes of things, with properties like instance variables or methods shared across all things that are members of the same class. You might be familiar with Python's str
, int
, and float
classes, for example.
When working with Python, you'll almost always be working with objects which are instances of classes.
You can check the type of a variable with the built-in type
method!
If classes are sets and objects are set elements, then we also need subsets and supersets!
In Python, we can make a "child" class inherit all the methods from a "parent" class like so:
class ChildClass(ParentClass1):
Consider the instance of our Square
earlier.
square1 = Square("square1", 5)
square1.name == "square1"
square1.length == 5
Notice that the name
and length
variables for are instance variables!
In contrast with instance variables, class level variables are shared across all instances of the class.
Say in the declaration of Square
, we included this:
class Square:
shape = "square"
def __init__(...):
...
Then, if we checked square1.shape
or square2.shape
, you'll notice that they both return the string "square"
. This also applies to checking the class directly: Square.shape
will also return "square"
!
shape
is a class level variable because it is shared across the whole class!
Class variables can be redefined from any instance (e.g. square1.shape = "rectangle"
) or directly through the class (e.g. Square.shape = "rectangle"
), but we strongly recommend doing it through the class directly.
Task 1. Make a parent class named Logger
. Above the constructor, include this line:
logging_tape: LoggingTape | None = None
This will be our log of things that happen! The : LoggingTape | None
indicates that the variable logging_tape
will either be of type LoggingTape
(which we'll make in just a second), or None
Context managers in Python are a great tool for temporarily defined things.
For instance, you have probably seen
with open('file.txt', 'r') as f:
# do things with the file f
#f is now closed, do other things!
You'll notice that f
is only properly defined as being the file opened with read permissions while in the "context" of the with
statement (within the with
statement's indent block)
This is a context manager! Context managers derive their functionality from special Dunder Methods __enter__
and __exit__
.
Let's work through an example! Say we want to set the class variable Logger.logging_tape
to be a new LoggingTape
, but only temporarily. Sounds perfect for a with
statement, huh? Here's a starter:
class LoggingTape:
def __init__(self):
...
def __enter__(self):
...
def __exit__(self, *args):
...
def add_to_log(self, new_log):
...
def print_logs(self):
for log in self.logs: print(log)
We might see some code using the LoggingTape
like
with LoggingTape() as tape:
...
...
On line 1, LoggingTape
's __enter__
method is called to enter the with
statement. Then, after the indent block (so after line 2 but before line 3), LoggingTape
's __exit__
method is called to exit from the with
statement.
In LoggingTape
's constructor, make an empty list called logs
. We'll store strings as messages of logs of whatever happened
Then, in __enter__
, set Logger.logging_tape = self
and return self
. We're setting a class level variable!
Next, in __exit__
, set Logger.logging_tape = None
In add_to_log
, append new_log
to the end of logs
.
Now, check it out this code block:
with LoggingTape() as tape: #runs LoggingTape's __enter__()
#Logger.logging_tape is now defined as tape (from line 1)!
tape.add_to_log("Hi!")
#runs LoggingTape's __exit__()
#Now Logger.logging_tape is defined as None
This might seem a little trivial now, but what this enables us to do is have any Logger
class record to tape
while inside the with
statement (lines 2-3 in the example)!
Say we have a car class:
class Car(Logger):
def travel(self, distance):
self.logging_tape.add_to_log(f"Traveled Distance {distance}")
car = Car()
with LoggingTape() as tape:
car.travel(5)
tape.print_logs
The output will be "Traveled Distance 5". The LoggingTape kept track of the logged item automatically for us. I wonder if this will be useful in Homework 2…
import numpy as np
#1. Make a NumPy array of zeros with shape (5,10).
zeros = ...
assert(zeros.shape == (5,10))
assert(np.max(zeros) == np.min(zeros) == 0)
# 2. Do it again, but make it full of ones!
ones = ...
assert(ones.shape == (5,10))
assert(np.max(ones) == np.min(ones) == 1)
# 3. Slice the array to get the first row of ones!
first_row = ...
assert(first_row.shape == (10,))
# 4. Slice the array to get the first *column* of ones!
# (Hint: Try passing in `:` as one of the slice indices!)
first_col = ...
assert(first_col.shape == (5,))
# 5. Create a new dimension on the ones array.
# You should end up with shape $(5,10,1)$. (Check out `np.expand_dims`)
expanded = ...
assert(expanded.shape == (5,10,1))
# 6. Cast a list `[1,2,3]` into a NumPy array.
# Then, change the first element to `4`.
arr = ...
...
np.testing.assert_array_equal(arr, np.array([4,2,3]))
# 7. Make a NumPy array of integers $0$ to $9$,
# inclusive, in shape $(2,5)$ using `np.arange` and `np.reshape`.
incr = ...
...
assert(incr.shape == (2,5))
np.testing.assert_array_equal(incr, np.array([[0,1,2,3,4],[5,6,7,8,9]]))
# 8. With incr from 7., use `np.vstack` to add
# a new row `[10, 11, 12, 13, 14]`.
vstacked = ...
v_target = [[0, 1, 2, 3, 4], [5, 6, 7, 8, 9], [10, 11, 12, 13, 14]]
np.testing.assert_array_equal(vstacked, np.array(v_target))
# 9. With incr from 7., use `np.hstack` to add a new column of 0's.
# *Hint: think about the dimensionality of the original matrix.
# What dimensions do you need to represent a new column?*
hstacked = ...
h_target = [[0, 1, 2, 3, 4, 0], [5, 6, 7, 8, 9, 0]]
np.testing.assert_array_equal(hstacked, np.array(h_target))
import numpy as np
# 1. Add two NumPy arrays of ones with shape $(5,10)$.
ones_1 = ...
ones_2 = ...
sum_ones = ...
assert(sum_ones.shape == (5,10))
assert(np.max(sum_ones) == np.min(sum_ones) == 2)
# 2. Subtract a NumPy arrays of ones with shape $(5,10)$ from another one.
# Reuse ones_1 and ones_2
diff_ones = ...
assert(diff_ones.shape == (5,10))
assert(np.max(diff_ones) == np.min(diff_ones) == 0)
# 3. Multiply a NumPy array of ones with shape $(5,10)$ by the scalar two.
scaled = ...
assert(scaled.shape == (5,10))
assert(np.max(scaled) == np.min(scaled) == 2)
#1. Use NumPy matrix multiplication (`np.matmul` or using the `@` symbol)
# to calculate the inner product of vectors v1, v2
v1 = np.array([1,2,3])
v2 = np.array([3,2,1])
inner_prod = ...
assert(inner_prod == 10)
# 2. Use NumPy matrix multiplication (`np.matmul` or using the `@` symbol)
# to calculate the matrix product of matrices m1 and m2.
m1 = np.array([[1, 2, 3],\
[0, 1, 0]])
m2 = np.array([[4, 6],\
[2, 1],\
[0, 5]])
mat_prod = ...
m_target = np.array([[8, 23],\
[2, 1]])
np.testing.assert_array_equal(mat_prod, m_target)
# 3. Use NumPy element-wise matrix multiplication (using the `*` symbol)
# to calculate the element-wise product of matrices m1 (above) and m3).
m3 = np.array([[4, 6, 2],\
[1, 0, 5]])
elem_prod = ...
e_target = np.array([[4, 12, 6],\
[0, 0, 0]])
np.testing.assert_array_equal(elem_prod, e_target)
# 4. Use NumPy element-wise matrix division (using the `/` symbol)
# to calculate the element-wise quotient of matrices m1 and m4.
m4 = np.array([[4, 6, 2],\
[1, 1, 5]])
quot = ...
q_target = np.array([[0.25, 0.33333333, 1.5],\
[0., 1., 0.]])
np.testing.assert_allclose(quot, q_target)
# 5. Use NumPy functions to find the average of the entries in matrix m5.
# Do it again, but get the average per row
# Then, do it per column
m5 = np.array([[1,2],\
[0,1]])
avg = ...
row_avg = ...
col_avg = ...
assert(avg == 1)
np.testing.assert_allclose(row_avg, [1.5, 0.5])
np.testing.assert_allclose(col_avg, [0.5, 1.5])
np.where
, argmax
)
# 1. Use a masking operation on matrix m1.
# We want masked to be a matrix whose entries are `False` where
# m1's entries are less than $6$, and `True` otherwise.
m1 = np.array([[1, 9, 5],\
[8, 0, 2]])
masked = ...
masked_target = np.array([[False, True, False],\
[True, False, False]])
np.testing.assert_array_equal(masked, masked_target)
# 2. Use `np.where` on matrix m1 to
# keep entries greater than or equal to 6
# and replace any entries less than 6 with 0.
replaced = ...
replaced_target = np.array([[0, 9, 0],\
[8, 0, 0]])
np.testing.assert_array_equal(replaced, replaced_target)
# 3. Use `np.argmax` on matrix m1 to find, per row,
# the index of the greatest element.
max_inds = ...
target_inds = [1,0]
np.testing.assert_array_equal(max_inds, target_inds)
Broadcasting is a powerful mechanism that allows numpy to work with arrays of different shapes when performing arithmetic operations. Frequently we have a smaller array and a larger array, and we want to use the smaller array multiple times to perform some operation on the larger array.
Take a glance through some of these simple examples!
import numpy as np
# We will add the vector v to each row of the matrix x,
# storing the result in the matrix y
x = np.array([[1,2,3], [4,5,6], [7,8,9], [10, 11, 12]])
v = np.array([1, 0, 1])
y = x + v # Add v to each row of x using broadcasting
print(y) # Prints "[[ 2 2 4]
# [ 5 5 7]
# [ 8 8 10]
# [11 11 13]]"
import numpy as np
# Let's say we have the following matrix A.
A = np.random.random((50, 100, 20))
# We can imagine A as 50 instances of (100,20) matrices.
# We have the following matrix B
B = np.random.random((20,40))
# We want to multiply each (100,20) instance of A by B,
# we can do this because the dimensions match up: 20 = 20
print(A @ B.shape)
# output should be of shape (50, 100, 40)
# each of the 50 (100,20) is multiplied by the (20,40) matrix to yield (100, 40)
Consider the following arrays:
A = np.array([[0,1,2],[3,4,5]]) # shape (2,3)
B = np.array([[1,1,1]]) # shape (1,3)
C = np.array([[-1,-1,-1],[1,1,1]]) # shape (2,3)
D
as A - B using broadcastingE
with shape (3,2) by reshaping C
F
with shape (2,2) by matrix multiplying D
by E
You can use the following to confirm your results look as they should!
assert(np.all(D == [[-1,0,1],[2,3,4]]))
assert(np.all(E == [[-1,-1],[-1,1],[1,1]]))
assert(np.all(F == [[2,2],[-1,5]]))
Let's try some of the important examples again, but with Tensorflow.
You'll find that for a lot of things, you can just replace np
with tf
. However, in some cases, the method might be named something else. Again, you should get used to searching for methods you'd like to use in the documentation.
Hint: If you know the Numpy method you'd like to use, you can usually get away with googling <numpy method name> in tensorflow.
import tensorflow as tf
#1. Make a tf Tensor of zeros with shape (5,10).
zeros = ...
assert(zeros.shape == (5,10))
assert(tf.reduce_max(zeros) == tf.reduce_min(zeros) == 0)
# 2. Slice the array to get the first *column* of ones!
# (Hint: Try passing in `:` as one of the slice indices!)
first_col = ...
assert(first_col.shape == (5,))
# 3. Create a new dimension on the ones array.
# You should end up with shape $(5,10,1)$. (Check out `tf.expand_dims`)
expanded = ...
assert(expanded.shape == (5,10,1))
# 3. Cast a list `[1,2,3]` into a tensor. (Check out tf.convert_to_tensor)
arr = ...
...
assert(tf.reduce_all(arr == [1,2,3]))
# 4. Make a tensor of integers $0$ to $9$,
# inclusive, in shape $(2,5)$ using `tf.range` and `tf.reshape`.
incr = ...
...
assert(incr.shape == (2,5))
assert(tf.reduce_all(incr == [[0,1,2,3,4],[5,6,7,8,9]]))
import tensorflow as tf
# 1. Add two tensors of ones with shape $(5,10)$.
ones_1 = ...
ones_2 = ...
sum_ones = ...
assert(sum_ones.shape == (5,10))
assert(tf.reduce_max(sum_ones) == tf.reduce_min(sum_ones) == 2)
# 2. Multiply a tensor of ones with shape $(5,10)$ by the scalar two.
scaled = ...
assert(scaled.shape == (5,10))
assert(tf.reduce_max(sum_ones) == tf.reduce_min(sum_ones) == 2)
# 1. Use Tensorflow matrix multiplication (`tf.matmul` or using the `@` symbol)
# to calculate the matrix product of matrices m1 and m2.
m1 = tf.convert_to_tensor([[1, 2, 3],\
[0, 1, 0]])
m2 = tf.convert_to_tensor([[4, 6],\
[2, 1],\
[0, 5]])
mat_prod = ...
m_target = tf.convert_to_tensor([[8, 23],\
[2, 1]])
tf.debugging.assert_equal(mat_prod, m_target)
# 3. Use NumPy element-wise matrix multiplication (using the `*` symbol)
# to calculate the element-wise product of matrices m1 (above) and m3).
m3 = tf.convert_to_tensor([[4, 6, 2],\
[1, 0, 5]])
elem_prod = ...
e_target = tf.convert_to_tensor([[4, 12, 6],\
[0, 0, 0]])
tf.debugging.assert_equal(elem_prod, e_target)
# 5. Use Tensorflow functions to find the average of the entries in matrix m5.
# Do it again, but get the average per row
# Then, do it per column
m5 = tf.convert_to_tensor([[1,2],\
[0,1]], dtype=tf.float32)
avg = ...
row_avg = ...
col_avg = ...
assert(avg == 1)
assert(tf.reduce_all(row_avg == [1.5, 0.5]))
assert(tf.reduce_all(col_avg == [0.5, 1.5]))
# 1. Use a masking operation on matrix m1.
# We want masked to be a matrix whose entries are `False` where
# m1's entries are less than $6$, and `True` otherwise.
m1 = tf.convert_to_tensor([[1, 9, 5],\
[8, 0, 2]])
masked = ...
masked_target = np.array([[False, True, False],\
[True, False, False]])
tf.debugging.assert_equal(masked, masked_target)
# 2. Use `tf.argmax` on matrix m1 to find, per row,
# the index of the greatest element.
max_inds = ...
target_inds = [1,0]
assert(tf.reduce_all(max_inds == target_inds))
Consider the following arrays:
A = tf.convert_to_tensor([[0,1,2],[3,4,5]]) # shape (2,3)
B = tf.convert_to_tensor([[1,1,1]]) # shape (1,3)
C = tf.convert_to_tensor([[-1,-1,-1],[1,1,1]]) # shape (2,3)
D
as A - B using broadcastingE
with shape (3,2) by reshaping C
F
with shape (2,2) by matrix multiplying D
by E
You can use the following to confirm your results look as they should!
assert(tf.reduce_all(D == [[-1,0,1],[2,3,4]]))
assert(tf.reduce_all(E == [[-1,-1],[-1,1],[1,1]]))
assert(tf.reduce_all(F == [[2,2],[-1,5]]))
Wohoo! You just completed your first assignment of CSCI1470! Submit your solutions to gradescope for the autograder to test your numpy/tensorflow sections.