# Lab 10: Understanding and Testing Mutation
In this lab, we will continue with two ideas that we introduced in lecture: understanding how data mutation works and testing functions that mutate data.
## An Example of Sharing Data: Collaborative Editing
When a program is maintaining data over time, sometimes we want that data to be shared across different parts of the program and sometimes we do not. Decisions about what should or should not be shared affect how we set up our data and write programs.
### Dataclasses, Examples, and Memory Diagrams
Consider a tool (like Google Docs) that allows people to create documents and share them with other people. Here's a basic dataclass for Documents.
```python
from dataclasses import dataclass
@dataclass
class Document:
title: str
contents: str
hwk1 = Document("Homework 1",
"Write a program")
hwk2 = Document("Homework 2",
"Run a program")
```
When we call something like `Document("Homework 1", ...)`, we are using the `Document` *constructor* to create a new piece of `Document` data.
Another basic dataclass captures people who are allowed to edit documents:
```python
@dataclass
class Editor:
name: str
docs: list # a list of Documents
amaris = Editor("amaris", [hwk1])
```
Imagine that the CS111 homeworks team is preparing assignments that are in different stages of readiness. Some documents are ready for sharing with others, but some stay private until the creators are ready to share them.
**Task 1:** Copy the above code fragments to a Python file. Add variables and definitions to set up the following set of editors and documents with their corresponding sharing settings:
- Michael as an editor with no documents
- Nya as an editor who is also working on homework 1 (you should use the `hwk1` variable from the examples above as part of constructing the Nya `Editor`, just like the `amaris` example above)
- Alex as an editor who is working on homework 2 (which is not yet shared with anyone)
**Task 2:** Draw the memory diagram for the current collection of Editors and Documents. Use this [spreadsheet template](https://docs.google.com/spreadsheets/d/10AP2QPY24iN0OFdT-t6VeVY4EM7My4-y8Z2nPQMLDtU/edit?usp=sharing) for your memory diagram. Make a copy of the template to edit it.
---
:::spoiler **What's a memory diagram???**
:::info
Consider the following program:
```python
@dataclass
class Environment:
name: str
size: num
@dataclass
class Webkinz:
name: str
species: str
home: Environment
prairie = Environment("Prairie", 8)
forest = Environment("Forest", 10)
sparkle = Webkinz("Sparkle", "Unicorn", forest)
brownie = Webkinz("Brownie", "Horse", prairie)
```
We could draw our memory diagram like this, where each box represents our data in memory. Each box resides in some location in memory (written as locXXXX). When a variable or field references another piece of data in memory, we use the location to identify the piece of data.
If we want, we can highlight the references between variables/fields and data by also drawing arrows that connect a reference to a location with the actual piece of data. The arrows, however, mask a key piece of what is happening: that data live in spots in memory with unique numeric identifiers.
![Screenshot 2024-11-13 at 5.31.45 PM](https://hackmd.io/_uploads/ryp1YozMJg.png)
An alternative way of drawing memory diagrams is to use the [spreadsheet template](https://docs.google.com/spreadsheets/d/10AP2QPY24iN0OFdT-t6VeVY4EM7My4-y8Z2nPQMLDtU/edit?usp=sharing) we saw in class. The same diagram as above would be represented like this using the spreadsheet:
![Screenshot 2024-11-13 at 5.32.41 PM](https://hackmd.io/_uploads/SyeQKjMGyg.png)
**It is critically important that names from the directory do not appear in the heap area/inside the data**. Using the names suggests that later updates to the variable values (like ```prairie = ...```) would affect the data in the heap, but this is NOT how programming languages work.
:::
___
### *CHECKPOINT:* Call over a TA once you reach this point. Note that the TA can ask either partner to answer, so please make sure both partners understand the diagram.
___
**Task 3:** Extend your memory diagram to show Amaris being the sole editor of a *copy* of Alex's current homework 2 document (as well as the `hwk1` document from the previous tasks).
**Task 4:** Add some code to your file that would create the same memory contents as what you just drew to give Amaris a copy of homework 2.
Amaris and Nya have been working on homework 1, but their idea isn’t working out. It’s such a mess that they have decided to just throw away their old attempt and start over.
**Task 5**: Here are two ways to “start over” on homework 1. What effect will each have on the memory diagram? After taking each approach (separately), what contents will Python produce if we ask to see Nya’s version of hwk1? Duplicate your Task 3 memory diagram and highlight the changes for each approach.
```=python
# Approach #1
hwk1 = Document("Homework 1", "This is much better")
#Approach #2
hwk1.contents = "This is much better"
```
One of the most subtle concepts to internalize when starting to work with updating data is the difference between updating **variables** and updating **contents of data structures**. The previous task highlights the differences.
**Task 6:** To help you internalize these details, please talk through the following questions with your partner.
1. If we update what `hwk1` refers to using approach 1, which editors will see the changes?
- [ ] Anyone that already had access to the old hwk1
- [ ] Anyone that gets access to hwk1 in the future
- [ ] Both past and future editors of hwk1
2. If we update what `hwk1` refers to using approach 1, will there be any way to access the original `hwk1` contents?
3. If we update the contents of hwk1 using approach 2, which editors will see the changes?
- [ ] Anyone that already had access to the old hwk1
- [ ] Anyone that gets access to hwk1 in the future
- [ ] Both past and future editors of hwk1
4. If we update what `hwk1` refers to using approach 2, will there be any way to access the original `hwk1` contents?
---
### *CHECKPOINT:* Call over a TA once you reach this point.
---
### Functions for Key Document Tasks
Creating, sharing, editing, and copying documents happen so often that it would make sense to have functions corresponding to those tasks.
**Task 7:** Write a function `create` that takes an `Editor` and a document title. The function creates a new document (with an empty description), stores it in the editor's list of docs, then returns the new document.
**Task 8:** Write a function `edit` that takes a `Document` and a string (for the new content) and replaces the document contents with the new text.
**Task 9:** Write a function `share` that takes a `Document` and an `Editor` and adds the document to the editor's list of docs.
**Task 10:** Write a function `copy` that takes a `Document` and returns a copy of the document. The copy has the same title and contents as the original.
**Task 11:** Look back at your four functions: which ones return something and which ones don't? Focus on the ones that don't return anything. Do they "accomplish" anything? Can you articulate what a function that doesn't return anything has to "do" in order to be of use? Write down your answer. (*hint: think about our mutations examples from lecture*)
___
### *CHECKPOINT:* Call over a TA once you reach this point.
___
### Testing
Now, we have to figure out whether the functions you've written provide the sharing conditions that we had in mind. In particular, think about the `edit`, `share`, and `copy` functions.
Your goal is to fill in this template of a testing function:
```python
def test_docs():
"SETUP"
# your scenario setup goes here
"PERFORM MODIFICATIONS"
# calls to your functions go here
"CHECK EFFECTS"
# assertions go here
```
**Task 12:** Identify an initial scenario of editors and documents against which you will run your tests. Create the corresponding data in the "SETUP" area of `test_docs`.
**Task 13:** Write down (in prose) a list of effects that these operations **should** have. Put these in a multi-line string comment above your `test_docs` function in your file:
```python
"""
Expected effects:
- If we create a doc titled t for an editor, the editor will have a document
with title t in their docs list
- YOU ADD MORE ITEMS HERE
"""
```
**Task 14:** Write down a list of effects that these operations should **not** have. For example, "if one document is edited, then the contents of another document should not change". Add these in a separate comment like the one above in your file.
**Task 15:** Make a collection of calls to the `edit`, `share`, and `copy` functions in the "PERFORM MODIFICATIONS" section. These calls should be sufficient to perform the checks that you outlined in your lists of effects.
**Task 16:** Turn your expected and unexpected effects into pytest assertions, putting those in the "CHECK EFFECTS" section of the testing function.
___
### *CHECKPOINT:* Call over a TA once you reach this point.
___
### SRC
Through this lab, you and your partner have created functions significant in facilitating collaborative editing. This is especially important in court cases when legal teams work together to create legal statements, arguments, and strategies. Now, let’s learn about the recent addition to many legal teams: artificial intelligence (AI).
**Task 17:** Read this [article](https://pro.bloomberglaw.com/insights/technology/ai-in-legal-practice-explained/#legal) from Bloomberg Law on AI for Lawyer explained with your lab partner.
**NOTE:** *The article linked heavily advertises its legal software. We don’t endorse the products advertised and are only using it to give background on how AI is used in the legal sphere*
**Task 18:** Take a moment to discuss the ethical considerations when using AI to answer questions about an open case, help draft court statements/arguments, or analyze evidence. Do you think artificial intelligence should be used in legal cases? If you had a lawyer, would you feel comfortable with their use of AI? Why or why not?
___
### *CHECKPOINT:* Call over a TA once you reach this point.
___
### Global variables
In a large course staff like we have for 111, having a separate variable for each TA/Prof makes our code a bit hard to manage. It therefore makes sense to make a list to hold all of the editors on our documents, as follows:
```=python
all_staff = [Editor("milda", [hwk1, hwk2]), Editor("rachel", [hwk1]), Editor("sofia", [hwk2])]
```
To help us keep track of all the documents we are working on, we want to make two additions to our current code: we want to be able to ask which staff have access to a particular document, and we want to maintain a count of how many documents we have.
Let's start with the latter. We'll set up a new variable to maintain the count of documents that we have created. We'll also modify our `copy` function to update this count.
**Task 19:** Create a global variable called `num_docs` and modify your `copy` code in a function called `copy2` to also increment `num_docs` whenever a document is copied.
:::info
**A note on testing global data**
We're not asking you to write a test for `copy2`. This is because writing
```python
import pytest
from [python code file] import *
```
pulls in all of the data from the Python code file by running that file and updating program memory (directory and heap) so that the testing file can access it. Unfortunately, some strange stuff goes on with scope that makes it difficult to test the effects of mutating functions on data that was declared in a different file (that is, just like functions have a "local context" to keep track of the temporary directory entries that exist when running a function, the code file and the testing file each have their own context, and data that is modified by a function in the code file cannot be read by the testing file).
If you take CS200, you'll learn a new style of programming (object-oriented programming) and a new style of testing to go along with it, that is a bit more strict with the scoping of data.
:::
**Task 20:** The following code computes a list of all staff who can access a specific document.
```python
def who_has(doc) -> list:
"Produce list of editors with access to given doc"
can_access = []
for editor in all_staff:
if doc in editor.docs:
can_access.append(editor)
# return alternative 1
# return alternative 2
return can_access
```
The `who_has` function indents the `return` under the `for` loop. We could also have indented the `return` at the corresponding comments. Discuss with your partner what each return position would result in compared to the original `return` position.
___
### *CHECKPOINT:* Call over a TA once you reach this point.
___
> Brown University CSCI 0111 (Fall 2024)
> Feedback form: tell us about your lab experience today [here](https://forms.gle/52Fi9HdMFRgW1vD67)!