# Data Structures and Setup Help ## Table of Contents 0. [Logistics](#logistics) 1. [Data Structures](#ds) 2. [Wondering](#wondering) 3. [VSCode and Common Python Issues](#vscode) 4. [Testing your Python Programs](#testing) ## Logistical Comments <a name="logistics"></a> Today's lecture is split into two parts. First, we'll talk about some different data structures you'll be using in 112. Then we'll do a demo installation of VSCode and spend some time helping everyone get set up and answering questions. The first drill is being released after class today. It contains some questions about how to use the data structures from today's lecture. <B>This first drill will be due Friday at noon </B> and subsequent drills will be due at noon on Mondays and Fridays. Drills are due at noon because Tim will use the answers from drills to adjust lecture content as needed. (The site had erroneously reported last year's drill due dates!) ## A Few Good Data Structures <a name="ds"></a> ### Lists Lists are used to represent <i>sequences</i> of items in a particular order. We can build and add to lists like this: ``` > story = ["It", "was", "a", "dark", "and", "stormy"] > story.append("night") ``` We can access and modify particular elements of the list by <i>index</i>. The first element of the list is at index ```0```, so if we want to change the words above from past to present tense, we would use index ```1```: ``` > story[1] "was" > story[1] = "is" > story ["It", "is", "a", "dark", "and", "stormy", "night"] ``` We can even add lists together (sometimes you'll hear people calling this "concatenation"): ``` > ["NARRATOR:"] + story ["NARRATOR:", "It", "is", "a", "dark", "and", "stormy", "night"] ``` And we can loop over the elements of a list: ``` > for word in story: print(word.upper()) IT IS A DARK AND STORMY NIGHT ``` ### Dictionaries Dictionaries (often called "Hashtables" or "Maps" in other contexts) are used to represent mappings between keys and values. ``` > status = {"brightness": "dark", "weather": "stormy"} > status["time"] = "night" ``` We access elements by key rather than (as in lists) by index: ``` > status["weather"] "stormy" > status["weather"] = "pleasant" > status {"brightness": "dark", "weather": "pleasant", "time": "night"} ``` We can check whether the dictionary contains a key: ``` > "weather" in status True ``` We can loop over the keys: ``` > for attribute in status: print(status[attribute].upper()) "DARK" "PLEASANT" "NIGHT" ``` #### A common issue It's very common to get the following error when you're working with dictionaries in Python: ``` TypeError: unhashable type: 'list' ``` You might see something else there besides `list`. What's going on? The problem is that dictionary keys can only be specific types of data. Lists cannot be keys in a dictionary. We'll learn why later; for now, just know what this error means: that the type you want to use as a key can't be used that way. #### Something I wonder about dictionaries Would we ever want a dictionary where the keys are numbers? (Would that be any different from a list?) ### Sets Sets store <i>unordered</i> collections of elements: ``` > night = {"dark", "stormy"} ``` We can add elements: ``` > night.add("frightening") > len(night) 3 > night.add("stormy") > len(night) 3 ``` We can test whether elements are present: ``` > "frightening" in night True > "inauspicious" in night False ``` Like with lists and dictionaries, we can loop over the elements: ``` > for quality in night: print(quality.upper()) FRIGHTENING DARK STORMY ``` We can combine sets: ``` > night | {"inauspicious"} {"dark", "stormy", "frightening", "inauspicious"} ``` We can convert a list to a set, and vice versa: ``` > monster = ["very", "very", "scary"] > set(monster) {"very", "scary"} > list(set(monster)) ["very", "scary"] ``` Sets are very useful when we care about which elements are present, but not about their order. #### Something I wonder about sets If I can loop over the elements of a set, but sets are unordered, what order will the elements be visited by the loop? ## Some other things I wonder <a name="wondering"></a> When should we use lists, hashtables, and sets? Let’s say we’re looking at the text of Frankenstein again and want to answer a few questions. Which data structure would we use in order to compute each of the following? * the number of unique non-capitalized words in Frankenstein; * all of the characters in Frankenstein, ordered by when they appear; and * The longest word in Frankenstein. ## VSCode and Common Python Issues <a name="vscode"></a> We did a demo of a fresh installation of Python and VSCode. See the lecture capture for more information, but I've put some common issues and fixes in the notes below. ### Running your programs at the terminal You can run your programs from outside VSCode via the terminal (which you may also hear me call the "command line"). VSCode gives you a terminal window under your code file, but you can also get a terminal through various operating-system specific means. On MacOS, I can find it under ```Applications``` and then ```Utilities``` in the Finder. Every terminal window will have a "current directory", which is the folder it's currently browsing. Right now, I have Python file called ```files_prep.py``` (from preparing this lecture!) in my ```teaching/112/lectures/sep13``` folder. But if my terminal isn't browsing that folder, it won't be able to see the Python file: ``` % python3 files_prep.py /usr/local/bin/python3: can't open file '/Users/tim/repos/teaching/112/learning/files_prep.py': [Errno 2] No such file or directory ``` This is common when, for instance, VSCode thinks the directory you want to be working in is different. Here, I've previously told VSCode's explorer that I wanted to work in a separate, `learning` folder! I can fix the problem by just changing directory: ``` % cd ../lectures/sep13 % python3 files_prep.py frankenstein.txt the ``` The most common word in Frankenstein is "the". ### Letting your programs take arguments from the terminal The notes from last week had this block of code at the end, saying that we added it so we could "use the program as a script": ``` if __name__ == '__main__': import sys print(most_common(count_words(open(sys.argv[1], 'r').read()))) ``` Why the `if __name__ == '__main__':` is there, we'll talk about a little later. For now, just know that it's standard Python practice to put your main, top-level code, inside this `if` statement. Importing `sys` loads some helper functionality in from Python's library. Most helpfully, it lets us access the <i>arguments</i> that the program was called with via the `sys.argv` list. So, when I ran `python3 files_prep.py frankenstein.txt` at the terminal, `sys.argv[1]` captured the filename `frankenstein.txt`. This is incredibly useful if you want to write a program you can reuse without having to edit the program. ### Python 2 versus Python 3 It turns out that there are 2 major versions of Python currently in use: version 2 and version 3. These versions are different enough that you need to be careful and run the right version: some systems have both installed! For example, here's the state of Tim's laptop: ``` % python --version Python 2.7.16 % python3 --version Python 3.7.3 ``` Make sure you run the right version; <B>this class uses Python 3</B>, which is why, in lecture, Tim is usually careful to run `python3` and, when he isn't, hilarity ensues. ## Testing your Python Programs <a name="testing"></a> This lecture has been pretty full, so we'll talk about testing next time. For now, if you see anything in the class referring to "PyTest" or "pytest", just know that this is about a testing tool we'll cover soon. For now, you can use the approach from the Python review. The slides and recording are [here](https://edstem.org/us/courses/13110/discussion/604590). Don't worry about trying to `import pytest`; you can just make tester functions using `assert` like this: ``` def test_most_common(): import sys assert 'the' == most_common(count_words(open(sys.argv[1], 'r').read())) ``` which, when called, will complain if "the" suddenly stops being the most common word in Frankenstein.