# Homework 2022-01-11 ## Algorithms and Performance We faced the following question: suppose we have a dictionary from security ids to account ids. Here is an example. For clarity, we will use letters to represent account ids and numbers to represent securities: ```python org = { 1: ['a'], 2: ['a'], 3: ['a', 'b'], 4: ['a', 'b'], 5: ['b', 'c'], 6: ['c'] } ``` We would like to hit the pricing-data endpoint, which will return analytics for any number of securities in a given account on a given date. 1. Review: Invert this dictionary, such that it is a map from accounts to securities they hold. Call the new dictionary `inv` ```python= inv = {} for key, value in org.items(): for v in value: if v not in inv: inv[v] = [] inv[v].append(key) ``` 2. Review: Write a program which will return the account with the most securities in it. For Tab and Nick, I recommend attempting to independently produce (ie without looking it up) something like Garrett's first answer. 3. Here is one possible implementation of problem 2: ```python= def go(k): return len(inv[k]) max_account = max(inv,key=go) ``` Is this implementation faster than your implementation? Why or why not? 4. We go and hit pricing-data for that account. Suppose all their securities have the information we need for every security. Which should be the next account we hit? 5. Write a program which hits pricing-data for the minimal number of accounts. You can do this with a mocked function if you'd like: ```python def get_pricing_data(account, securities): return {s:True for s in securities} ``` 7. In reality it might be that we need to hit multiple accounts to cover the same security. Write a program which does this minimally. You can do this with a mocked function if you'd like: ```python def get_pricing_data(account, securities): return {s:not (ord(account) + s) % 2 for s in securities} ``` NB: even after running through all account/security combos you may not have the data you want for all securities 8. How well does your algorithm survive contact with `semaphore`? Suppose now that we want the fastest script by the clock. You can practice with this mocked function if you'd like (and probably should use a larger list of securities/accounts, like the one we get from running the script; if so, remove the `ord` function below): ```python import time import random def get_pricing_data(account, securities): time.sleep(random.random() * len(securities)/25) return {s:not (ord(account) + s) % 2 for s in securities} ``` ## Syntactic Sugar NB: This section is somewhat academic and is primarily aimed at letting you "peek under the covers" of what the language is doing. You will use these tools a lot, but you can use them for very long without knowing the internals, and so you shouldn't worry too much about studying this section This week we observed that `set([1,2,3]) - set([3,4,5]) = set([1,2])`. This is a different use of `-` than we normally see, where `4-3=1`. 1. What are other examples of things you can use `-` on? ```python= import pandas as pd dat = [{"a":1},{"a":4}] df = pd.DataFrame(dat) dat2 = [{"a":4},{"a":-1}] df2 = pd.DataFrame(dat2) df-df2 ``` 3. This process is called "Operator Overloading". In python, these are implemented on "classes", which you will not have to worry about for years. In particular, `-` is defined by an object's `__sub__` method, while ie `+` is defined by the `__add__` method. It is important to understand that the `+` you see in python is very different from the `+` you see in normal math. 1. One obvious example is that in python, `+` is not _commutative_ - that is, it is not generally true that `a+b=b+a`. We will look at some extreme examples below, but first, try to come up with example values for `a` and `b` such that `not (a+b==b+a)`. 2. In what way is `datetime.date(2021,2,1) - datetime.date(2021,1,1)` not like normal subtraction? 3. What is the result of `[1,2] + {3,4}`? 4. What is the result of the following program: ```python= x = [1,2] x += {3,4} x ``` 5. Why?!?! 6. "Operator Overloading" is an example of a wider phenomena called "Syntactic Sugar". "Syntactic Sugar" is a phrase used to describe ways that programming languages translate human readable input into (usually) more verbose but equivalent programs. For instance, writing `"this"+"kiss"` is sytactic sugar for `"this".__add__("kiss")`. It is often useful to think about expressions and what they "desugar" into. For instance, while (as we saw above) the following is not quite true, to a first approximation you can read `x+=1` as `x=x+1`. 1. Suppose that `xss=[[1,2,3],[4,5],[6]]`. Write an expression that returns `[1,2,3,4,5,6]` using nested for loops. 2. Write an expression that returns `[1,2,3,4,5,6]` using list comprehension. 3. Write an expression that returns `[1,2,3,4,5,6]` using `sum`. 4. What are other examples of Sytantic Sugar in Python?