# Password cracker problem proto-handout This document contains notes on how to get started with and run a test problem that Nick is planning to incorporate into project 1. There's no handout yet, but the hope is that this should be sufficient to get started. ## Initial setup (TA testing only) The stencil version of this project uses Python. To set it up, 1. Clone [this repository](https://github.com/brown-csci1660/cryptography) 2. Go to the `pwfind` directory. For this test, do everything from this directory 3. Do the environment setup according to [these instructions](https://hackmd.io/@cs1660/dropbox-setup-guide#Cloning-the-stencil-project). Stop at the "working with the stencil code section" and then come back here ## Premise The goal of this problem is to get students thinking about mechanisms to store passwords and how to search the input space of possible passwords in different methods. To do this, we have a stencil that creates a simple password storage "database" and a login program (`login`) to test the act of users logging into the system. At the start, the database and login program stores the passwords in plaintext. Students will implement two password storage mechanisms - Storing a hashed version of the password - Storing a salted hash At each stage, students will implement both the login and generation mechanism, then build a program to crack the password database. :::warning To make this attack feasible within reasonable time, Blue University has the following password policy: - All passwords are 4 characters long - Passwords are composed of lowercase ASCII letters and numbers ::: ## Getting started First, let's get a sense of what you'll be attacking: 1. Generate a password database that stores passwords using the `plain` method. This method stores the passwords **in cleartext** (which is bad!). We'll use it as a demonstration of how everything works: ``` $ ./generate_database --method plain --users 10 --secrets secrets-part1.txt part1.db.json ``` You'll use this command a bunch, so let's break down what's happening here: - `--method`: Is the password storage method for the database. We'll use more methods shortly. - `--users 10`: Number of users to generate - `part1.db.json` is the password "database", similar to how a real system might store passwords. This is the file you'll be "attacking"! - `--secrets secrets-part1.txt`: This option writes the cleartext passwords to a separate file--you won't use this for the attack, but you can use it to check your work 2. Open up the password database file `part1.db.json`. You should see a file like this (note: your usernames/passwords will be different!): ```javascript { "method": "plain", "users": { "aenrc": { "password": "txfa" }, "bpstx": { "password": "ipn2" }, "grnpa": { "password": "mhfe" }, "hdrym": { "password": "xxbg" }, . . . ``` Each block in the `users` section gives the information stored about that user. Here, it's just the password, and it's unencrypted! (Yes, the passwords are only four characters long, more on that in a minute...) 2. Now that we have a password database, we can try to log in. To simulate this process, we've provided a program `login` that takes in a username, password, and database, and simulates a login. Try to run it using one of the usernames/passwords **from your database**: ``` # Replace <username>, <password> with credentials from your database $ ./login part1.db.json <username> <password> Success! ``` More importantly, we want to find the passwords without knowing them ahead of time. For the `plain` method, this is trivial, but it won't be after we implement a more secure password storage method. We'll do this by *cracking* the database to discover the password. ### Cracking passwords We've provided a starter password cracking file called `pwfind`--we've implemented the part to "crack" `plain` passwords, you'll do the more interesting parts. You can run `pwfind` like this: ```shell ## ./pwfind <database file> output.txt $ ./pwfind part1.db.json found-part1.txt ``` This creates the file `found-part1.txt` with the recovered passwords. Open up this file and take a look at how it works. ## Your task You'll implement two more methods of password storage (described further below): 1. Method `sha1-nosalt`: stores the SHA1 hash of the password 2. Method `sha1-salt4`: stores the SHA1 hash of the password + a 4-byte random salt To demonstrate what we mean, let's generate a database of 100 hashed passwords ``` $ $ ./generate_database --method sha1-nosalt --users 100 --secrets secrets-part2.txt part2.db.json ``` Take a look at the file `part2.db.json`---these aren't the passwords, but a hashed representation! Your job is to recover the original passwords from the database. (You can check your work by looking at `secrets-part2.txt`) To do this, you should do the following: 1. Extend the `login` program to successfully authenticate a user when using a database with this storage method. This is to make sure you understand how this method of password storage works 2. Extend your cracking program `pwfind` to crack this type of password database. See the next section for constraints on how this works, how many passwords you should crack, etc. ### Steps 1. Extend `login` to support a database stored with the `sha1-nosalt` method 2. Extend `pwfind` to crack a database of **100** passwords stored with the `sha1-nosalt` method. **Performance requirement for step 2**: the time to crack passwords MUST NOT scale with the number of passwords in the database 3. Extend `login` to support the `sha1-salt4` method 4. Extend `pwfind` to crack a database of **5** passwords stored with the `sha1-salt4` method **Design question for step 4**: For this part, can you retain the same performance requirement as in step 2? Why or why not?