Try   HackMD

Lab 6: Priority Queues and Comparators

Setup

You can grab the stencil code from this GitHub Classroom link.

Managing Different Orderings within Lists

We've looked at LinkedList vs ArrayList, where the differences are about the runtime of key operations. When you are developing programs, you choose between these based on the frequency with which you'll perform key list operations.

Task: The hashmap implementation from lecture and homework uses an array of linked lists to organize the KVPairs. Talk through the following questions about this configuration:

  1. Why do we use an array for the outer organization of lists-of-KVpair? Could we replace that with a LinkedList instead? What would be the positive and negative tradeoffs for doing so?
  2. What do we use LinkedLists instead of ArrayLists within the array? What would be the positive and negative tradeoffs for making that change?

Note that this is a good review question for the midterm, as it checks your understanding of each of linked lists, array lists, and hashmaps.

Check your answers (after working on the task)
  1. Hashmaps provide constant-time access to values. Achieving constant-time requires that we get from a key to the corresponding list of KVPair in constant time. If we replaced the array with a LinkedList, we would not have a constant-time way to get from an index (for a key) to the list of pairs that contains that key.
  2. New KVPairs get added to and removed from the lists within the array slots every time a user adds or removes items from the hashmap. These lists are not of fixed size, and we never get items from those lists by position (the two benefits of ArrayLists). Thus, ArrayLists offer no benefits, whereas LinkedLists provide fast support for the operations that we do care about.

In short, neither proposed alternative offers any benefits.

Orderings within lists

In general, lists are a flexible and valuable data structure for maintaining linear order among items. In practice, however, there are different orderings themselves that are useful in different situations. Here are two examples:

  • A fast-food restaurant could use a list to manage the order in which to prepare customers' meals.
  • A web browser could use a list to track the order in which you visited pages.

Let's look at the differences in how lists are used in these settings.

Task: Assume the restaurant has two meals waiting to be prepared:

toPrepare = [Meal("Kathi", "egg salad"),
             Meal("Milda", "turkey sandwich")]

Discuss with your partner:

  1. Kenta orders soup. Where should that meal be placed within the list?
  2. The cook is ready to prepare the next meal. Which one should they work on next? Where is it in the list?

Task: Assume you are browsing the CS200 website. You started on the homepage, then went to the lectures page, then to the notes for lecture 14. When you are done with lecture 14, you hit the back button on your browser then visit lecture 15. The browser uses a list to maintain which pages you came from so that it can retrieve previous pages when you hit the back button. For example:

   cameFromPages = [lec14, lectures, home]
   // hit the back button
   cameFromPages = ___________
   // click on lecture 15
   cameFromPages = ___________   
  1. Fill in the blank lines with the content of the cameFromPages list after each user action.
  2. State a more general rule about where to place items in the cameFrompages list: where are new pages inserted? where does pressing the back button take pages from?

Task: Fill in the following table to summarize what we have observed about these two uses of lists:

Restaurant Back Button
Where are new items added (front or end)?
Where is the next item to process (front or end)?
Where are items deleted from (front or end)?

The restaurant data structure, in which the first item added is the first item processed, is called a Queue. The back button data structure, in which the last (most recent) item added is the first item processed, is called a Stack.

While stacks and queues are both forms of lists, their usage patterns are sufficiently common that computer scientists and programmers give them their own names, including names of the core operations on them. Java, like many programming languages, provides versions of these built in (we're giving you the links here for reference you don't need them right now).

Java's Queue interface

Java's Stack class

Side Note: Why is Queue and interface and Stack a class? Because we've given you an introductory view of Queues as based on lists here, whereas they are actually richer in practice. We'll return to this later in the course.

Priority Queues

While regular queues are used to process requests/issues/etc in the order that they arrive (add to the end, take from the front), in many settings we want a variation that considers items in order of priority. Think of a hopsital, for example, patients are treated in order of urgency, with arrival time a secondary criterion. This situation needs what we call a Priority Queue.

In this part of lab, you'll learn to work with Java's built-in PriorityQueue class, which implements the Queue interface. Later in the course, we'll write a different implementation of priority queues.

Defining "Priority"

What determines the priority among objects? If our objects are just numbers, we can use the natural ordering on numbers. Here's the definition of a basic priority queue on numbers:

import java.util.PriorityQueue;

PriorityQueue<Integer> numsLowToHigh = new PriorityQueue<Integer>();

More interesting cases are when the objects either do not inherently have an ordering or we want to use a non-default ordering. In this case, we define a Java Comparator, which is an object that defines how to compare or order other objects.

The essence of a comparator is a function that takes two objects of the type to be compared and returns an integer indicating the relationship between the arguments:

  • A positive return value indicates that the first object is larger,
  • a negative indicates that the second object is larger, and
  • zero indicates that the objects should be considered equal.

Recent versions of Java have added a notation for defining functions, akin to lambdas as you've seen in other languages. This notation can be used to define comparators. Here's an example of a comparator on integers that puts larger numbers first:

import java.util.Comparator; // order numbers decreasing Comparator<Integer> numHigher = (int1, int2) -> { return Integer.compare(int1, int2) * -1; };

Here, int1 and int2 are the parameters of a lambda function. The body of the lambda is in the curly braces. There is a built-in compare method on Integers that returns a negative number when the first number is smaller than the second, a positive number when the first is larger, and 0 if they are equal. To prioritize the second number, this comparison function multiplies that result by -1.

Here is a priority queue that uses the above comparator to prioritize larger numbers:

PriorityQueue<Integer> numsHightoLow = new PriorityQueue<Integer>(numHigher);

As another example, here is a priority queue that gives highest priorities to shorter strings.

// order strings by length Comparator<String> stringShorter = (str1, str2) -> { return Integer.compare(str1.length(), str2.length(); }; PriorityQueue<String> stringsByLength = new PriorityQueue<String>(stringShorter);

Task: The stencil code has a class called PQ.java. Use the PriorityQueue method add to add items to each of the queues defined above, then print out the queue to make sure you are getting the expected order in each one.

Tip: Use the printPQ method to print out your PriorityQueues throughout the lab. Not doing this will print the PriorityQueue in a representation you do not need to understand yet.

Checkpoint: Call over a TA to review your work.

Example: CIT Lab Preparation

The CIT has decided to implement a system in order to prepare lab spaces for CS Lab. They've already implemented a class MeetingTime that represents a meeting time given a DayOfWeek enum and a starting time (using int to represent 24 hour time), and a class CSLab that holds a MeetingTime and the course code of the CS class. All of these classes are in the stencil.

Reminder: Using Enums

An enum is a type in Java used to enumerate constants. For example, if you needed constants for directions, you could define and use an enum as follows:

enum Direction { NORTH, SOUTH, EAST, WEST } public Direction opposite(Direction ofDirection) { if (ofDirection == Direction.NORTH) { return Direction.SOUTH; } ... }

Task: Write a Comparator called labsByCode and orders CSLabs by course code so they can prepare labs for lower-level courses before upper-level ones. Check your comparator by using it to create a PriorityQueue, adding items and printing out the queue to check the ordering.

Finding the Next Lab

Task: Implement a comparator labsByTime that orders CSLabs by proximity to the current MeetingTime. For example, if you defined the following comparator:

MeetingTime now = new MeetingTime(DayOfWeek.WEDNESDAY, 16); Comparator<CSLab> labsByTime = (lab1, lab2) -> { //Fill in the rest here! };

labsByTime would have the something like the following order: a lab at 5 p.m. on Wednesday, a lab at 7 p.m. on Wednesday, all labs on Thursday, and a lab at 11 a.m. on Wednesday (representing labs the next week).

Hint: You might find it helpful to write hoursUntil, a method that returns the number of hours between one MeetingTime to another. You may find the ordinal() method helpful here! This will return an Integer that represents the order of the enum value in its initial declaration. For example, DayOfWeek.MONDAY.ordinal() returns 1 and DayOfWeek.THURSDAY.ordinal() returns 4.

Checkpoint: Call over a TA to review your work.

Reflecting on Comparators

Enter your responses to the tasks in this section on this Google Form. We will synthesize your ideas and discuss them in a future class.

Imagine that we were using comparators to determine the order of search results from a website. Each website is represented by a class that includes a number of metadata: the number of times the search term occurs on the site, the site’s visits per minute, the number of links on the web that direct to the site, the site’s number of social media shares in the past week, and the average amount of minutes a user spends on the site.

public class Website { private int searchTermFrequency; private int visitsPerMinute; private int linksToSite; private int socialMediaShares; private float avgMinSpentOnSite; }

Here are some same data that might have been returned from query:

Sample website search results

You could imagine different web browsers writing different comparators to determine the order in which websites are presented to users.

Task: Discuss the societal or organizational benefits and limitations of using the linksToSite field to determine priority in produced search results. Is this value a good way to prioritize results? Why or why not?

Task: Imagine that the topic being searched for is a "hot issue" for which some sites are more reliable or accurate than others. The search-engine company wants to return the most accurate results. They propose adjusting their comparators to handle accuracy by adding an "accuracy" field to the Website class. Does this seem like a good idea? What are some of the downsides or challenges to taking this approach?

Task: Imagine now that the objects being ordered are patients on the waiting list for an organ transplant. The patient information is as follows:

public class TransplantPatient { private int age; private int daysAwaitingTransplant; private GenderEnum gender; private double weight; private boolean smokes; }

Propose two different comparators that embody two different sets of (human) values for deciding who should get the next available organ. What would make you comfortable with having a comparator expressed in code determine who should receive the next organ?

Task: What if instead of the five fields shown above, the comparator used a decision tree like the one you wrote for the project? Does that make you more comfortable with a code-based prioritization?

Reflection: These questions are designed to highlight that even seemingly technical details like writing comparators end up encoding human values. Do we value length or uniqueness in word search? Do we value time waiting or likelihood of survival in transplant patients? When you use a comparator on data, it is worth asking what values that comparator expresses.

Remaining Time Homework, Midterm Review Your Choice

Use your remaining lab time as makes most sense for you on course activities. You can work on the homework or do midterm practice problems.


Please let us know if you find any mistakes, inconsistencies, or confusing language in this or any other CSCI0200 document by filling out the anonymous feedback form.