ADA Final Challenge 2020

Teams

Background

The NTU CSIE NASA Workstation Team is thinking of developing an algorithm to schedule tasks on the GPU servers.

(Disclaimer: The scenario is purely hypothetical and does not reflect real-world decisions.)

For simplicity, we assume that students submit jobs to run before a given point of time (say, every midnight).

In other words, the scheduling can be done offline.

We also assume that the students can predict to a certain extent how long their jobs will run.

(For example, we can kill tasks that are over the time limit, while using the "bubbles" caused by tasks finishing before expectation for non-scheduled jobs.)

The goal is to minimize

The makespan, i.e., the total time it takes to finish the jobs
The weighted total flow time, i.e., the weighted sum of the finishing times

(Since this is not an OS course, we hand-wavily assume that we can avoid starvation by dynamically ramping up the weights of overdue jobs :) )

Definitions

Slice: Can be thought of units of computational power (e.g., VRAM.).

Job: Contains some operations. The operations in a job may depend on each other in some order.
Operation: A basic task that requires some slices of computing resources and a duration to run. Cannot be preempted.

Duration: The time required to execute an operation.
Makespan: The total time required to finish the jobs. (Wikipedia)

Finishing time: The time when every operation in a job is finished.
Weighted total flow time: The weighted sum of the finishing time for each job, i.e.,
$\sum_{i} w_{i} C_{i}$ , where
$w_{i}$ and
$C_{i}$ are the weight and finishing time of the
$i$ -th job respectively

Metric to Optimize

The sum of the makespan and the weighted total flow time

max_{i} C_{i} + \sum_{i} w_{i} C_{i}

Example

        
    
    
      Image Not Showing
      Possible Reasons
      The image file may be corrupted
The server hosting the image is unavailable
The image path is incorrect
The image format is not supported
      Learn More →

There are 3 slices of computational power
All weights are set to 1
Op. 1B and Op. 1C depends on Op. 1A
Op. 2 requires two slices
This scheduling has a makespan of 17 days and a weighted total flow time of 39 days
Hint: This plan is not optimal

Test Data

Students are required to generate test cases for the other teams.

Public test cases: Link is in the Resources section
"Private" test cases generated by other teams: Will be released three days before the deadline

Input Format

The first line contains an integer

ℓ

, indicating the number of available slices.

The second line contains an integer

n

, denoting the number of jobs. Then

n

descriptions of jobs follow.

The first line of a job description contains an integer

m_{i}

, denoting the number of operations in the job. The second line contains a real number

w_{i}

, denoting the weight of the job. Then

m_{i}

lines follow, the

j

-th of which describes the

j

-th operation and starts with three integers

s_{i, j}

d_{i, j}

, and

p_{i, j}

, indicating the number of required computing slices, the duration required, and the number of operations it depends on. In the same line, there are then

p_{i, j}

more integers

a_{i, j, k}

, denoting which operations in the same job the operation is dependent on.

The indices are 1-based
It is guaranteed that for each job, the dependency graph is acyclic
For the public test cases, the constraints are directly observable and thus omitted here
Constraints for the private testcases will be given below
- A validation program that checks if the test cases you provide satisfy the constraints is also provided in the Resources section

Output Format

For each job, print

m_{i}

lines separated by newlines, with the

j

-th describing the scheduling of the

j

-th operation in the

i

-th job. Specifically, the line starts with an integer, denoting the starting time of the operation. Then

s_{i, j}

space-separated integers follow, denoting the slices that the operation uses.

The schedule must not violate the operation dependencies nor use more than

ℓ

slices at a time.

Notice that time starts at zero.

Output Example

        
    
    
      Image Not Showing
      Possible Reasons
      The image file may be corrupted
The server hosting the image is unavailable
The image path is incorrect
The image format is not supported
      Learn More →

Submission

"Output-only"
- Run your program offline and upload the results
- Online judge & scoreboard available for the public data set
- Judge: https://ada-judge.csie.ntu.edu.tw/
Test cases (at most two)

Code & instructions are still required for reproducibility
- Feel free to use any publicly available languages and packages
- The TAs may need to run your code, so if you use commercial software, at least make sure there is a trial version!
- There is a time limit of 12 hours for the public tests combined
- The time limit for the private tests combined is 24 hours
- Using multiple CPU cores (within reason) is allowed
- You may assume that the execution time is measured on the linux{1..4} workstations

Test Case Submission Format

Please submit a zip archive containing at most two testcases with the file extension *.in, as shown below. Notably, to avoid issues related to directory structures, any file with the extension will be considered no matter which folder it is placed in.

We will, before the deadline, create problems on the judge for submitting testcases. Note that in the case of multiple submissions, we will take the latest one that passes validation.

https://ada-judge.csie.ntu.edu.tw/#!/problem/30

Private Testcase Constraints

$1 \leq ℓ \leq 8$
$1 \leq n \leq 30$
$m_{i} \geq 1$
$\sum m_{i} \leq 100$
$0 \leq w_{i} \leq 64$ ; should be given with at most
$6$ digits after the decimal point. Note that scientific notation cannot be used.
$1 \leq s_{i, j} \leq ℓ$
$1 \leq d_{i, j} \leq 96$
$0 \leq p_{i, j} \leq m_{i}$
$1 \leq a_{i, j, k} \leq m_{i}$

Code Submission Format

Please upload to NTU COOL an archive with the following contents:

teamXX/ # Replace with your team number
├── README.md # Instructions on how to build and run your code
└── OTHER_CONTENTS

Note that only one group member needs to upload the file. Also, please make sure to document any dependencies your code requires. (Kudos to those who create a Docker/Vagrant container :) )

Private Tests Output Submission Format

Please upload to NTU COOL an archive with the following contents, with each file corresponding to the test case of the same name.

teamXX/ # Replace with your team number
├── 01ef1bedf5397f5e541b2fa144004ea3cf9ce506e6c7a53361203248b11f6d91a97a1ee78b2cde8373045d1dee5a9bbed7d8692be9adabfce821d2d9a0d6ab38.out
├── 080c61c8f3a7b53ee2f09ccb058f7bfced37e9287c551c38f1cad93baa0daa4580a943efdc51930e9461c7fd8b88919d87962e1ef47018ec4cc34f9046e8bc33.out
├── 094246fed97e640dc3210fd8046c3dbc75f8713fc01d2e40f1a9ac9e92cc7d8bb32fde1a783229502a6a3f590dc925f98ab4c6f80942da2cd5cf4b5330e8545a.out
├── 10e23905c7c1cecd95dec8f988cc4019313946f3faff9cf8910445cc5477cc71c60b0911a40545f1affd94ef204179e3ffa1dcbe20d05638f86f3d6ecf5afaa8.out
└── ...

Again, note that only one group member needs to upload the file.

Validator

A program that checks both your program output and the testcases you generate is provided together with the public testcases. Note that for the public testcases, you may have to use the flag --public to loosen some restrictions.

# Build the validator
make
# Verifies if the testcase 00.in satisfies the private contraints
./checker 00.in
# Verifies if the testcase 00.in satisfies the private contraints
# and calculate the metric for the output 00.out
./checker 00.in 00.out
# Verifies if the testcase 00.in satisfies the public contraints
# and calculate the metric for the output 00.out
./checker --public 00.in 00.out

If you have issues building the validator, you may want to try the following:

Install make (aside from a C++ compiler, of course)
Upgrade your compiler to a version no older than GCC 7, Clang 5, ICC 19.0.1, or MSVC 19.14 for full support of C++17
- See https://en.cppreference.com/w/cpp/17 for details
Try to run the validator on the CSIE workstations instead

Scoring

Metric rankings (Public cases) (50%)
- Exceed Baseline I (10%)
- Exceed Baseline II (20%)
- Overall ranking (20%)
Metric rankings (Private cases) (20%)
- Overall ranking (20%)
Strongness of test cases (30%)
- The standard deviation among the other teams (15%)
- Your performance compared to theirs in terms of percentile (15%)
- In each track, only the testcase that performs the best (out of the ones you submitted) will be considered
Bonuses to encourage uploads (10%)
- Every midnight, the ~~top 20% on the scoreboard~~ top 20% teams (i.e., top
  $⌈ 0.2 \times 42 ⌉ = 9$ teams without including the baselines) gets 1% in this category
- Note that this will only come into play a week after the judge goes public
- Also note that this part caps out at 10%

Note that except from the baseline-related tracks and the bonus, the score for each track will be determined by an affine function of your ranking in that category.

For the public tests, the scores will be summed across all test cases, resulting in a single scoreboard/ranking.

For private tests, each testcase will be ranked separately.

Schedule

Judge available: 12/17
Start of scoreboard bonuses: 12/24
Deadline for testcases: 01/18 24:00
Release of "private" testcases: Before 01/19 24:00
Deadline for code & final results: 01/22 24:00

Questions

This is the first year of the event. Feel free to bring up any "bugs" you find
~~希望能盡量不滾動規則，保持這門課的公信力~~
ada-ta@csie.ntu.edu.tw
TA hour: Wed 1630-1730 @ R217

Resources

https://www.youtube.com/watch?v=dQw4w9WgXcQ
Public test cases & validator: https://www.csie.ntu.edu.tw/~wp/ada-final-public.tar.xz
- Last Update: 01/18
Private test cases: https://www.csie.ntu.edu.tw/~wp/ada-private-cases.zip
Some keywords:
- Constraint Programming
- Mixed Integer Programming
- Metaheuristics
- Knapsack Problem FPTAS
- Algorithms / approaches you have learnt in this course
- Operations Research / Job Shop Problem

Hints

For scoring purposes, sometimes the durations in a testcase are scaled by a constant, meaning that they may all be a multiple of some number.

Changelog

2020/12/04: Initial release
2020/12/13: Major revamp for public release
2020/12/16: Add tips on how to compile the validator and fix some typos
2020/12/16: Clarify scoring rules regarding rankings and update the validator to resolve crashes when handling malformed data
2020/12/17: Remove accidental binary in the provided archive and clarify scoring rules regarding bonuses
2020/12/17: Change testcase submission method
2020/12/22: Clarify bonus rules re. the number of teams and update validator for more comprehensive testdata checks
2020/12/30: Fix output format re. the slices used
2020/01/17: Update validator to the version used for testdata verification on the judge
2020/01/18: Update validator to remove extra #defines and add link to the problem for submitting testdata to the judge
2020/01/18: Add FAQ entries re. the uploading of testdata
2020/01/19: Add code & private output submission format and link to private test cases

FAQ

Discussions: https://cool.ntu.edu.tw/courses/3364/discussion_topics/26191
Testdata-related
- If you encounter issues when uploading your testdata on Windows, you may want to ~~uninstall that proprietary c**p~~ use something like Notepad++ to convert CR-LF newlines to LF. (Link.)
- Similar to your program output, please make sure that the data you provide ends with a newline.

Work in Progress

Guest Gross2020/12/03 08:55:55

If you encounter issues when uploading your testdata on Windows, you may want to ~~uninstall that proprietary c**p~~ use something like Notepad++ to convert CR-LF newlines to LF.

Test question. Feel free to leave anonymous questions here :) (Edited)