CS3301
      • Sharing URL Link copied
      • /edit
      • View mode
        • Edit mode
        • View mode
        • Book mode
        • Slide mode
        Edit mode View mode Book mode Slide mode
      • Customize slides
      • Note Permission
      • Read
        • Owners
        • Signed-in users
        • Everyone
        Owners Signed-in users Everyone
      • Write
        • Owners
        • Signed-in users
        • Everyone
        Owners Signed-in users Everyone
      • Engagement control Commenting, Suggest edit, Emoji Reply
    • Invite by email
      Invitee

      This note has no invitees

    • Publish Note

      Share your work with the world Congratulations! 🎉 Your note is out in the world Publish Note

      Your note will be visible on your profile and discoverable by anyone.
      Your note is now live.
      This note is visible on your profile and discoverable online.
      Everyone on the web can find and read all notes of this public team.
      See published notes
      Unpublish note
      Please check the box to agree to the Community Guidelines.
      View profile
    • Commenting
      Permission
      Disabled Forbidden Owners Signed-in users Everyone
    • Enable
    • Permission
      • Forbidden
      • Owners
      • Signed-in users
      • Everyone
    • Suggest edit
      Permission
      Disabled Forbidden Owners Signed-in users Everyone
    • Enable
    • Permission
      • Forbidden
      • Owners
      • Signed-in users
    • Emoji Reply
    • Enable
    • Versions and GitHub Sync
    • Note settings
    • Note Insights
    • Engagement control
    • Transfer ownership
    • Delete this note
    • Insert from template
    • Import from
      • Dropbox
      • Google Drive
      • Gist
      • Clipboard
    • Export to
      • Dropbox
      • Google Drive
      • Gist
    • Download
      • Markdown
      • HTML
      • Raw HTML
Menu Note settings Versions and GitHub Sync Note Insights Sharing URL Help
Menu
Options
Engagement control Transfer ownership Delete this note
Import from
Dropbox Google Drive Gist Clipboard
Export to
Dropbox Google Drive Gist
Download
Markdown HTML Raw HTML
Back
Sharing URL Link copied
/edit
View mode
  • Edit mode
  • View mode
  • Book mode
  • Slide mode
Edit mode View mode Book mode Slide mode
Customize slides
Note Permission
Read
Owners
  • Owners
  • Signed-in users
  • Everyone
Owners Signed-in users Everyone
Write
Owners
  • Owners
  • Signed-in users
  • Everyone
Owners Signed-in users Everyone
Engagement control Commenting, Suggest edit, Emoji Reply
  • Invite by email
    Invitee

    This note has no invitees

  • Publish Note

    Share your work with the world Congratulations! 🎉 Your note is out in the world Publish Note

    Your note will be visible on your profile and discoverable by anyone.
    Your note is now live.
    This note is visible on your profile and discoverable online.
    Everyone on the web can find and read all notes of this public team.
    See published notes
    Unpublish note
    Please check the box to agree to the Community Guidelines.
    View profile
    Engagement control
    Commenting
    Permission
    Disabled Forbidden Owners Signed-in users Everyone
    Enable
    Permission
    • Forbidden
    • Owners
    • Signed-in users
    • Everyone
    Suggest edit
    Permission
    Disabled Forbidden Owners Signed-in users Everyone
    Enable
    Permission
    • Forbidden
    • Owners
    • Signed-in users
    Emoji Reply
    Enable
    Import from Dropbox Google Drive Gist Clipboard
       owned this note    owned this note      
    Published Linked with GitHub
    Subscribed
    • Any changes
      Be notified of any changes
    • Mention me
      Be notified of mention me
    • Unsubscribe
    Subscribe
    # Mini Project 3 Questions ## LAZY Fork Q1. The report asks us to count the frequency of page faults during the execution of COW fork. Does this mean we have to count page faults specific to COW-triggered faults only, or capture all types of page faults? Q2. While running usertests after implementing COW, all tests other than textwrite pass, although it passed before implementing it. For MP2 though, we were asked to remove textwrite from usertests, since it was faulty. Do we do the same here? Q3. When does a page fault occur - during page allocation or during page access, that is the first time a page is accessed by a program? Q4. All usertests passes, but im getting the following: FAILED -- lost some free pages 32063 (out of 32462).Is this fine? Q5. DO we have to record page faults for all processes together or the page faults htat have occurred for each process? ## LAZY Read-Write Q1. Will the entire input will be given at the start of the program , or input can be given after some processing ? Q2. Is the input to be taken from a file? Q3. Can we have two request having the same user ID ? Q4. If we have a concurrency limit of `5` and let say at time `t = 10` there are `3` readers on the file `f1`. At `t = 11` there is a delete request and at `t = 12` there is another reader. Now the delete request is blocked until all the readers leave , but should we allow the reader at `t = 12` to read the file with other readers (as we are below our concurrency limits) so should we delete first. Q5. If two events happen at `t` seconds, then does the order in which they are printed matter? For example, LAZY takes up a WRITE request at `t=2` and another user makes another request at `t=2`, can they be printed in any order? Q6. Is the given example output correct? User 3 made a request to delete file 2 at 2 seconds and User 5 made a request to read file 2 at 4 seconds, then shouldn't User 3's request be taken up first instead of User 5's request once User 2 completes writing to file 2? Q7. Can two users write to different files at the same time? Q8. What does it mean that READing is alllowed while simultaneously writing to a file? What will the user read is the file being read is modified by another user WRITing to it. Q9. What would be the expected output for this input: 2 4 6 3 2 5 1 1 READ 0 2 1 WRITE 1 3 2 DELETE 2 STOP Q10. In the given example, lazy can take up the request of user1 at 1 sec and user 2 can send a write req at 1 sec, should the order of these two events happening be the same as example or can they change randomly?? Q11. Do we have to check each second while the thread is waiting to see whether operation is possible or not and print that User cancelled at `t_k` where `t_k = T+arrival`? Or is it fine if we wait for much more than `T` seconds and then when we acquire the required locks, we realise that time > T and print cancelled at t_k seconds where `t_k = T+arrival+extra` Q12. The task requirement states the following "Users cancel their requests if LAZY takes more than T seconds (from the time at which users send their request) to start processing." How should we handle the case when a request starts being processed by LAZY at exactly time `T`? According to the example, user 3 sends his request at 2 seconds. Now, user 3 cancels his request at 7 seconds, but the request can start at exactly 7 seconds (`T` = 5). Q13. In the given test case file 2 is not being read and no one is writing to it at t = 7, so can't we accept request of user 3 at t = 7 seconds? Q14. Can we get more test cases please. Q15. Can we limit the number of requests like take it at max 100? Q16. Can we limit the number of files (can i take the max no.of files to be 100 or something?) Q17. Can a user perform two tasks simultaneously. For example, can user 1 read file 1, and write to file 1 simultaneously? Q18. ![image](https://hackmd.io/_uploads/S1Uyfcjlkg.png)what do colour mean here Q19. Does the order in which the output is printed matter? My output is correct, but the results are not in order. Q20. If the concurrency limit is reached then should the user wait till it gets chance(if the concurrency limit decreases) or LAZY should cancel the request. If it should wait then the max time would be T+T_arr ? Q21. What would be the expected output for this input: 2 4 6 3 2 2 1 1 READ 0 2 2 WRITE 1 3 2 DELETE 2 STOP Q22. If an operation example WRITE completes at 6s then another WRITE on the same file which arrived at 4s, then the next WRITE should start at 6s or 7s? Q23. If an operation completes at t's then can another operation start at t's on the same file? Q24.Can any two requests can arrive at same time ? . If yes then what is the output of the following test case if only 2 concurrent users are allowed 1 1 READ 0 2 1 READ 1 3 1 READ 1 Since the both 2 and 3 came at the same time what should we consider do we need to consider both ? Q25. Referring to Q16, I don't see why cannot we limit the files. One request can access one file only, so we can limit the number of files to the number of requests. Q26. If a user say user1 requests to delete a file say file1 at t seconds then lazy takes the request of user1 at t+1 seconds then if user2 requests to read or write to file1 at t+1 seconds , should the request be declined by lazy at t+2 seconds? what should be done? Q27.let write operation take 4s then user1 requests to write into file1 at ts then lazy takes request at t+1 s if user2 requests to write into same file: file1 at t+1 s should lazy decline the request? Q28. REFERENING TO Q3 if multiple requests with same user id how we differnatte between requests Q29. 1 1 READ 0 2 1 READ 1 In this case what should be output should lazy start processing both of them at 1 second assuming max_concurrent_limit greater than two Q30. If no of files is 2 but user tries to access 100 file what should happen Q31. ![image](https://hackmd.io/_uploads/ryoBLzVWkx.png) CAN I MAKE FOLLOWING ASSUMPTIONS 1. THE REQUESTS ARE GIVEN IN INCREASING ORDER OF t_i 2. t_i is like 1,2,3.... i.e every second a request arrives Q32. What happens if o_1 is erroneous (ie. instead of "READ" or "WRITE", the user types in "HELLO"), what should be the output? Q33. If a single user makes two different read requests on the same file (such that the second read request would execute before the first read request is completed), how is that supposed to be treated? Is the second request supposed to be delayed, declined, etc.? If it supposed to be taken up immediately without delay, does that count as 2 different users accessing the file in terms of "c" (the maximum number of users that can access a file at a given time)? Q34. Just a further clarification on Q26, every time there's an invalid file access (whether it be because the file was deleted or it doesn't exist), it should occur at t seconds (assuming the request was made at t seconds) or can it occur at t+1 seconds when it is supposed to be taken up? Q35. Do we need to use threads for this part or can we do it without threads? Q36. If a delete operation is requested at `t=2` and is taken up at `t=3`, and another request is at `t=4` and delete operations takes 6s. Will the second request wait for the completion of the delete request and display invalid at `t=9s` or at `t=4`(i.e, as soon as the request can be taken up) Q37. (With refference to Q9) isn't the output incorrect? The delete request shouldn't be taken up untill all the read/write requests are completed right? Q38. Say if a write request is completed at t seconds, and a delete request and a read/write request are both ready to be executed at the same time then is it ok to assume that whichever request arrived earlier would be given the prefference? Also what should be done if in the same scenario both a read and a delete request arrived at (t-1)s , which should be executed at (t)s ? Q39. Say if there are two requests which came at the same time but both cannot be executed at the moment because of concurrency limit (max number of users accessing the file), when they can actually execute can we asssume the order of execution to be random as we can't determine which request acquired the lock?? Q40. Can we use sem_timedwait function in order to wait for a semaphore for only a given amount of time and return if it exceeds T. Q41. Is it fine if we imitate the delete read or write request by sleeping for the required amount of time.There should be no harm in doing this right. Q42. Is it necessary to do sleep or can we just ensure that the output is correct (like If we use sleep the query might require 20 secs but is it fine if without sleep the ouput comes in immediately ) Q43. In relation to Q40, can we use pthread_mutex_timedlock (which serves the same purpose but for threads)? If not can we use pthread_mutex_trylock() to test every second if a lock has been acquired? We shouldn't wait for a time longer than T for any lock, so there should some funciton to let that happen right? Q44.For this Case: 2 4 6 3 2 1 1 1 READ 0 2 2 WRITE 1 STOP should both users cancel their requests? or both requests will be taken?? Q45. Refering to Q39, we can't determine which request acquired the lock right? (it's random right) (how can we execute the earlier one first?) Q46. Lets say a delete come at time x cant be done now after that a write came it cant be done now after some time the file become empty so we should process delete as it came early am I correct or should I go for write as it was in the *queue*? Q47. Similar to question Q46, what happens if a delete and read request are both sent at t seconds. The read request will be prioritized (in line with question Q38). Let's say the time for every operation is 5 seconds and a read request is sent at t+1 seconds. Assume the concurrency limit has not been reached. LAZY should take up write at t+2, finish write, then take up then at t+7. Right? ## LAZY Sort Q1. Can ID be a String or is it just Interger? Q2. Should we use a constant like `max_threads` and hence implement a basic task queuing system (Add all tasks to a queue and at any given time only `max_threads` amount of threads can be spawned). Or do we assume that we can spawn as many threads as we like (Just spawn all tasks at once for each level of merge). Q3. Can we use 1 thread extra for managing everything? Or do you expect us to use locks etc and implement it like a recursive function? Q4. Please tell me if I understand the qustion wrong , but can't the problem be solved without using concurrency concepts (especially without using threads,locks).So is that acceptable? Q5. Can we assume that ID's are unique? Q6. Can we use other sorting algorithm for sorting strings in count sort ? Or is it mandatory to use count sort for strings(name and timestamp)?As count sort is very efficient for numbers and becomes complex for strings. Q7. a) What does distributed implemetation here mean? Does it mean distributed over different networks, or file systems or list of files/folders in multiple distinct multiple files? How do we ensure that we are testing distributed nature of things? If we actually have to simulate files spread across multiple machines or network locations we would have to implement the networking for this. How can we go about doing this? b)If we are not supposed to implement the networking between different machines, do we partition files into chunks and consider each chunk as a different node? Does the input file look something like this? 5 node1 fileA.txt 205 2023-10-02T08:00:00 node2 fileB.txt 207 2023-09-30T10:10:00 node2 fileC.txt 203 2023-10-01T15:20:00 node3 fileD.txt 201 2023-09-29T17:15:00 node3 fileE.txt 204 2023-10-01T12:00:00 ID Or do we use multiple files? One for each node? Somethhing like this? distributed_system/ ├── Node_A.txt ├── Node_B.txt ├── Node_C.txt └── main_data_file.txt main_data_file.txt 50 Node_A fileA.txt 205 2023-10-02T08:00:00 Node_A fileB.txt 207 2023-09-30T10:10:00 Node_B fileD.txt 201 2023-09-29T17:15:00 Node_C fileC.txt 107 2023-10-01T09:15:00 ... ID Content for individual nodes Node_A.txt fileA.txt 205 2023-10-02T08:00:00 fileB.txt 207 2023-09-30T10:10:00 Node_B.txt fileD.txt 201 2023-09-29T17:15:00 Node_C.txt fileC.txt 107 2023-10-01T09:15:00 c)Files belonging to different nodes may not have unique names or IDs. What can we assume to be unique for all files? Or are the files only differentaited on the basis of which node they belong to? Q8. Count sort for strings? How does that work? Q9. (Refering to ans of Q6) If we generate a unique number for each string and do count sort on numbers, we cant get the sorted list of strings because for that we must give smaller numbers to lexographically smaller strings and for that we need to know the lexographical order of strings but that is what have to find out. From my understanding, I think strings can't be countsorted but please correct me if I am wrong. Q10. Is there a maximum limit on the range (max - min) of IDs/timestamps in a test case (e.g. 1e5 or similar) so that we can allocate an array of allowable size? Q11. Can we declare fixed size for filename/timestamp and call countsort for max len from the given strings, the complexity changes to O(MAXLEN*N), is this valid or should I use the procedure as specified in Q6? Q12. Do we need to handle cases where 2 strings from name are mapping to the same number when using our hash function? For example string1 and string2 may map to the same number x with the hash function used. Do we have to handle such cases separately or we can assume such collisions won't occur? Q13. The clarification for Q6 stated that ultimately we have to use count sort to sort Names and Timestamps (sorting criteria that are strings) in case threshold is less than 42. However, can we use multiple passes on each character position emulating LSD (Least Significant Digit) Radix Sort, where each character position is treated as a "digit" in the sorting process? Or should we just stick to mapping each string to a single integer that only requires a single Count Sort pass? My implemetation of the latter approach restricts the string length to 8 as my hash value is large enough that there is risk of overflow. While the former approach (emulating radix sort) is enabling me to work with much longer strings. Could you please clarify which approach I can use? Q14. So essentially we are not supposed to use MPI or anything like it to implement the distributed system, right? Q15. Do we have to account for name hash collisions in countsort? Q16. In reference to q9 and q6. We will need to make a hash function for 128 characters, which also preserves ordering, i.e hash(fileA.txt) < hash(fileB.txt). Also consider that a reasonable max array size is like 10,000 for this (also consider 10 bucket size needed inside each slot if you want to). Now when you try to generate these hashes, there will inevitably be overflows. Handling these overflows is not trivial. If you simply take modulus and place it from the starting and use bucketing also then you will have to drastically change the countsort algorithm and it becomes extremely complex and is no longer countsort. The true countsort implementation would be to iterate over the entire searchspace i.e 0 to max(hash(filename)) WITHOUT adding any modulus to the hash, i.e the max would be theoretically 2^(8\*128) if all chars allowed, (26+4)^128 if only alphabets + extra chars. Any submitted implementation will be remote from countsort. Either allow that, or let us know if you have a specific logic/implementation in mind ***in detail***. Q17. For the line graph, execution time refers to the CPU time or the wallclock time? Q18. Can we assume that our filename is atmost some 8 length ? Q19. The hashing techique is generating very large number (if we are to maintain the lexicographic order) , so can we use something like Radix sort or some other sorting algo to generate the hashing number. Q20. Is there any restriction as to which libraries we are allowed to use? More specifically, is the MPI.h library allowed? Q21. Are we allowed to use a trie for sorting on Name and Timestamp for count sort? Q22. Can we assume that only `a-z , .` are being used as filenames to increase the number of characters usable for filename in countsort? Q23. Can we use the `_Atomic` keyword in C? Q24. In response to the addition in Q16 by [PS] > Thus there are multiple possible hash function mapping that would lead to a normal count sort implementation. 18446744073709551615 uint size. 26^8 = 208827064576 Here there is a clear issue, that iterating over 208827064576 slots will take a significant amount of time. Which is 208*10^9, would take at least around 500 seconds to run. And it has to be done in EACH thread and finally once as well. Not to mention space, but there are methods to make space work. 26^6 is much more reasonable. Q25. Is it okay if we take all the countsort arrays of children together and then merge them in the main thread? It is not what the stackoverflow source says, that case is efficient when the number of elements is much larger than the max array size, but this is easier to implement and more efficient for our case Q26. Can we assume that valid timestamps are only after 1970-01-01? Q27. Can we assume that the difference between hashed numbers of the file which has the min hashing number and the file which has the max hashing number is at max 10^6 or 10^7? Q28. Adding onto Q18, is an assumption of length 6 acceptable to the TAs? Q29. Is using qsort allowed to sort individual threads

    Import from clipboard

    Paste your markdown or webpage here...

    Advanced permission required

    Your current role can only read. Ask the system administrator to acquire write and comment permission.

    This team is disabled

    Sorry, this team is disabled. You can't edit this note.

    This note is locked

    Sorry, only owner can edit this note.

    Reach the limit

    Sorry, you've reached the max length this note can be.
    Please reduce the content or divide it to more notes, thank you!

    Import from Gist

    Import from Snippet

    or

    Export to Snippet

    Are you sure?

    Do you really want to delete this note?
    All users will lose their connection.

    Create a note from template

    Create a note from template

    Oops...
    This template has been removed or transferred.
    Upgrade
    All
    • All
    • Team
    No template.

    Create a template

    Upgrade

    Delete template

    Do you really want to delete this template?
    Turn this template into a regular note and keep its content, versions, and comments.

    This page need refresh

    You have an incompatible client version.
    Refresh to update.
    New version available!
    See releases notes here
    Refresh to enjoy new features.
    Your user state has changed.
    Refresh to load new user state.

    Sign in

    Forgot password

    or

    By clicking below, you agree to our terms of service.

    Sign in via Facebook Sign in via Twitter Sign in via GitHub Sign in via Dropbox Sign in with Wallet
    Wallet ( )
    Connect another wallet

    New to HackMD? Sign up

    Help

    • English
    • 中文
    • Français
    • Deutsch
    • 日本語
    • Español
    • Català
    • Ελληνικά
    • Português
    • italiano
    • Türkçe
    • Русский
    • Nederlands
    • hrvatski jezik
    • język polski
    • Українська
    • हिन्दी
    • svenska
    • Esperanto
    • dansk

    Documents

    Help & Tutorial

    How to use Book mode

    Slide Example

    API Docs

    Edit in VSCode

    Install browser extension

    Contacts

    Feedback

    Discord

    Send us email

    Resources

    Releases

    Pricing

    Blog

    Policy

    Terms

    Privacy

    Cheatsheet

    Syntax Example Reference
    # Header Header 基本排版
    - Unordered List
    • Unordered List
    1. Ordered List
    1. Ordered List
    - [ ] Todo List
    • Todo List
    > Blockquote
    Blockquote
    **Bold font** Bold font
    *Italics font* Italics font
    ~~Strikethrough~~ Strikethrough
    19^th^ 19th
    H~2~O H2O
    ++Inserted text++ Inserted text
    ==Marked text== Marked text
    [link text](https:// "title") Link
    ![image alt](https:// "title") Image
    `Code` Code 在筆記中貼入程式碼
    ```javascript
    var i = 0;
    ```
    var i = 0;
    :smile: :smile: Emoji list
    {%youtube youtube_id %} Externals
    $L^aT_eX$ LaTeX
    :::info
    This is a alert area.
    :::

    This is a alert area.

    Versions and GitHub Sync
    Get Full History Access

    • Edit version name
    • Delete

    revision author avatar     named on  

    More Less

    Note content is identical to the latest version.
    Compare
      Choose a version
      No search result
      Version not found
    Sign in to link this note to GitHub
    Learn more
    This note is not linked with GitHub
     

    Feedback

    Submission failed, please try again

    Thanks for your support.

    On a scale of 0-10, how likely is it that you would recommend HackMD to your friends, family or business associates?

    Please give us some advice and help us improve HackMD.

     

    Thanks for your feedback

    Remove version name

    Do you want to remove this version name and description?

    Transfer ownership

    Transfer to
      Warning: is a public team. If you transfer note to this team, everyone on the web can find and read this note.

        Link with GitHub

        Please authorize HackMD on GitHub
        • Please sign in to GitHub and install the HackMD app on your GitHub repo.
        • HackMD links with GitHub through a GitHub App. You can choose which repo to install our App.
        Learn more  Sign in to GitHub

        Push the note to GitHub Push to GitHub Pull a file from GitHub

          Authorize again
         

        Choose which file to push to

        Select repo
        Refresh Authorize more repos
        Select branch
        Select file
        Select branch
        Choose version(s) to push
        • Save a new version and push
        • Choose from existing versions
        Include title and tags
        Available push count

        Pull from GitHub

         
        File from GitHub
        File from HackMD

        GitHub Link Settings

        File linked

        Linked by
        File path
        Last synced branch
        Available push count

        Danger Zone

        Unlink
        You will no longer receive notification when GitHub file changes after unlink.

        Syncing

        Push failed

        Push successfully