Hands-On 4 - HackMD

# Hands-On 4 ###### tags: `homework/lab` ## TA Announcements :::info 5/23 Check e3 for new information and resources. 5/22 See [detailed homework instructions](https://docs.google.com/document/d/1PVn-icznCCtMJ99ej-xvt2TG9MCET7mOrYk8Xg0OCk8/edit#heading=h.lkttikbk6pru). ::: ## Tips & Tricks ### Credit Card :::warning If you do not wish to enter your credit cards details in AWS, try the following: 1. Create an AWS account. When you are asked to enter the credit card details, leave the website and proceed to the next step. 2. Create an MTurk Requester account (If you haven't done so already) 3. Link the Requester Account with the AWS account 4. The AWS account should now be accessible ::: ### HIT Expiration Time :::warning During develpment of the upload script, set the HIT expiration time to a short duration (e.g., 1 hour). This will prevent more and more HITs from showing on the Sandbox, since they will expire pretty quickly. Set the HIT expiration to the required duration (8 weeks) only for your final submission. ::: ### HIT Deletion (Updated) :::warning Note: after the HITs expire and can no longer by found on the worker's sandbox (workersandbox.mturk.com), they can still be fetched by the requester to get the assignment results. This can create a bit of a mess later on when you download the results, especially if you uploaded HITs for the same tweets multiple times during testing. You might end up with more HITs than tweets... I am sharing a script that you might find useful for this case: it will permanently delete all HITs and assignments. This will create a "clean environment" with no HITs. If you choose to run this script, you will need to re-upload your HITs once again, just before the experiment begins. Use with care! The script first shows you how many HITs you have in the system, and only then asks you whether you want to delete them. So you can also use the script to count the HITs in your system. Because there are 377 tweets, there should a total of 377 HITs. [Delete All HITs](https://colab.research.google.com/drive/1kukDYGT-RuCzv79rJNsSj5i_ZfEjD2cQ?usp=sharing) After you run the script, run it again to verify that you now have 0 HITs in MTurk. If not, run the script again. You might need to run it a few times. If you have any questions about this script, feel free to post a message in the forum below. ::: ## Discussion :::success This forum is used for asking question and sharing information. Students are encouraged to share their knowledge or solutions with other students. In addition, the forum will be checked by the TAs every 12~24 hours. ::: > Hello Boaz I have some problems in creating 'credentials.csv' when I want to create IAM user. ![](https://i.imgur.com/QXMXZCe.png) Because I forget the next step what you did in class. Could you upload the full video records on new e3? >>Check slides 39~45[name=Boaz][color=#DB7093] > When I try to create AWS account, it always asks me for credit card information. What should I do? >> The trick is to go to the MTurk Requester website, and link the Requester account with the AWS account. After you do this, you should be able to access the AWS account. Kindly let me know if it worked for you :)[name=Boaz][color=#C71585] > How could I make sure that I'm in MTurk sandbox? >> * Use the Sandbox endpoint when calling the API: `https://mturk-requester-sandbox.us-east-1.amazonaws.com` >> * To see your HITs, go to the Worker Sandbox website (https://workersandbox.mturk.com/) >> >> [name=Boaz][color=#FF69B4] >> > > How could I make credentials for a user with read-only permissions for MTurk >> When you create the user on the AWS console, search for "Turk", and you'll see a few options, choose the one with Read Only. See Slide 42 for more information.[name=Boaz][color=#DB7093] > In the Homework assignment, it says that "Each HIT is one tweet." and "Each HIT will have 3 assignments." > Does the 3 assignments refer to valence, arousal, and dominance? >> I think that 3 assignments means that a HIT can be performed by three different workers, once per worker. And the valence, arousal, and dominance mean the score be rated by workers after they read the tweet. >> [name=classmate] >>>Yes, exactly, thank you `classmate`! Each HIT will ask the worker to rate a tweet on all VAD dimensions (3 scores). In total we need to create num_tweets=378 HITs. When we create each HIT, we specify `MaxAssignments`, so that HIT will be available to `MaxAssignments` different workers. After a HIT is completed by `MaxAssignments` workers, we can have `MaxAssignments` different VAD ratings for the tweet. In our case, `MaxAssignments=3`. [name=Boaz][color=#FFB6C1] >So we need to write the HTML for ourself? How to connect the HTML and get the random line in our csv? >> Yes, you need to create the HTML file (or files) and host it on your external server. >> Check [ExternalQuestion](https://docs.aws.amazon.com/AWSMechTurk/latest/AWSMturkAPI/ApiReference_ExternalQuestionArticle.html) for more information, or the sample Colab code. >> I am not sure what you mean by "random line", maybe you can clarify?[name=Boaz][color=#DB7093] > I have the similar question. Do we need to randomly pick up a tweet and show it to the turker each time a turker acceppt our HIT? Or we can have some fixed tweets and collect information from turker by showing them only these fixed tweet? >> There is no need to do anything at random :) You create one task (HIT) per tweet, for a total of 378 HITs. Amazon takes care of the rest. It will assign the HITs to workers. It will assign each HIT to three workers, since `MaxAssignments=3`. (The allocation of HITs to workers is done automatically by Amazon, and it may be in a random order, but you don't need to worry about that!)[name=Boaz][color=#DB7093] >>> So do we need to create HTML file for each tweet and totally 378 HTML files? >>>> I think so, that's how our group did it. [name=classmate] >>>>> I think there may be an easier way if you pass parameter to your url, such as `your.url:port/get_text?idx=1` [name=another classmate] > I have some problems about requester's name. I have tried to change name from [here](https://requester.mturk.com/account) and I also correctly change my name. But when I create hit I saw my original requester's name from searching all hits in Worker Sandbox website. How can I actually change my requester's name? Thanks. >> We found it update automatically after few hours, thanks. >> >I have waited for about 12 hours and the username still won't change...I have changed every single username I can see on every account setting. I've even registered another account but named it wrong accidentally. hmmm >>>> Interesting. I'll try to change my test account name. Let's see how long it takes until it shows the new name :) [name=Boaz][color=#DB7093] >>>> Update: this morning my HITs are updated with the new name. So perhaps they update it once or twice a day. [name=Boaz][color=#DB7093] >I want to confirm what to submit to the newE3 platform. >In [Lab 4 Assignment](https://docs.google.com/document/d/1PVn-icznCCtMJ99ej-xvt2TG9MCET7mOrYk8Xg0OCk8/edit#), it says **Submit on e3: *lab4<TeamName>.ipynb* (for example, lab4_Team_Pudding.ipynb)**. But on e3 platform, it says **Upload the Python script (*lab4-<TeamName>.py*, (for example, lab4_Team_Pudding.ipynb)**. Is either of them (**.py** or **.ipynb**) accpetable? >>Thank you for spotting the typos! They will be fixed soon. As usual, the Colab Notebook ends with .ipynb, and the corresponding Python file ends with .py. Use the usual SOP when submitting the homework.[name=Boaz][color=#FFB6C1] > I often get this error when I'm running APIs such as `client.create_hit_with_hit_type()`,`client.list_assignments_for_hit()`,... Is this the problem from the APIs? ``` ClientError: An error occurred (ThrottlingException) when calling the ListAssignmentsForHIT operation (reached max retries: 4): Rate exceeded ``` > so we really need to create 378 htmls and create 378 question variables and create 378 response variables?? >> As I mentioned above, you may create a server and parse parameter to your url. By this way, you are able to avoid creating 278 html pages. [name=classmate] > Do we need to upload our external files, such as .js .html or .py for server to the e3? >> Please follow the instructions. The instructions do not mention these files.[name=Boaz][color=#DB7093] > Do we need to include the code of getting result.csv in the python script of phase 1? Or we can add this part when we hand in the phase 3 one? >> Phase 1 submission is the script for uploading the HITs. Phase 3 submission is the script for downloading the assignments and calculating the results. In addition, the `results.csv` and `credentials.csv` files should be included in phase 3.[name=Boaz][color=#FFB6C1] > Anyone knows how to get the qualification type ID without recording it? I cannot find it anywhere. Therefore I cannot delete my qualification for testing. > Does the same group share the same AWS account? >> There is no need to share the AWS account username/password. The AWS account holder can create a credentials file that can be used by all team members.[name=Boaz][color=#DB7093] > We have some problems about our requester name. We changed our name two days ago but we still can't search our HITs using NCTU keyword. We need to use the origin user name to search our HITs. As shown in the figure below. ![](https://i.imgur.com/XosLGGe.png) Should we create a new account and upload it again or keep waiting? >>No need to create new account or re-upload any HITs. Your HITs can also be found by using "emotion", so you should be OK.[name=Boaz][color=#FFB6C1] >I'm sure that there are more than 150 assignments of hits done, but when I try to fetch the results, this error shows up when I got to about the 60th HIT`ClientError: An error occurred (ThrottlingException) when calling the ListAssignmentsForHIT operation (reached max retries: 4): Rate exceeded`. I tried again and the error shows up at about 100th HIT. Did anyone run into the same problem? >> I think it is because the fetching result process is too fast. I added `sleep(a short period of time, in seconds)` before calling ListAssignmentsForHIT and I have not encountered this error. But in the meanwhile, it could take more time to get all the assignment response.[name=classmate] >>>thank you! i'll try this right now. > If a hit has not been completed by anyone(number of completed assignments equals to zero), does it also need to shown in `results.csv`? If it needs, what should the value of `avg_valence`, `avg_arousal`, `avg_dominance`, and `avg_time` be? >> Do what you think is more sensible.[name=Boaz][color=#DB7093]