# Annotation of Queries ###### tags: `study_2_annotation` https://towardsdatascience.com/inter-annotator-agreement-2f46c6d37bf3 | Annotator | Annotator | Interrater reliability | | ------------- | ------------- | ---------------------- | | Annotator \#1 | Annotator \#2 | 0.879 | | Reinier | Annotator \#2 | 0.879 | | Reinier | Annotator \#1 | 0.880 | ## Objective participant We ask each participant to verify the answer given and provide a reason for why the answer is either right or wrong. | Correct | Partially correct | Incorrect | | --------------------------------- | --------------------------- | ----------------------------------- | | Answer is good, Reasoning is good | Answer or Reasoning is good | Both Answer and Reasoning incorrect | ### When is answer good? - if "Expected" answers match given answer ### When is reasoning bad? - it indicates that the user does not know how to use the application - For example the user only sees 5 or 10 records, this is something the user can adapt themselves, however do not always notice. So then they only find 5 or 10 records. <!-- - example 1: "no there are only ten and not fifteen" --> - it indicates that the user does not understand the question - For example the user expects other types of outcomes than is given. <!-- - example 1: "No, because it appears the information provided is not relevant" - example 2: "No because it only lists one number and does not specify treatment" --> - the reason is in the form of "it is correct because it is correct" <!-- - example 1: "Yes, i think the number is correct having looked at the dat" - example 2: "No I don't think the answer is correct. It seems too complicated for the algorithm." --> - the reasoning is irrelevant for the answer - For example the user mentions something unrelated or not relevant for the answer. <!-- - example 1: "no it doesn't seem right because he list is unclear it must be clearer " --> - the reasoning does not reflect the error that was introduced (if such an error was introduced; every odd question) ### When is reasoning solid (& relevant)? - if one of the following applies: - for expected answer "yes": - it indicates that the user knows how to use the different tables - like mentions of: there are 15 records, we need 8, 7 we do not need. - it indicates that the user calculated the answer and gives examples. (not just says it is just correct) - approximations are acceptable if the query is performs column-based filtering (like select, without the where), since those operations do not require row-based filtering. - for expected answer "no": - it indicates that the user caught the error - and one of the following from "for expected answer 'yes'" ### When is a participant rejected? - the reasoning indicates technical difficulties - For example the user is unable to see any data, or mentions having had issues during the process <!-- - example 1: "No - I think it's 6 but I might be wrong - some kind of glitch happened, sorry - survey seemed to freeze but back now (fingers crossed)" - example 2: "No - question outcome is not working so I'm unable to correctly answer this" --> ![](https://i.imgur.com/iBDpRpX.png) ***\* Annotations might change depending on first evaluating "Correct?" for every query.*** #### Correct 0. Correct (100%) 1. Partially correct (Answer or reasoning not correct) 2. Incorrect (Both answer and reason wrong) #### Annotations 1. **Nothing**: nothing noteworthy observed 2. **Wrong reason**: Does not understand the task/tables/data, provides a miscalculation (that cannot be attributed to pagination issues) or just plain wrong reason. 3. **Not a reason**: participant provides a reason that is not a reason or just stating the answer without reasoning why the answer is (in)correct. 7. **Pagination issues**: Do not know that there are more than 5 or 10 records. (Each table contains around 15 records) This limitation is because they do not know how to use the table pagination. 8. **Bias**: trust the program to give the right answer, think it is easy for algorithm to calculate, specifity bias (5.06666666667). 9. **Technical error**: program hiccup 8. **Anomaly**: provides an unique reasoning which seems to have no merit at all and cannot be attributed to miscalculations. ### Intricacies of the dataset - Each table contains 15 rows, however, that does not mean each row is "unique" - some dogs have had 0 treaments, others had multiple. - there are 7 veterenarians, 8 employees; these are all professionals - there are 3 different treatment types: physical examination, vaccination and take for a walk; these are all counted as treaments - Some owners have multiple dogs ### Intricacies of the application - Users can select to show 5, 10 or 15 records. this means some users do not know that they can do that and answer incorrectly. - Users can sidescroll. However, not all of them are aware of this. ------------- ------------- ### Question 1: "What is the average age of all the dogs?" Expected answer: "no" Expected error reason: "the result shows a not relevant extra column 'dog identification number'." Query: ```sql= SELECT Avg(Dogs.age), Dogs.dog_id FROM Dogs ``` Query outcome: | Avg(Dogs.age) | dog_id | | ---------------- | ------ | | 5.06666666666667 | 1 | ------------- ### Question 2: "What is the average age of all the dogs?" Expected answer: "yes" Query: ```sql= SELECT Avg(Dogs.age) FROM Dogs ``` Query outcome: | Avg(Dogs.age) | | ---------------- | | 5.06666666666667 | ------------- ### Question 3: "How much does the most expensive charge type cost?" Expected answer: "no" Expected error reason: "the result shows name of the most expensive charge type, not the actual cost." Query: ```sql= SELECT Charges.charge_type FROM Charges ORDER BY Charges.charge_type Desc LIMIT 1 ``` Query outcome: | charge_type | | ------------ | | Health Check | ------------- ### Question 4: "How much does the most expensive charge type cost?" Expected answer: "yes" Query: ```sql= SELECT Charges.charge_amount FROM Charges ORDER BY Charges.charge_amount Desc LIMIT 1 ``` Query outcome: | charge_amount | | ------------- | | 640 | ------------- ### Question 5: "What are each owner's first name and their dogs's name?" Expected answer: "no" Expected error reason: "the result shows only the owners name. We also want the name of the dog." Query: ```sql= SELECT Owners.first_name FROM Owners JOIN Dogs ON Owners.owner_id = Dogs.owner_id ``` Query outcome: | first_name | | ---------- | | Jaclyn | | Gay | | Nora | | Rachelle | | Emelie | | Johann | | Jaclyn | | Rachelle | | Melisa | | Kade | | indy | | Orlando | | Rolando | | Rachelle | | Lorenz | ------------- ### Question 6: "What are each owner's first name and their dogs's name?" Expected answer: "yes" Query: ```sql= SELECT Owners.first_name, Dogs.name FROM Owners JOIN Dogs ON Owners.owner_id = Dogs.owner_id ``` Query outcome: | first_name | name | | ---------- | ---------- | | Jaclyn | Kacey | | Gay | Hipolito | | Nora | Mavis | | Rachelle | Houston | | Emelie | Jeffrey | | Johann | Merritt | | Jaclyn | Narciso | | Rachelle | George | | Melisa | Bessie | | Kade | Troy | | indy | Betty | | Orlando | Holden | | Rolando | Jesus | | Rachelle | Lyric | | Lorenz | Evangeline | ------------- ### Question 7: "What are the email, cell phone and home phone of each professional?" Expected answer: "no" Expected error reason: "the result shows no cell phone number of the professional." Query: ```sql= SELECT Professionals.email_address, Professionals.home_phone FROM Professionals ``` Query outcome: | email_address | home_phone | | ------------------------------- | -------------------- | | deanna.schuster@example.com | +71(6)2898266914 | | lucile.shanahan@example.org | +02(1)0259033559 | | uboehm@example.org | 325-155-0801x7005 | | lourdes.lowe@example.net | 312.216.3352 | | mekhi.little@example.org | 1-609-566-2752x25197 | | jacynthe.mclaughlin@example.net | +43(5)1132733868 | | lambert62@example.org | 022.529.0550x1319 | | goyette.roosevelt@example.net | 891.475.2256 | | schneider.kathryne@example.org | 320-508-6023 | | jerrod.bahringer@example.org | (230)338-3342x585 | | west.eula@example.net | (920)304-4499x59146 | | marquardt.furman@example.org | 246-951-0080x76716 | | delphine29@example.com | 346.594.3739 | | cole.margarita@example.org | 971.048.3763x9404 | | jeichmann@example.com | 1-138-287-3775 | ------------- ### Question 8: "What are the email, cell phone and home phone of each professional?" Expected answer: "yes" Query: ```sql= SELECT Professionals.email_address, Professionals.cell_number, Professionals.home_phone FROM Professionals ``` Query outcome: | cell_number | email_address | home_phone | | ------------------- | ------------------------------- | -------------------- | | (275)939-2435x80863 | deanna.schuster@example.com | +71(6)2898266914 | | 889-940-2676 | lucile.shanahan@example.org | +02(1)0259033559 | | (369)908-7311x065 | uboehm@example.org | 325-155-0801x7005 | | 00230569697 | lourdes.lowe@example.net | 312.216.3352 | | 011.193.9081x3186 | mekhi.little@example.org | 1-609-566-2752x25197 | | 139-321-7313 | jacynthe.mclaughlin@example.net | +43(5)1132733868 | | 499-434-0215x1628 | lambert62@example.org | 022.529.0550x1319 | | 328.842.3792 | goyette.roosevelt@example.net | 891.475.2256 | | 962-983-8109x3509 | schneider.kathryne@example.org | 320-508-6023 | | 461-801-2600 | jerrod.bahringer@example.org | (230)338-3342x585 | | 609-405-2990 | west.eula@example.net | (920)304-4499x59146 | | 1-181-670-9466 | marquardt.furman@example.org | 246-951-0080x76716 | | 880-659-7577x736 | delphine29@example.com | 346.594.3739 | | 1-185-137-1945x409 | cole.margarita@example.org | 971.048.3763x9404 | | 1-258-285-4707x8020 | jeichmann@example.com | 1-138-287-3775 | ------------- ### Question 9: "How many dogs have an age below the average?" Expected answer: "no" Expected error reason: "the result shows the total number of dogs, not number of dogs below average." Query: ```sql= SELECT Count(*) FROM Dogs ``` Query outcome: | Count(\*) | | --------- | | 15 | ------------- ### Question 10: "How many dogs have an age below the average?" Expected answer: "yes" Query: ```sql= SELECT Count(*) FROM Dogs WHERE Dogs.age < (SELECT Avg(Dogs.age) FROM Dogs) ``` Query outcome: | Count(\*) | | --------- | | 9 | ------------- ### Question 11: "Which states have both owners and professionals living there?" Expected answer: "no" Expected error reason: "the result shows the identification numbers that overlap for owners and professionals, not the states that overlap." / "we were expecting a list of states not a list of numbers" Query: ```sql= SELECT Owners.owner_id FROM Owners INTERSECT SELECT Professionals.professional_id FROM Professionals ``` Query outcome: | owner_id | | -------- | | 1 | | 2 | | 3 | | 4 | | 5 | | 6 | | 7 | | 8 | | 9 | | 10 | | 11 | | 12 | | 13 | | 14 | | 15 | ------------- ### Question 12: "Which states have both owners and professionals living there?" Expected answer: "yes" Query: ```sql= SELECT Owners.state FROM Owners INTERSECT SELECT Professionals.state FROM Professionals ``` Query outcome: | state | | ----------- | | Indiana | | Mississippi | | Wisconsin | ------------- ### Question 13: "How many dogs have not gone through any treatment?" Expected answer: "no" Expected error reason: "the result shows only the total number of dogs (or treatments), instead of only counting dogs that didn't go through any treatment."" Query: ```sql= SELECT Count(*) FROM Dogs ``` Query outcome: | Count(\*) | | --------- | | 15 | ------------- ### Question 14: "How many dogs have not gone through any treatment?" Expected answer: "yes" Query: ```sql= SELECT Count(*) FROM Dogs WHERE Dogs.dog_id NOT IN (SELECT Treatments.dog_id FROM Treatments) ``` Query outcome: | Count(\*) | | --------- | | 6 | ------------- ### Question 15: "How many professionals did not operate any treatment on dogs?" Expected answer: "no" Expected error reason: "the result shows the total number of professionals, instead of only counting professionals that did not operate any treatment on dogs." Query: ```sql= SELECT Count(*) FROM Professionals ``` Query outcome: | Count(\*) | | --------- | | 15 | ------------- ### Question 16: "How many professionals did not operate any treatment on dogs?" Expected answer: "yes" Query: ```sql= SELECT Count(*) FROM Professionals WHERE Professionals.professional_id NOT IN (SELECT Treatments.professional_id FROM Treatments) ``` Query outcome: | Count(\*) | | --------- | | 7 |