# Quiz Module 3.2 ## Descriptive Statistics ### Central tendency 1. **A study group is looking to determine what proportion of US adults aged 75+ work out on a regular basis. To do so, they take a sample of 80 US adults aged 75+. Identify the population.** - [ ] All US adults - [ ] 80 US adults aged 75+ - [x] All US adults aged 75+ - [ ] All US citizens 2. **In a city with large wage disparities, what is the appropriate measure for center?** - [ ] mean - [x] median - [ ] mode 3. In a class of of 38 female stdents and 24 males, the overall mean of the recent test was 87. If the male students had a mean of 92, what was the approximate mean for the female students on the midterm exam? - [ ] 82.6 - [x] 83.8 - [ ] 89.5 - [ ] 90.4 ### Dispersion 1. **If quartiles Q1 = 20, Q3 = 30, which of the following must be true?** - [ ] The median is 25 - [ ] The mean is between 25 and 30 - [ ] Both are correct - [x] None are correct 2. Given the variance of a dataset, how do you calculate the standard deviation? - [ ] Divide the variance by the number of points in the dataset - [ ] Square the difference between each data point and the mean. - [ ] Multiply the variance by the mean. - [x] Take the square root of the variance. 3. **Which point from the options below is the least common?** - [ ] A datapoint that is 1 standard deviation below the mean. - [x] A datapoint that is 3.5 standard deviations below the mean. - [ ] A datapoint that is equal to the mean. - [ ] A datapoint that is 3 standard deviations above the mean. 4. Given a dataset of 100 points with the value of 20, what is the variance of the dataset? - [ ] Infinity - [x] 0 - [ ] 100 - [ ] 20 5. The mean of 5 observations is 3 and variance is 2. If three of the five observations are 1, 3, 5, find the other two. - [ ] 2, 6 - [ ] 3, 3 - [ ] 1, 5 - [x] 2, 4 ## Data Cleaning ### Remove Unwanted Data 1. Which method is used to delete row or column in DataFrame? - [ ] delete() - [ ] del() - [x] drop() - [ ] None of the above 2. **The following statement will ...** ```python= df = df.drop(columns = ['Name', 'Class', 'Rollno']) ``` - [x] delete three columns having labels ‘Name’, ‘Class’ and ‘Rollno’ - [ ] delete three rows having labels ‘Name’, ‘Class’ and ‘Rollno’ - [ ] delete any three columns - [ ] return error 3. Which method is used to change the labels of rows and columns in DataFrame? - [ ] change() - [x] rename() - [ ] replace() - [ ] None of the above 4. You have a dataframe 'df' in which the 5th column is incorrectly labelled as 'Bld Grp' instead of 'Blood Group'. - [ ] df = df.rename(cols = {'Bld Grp':'Blood Group'}) - [x] df = df.rename(columns = {'Bld Grp':'Blood Group'}) - [ ] df.columns.values = 'Blood Group' - [ ] df.columns.values[5] = 'Blood Group' ### Handle Duplicated Data 1. What is a correct method to discover if a row is a duplicate? - [ ] df.duplicate() - [x] df.duplicated() - [ ] df.dup() 2. What is a correct method to remove duplicates from a Pandas DataFrame? - [ ] df.remove_duplicates() - [ ] df.duplicates() - [x] df.drop_duplicates() - [ ] df.delete_duplicates() ### Handle Missing Values 1. What is a correct Pandas method for removing rows that contains empty cells? - [x] dropna() - [ ] remove_null() - [ ] delete_null() 2. True or false: by default, the Pandas dropna() method returns a new DataFrame, and will not change the original. - [x] True - [ ] False 3. What is a correct method to fill empty cells with a new value? - [ ] replacena() - [ ] value_null() - [ ] insertna() - [x] fillna() 4. When using the Pandas dropna() method, what argument allowes you to change the original DataFrame instead of returning a new one? - [x] dropna(inplace = True) - [ ] dropna(original = True) - [ ] dropna(keep = True) ### Adjust Data Types and Corrupted Data 1. What is the correct method to check the data type? - [ ] df.data_type() - [x] df.info() - [ ] df.dtype() - [ ] df.info_data() ### Identify Outliers 1. In a study investigating the sugar consumption in teenagers' diets, summary statistics are noted below. Which of the following is a true statement? ![](https://i.imgur.com/qTAlL6N.png) - [ ] None of the values are outliers - [ ] The value 60 is an outlier, and there can be no others. - [ ] Both 10 and 60 are outliers, and there can be no others. - [x] The value 60 is an outlier, and there may be others at the high end of the data set. 2. What is the correct way to find the third quartile of a dataframe 'Countries'? - [ ] df['Countries'].quartile(0.75) - [ ] df['Countries'].quartile(3) - [x] df['Countries'].quantile(0.75) - [x] df['Countries'].quantile(75) ## Working with Text ## Long and Wide Table ### What is Long and Wide form? 1. What type of table is this Dataframe? ![](https://i.imgur.com/xEeFgdk.png) - [x] Wide - [ ] Long 2. What type of table is this Dataframe? ![Uploading file..._boy77q4yr]() - [ ] Wide - [x] Long ### Pivot Table & Get Dummies 1. A technique, which when performed on a dataframe, rearranges the data from rows and columns in a report form, is called _____. - [ ] summarising - [ ] reporting - [ ] grouping - [x] pivoting 2. One Hot Encoding ( OHE) is a process to - [x] convert non-numeric categorical values in a column into numeric values - [ ] converting numeric values in a column into non-numeric categorical values - [ ] converting integers values in a column into decimals 3. Among the following functions, which one can be used to combine dataframes when they have similar structure. - [ ] combine_first - [x] concat() - [ ] merge() 4. For the concat(), if the axis=1, it will join the dataframes ............... - [ ] vertically - [x] horizontally