# Quiz Module 3.1 ## Numpy ### Intro to Numpy 1. How do you import Numpy? - [ ] require numpy - [x] import numpy as np - [ ] from numpy import np - [ ] import np 2. **Select the unstructured data types** - [ ] Excel sheet - [ ] SQL tables - [ ] csv files - [x] Texts 3. What is an appropriate array for this dataframe? ![](https://i.imgur.com/8VZVWKU.png) - [ ] array([420, 380, 390]) - [ ] array([390, 40]) - [ ] array([2, 390]) - [x] array([390, 45]) 4. **What is NOT a characteristic of Numpy?** - [ ] Numpy vectorizes data - [ ] Numpy array processes data quicklier compared with Python list - [x] Numpy array can store different data types - [ ] Numpy is a Python library ### Numpy syntax (p1) 1. **What’s is the value of c in the following array operation?** ```python= a = np.array([1, 2, 3]) b = np.array([4, 5, 6]) c = a+b ``` - [ ] array([1, 2, 3], [4, 5, 6]) - [x] array([5, 7, 9]) - [ ] array([10, 11, 12]) - [ ] array([1, 2, 3, 4, 5, 6]) 2. What should this data storage method called? [21, 22, 23, 24, 25] - [ ] scalar - [x] array - [ ] matrix 3. **What is the shape of this matrix?** ```python= array([[2, 3, 5], [4, 7, 9], [11, 13, 16], [22, 3, 10]]) ``` - [ ] (3, 4) - [x] (4, 3) - [ ] (4, 12) - [ ] (2, 10) ### Numpy syntax (p2) 1. **What would the output be for this program?** ```python= height = np.array([160, 167, 173, 159]) height[2] = 166 height[-1] = 153 ``` - [ ] array([160, 167, 173, 159]) - [ ] array([160, 166, 173, 153]) - [x] array([160, 167, 166, 153]) - [ ] array([160, 167, 166, 159]) 2. Consider the following array ```b = np.array([[20, 22, 24], [26, 28, 30], [32, 34, 36]])```, what would the output be if the input is ```print(b[1:2, 0:2])```? - [ ] array([[26, 28, 30]]) - [ ] array([[26, 28, 30], [32, 34, 36]]) - [x] array([[26, 28]]) - [ ] array([[20, 22, 24], [26, 28, 30]]) 3. **Consider the array ```a = np.array([10, 4, 6, 9, 18, 22, 11, 13, 3, 2, 15])```, Which logical operation could be performed on the array to return ```array([ 4, 18, 22, 3, 2])```?** - [x] a[(a>15) | (a<5)] - [ ] a[(a<15) | (a>5)] - [ ] a[(a>15) & (a<5)] - [ ] a[(a<15) & (a>5)] ## Getting to Know Pandas ### Getting to know Pandas (p1) 1. **What is NOT an appropriate hosting service?** - [ ] SQL - [ ] Big Query - [ ] GitHub - [x] Computer Hardware 2. **What is NOT an appropriate Python library that supports EDA?** - [x] Beautiful Soup - [ ] Pandas - [ ] Matplotlib - [ ] Seaborn ### Getting to know Pandas (p2) 1. **What is the 'Income Group' called?** - [ ] DataFrame - [ ] Series - [ ] Observation - [x] Feature 2. A Numpy series can store both Strings and Integers - [ ] True - [x] False 3. **We cannot access a Pandas dataframe using Index name** - [ ] True - [x] False ### DataFrame - Load and Overview 1. **Which of the following commands will correctly import a csv link into the DataFrame content?** - [ ] content = pd.from_csv('https://dropbox.com/s...') - [ ] content = pd.from_csv(https://dropbox.com/s...) - [x] content = pd.read_csv('https://dropbox.com/s...') - [ ] content = pd.read_csv(https://dropbox.com/s...) 2. **Which of the following commands will print the first 5 rows (not including the header) of the DataFrame df?** - [ ] print(df[1:5]) - [x] print(df.head(5)) - [ ] print(df.info(5)) 3. How do you import Pandas? - [ ] require pandas - [x] import pandas as pd - [ ] from python import pd - [ ] import pd ### DataFrame - Index 1. **What is the correct syntax to get all column names?** - [ ] df.get_name() - [x] df.columns - [ ] df.name[:] - [ ] df[columns] 2. **With ```df.set_index('Country Name', inplace=True)```, Country Name will be saved as a copy.** - [ ] True - [x] False ### DataFrame - Selection with loc 1. **If you only want to select the ages from the DataFrame, which of the follow lines of code would you use?** ![](https://i.imgur.com/bYtKD7M.png) - [ ] customers.loc['age'] - [x] customers.age - [ ] customers[column == 'age'] - [ ] customers.loc[:, 'age'] 2. **What is the correct line of code to the Birth rate of Angola, Albania, and United Arab Emirates?** ![](https://i.imgur.com/QpjJEJ2.png) - [ ] df.['Aruba':'Albania', 'Birth rate'] - [ ] df.loc[:-3, 'Birth rate'] - [x] df.loc['Angola':'United Arab Emirates', ['Birth rate']] - [ ] df.['Angola', 'United Arab Emirates'] 3. To access value of data frame using column label, we can use: ![](https://i.imgur.com/dPdqfas.png) - [ ] loc - [ ]```<dataframe object>.<column label>``` - [x] Both - [ ] None 4. Which of the following is not correct? - [ ] print(class.loc[:, ‘girls’:’subject’]) - [ ] print(class.loc[[‘class1’:’class2’], [‘girls’:’subject’]] - [ ] print(class.loc[‘class1’, ‘class2’, ‘girls’, ‘subject’] - [x] print(class.loc[‘class’, ‘girls’] 5. What will be the output of print(df.loc[:])? - [ ] Display ‘Error’ - [ ] Display all rows - [ ] Display all columns - [x] Display all rows and columns ### DataFrame - Selection with iloc 1. What is the correct way to display the last two rows? - [ ] print(df[-2:-1]) - [ ] print(df.iloc[-2:-1]) - [x] print(df.tail(2)) - [ ] All of the above 2. **To extract the first 3 rows and 3 columns of a dataframe 'exp' which of the following is True?** - [ ] exp.iloc[0:2, 0:2] - [x] exp.iloc[0:3, 0:3] - [ ] exp.iloc[1:4, 1:4] - [ ] exp.iloc[1:3, 1:3] 3. Now suppose you want to select the 3rd column, named 'Ratings' from the dataframe 'movies', but you want to select it as a dataframe instead of a series. Which of the following will help you achieve the same? - [ ] df[['Ratings']] - [ ] df.iloc[:, [2]] - [ ] df.Ratings - [x] df.loc[:, ['Ratings']] ### Working with Series 1. **When selecting a single column or row, what will the data type of the result be?** - [ ] Dataframe - [ ] Array - [ ] Python list - [x] Series 2. For the dataframe above, what would be the result of df['Grade_letter'].nunique()? ![](https://i.imgur.com/v27fY00.png) - [ ] ['A', 'F', 'C', 'B'] - [ ] ['A':2, 'F':1, 'C':1, 'B':1] - [ ] 2, 1, 1, 1 - [x] 4 ### Filter and Sort 1. **What is the correct way to print the specific column 'Salary' whose value is larger than 90000?** ![](https://i.imgur.com/2Z9q4LA.png) - [ ] df[df['Salary'] > 90000] - [ ] df[df['Salary'] > 90000, 'Salary'] - [x] df.loc[df['Salary'] > 90000, 'Salary'] 2.**What does ```df.sort_index()``` do?** - [ ] Sort out the values in all columns in ascending order - [x] Sort out the row names - [ ] Sort out the sum of the values 3. Consider the DataFrame, sports_store, shown below, that gives the prices of various sports equipment sold at a local retail store. What command would you run if you wanted to find the average price of the items sold by this store? ![](https://i.imgur.com/nevKV0i.png) - [ ] sports_store['price'].median() - [ ] sports_store['price'].average() - [ ] sports_store['price'].std() - [x] sports_store['price'].mean() 4. Which of the following commands will you use to retain the teams that have points greater than 75 and have won at least 23 matches? ![](https://i.imgur.com/EYQXoqg.png) - [ ] final.loc[(final['Points'] > 75) & (final['Won'] > 23)] - [ ] final.loc[(final['Points'] >= 75) & (final['Won'] > 23)] - [ ] final.loc[(final['Points'] >= 75) & (final['Won'] >= 23)] - [x] final.loc[(final['Points'] > 75) & (final['Won'] >= 23)] ## Basic Pandas Functions ### Groupby 1. **What is the order of the groupby() function?** - [ ] Find the group -> Split the data -> Compile the data - [ ] Apply the function -> Split the data -> Compile the data - [ ] Clean the data -> Split the data -> Compile the data - [x] Split the data -> Apply the function -> Compile the data 2. A movie review website employs several different critics. They store these critics’ movie ratings in a DataFrame called movie_ratings, which has three columns: critic, movie, and rating. What command would give the average rating for each movie? - [ ] movie_ratings('movie').groupby['rating'].mean() - [ ] movie_ratings['rating']groupby('movie').mean() - [x] movie_ratings.groupby('movie')['rating'].mean() - [ ] movie_ratings.groupby('movie', 'critic')['rating'].mean() 3. **The City Library has several branches throughout the area. They collect all of their book checkout data in a DataFrame called checkouts. The DataFrame contains the columns ‘location’, ‘date’, and ‘book_title’. If we want to compare the total number of books checked out at each branch, what code could we use?** - [ ] checkouts.groupby('book_title')['location'].count() - [ ] checkouts.groupby('location', 'book_title').count() - [ ] checkouts.groupby('location').count('book_title') - [x] checkouts.groupby('location')['book_title'].count() 4. A movie review website employs several different critics. They store these critics’ movie ratings in a DataFrame called movie_ratings, which has three columns: critic, movie, and rating. The following code gives the max rating of each critic. What type of object is the output of this code? ```python= movie_ratings.groupby['critic']['rating'].max().reset_index() ``` - [x] Dataframe - [ ] Series - [ ] String - [ ] Float 5. **What does .agg() function do?** - [x] Perform multiple calculations - [ ] Optimize the calculations - [ ] Select specific columns - [ ] Rank the values ### Pivot Table 1. **Which of the following commands was used to achieve this pivot?** ![](https://i.imgur.com/VSrUKfR.png) - [ ] df.pivot_table(data=df, index='movie', columns='critic', values='rating') - [x] pd.pivot_table(data=df, index='critic', columns='movie', values='rating') - [ ] pd.pivot_table(data=df, index='rating', columns='movie', values='critic') - [ ] df.pivot_table(data=df, index='critic', columns='movie', values='rating') 2. Look at the dataframe named ‘final’, given below and answer the questions that follow: Suppose you want to find out the sum of all the points made by the teams in the two different leagues. Which command will achieve this? ![](https://i.imgur.com/ILV6jMf.png) - [x] final.pivot_table(values = 'Points', index = 'League', aggfunc = 'sum') - [ ] final.pivot_table(values = 'League', index = 'Points', aggfunc = 'sum') - [ ] final.groupby('League')['Points'].sum() - [ ] final.groupby('Points')['League'].sum()