# Quiz Module 3.1
## Numpy
### Intro to Numpy
1. How do you import Numpy?
- [ ] require numpy
- [x] import numpy as np
- [ ] from numpy import np
- [ ] import np
2. **Select the unstructured data types**
- [ ] Excel sheet
- [ ] SQL tables
- [ ] csv files
- [x] Texts
3. What is an appropriate array for this dataframe?

- [ ] array([420, 380, 390])
- [ ] array([390, 40])
- [ ] array([2, 390])
- [x] array([390, 45])
4. **What is NOT a characteristic of Numpy?**
- [ ] Numpy vectorizes data
- [ ] Numpy array processes data quicklier compared with Python list
- [x] Numpy array can store different data types
- [ ] Numpy is a Python library
### Numpy syntax (p1)
1. **What’s is the value of c in the following array operation?**
```python=
a = np.array([1, 2, 3])
b = np.array([4, 5, 6])
c = a+b
```
- [ ] array([1, 2, 3], [4, 5, 6])
- [x] array([5, 7, 9])
- [ ] array([10, 11, 12])
- [ ] array([1, 2, 3, 4, 5, 6])
2. What should this data storage method called?
[21, 22, 23, 24, 25]
- [ ] scalar
- [x] array
- [ ] matrix
3. **What is the shape of this matrix?**
```python=
array([[2, 3, 5],
[4, 7, 9],
[11, 13, 16],
[22, 3, 10]])
```
- [ ] (3, 4)
- [x] (4, 3)
- [ ] (4, 12)
- [ ] (2, 10)
### Numpy syntax (p2)
1. **What would the output be for this program?**
```python=
height = np.array([160, 167, 173, 159])
height[2] = 166
height[-1] = 153
```
- [ ] array([160, 167, 173, 159])
- [ ] array([160, 166, 173, 153])
- [x] array([160, 167, 166, 153])
- [ ] array([160, 167, 166, 159])
2. Consider the following array ```b = np.array([[20, 22, 24], [26, 28, 30], [32, 34, 36]])```, what would the output be if the input is ```print(b[1:2, 0:2])```?
- [ ] array([[26, 28, 30]])
- [ ] array([[26, 28, 30],
[32, 34, 36]])
- [x] array([[26, 28]])
- [ ] array([[20, 22, 24],
[26, 28, 30]])
3. **Consider the array ```a = np.array([10, 4, 6, 9, 18, 22, 11, 13, 3, 2, 15])```, Which logical operation could be performed on the array to return ```array([ 4, 18, 22, 3, 2])```?**
- [x] a[(a>15) | (a<5)]
- [ ] a[(a<15) | (a>5)]
- [ ] a[(a>15) & (a<5)]
- [ ] a[(a<15) & (a>5)]
## Getting to Know Pandas
### Getting to know Pandas (p1)
1. **What is NOT an appropriate hosting service?**
- [ ] SQL
- [ ] Big Query
- [ ] GitHub
- [x] Computer Hardware
2. **What is NOT an appropriate Python library that supports EDA?**
- [x] Beautiful Soup
- [ ] Pandas
- [ ] Matplotlib
- [ ] Seaborn
### Getting to know Pandas (p2)
1. **What is the 'Income Group' called?**
- [ ] DataFrame
- [ ] Series
- [ ] Observation
- [x] Feature
2. A Numpy series can store both Strings and Integers
- [ ] True
- [x] False
3. **We cannot access a Pandas dataframe using Index name**
- [ ] True
- [x] False
### DataFrame - Load and Overview
1. **Which of the following commands will correctly import a csv link into the DataFrame content?**
- [ ] content = pd.from_csv('https://dropbox.com/s...')
- [ ] content = pd.from_csv(https://dropbox.com/s...)
- [x] content = pd.read_csv('https://dropbox.com/s...')
- [ ] content = pd.read_csv(https://dropbox.com/s...)
2. **Which of the following commands will print the first 5 rows (not including the header) of the DataFrame df?**
- [ ] print(df[1:5])
- [x] print(df.head(5))
- [ ] print(df.info(5))
3. How do you import Pandas?
- [ ] require pandas
- [x] import pandas as pd
- [ ] from python import pd
- [ ] import pd
### DataFrame - Index
1. **What is the correct syntax to get all column names?**
- [ ] df.get_name()
- [x] df.columns
- [ ] df.name[:]
- [ ] df[columns]
2. **With ```df.set_index('Country Name', inplace=True)```, Country Name will be saved as a copy.**
- [ ] True
- [x] False
### DataFrame - Selection with loc
1. **If you only want to select the ages from the DataFrame, which of the follow lines of code would you use?**

- [ ] customers.loc['age']
- [x] customers.age
- [ ] customers[column == 'age']
- [ ] customers.loc[:, 'age']
2. **What is the correct line of code to the Birth rate of Angola, Albania, and United Arab Emirates?**

- [ ] df.['Aruba':'Albania', 'Birth rate']
- [ ] df.loc[:-3, 'Birth rate']
- [x] df.loc['Angola':'United Arab Emirates', ['Birth rate']]
- [ ] df.['Angola', 'United Arab Emirates']
3. To access value of data frame using column label, we can use:

- [ ] loc
- [ ]```<dataframe object>.<column label>```
- [x] Both
- [ ] None
4. Which of the following is not correct?
- [ ] print(class.loc[:, ‘girls’:’subject’])
- [ ] print(class.loc[[‘class1’:’class2’], [‘girls’:’subject’]]
- [ ] print(class.loc[‘class1’, ‘class2’, ‘girls’, ‘subject’]
- [x] print(class.loc[‘class’, ‘girls’]
5. What will be the output of print(df.loc[:])?
- [ ] Display ‘Error’
- [ ] Display all rows
- [ ] Display all columns
- [x] Display all rows and columns
### DataFrame - Selection with iloc
1. What is the correct way to display the last two rows?
- [ ] print(df[-2:-1])
- [ ] print(df.iloc[-2:-1])
- [x] print(df.tail(2))
- [ ] All of the above
2. **To extract the first 3 rows and 3 columns of a dataframe 'exp' which of the following is True?**
- [ ] exp.iloc[0:2, 0:2]
- [x] exp.iloc[0:3, 0:3]
- [ ] exp.iloc[1:4, 1:4]
- [ ] exp.iloc[1:3, 1:3]
3. Now suppose you want to select the 3rd column, named 'Ratings' from the dataframe 'movies', but you want to select it as a dataframe instead of a series. Which of the following will help you achieve the same?
- [ ] df[['Ratings']]
- [ ] df.iloc[:, [2]]
- [ ] df.Ratings
- [x] df.loc[:, ['Ratings']]
### Working with Series
1. **When selecting a single column or row, what will the data type of the result be?**
- [ ] Dataframe
- [ ] Array
- [ ] Python list
- [x] Series
2. For the dataframe above, what would be the result of df['Grade_letter'].nunique()?

- [ ] ['A', 'F', 'C', 'B']
- [ ] ['A':2, 'F':1, 'C':1, 'B':1]
- [ ] 2, 1, 1, 1
- [x] 4
### Filter and Sort
1. **What is the correct way to print the specific column 'Salary' whose value is larger than 90000?**

- [ ] df[df['Salary'] > 90000]
- [ ] df[df['Salary'] > 90000, 'Salary']
- [x] df.loc[df['Salary'] > 90000, 'Salary']
2.**What does ```df.sort_index()``` do?**
- [ ] Sort out the values in all columns in ascending order
- [x] Sort out the row names
- [ ] Sort out the sum of the values
3. Consider the DataFrame, sports_store, shown below, that gives the prices of various sports equipment sold at a local retail store. What command would you run if you wanted to find the average price of the items sold by this store?

- [ ] sports_store['price'].median()
- [ ] sports_store['price'].average()
- [ ] sports_store['price'].std()
- [x] sports_store['price'].mean()
4. Which of the following commands will you use to retain the teams that have points greater than 75 and have won at least 23 matches?

- [ ] final.loc[(final['Points'] > 75) & (final['Won'] > 23)]
- [ ] final.loc[(final['Points'] >= 75) & (final['Won'] > 23)]
- [ ] final.loc[(final['Points'] >= 75) & (final['Won'] >= 23)]
- [x] final.loc[(final['Points'] > 75) & (final['Won'] >= 23)]
## Basic Pandas Functions
### Groupby
1. **What is the order of the groupby() function?**
- [ ] Find the group -> Split the data -> Compile the data
- [ ] Apply the function -> Split the data -> Compile the data
- [ ] Clean the data -> Split the data -> Compile the data
- [x] Split the data -> Apply the function -> Compile the data
2. A movie review website employs several different critics. They store these critics’ movie ratings in a DataFrame called movie_ratings, which has three columns: critic, movie, and rating. What command would give the average rating for each movie?
- [ ] movie_ratings('movie').groupby['rating'].mean()
- [ ] movie_ratings['rating']groupby('movie').mean()
- [x] movie_ratings.groupby('movie')['rating'].mean()
- [ ] movie_ratings.groupby('movie', 'critic')['rating'].mean()
3. **The City Library has several branches throughout the area. They collect all of their book checkout data in a DataFrame called checkouts. The DataFrame contains the columns ‘location’, ‘date’, and ‘book_title’. If we want to compare the total number of books checked out at each branch, what code could we use?**
- [ ] checkouts.groupby('book_title')['location'].count()
- [ ] checkouts.groupby('location', 'book_title').count()
- [ ] checkouts.groupby('location').count('book_title')
- [x] checkouts.groupby('location')['book_title'].count()
4. A movie review website employs several different critics. They store these critics’ movie ratings in a DataFrame called movie_ratings, which has three columns: critic, movie, and rating. The following code gives the max rating of each critic. What type of object is the output of this code?
```python=
movie_ratings.groupby['critic']['rating'].max().reset_index()
```
- [x] Dataframe
- [ ] Series
- [ ] String
- [ ] Float
5. **What does .agg() function do?**
- [x] Perform multiple calculations
- [ ] Optimize the calculations
- [ ] Select specific columns
- [ ] Rank the values
### Pivot Table
1. **Which of the following commands was used to achieve this pivot?**

- [ ] df.pivot_table(data=df,
index='movie',
columns='critic',
values='rating')
- [x] pd.pivot_table(data=df,
index='critic',
columns='movie',
values='rating')
- [ ] pd.pivot_table(data=df,
index='rating',
columns='movie',
values='critic')
- [ ] df.pivot_table(data=df,
index='critic',
columns='movie',
values='rating')
2. Look at the dataframe named ‘final’, given below and answer the questions that follow:
Suppose you want to find out the sum of all the points made by the teams in the two different leagues. Which command will achieve this?

- [x] final.pivot_table(values = 'Points', index = 'League', aggfunc = 'sum')
- [ ] final.pivot_table(values = 'League', index = 'Points', aggfunc = 'sum')
- [ ] final.groupby('League')['Points'].sum()
- [ ] final.groupby('Points')['League'].sum()