# STAT1013: Practical Assignment Part 2: Exploring, Analyzing and A/B Testing Data
###### Marks: 65 Points
Part 2 of your project consists of five main parts, each explained below. Your final write-up must cover the following aspects, and your grade will depend on the inclusion of these parts:
> If you are not satisfied with the data for Part A, you can also change the dataset in Part B, but the grade for Part A will not change.
## **Introduction (10 points)**
In the introduction, it is essential to outline the genesis of your idea and the hypotheses you formulated regarding the comparison of the two samples, along with the rationale behind them. Additionally, elucidate on the data collection process. This section of the practical assignment will be evaluated based on the following criteria:
1. Articulate the inspiration behind your idea **(2 points)**.
2. Present your hypotheses using descriptive language (1 point) and appropriate statistical symbols (1 point), encompassing both null and alternative hypotheses **(2 points)**.
3. Justify your hypotheses by explaining why you anticipate differences between the samples **(3 points)**.
4. Describe the methodology employed for data collection **(2 points)**.
> Basically, recall and detail the data description of your Part A.
## **Verifying Necessary Data Conditions (5 points)**
This section discusses the essential data conditions that need to be validated when conducting a two-sample t-test for the difference of means or a paired t-test for the mean of differences. It is important to verify if these conditions are met. In the case of paired samples, calculating and visually representing the differences variable is important for testing one of the essential data conditions. To ensure a comprehensive discussion and receive full credit, address all necessary data conditions, whether any of the conditions are violated, and provide outputs from the chosen software package to support the verification process. It is acceptable for this project if any data conditions are violated; however, it is crucial to explain why the conditions are not met and discuss the potential impact on the conclusions drawn from the analysis.
## **Conducting a Hypothesis Test (21 points)**
Perform the appropriate hypothesis test, such as a two-sample t-test or a paired t-test, and analyze the outcomes of the test. Your assessment will be evaluated according to the following criteria:
1. Conduct the suitable hypothesis test based on the nature of your data. **(4 points)**
2. Include the Python command and the output of the hypothesis test. **(7 points)**
3. Interpret the results of your hypothesis test. What is the p-value? Explain the meaning of the p-value in the context of your data, rather than just stating whether to reject or fail to reject the null hypothesis. **(4 points)**
4. Decide whether to reject or fail to reject the null hypothesis based on the test results. Justify your decision. Are the results statistically significant? Are they practically significant, indicating meaningful differences? Clarify. **(4 points)**
5. Discuss the potential type of error that could have occurred based on the results of your hypothesis test. Provide an explanation. **(2 points)**
<!-- 1. A confidence interval is automatically generated when you conduct a t-test (or can be easily requested using “two.sided” option.). Please indicate what this interval is and how it should be interpreted. What exactly does the interval tell us? Does it give us additional information beyond what we get when we conduct a t-test? **(4 points)** -->
## **Conclusion and Summary (13 points)**
This section requires you to provide a comprehensive summary of your project, including how the idea was developed, data collection methods, key findings from data exploration and analysis. It also involves reflecting on any limitations in the data gathering process, identifying surprising discoveries during data analysis, and considering potential influence of larger sample sizes on results. Additionally, you should outline how you would approach the project differently if given another opportunity. This component will be evaluated based on the following criteria:
1. Succinctly summarize the project, detailing idea generation, data collection processes, and findings from analysis. **(5 points)**
2. Critically evaluate the limitations of the data collection methods employed and provide reasoning for these shortcomings. **(5 points)**
3. Describe potential modifications to the project if conducted again. **(3 points)**
## Submission
Please submit your **write-up Jupyter notebook** on **BlackBoard**.