---
title: "Jam 04 - Exercise 4"
tags:
- 3 ๐งช in testing
- 4 ๐ฅณ done
- jam04
- arrays
- statistics
- enums
---
<!-- markdownlint-disable line-length single-h1 no-inline-html -->
<!-- markdownlint-configure-file { "ul-indent": { "indent": 4 }, "link-fragments": {"ignore_case": true} } -->
{%hackmd dJZ5TulxSDKme-3fSY4Lbw %}
# Exercise 4 - Statistical Analysis with Skewness
## Overview - Exercise 4
In this exercise, you'll explore statistical concepts through programming, focusing on analyzing data distributions using measures like mean, median, and skewness. This practical application combines statistical analysis with Java programming, demonstrating how software can help us understand data patterns.
:::info
๐ **Key Concepts**
- Statistical measures (mean, median) for data analysis
- Data distribution shapes and skewness
- Working with enumerated types in Java
- Input validation and error handling
- Using provided methods effectively
:::
## Understanding Data Distributions
Suppose you had a large collection of numbers, and you wanted to understand a bit about the *distribution* of your data. Statisticians will teach you to identify the *center*, the *shape*, and the *spread* of your data. To identify the center of your data, you usually compute the **mean** (or average) of the numbers in your data. This is a good estimate of the central tendency of your data.
However, shape can be a bit more challenging. One approach is to compare your mean to the **median** of your data. Recalling, the median is the number that occurs at the middle of the list, i.e. the number that statisticians identify as the 50th percentile. It's the number that separates the upper half of the data from the lower half. The number is interesting in statistics, especially when you compare it to the mean. These two numbers together can inform us a bit about the shape of the distribution with respect to **skew**.
Let's look at an example:

[Figure from statisticshowto.com/pearson-mode-skewness/](https://www.statisticshowto.com/pearson-mode-skewness/)
A mean that is significantly smaller than the median tells us that there are some small numbers that are pulling our mean down, and thus we say that the data are **skewed left** (a.k.a. negative skew). In contrast, when the mean is larger than the median, we say that the data are **skewed right** because some large numbers are pulling the mean in a positive direction.
## The Problem - Exercise 4
**Problem**: Using these statistical concepts, you'll create a program that generates and analyzes data distributions to determine their skewness.
First, let's get the starter code:
```bash
# If working on your laptop (replace userid with your Bucknell username):
scp "userid@linuxremote.bucknell.edu:/home/csci205/2025-spring/student/jam04/Skewness.java" "src/main/java/jam04/"
# If working on linuxremote:
cp "/home/csci205/2025-spring/student/jam04/Skewness.java" "src/main/java/jam04/"
```
:::warning
๐จ **Important**:
- Make sure you're in your project root directory when running these commands
- The file must be placed in the correct package directory (`src/main/java/jam04/`)
- Don't forget to add your banner at the top of the file
- Verify the file was copied correctly before proceeding
:::
The `Skewness.java` file contains:
- An enumerated type `SkewType` for different skew types (enumerated types, or enums, are explained in detail below)
- A method `generateSkewedData` for creating test data
- Constants and utility methods for data generation
## Required Steps - Exercise 4
1. Create a program that:
- Asks for quantity of numbers to generate
- Lets user choose skew type (left, right, or none)
- Generates data using provided method
- Calculates and compares mean and median
- Determines skewness based on the comparison
- Reports results to user
2. Implementation requirements:
- Use the provided `generateSkewedData` method
- Calculate mean and median yourself
- Consider data skewed if mean differs from median by > 1% of mean
- Handle invalid input gracefully
- Follow good method decomposition practices
Note: You will get a checkstyle error "Top-level class SkewType has to reside in its own source file."
:::warning
๐จ **Important Requirements**
1. Input Validation:
- Quantity must be positive
- Skew type must be 0, 1, or 2
- Handle invalid input gracefully
2. Skewness Analysis:
- Data is skewed if mean differs from median by > 1% of mean
- Use Arrays.sort() for finding median
- For even-length arrays, median is average of middle two values
3. Code Organization:
- Break down into small, focused methods
- Use clear, descriptive names
- Add proper JavaDoc comments
:::
## Understanding Enumerated Types
Before we dive into the solution, let's talk about **enums** (short for enumerated types). They might seem simple at first - just named constants, right? But in Java, they're so much more! Java's enums are actually special classes that give you a whole new type to work with. And those constants you define inside? They're full-fledged objects! Pretty cool, right? While enums can do lots of fancy things with methods and fields, we'll keep it simple for this jam.
:::info
๐ **Key Concepts - Enums**
- They're more than just constants - they're their own type!
- Each enum value is actually an object (surprise!)
- Java makes sure you can't use invalid values
- Way better than using "magic numbers" in your code
:::
Let's look at the enum in your starter code:
```java
enum SkewType {
SKEW_LEFT, // Data skewed towards lower values
SKEW_RIGHT, // Data skewed towards higher values
SKEW_NONE // No intentional skew
}
```
See how clean that looks? You've just created a new type called `SkewType` with three ready-to-use objects: `SKEW_LEFT`, `SKEW_RIGHT`, and `SKEW_NONE`. No need to worry about setting them up - Java handles all that for you!
You can see in the `generateSkewedData` method that `skewType` is a required parameter:
```java
public static double[] generateSkewedData(int sizeOfArray, SkewType skewType)
```
And, you can see how it's used in the loop. Here is a snippet of code showing how a variable of an enum type can be used just like any other variable, except that it can *only* take on values of the enum itself:
```java
// Let's create some intentional skew in our data if requested
for(int i = 0; i < sizeOfArray/20; i++) {
switch (skewType) {
case SKEW_LEFT:
// skew data to the left
break;
case SKEW_RIGHT:
// skew data to the right
break;
}
}
```
See? They really are simple! An enum is nothing more than a new type in your program. You can create variables of enum types (like we did with the parameter skewType). But, because Java is a strongly-typed language, the huge benefit of enums is that enum variables can only take on the objects of that type. That is very restrictive, and that is a good thing! Why?
- Enums make your code more readable. Named constants are always better than having to manually define multiple static final int
- Enums prevent you having the burden of assigning unique integer constants like 0, 1, and 2 that we would need for our three skew types
- Enums prevent you from introducing magic numbers and setting up your code for failure!
We'll do more with enums in a future jam.
:::success
๐ **Why You'll Love Enums**
- Your code becomes super readable - no more guessing what numbers mean!
- Java keeps you safe by preventing invalid values
- No more headaches managing integer constants
- Your IDE can help you with auto-completion
:::
We'll explore more cool features of enums in future jams, but this is all you need to know to rock Exercise 4!
## Example Output - Skewness.java
```text
Welcome to the skewed data simulator
Please enter the quantity of numbers to generate: 100000
Please choose one of the following:
0 - NO skew
1 - LEFT skew
2 - RIGHT skew
1
mean: 9.630
median: 9.882
Skew: LEFT
```
```text
Welcome to the skewed data simulator
Please enter the quantity of numbers to generate: 100000
Please choose one of the following:
0 - NO skew
1 - LEFT skew
2 - RIGHT skew
0
mean: 10.008
median: 10.018
Observed skew: NONE
```
:::warning
๐จ **A Word About Method Decomposition**
We've noticed some of you are trying to cram everything into your `main` method - resist that temptation! Not only does it make your code harder to grade, but it's also a habit that could hurt your career. Here's why:
- Huge, monolithic methods are bug magnets
- They're nearly impossible to test properly
- They make code reviews a nightmare
- In the real world, this could lead to poor performance reviews or worse!
Remember, your grade depends on:
- Breaking down your code into small, focused methods
- Using clear, descriptive names for everything
- Avoiding magic numbers (that's where enums really shine!)
- Adding proper comments for methods and classes
- Including your banner at the top of each file
Think of methods like LEGO blocks - each one should do one thing well, and you combine them to build something awesome. Your code will be easier to write, easier to debug, and way easier to maintain. Plus, you'll make your grader happy (and trust us, that's a good thing! ๐).
:::
> ๐ **Checkpoint**
>
> Before proceeding, verify:
>
> - Your program handles all input validation
> - Mean and median calculations are correct
> - Skewness detection follows the 1% rule
> - Output matches the expected format
> - Code is well-documented with JavaDoc
## Save Your Work - Exercise 4
Verify what files are uncommitted:
```bash
git status
```
Stage your changes:
```bash
git add src/main/java/jam04/Skewness.java
```
Commit your work:
```bash
git commit -m "jam04: Implement skewness analysis"
```
:::success
๐ **Key Takeaways**
- Statistical measures help understand data distributions
- Enums provide type-safe constants in Java
- Breaking down problems into methods improves code quality
- Input validation is crucial for robust programs
- Version control helps track your progress
:::