<!-- <link rel="stylesheet" href="css/theme/default.css" id="theme"> -->
<style type="text/css">
/* @import url('https://fonts.googleapis.com/css2?family=Lato'); */
body {
-webkit-animation-delay: 0.1s;****
-webkit-animation-name: fontfix;
-webkit-animation-duration: 0.1s;
-webkit-animation-iteration-count: 1;
-webkit-animation-timing-function: linear;
}
@-webkit-keyframes fontfix {
from { opacity: 1; }
to { opacity: 1; }
}
html, .reveal {
background-color: #000000;
/* font-family: 'Lato'; */
color: #CFC7BF;
}
.reveal .slide-background-content {
background-color: #000;
}
.reveal .slides section h1,
.reveal .slides section h2,
.reveal .slides section h3,
.reveal .slides section h4,
.reveal .slides section h5,
.reveal .slides section h6{
/* font-family: "Lato"; */
color: #ffefef;
text-shadow:
-3px -3px 0 rgba(212, 27, 44, 0.5),
3px -3px 0 rgba(212, 27, 44, 0.5),
-3px 3px 0 rgba(212, 27, 44, 0.5),
3px 3px 0 rgba(212, 27, 44, 0.5);
}
</style>
---
<section data-background-image="https://i.imgur.com/1TQ37c4.png" data-background-size="contain" data-background-opacity="0.2">
<h2 style="text-align: center;line-height: 700px; -webkit-text-stroke: 2px black;" class="fragment fade-left"> Northeastern University </h2>
</section>
---
<h3 style="color:white">Data Analytics</h3>
<h3 style="color:white" class="fragment fade-left"><i>Career Accelerator</i></h3>
---
# Introduction to Data
---
## Objectives
1. Understand types of data works
2. Understand data workflow
3. Be able to do some:
- Data cleaning
- Data exploration
- Data visualization
---
## Data jobs
Data works are generally grouped into:
1. Data Engineering
2. Data Analytics
3. Data Science
---
## Data engineers
Data engineers build information storage systems and custom data pipelines. They design databases to collect and store data while also making it easy to process.
---
## Data engineers' toolbox
Data engineers are proficient in all kinds of database languages and frameworks, e.g., all types of **SQL**. They use programming languages like **Java**, **Scala**, or **Python** to process data and command-line languages such as **Shell** to automate the workflow.
---
## Data analysts
Data analysts describe and explore data. They focus on data preparation, visualization, and summative reports.
---
## Data analysts' toolbox
Data analysts use **SQL** language to retrieve and aggregate data from existing databases. They may use **spreadsheets** to perform simple analyses on smaller datasets. Analysts also use **Business Intelligence (BI)** tools, such as **Tableau** or **Power BI**, to create dashboards. **Python** and **R** are other increasingly common tools for cleaning and analyzing data.
---
## Data scientists
Data scientists use machine learning and deep learning tools for classification, prediction, and forecasting. They do advanced data exploration, visualization, experimentation, and prediction.
---
## Data scientists' toolbox
Data scientists must be proficient in **SQL** and **Python** or **R**. They use popular data science libraries, such as **pandas** (Python) or **tidyverse** ( R ). Traditional machine learning also requires libraries such as **scikit-learn** or **PyTorch**. Deep learning experts might use **TensorFlow** to run powerful deep learning algorithms or **Keras** to work with neural networks.
---
## Data workflow
To sum up, many data projects have four stages:
1. Data collection
2. Data preparation
3. Data exploration
4. Experimentation and prediction
---
## Data collection
may involve conveying surveys, gathering data from social media (Big Data), acquiring transactional data.
---
## Data preparation
often involves "cleaning data," such as removing duplicate values or finding missing ones, converting pieces of data into a consistent and easy to process format.
---
## Data exploration
may involve visualization, obtaining descriptive statistics such as mean and median.
---
## Experimentation and prediction
means running experiments and predictions on the data, e.g., forecasting sales.
---
Data analysis focuses on the two middle stages: data preparation and data exploration. That is not to say that our course is limited to those two. We will look into database design and predictive data science too. However, first things first.
{"metaMigratedAt":"2023-06-16T17:57:24.160Z","metaMigratedFrom":"YAML","title":"Intro to Data - Data Analysis Career Accelerator Bootcamp","breaks":true,"slideOptions":"{\"theme\":\"black\",\"transition\":\"fade\",\"parallaxBackgroundImage\":\"\"}","contributors":"[{\"id\":\"0f7a35ec-f2b1-4089-943c-17d8603eb062\",\"add\":4602,\"del\":338},{\"id\":\"dceb8078-7b4c-4dbb-97f5-d101feae5fc5\",\"add\":374,\"del\":13}]"}