---
tags: BigData
title: Stage IV - Presentation
---
# Stage IV - Presentation & Delivery
**Course:** Big Data - IU S23
**Author:** Firas Jolha
# Dataset
- [Some emps and depts](http://www.cems.uwe.ac.uk/~pchatter/resources/html/emp_dept_data+schema.html)
# Agenda
[toc]
# Prerequisites
- Hortonworks Data Platform (HDP) is installed
- Python 2.7 is installed
- Pip 20.3.4 is installed
# Objectives
- Build a web dashboard in Streamlit
# Install Streamlit
You can easily install `streamlit` version `0.55.2` via `pip`. You just need to run the command:
```sh!
pip install streamlit --ignore-installed
```
We used the option `--ignore-installed` to avoid issues during installation.
:::warning
**Note:** When you install a new package in Python 2.7, If `pip` gives the follwing error:

Then add the line `nameserver 8.8.8.8` to the file `/etc/resolv.conf`.
:::
# Build Streamlit app
For the project purposes, you have to display at least the results of EDA and PDA, in addition to data characteristics but try to build a cool dashboard for your project. We can import the package `streamlit` as follows:
```python!
import streamlit as st
```
As we know that the analysis results are stored as csv files and here we can read them as Pandas or Spark DataFrame as follows:
```python!
import pandas as pd
emps = pd.read_csv("data/emps.csv")
depts = pd.read_csv("data/depts.csv")
q1 = pd.read_csv("output/q1.csv")
q2 = pd.read_csv("output/q2.csv")
q3 = pd.read_csv("output/q3.csv")
q4 = pd.read_csv("output/q4.csv")
q5 = pd.read_csv("output/q5.csv")
q6 = pd.read_csv("output/q6.csv")
```
## `st.write`
`st.write` is used to display information into your Streamlit app. It does different things depending on what you throw at it. Unlike other Streamlit commands, write() has some unique properties:
- You can pass in multiple arguments, all of which will be written.
- Its behavior depends on the input types as follows.
- It returns None, so its "slot" in the App cannot be reused.
We can print some text on the dashboard.
```python!
st.write("# Big Data Project \n _Employee Salary_$^{Prediction}$ :sunglasses: \n", "*Year*: **2023**")
```

<!-- st.write("**Data Characteristics** \n`emps` data \n features: ", emps.shape[0]," \n instances
:", emps.shape[1], " \n `depts` data: \n features: ", depts.shape[0], " \n instances:", depts.shape[1]) -->
:::info
- Using this function, you can print the formatted Markdown string and emoji shortcodes.
- You can display a dataframe, Matplotlib figure, and Altair chart....etc
- You need to add double whitespace before `\n` if you need to use it.
:::
We can display a dataframe as follows:
```python!
# Display the descriptive information of the dataframe
emps_description = emps.describe()
st.write(emps_description)
```

We can display Altair charts as follows:
```python!
import altair as alt
c = alt.Chart(emps).mark_circle().encode(
x='ename', y='deptno', size='sal', color='sal', tooltip=['ename', 'deptno', 'sal'])
st.write(c)
```

## Text elements
Streamlit provides specific functions for different text elements but `st.write` can be used to perform similar jobs.
#### `st.markdown`
Display string formatted as Markdown.
```python!
st.markdown("We can add equations such as $sin^2(x)+cos^2(x) = 1$")
```
:::info
`st.divider` is not supported in Streamlit v0.55.2 but we can use `st.markdown("---")` for adding dividers.
:::
:::info
The function `st.markdown(body, unsafe_allow_html = False)` has an argument `unsafe_allow_html` which can be used to add html tags to the dashboard. By default, any HTML tags found in the body will be escaped and therefore treated as pure text. This behavior may be turned off by setting this argument to `True`.
That said, the package authors strongly advise against it. It is hard to write secure HTML, so by using this argument you may be compromising your users' security. Only for this project, it is fine to use it.
:::
#### `st.title`
Display text in title formatting. Each document should have a single st.title(), although this is not enforced.
```python!
st.title("# Big Data Project \n _Employee Salary_$^{Prediction}$ :sunglasses: \n", "*Year*: **2023**")
```

:::warning
As you can see, we can not write markdown text for the title. This function will not change the title of the dashboard.
:::
#### `st.header`
Display text in header formatting.
```python!
st.header("Data Characteristics")
```
#### `st.subheader`
Display text in subheader formatting.
```python!
st.subheader("Emps table")
```

<!-- #### `st.caption`
Display text in small font. This should be used for captions, asides, footnotes, sidenotes, and other explanatory text.
```python!
st.write(emps.describe())
st.caption("Descriptive information for Emps table")
``` -->
#### `st.code`
Display a code block with optional syntax highlighting.
```python!
st.code("SELECT * FROM employees WHERE deptno = 10;", language = 'sql')
```

#### `st.text`
Write fixed-width and preformatted text.
```python!
st.text("This is a text!")
```
#### `st.latex`
Display mathematical expressions formatted as LaTeX. Supported LaTeX functions are listed at [Katex.org](https://katex.org/docs/supported.html).
```python!
st.latex("sin^2(x)+cos^2(x)=1")
```

## Data display elements
When you're working with data, it is extremely valuable to visualize that data quickly, interactively, and from multiple different angles. That's what Streamlit is actually built and optimized for.
There are two main functions for displaying the dataframes. `st.dataframe` displays the dataframe as an interactive table whereas `st.table` displays the dataframe as a static table.
```python!
st.dataframe(q1)
st.table(q1)
```
<!-- :::info
You can add styles to the dataframes via Pandas styling. For more info, follow [this link](https://pandas.pydata.org/docs/user_guide/style.html).
::: -->
## Chart elements
It is recommended to build charts using Altair or Matplotlib packages since the Streamlit package provides only limited settings then display the charts via `st.pyplot` or `st.altair_chart` respectively. Indeed you can add `css` styles to your dashboard as follows:
```python!
st.markdown("<style>{}</style>".format(<YOUR_STYLE>), unsafe_allow_html = True)
```
## Media elements
You can add images to the dashboard via `st.image` function.
```python!
# To center the image
st.markdown("""<style>body {
background-color: #eee;
}
.fullScreenFrame > div {
display: flex;
justify-content: center;
}
</style>""", unsafe_allow_html=True)
# set the image and the caption
st.image("https://i2.wp.com/hr-gazette.com/wp-content/uploads/2018/10/bigstock-Recruitment-Concept-Idea-Of-C-250362193.jpg", caption = "Employees and Departments", width=400)
```
## Status elements
Streamlit provides a few methods that allow you to add animation to your apps. These animations include progress bars, status messages (like warnings), and celebratory balloons.
```python!
import time
with st.spinner('Wait for it...'):
time.sleep(5)
st.balloons()
st.success('Done!')
st.error('This is an error')
st.warning('This is a warning')
st.info('This is a purely informational message')
progress_text = "Operation in progress. Please wait."
st.text(progress_text)
my_bar = st.progress(0)
for percent_complete in range(100):
time.sleep(0.1)
my_bar.progress(percent_complete + 1)
st.success("Done!")
```
## Input widgets
With widgets, Streamlit allows you to bake interactivity directly into your apps with buttons, sliders, text inputs, and more.
### `st.button`
Display a button widget.
```python!
def clicked():
st.write('Hello there!')
def unclicked():
st.write('Goodbye')
if st.button('Say hello'):
clicked()
else:
unclicked()
```
### `st.checkbox`
Display a checkbox widget.
```python!
def clicked():
st.write('Great!')
def unclicked():
st.write('It is fine!')
if st.checkbox('Do you agree?'):
clicked()
else:
unclicked()
```
### `st.radio` and `st.selectbox`
`st.radio` displays a radio button widget.
```python!
genre = st.radio(
"What\'s your favorite movie genre",
('Comedy', 'Drama', 'Documentary'))
if genre == 'Comedy':
st.write('You selected comedy.')
else:
st.write("You didn\'t select comedy.")
option = st.selectbox(
'How would you like to be contacted?',
('Email', 'Home phone', 'Mobile phone'))
st.write('You selected:', option)
```
### `st.text_input` and `st.number_input`
`st.text_input` displays a single-line text input widget. `st.number_input` displays a numeric input widget.
```python!
number = st.number_input('Insert a number')
st.write('The current number is ', number)
title = st.text_input('Movie title', 'Life of Brian')
st.write('The current movie title is', title)
```
### `st.date_input` and `st.time_input`
`st.date_input` displays a date input widget. `st.time_input` displays a time input widget.
```python!
import datetime
d = st.date_input(
"When\'s your birthday",
datetime.date(2019, 7, 6))
st.write('Your birthday is:', d)
t = st.time_input('Set an alarm for', datetime.time(8, 45))
st.write('Alarm is set for', t)
```
# Dashboard Example
```python!
st.markdown('---')
st.title("Big Data Project **2023**")
st.markdown("""<style>body {
background-color: #eee;
}
.fullScreenFrame > div {
display: flex;
justify-content: center;
}
</style>""", unsafe_allow_html=True)
st.image("https://i2.wp.com/hr-gazette.com/wp-content/uploads/2018/10/bigstock-Recruitment-Concept-Idea-Of-C-250362193.jpg", caption = "Employees and Departments", width=400)
#st.markdown("<p style='text-align: center; color: grey;'>Employees and Departments</p>", unsafe_allow_html=True)
st.markdown('---')
st.header('Descriptive Data Analysis')
st.subheader('Data Characteristics')
emps_dda = pd.DataFrame(data = [["Employees", emps.shape[0]-1, emps.shape[1]], ["Departments", depts.shape[0], depts.shape[1]]],columns = ["Tables", "Features", "Instances"])
st.write(emps_dda)
st.markdown('`emps` table')
st.write(emps.describe())
st.markdown('`depts` table')
st.write(depts.describe())
st.subheader('Some samples from the data')
st.markdown('`emps` table')
st.write(emps.head(5))
st.markdown("`depts` table")
st.write(depts.head(5))
st.markdown('---')
st.header("Exploratory Data Analysis")
st.subheader('Q1')
st.text('The distribution of employees in departments')
st.bar_chart(q1)
st.subheader('Q2')
st.text('The average salary in departments')
st.table(q2)
st.line_chart(q2['sal_avg'], width=400)
st.markdown('---')
st.header('Predictive Data Analytics')
st.subheader('ML Model')
st.markdown('1. Linear Regression Model')
st.markdown('Settings of the model')
st.table(pd.DataFrame([['setting1', 1.0], ['setting2', 0.01], ['....','....']], columns = ['setting', 'value']))
st.markdown('2. SVC Regressor')
st.markdown('Settings of the model')
st.table(pd.DataFrame([['setting1', 1.0], ['setting2', 'linear'], ['....','....']], columns = ['setting', 'value']))
st.subheader('Results')
st.text('Here you can display metrics you are using and values you got')
st.table(pd.DataFrame([]))
st.markdown('<center>Results table</center>', unsafe_allow_html = True)
st.subheader('Training vs. Error chart')
st.write("matplotlib or altair chart")
st.subheader('Prediction')
st.text('Given a sample, predict its value and display results in a table.')
st.text('Here you can use input elements but it is not mandatory')
```
# Run Streamlit
HDP comes with a list of custom ports and you can check them by looking at the prots forwarded in `virtual box` or `docker`.

We will use the first port `60000` for Streamlit server. By default, Streamlit uses port `8501`, but you can run it on a custom port by specifying the server port as follows:
`streamlit run <streamlit_app.py> --server.port 60000`
Here we are running the Streamlit app in `<streamlit_app.py>` on the port 60000. You can open a web tab on your local machine for `localhost:60000` to view the Streamlit app.
# References
- [Altair Chart examples](https://altair-viz.github.io/gallery/index.html)
- [Streamlit docs](https://docs.streamlit.io/)
<!-- 
```python!
```
-->
<!--
How to add a port to HDP

-->
<!--
install streamlit
pip install streamlit==0.55.0
-->