Pandas Task #1

For pandas level of Data Science course we are asking you to do two courses on Kaggle platform to make sure everyone starts from the same level when working with data!

Course
1. Intro to Programming
2. Python
  • In first course, pay special attention to functions and how they help to avoid duplicate code and structure your programs nicely.
  • In second course, pay attention to lists and dictionaries you will use those a lot when writing code.

Please do this before 22 March, so we all move at the same pace!

Kaggle

If you are done with the task and interested in learning more about Data Science on your own, Kaggle is a great place to check out!

Kaggle community is centered around Machine Learning contests with real money prizes. If you are just starting out, I would encourage you to ignore contests in favour of other resources like:

  • courses
  • datasets
  • notebooks

Notebooks

Notebooks you can find under code tab deserve a special mention - they allow you to write and execute Python code on remote machines without having to worry about configuring your local computer!

We will get to writing notebooks soon, but lets start by reading code written by others!

For extra credit:

  1. search code tab of Kaggle to find interesting notebooks
  2. share it with other learners in RegenLearnings/Pandas chat

Some tutorial on topic of data visualization or exploratory data analysis (eda) can be a great starting point, but feel free to pick whatever seems interesting to you!

Notes:

  • If don't enjoy Kaggle flavour of Notebooks, Google Collab offers a good alternative with similar features.
  • We call those Jupyter Notebooks, because first programming languages supported by the project were (Julia + Python + R)
  • Jupyter is a serious tool used in real-world Data Science!

Extra Resources