# Kaggle :::info [TOC] ::: ![Снимок-экрана-2023-10-20-в-17.43.28-1024x666-1](https://hackmd.io/_uploads/Syz3VrIjeg.png) <br/> ## Introduction **Kaggle** is an online data science community and platform owned by Google. It provides: - 📊 **Datasets**: Public and private datasets. - 🏆 **Competitions**: Machine learning challenges with leaderboards. - 🛠 **Notebooks**: Free cloud-based Jupyter-like environment. - 👥 **Community**: Forums and discussions. > Perfect for beginners to practice ML and for experts to compete. <br/> ## Create Account and API Key 1. Go to [Kaggle](https://www.kaggle.com/) and create an account. 2. After login → click profile icon → **Account**. 3. Scroll to **API** section → click **Create New API Token**. - `kaggle.json`: ```json { "username": "your_username", "key": "your_api_key" } ``` :::success #### • Setup API Key on Windows Place the `kaggle.json` file at ++"C:\Users\\\<Username\>\\.kaggle\kaggle.json"++ ```python """ Or set environment variables directly in Python """ import os os.environ['KAGGLE_USERNAME'] = 'your_username' os.environ['KAGGLE_KEY'] = 'your_api_key' ``` ::: <br/> ## Install API & Download Datasets Run in terminal (Anaconda Prompt / CMD): ```bash pip install kaggle kaggle --version ``` Common Download Commands: ```bash # Search datasets kaggle datasets list -s "titanic" # Download a dataset kaggle datasets download -d <dataset-owner>/<dataset-name> # Example: download Titanic competition data kaggle competitions download -c titanic ``` > Extract the `.zip` file after downloading. <br/> ## Kaggle Resources 1. **Kaggle Competitions** - Browse competitions: ++[Kaggle Competitions](https://www.kaggle.com/competitions)++ - Use Kaggle Notebooks or your local environment to build a model. - Submit predictions as a `.csv` file → check your leaderboard ranking. 2. **Kaggle Notebooks** Kaggle provides free cloud notebooks with: - Jupyter-like interface - Free GPU/TPU support - Preinstalled ML libraries (TensorFlow, PyTorch, Scikit-learn, etc.) - Direct access to Kaggle datasets > Great for users without local GPU resources. <br/> - 💡 **Recommended Starter Competitions** 1. ++[Titanic: Machine Learning from Disaster](https://www.kaggle.com/c/titanic)++ - Predict passenger survival. - Great for learning data cleaning, feature engineering, and classification. 2. ++[House Prices - Advanced Regression Techniques](https://www.kaggle.com/c/house-prices-advanced-regression-techniques)++ - Regression problem to predict house prices. 3. ++[Digit Recognizer](https://www.kaggle.com/c/digit-recognizer)++ - MNIST handwritten digits classification (intro to deep learning). <br/> ## Summary - Kaggle is the best platform to start learning data science. - Always set up API (`kaggle.json`) for downloading datasets. - Begin with the Titanic competition for hands-on practice. - Learn from Kaggle Notebooks shared by the community. <br/> :::spoiler Relevant Resource [Kaggle Datasets](https://www.kaggle.com/datasets) [Kaggle Competitions](https://www.kaggle.com/competitions) [Kaggle Notebooks](https://www.kaggle.com/code) [Kaggle Learn Courses](https://www.kaggle.com/learn) :::