---
tags: BigData-MS-2019, BigData-BS-2019
title: Labs. Prerequisites.
---
# Labs: Prerequisites
Welcome to the course of Big Data. This document will describe the list of software you will need in this course. Straight ahead!
## Software used in the course:
- [Virtual Box](https://www.virtualbox.org/wiki/Downloads)
- [Vagrant](https://www.vagrantup.com/intro/getting-started/install.html)
- [Docker](https://docs.docker.com/get-docker/)
- [Hadoop 3.3.0](https://archive.apache.org/dist/hadoop/common/hadoop-3.3.0/)
- [Spark 3.0.0](https://spark.apache.org/downloads.html)
## Hardware Requirements
**Processor**: Virtualization Enabled ([Check Virtualization on Windows](https://www.thewindowsclub.com/check-intel-amd-processor-supports-hyper-v), On Linux: `lscpu`). Sometimes it is disabled in BIOS.
**Memory**: We recommend at least 8Gb of RAM
**Storage**: To complete the first half of the course, you will need roughly 20Gb of free storage on your hard disk.
**OS**: macOS or Linux (latest Ubuntu LTS) are fine. You will have some additional issues with Hadoop and Spark on Windows, but it will work. If you want to use Linux - install it on hardware, do not use nested virtualization.
:::info
If you cannot meet these hardware requirements, please consult with your TA.
:::
## Recommended Reading
Refresh your knowledge of [bash](https://learnxinyminutes.com/docs/bash/).
## First Lab
For the first lab, install Virtual Box, Vagrant, and Docker.
Download Vagrant Box Image from [Vagrant Repository](https://cloud-images.ubuntu.com/vagrant/trusty/current/trusty-server-cloudimg-amd64-vagrant-disk1.box).
## Useful Software
- Terminal Emulators (highly recommended for Windows): [Terminus](https://eugeny.github.io/terminus/), [Hyper](https://hyper.is/), [Cmder (Windows Only)](https://cmder.net/), [ConEmu (Windows Only)](https://conemu.github.io/)
- Modern Editors: [Atom](https://atom.io/), [Visual Studio Code](https://code.visualstudio.com/)
- Bash tools on Windows (highly recommended): [Gnu on Windows](https://github.com/bmatzelle/gow/wiki), [Git Bash](https://www.atlassian.com/git/tutorials/git-bash), [Linux Subsystem](https://docs.microsoft.com/en-us/windows/wsl/install-win10)
- IDE (highly recommended): [IntelliJ Idea](https://www.jetbrains.com/idea/download/)
## Books:
- [Designing Data Intensive Application](https://dataintensive.net/)
- [Programming in Scala](https://booksites.artima.com/programming_in_scala_3ed)
- *Extra: [Hadoop: The Definitive Guide](https://www.oreilly.com/library/view/hadoop-the-definitive/9781491901687/)*
- *Extra: [Spark: The Definitive Guide](http://shop.oreilly.com/product/0636920034957.do)*