--- tags: IntroToBigData title: Installing Python interpreter on Zeppelin --- Installing Python interpreter on Zeppelin === **Course:** Intro to Big Data - IU S23 **Author:** Firas Jolha # Agenda [TOC] ## Installing pandas package The cluster node in HDP Sandbox does not have pandas package installed, so you need to install it using pip. The Python version in the sandbox is 2.7, and pip 20.3.4 need to be installed using `easy_install` script or any other method. After that, you need to install the pandas package version 0.24.2. :::info Search for a tutorial on your preferred search engine :slightly_smiling_face: ::: :::warning Do not update the Python version in the sandbox, otherwise some services will not work. ::: :::spoiler simple steps to install pip and pandas without using yum ```sh wget https://bootstrap.pypa.io/pip/2.7/get-pip.py python get-pip.py pip install pandas==0.24.2 ``` ::: <!-- Simple steps to install pip and pandas without using yum wget https://bootstrap.pypa.io/pip/2.7/get-pip.py python get-pip.py pip install pandas==0.24.2 --> <!-- :::info Only for this lab, it is ok to use any other ::: --> ## Installing python interpreter on Zeppelin [Apache Zeppelin](https://zeppelin.apache.org) is a web-based notebook that enables data-driven, interactive data analytics and collaborative documents with SQL, Scala, Python, R and more. <!-- --> You can check this [tutorial](https://zeppelin.apache.org/docs/0.8.0/usage/interpreter/installation.html#3rd-party-interpreters) to see how you can add an interpreter to Zeppelin. You need to run the script `/usr/hdp/2.6.5.0-292/zeppelin/bin/install-interpreter.sh` and specify the name of the interpreter as an argument to the option `--name`. For Python, you need to pass `python` as follows: ```vim /usr/hdp/2.6.5.0-292/zeppelin/bin/install-interpreter.sh --name python ``` :::danger **Note:** Before you run the script, you need to add the line: `export ZEPPELIN_INTERPRETER_DEP_MVNREPO="https://repo1.maven.org/maven2"` at the beginning of the file `/usr/hdp/2.6.5.0-292/zeppelin/conf/zeppelin-env.sh` <!-- as follows: `sed -i '1iexport ZEPPELIN_INTERPRETER_DEP_MVNREPO="https://repo1.maven.org/maven2"' /usr/hdp/2.6.5.0-292/zeppelin/conf /zeppelin-env.sh` --> <!-- This is due to the fact, that Zeppeling is not supported nowadays. --> ![](https://i.imgur.com/LlqtYG9.png) ::: Then run the script passing `python` as the interpreter. ![](https://i.imgur.com/Q6Gd0ER.png) After that, restart Zepplin from Ambari dashboard. You can access Zeppelin Notebook via the link http://localhost:9995 on your local machine. ![](https://i.imgur.com/NUkzqJM.png) You can add interpreters to Zeppelin by navigating to interpeters window. Press `create`, select `python`, and add a name `python2` for the interpreter instance as shown below. <!-- with custom properties to the list of Zeppelin interpreters as show in the figure below. --> <!-- ![](https://i.imgur.com/YqCJS5D.png) --> ![](https://i.imgur.com/0cfhKDX.png) --- <!-- Select python --> ![](https://i.imgur.com/Vepj5TY.png) Create a new note and select python2 as shown below. ![](https://i.imgur.com/RGiT4sE.png) <!-- Add a name ![](https://i.imgur.com/5LXpbZb.png) --> This will create a new notebook where the default interpreter is `python2` but indeed you can run cells using other interpreters by specifying the interpreter at the beginning of the cell such as `%sh` to run shell commands.