Apache Zeppelin is an online open source laptop and collaborative application for interactive data ingestion, discovery, analytics, and visualization. Zeppelin supports 20+ languages, including Apache Spark, SQL, R, Elasticsearch and many more. Apache Zeppelin allows you to make beautiful data-driven documents and see the results of your analytics.
Table of Contents
Step 1. First let’s start by ensuring your system is up-to-date.
Step 2. Installing Java.
Step 3. Installing Zeppelin.
Step 4. Configure Systemd service for Apache Zeppelin.
Step 5. Configure Reverse Proxy Nginx.
Step 6. Accessing Apache Zeppelin.
Prerequisites
This article assumes you have at least basic knowledge of Linux, know how to use the shell, and most importantly, you host your site on your own VPS. The installation is quite simple and assumes you are running in the root account, if not you may need to add ‘sudo’ to the commands to get root privileges. I will show you through the step by step install Apache Zeppelin on CentOS 7 server.
Install Apache Zeppelin on CentOS 7
Step 1. First let’s start by ensuring your system is up-to-date.
yum clean all yum -y update
Step 2. Installing Java.
At the time of writing this tutorial, the latest Java JDK version was JDK 8u45. First, let us download the latest Java SE Development Kit 8 release from its official download page or use following commands to download from shell:
cd /opt/ wget --no-cookies --no-check-certificate --header "Cookie: gpw_e24=http%3A%2F%2Fwww.oracle.com%2F; oraclelicense=accept-securebackup-cookie" "http://download.oracle.com/otn-pub/java/jdk/8u45-b14/jdk-8u45-linux-x64.tar.gz" tar xzf jdk-8u45-linux-x64.tar.gz
After extracting archive file use alternatives command to install it. alternatives command is available in chkconfig package:
cd /opt/jdk1.8.0_45/ alternatives --install /usr/bin/java java /opt/jdk1.8.0_45/bin/java 2 alternatives --config java There are 3 programs which provide 'java'. Selection Command ----------------------------------------------- * 1 /opt/jdk1.7.0_71/bin/java + 2 /opt/jdk1.8.0_25/bin/java 3 /opt/jdk1.8.0_45/bin/java Enter to keep the current selection[+], or type selection number: 3
At this point JAVA 8 (JDK 8u45) has been successfully installed on your system. We also recommend to setup javac and jar commands path using alternatives:
alternatives --install /usr/bin/jar jar /opt/jdk1.8.0_45/bin/jar 2 alternatives --install /usr/bin/javac javac /opt/jdk1.8.0_45/bin/javac 2 alternatives --set jar /opt/jdk1.8.0_45/bin/jar alternatives --set javac /opt/jdk1.8.0_45/bin/javac
Checking Installed java version:
[email protected] ~# java -version java version "1.8.0_45" Java(TM) SE Runtime Environment (build 1.8.0_45-b14) Java HotSpot(TM) 64-Bit Server VM (build 25.45-b02, mixed mode)
We can easily set the environment variables using the export command as shown below:
Setup JAVA_HOME Variable:
export JAVA_HOME=/opt/jdk1.8.0_45
Setup JRE_HOME Variable:
export JRE_HOME=/opt/jdk1.8.0_45/jre
Setup PATH Variable:
export PATH=$PATH:/opt/jdk1.8.0_45/bin:/opt/jdk1.8.0_45/jre/bin.
Step 3. Installing Zeppelin.
First, download the Zeppelin binary on your system. You can always find the latest version of the application on Zeppelin download page:
wget http://www-us.apache.org/dist/zeppelin/zeppelin-0.7.3/zeppelin-0.7.3-bin-all.tgz tar xf zeppelin-*-bin-all.tgz -C /opt
Rename the directory for sake of convenience:
mv /opt/zeppelin-*-bin-all /opt/zeppelin
Step 4. Configure Systemd service for Apache Zeppelin.
We will set up a Systemd unit file for the Zeppelin application:
adduser -d /opt/zeppelin -s /sbin/nologin zeppelin
Provide ownership of the files to the newly created Zeppelin user:
chown -R zeppelin:zeppelin /opt/zeppelin
Next, Create a new Systemd service unit file:
### nano /etc/systemd/system/zeppelin.service
[Unit] Description=Zeppelin service After=syslog.target network.target [Service] Type=forking ExecStart=/opt/zeppelin/bin/zeppelin-daemon.sh start ExecStop=/opt/zeppelin/bin/zeppelin-daemon.sh stop ExecReload=/opt/zeppelin/bin/zeppelin-daemon.sh reload User=zeppelin Group=zeppelin Restart=always [Install] WantedBy=multi-user.target
Then, Start the application:
systemctl start zeppelin systemctl enable zeppelin
Step 5. Configure Reverse Proxy Nginx.
By default, the Zeppelin server listens to localhost on port 8080. In this tutorial, we will use Nginx as a reverse proxy so that the application can be accessed via standard HTTP and HTTPS ports:
yum install certbot yum install nginx
Start Nginx and enable it to automatically start at boot time:
sudo systemctl start nginx sudo systemctl enable nginx
Next, Generate the SSL certificates:
certbot certonly --webroot -w /usr/share/nginx/html -d zeppelin.wpcademy.com
The generated certificates are likely to be stored in /etc/letsencrypt/live/zeppelin.wpcademy.com/. The SSL certificate will be stored as fullchain.pem and private key will be stored as privkey.pem.
Set up auto-renewal of the certificates Let’s Encrypt using cron jobs:
sudo crontab -e 30 5 * * * /usr/bin/certbot renew --quiet
Next steps, create a new server block file for the Zeppelin site:
nano /etc/nginx/conf.d/zeppelin.wpcademy.com.conf
upstream zeppelin { server 127.0.0.1:8080; } server { listen 80; server_name zeppelin.wpcademy.com; return 301 https://$host$request_uri; } server { listen 443; server_name zeppelin.wpcademy.com; ssl_certificate /etc/letsencrypt/live/zeppelin.wpcademy.com/fullchain.pem; ssl_certificate_key /etc/letsencrypt/live/zeppelin.wpcademy.com/privkey.pem; ssl on; ssl_session_cache builtin:1000 shared:SSL:10m; ssl_protocols TLSv1 TLSv1.1 TLSv1.2; ssl_ciphers HIGH:!aNULL:!eNULL:!EXPORT:!CAMELLIA:!DES:!MD5:!PSK:!RC4; ssl_prefer_server_ciphers on; access_log /var/log/nginx/zeppelin.access.log; location / { proxy_pass http://zeppelin; proxy_set_header X-Real-IP $remote_addr; proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for; proxy_set_header Host $http_host; proxy_set_header X-NginX-Proxy true; proxy_redirect off; } location /ws { proxy_pass http://zeppelin/ws; proxy_http_version 1.1; proxy_set_header Upgrade websocket; proxy_set_header Connection upgrade; proxy_read_timeout 86400; } }
Restart Nginx so that the changes can take effect:
systemctl restart nginx
Step 6. Accessing Apache Zeppelin.
Apache Zeppelin will be available on HTTP port 80 by default. Open your favorite browser and navigate to https://zeppelin.wpcademy.com and complete the required the steps to finish the installation. If you are using a firewall, please open port 80 to enable access to the control panel.
Congratulation’s! You have successfully installed Apache Zeppelin on CentOS 7. Thanks for using this tutorial for installing Apache Zeppelin on CentOS 7 systems. For additional help or useful information, we recommend you to check the official Apache Zeppelin web site.