Static Sites and Apps On Your Own Dokku Server
Static Sites and Apps On Your Own Dokku Server
https://ericmjl.github.io/essays-on-data-science/miscellaneous/static-sites-on-dokku/
Summary: In this essay, I’m going to share with you how you can deploy your own static sites and apps on a Dokku server.
Introduction You’ve worked on this awesome Streamlit app, or a Panel dashboard, or a Plotly Dash web frontend for your data science work, and now you’ve decided to share the work. Or you’ve built documentation for the project, and you need to serve it up. Except, if your company doesn’t have a dedicated platform for apps, you’re stuck! That’s because you’ve now got to share it from your laptop/workstation and point your colleagues to there (right… go ahead and email them a link to http://10.16.213.24:8501 right now…) and keep your computer running perpetually to serve the app in order for them to interact with it.
Or, you copy/paste the docs into a separate hosted solution, and now the docs are divorced from your code, leading to documentation rot in the long-run because it’s too troublesome to maintain two things simultaneously. That’s much too fragile. What you need is another hosting machine that isolated from your development machine, so that you can develop in peace while the hosting machine reliably serves up the working, stable version of your work.
The specific thing that you really need is a “Platform as a Service”. There’s a lot of commercial offerings, but if you’re “cheap” and don’t mind learning some new concepts to help you get around the web, then this essay is for you. Here, I’m going to show you how to configure and deploy Dokku as a personal PaaS solution that you can use at work and for hobby projects. I’m then going to show you how to deploy a static site (which can be useful for hosting blogs and documentation), and finally I’ll show you how to deploy a Streamlit app, which you can use to show off a front-end to your fancy modelling work. Along the way, I hope to also point out the “generalizable” ideas behind the steps listed here, and give you a framework (at least, one useful one) for building things on the web.
But, but why? If you’re part of a company… Your organisation might not be equipped with modern PaaS tools that will enable you, a data scientist, to move quickly from local prototype to share-able prototype. If, however, you have access to bare metal cloud resources (i.e. just gimme a Linux box!), then as a hacker-type, you might be able to stand up your own PaaS and help demonstrate to your infrastructure team the value of having one.
If you’re doing this for your hobby projects… You might be as cheap as I am, but need a bit more beefy power than the restrictions given to you on Heroku (512MB RAM, 30 minute timeouts, and limited number of records in databases), and you don’t want to pay $7/month for each project.
Additionally, you might want a bit more control over your hosting options, but you don’t feel completely ready to go fiddling with containers and networking without a software stack to help out just a bit.
More generally… You might like the idea of containers, but find it kind of troublesome to learn yet another thing that’s not trivial to configure, execute and debug (i.e. Docker). Dokku can be a bridge to get you there, as it automates much of the workflow surrounding Docker containers. It also comes with an API that both matches closely to Heroku (which is famous for being very developer-friendly) and also helps you handle proxy port mapping and domains easily.
Are you ready? Let’s go!
Pre-requisites I’m assuming you know how to generate and use SSH keys to remotely access another machine. This is an incredibly useful thing to know how to do, and so I’d recommend that you pick this skill up. (As usual, DigitalOcean’s community pages have a great tutorial.)
I am also assuming that you have access to a Linux box of some kind with an IP address that you can expose to the “internet”. The “internet” in this case can mean the world wide web if you’re working on a personal project, or your organisation’s “intranet” if you’re planning on only letting those in your organisation access the sites and apps that you will build.
I’m also assuming familiarity with git, which I consider to be an indispensable tool in a data scientist’s toolkit.
This last point is not an assumption, but an exhortation: you should be building your data app prototype in accordance to 12-factor app principles. It’ll make (most of) your Engineering colleagues delight in working with you. (Yes, there are some esoteric types that don’t subscribe to 12 factor app principles…) If you’ve never heard of it, go check it out here. It’s a wonderful read that will change how you build your data app prototypes for the better. It will also make handing over the app to Engineering much easier in the long-run, improving your relationship with the Engineering team that will take care of your app!
Set up a Box in the Cloud (optional if you have one) If you don’t already have another computer that you have access to, or if you’re curious on how to get set up in the cloud, then follow along these instructions.
To make things a bit more concrete, I’m going to rent a server on DigitalOcean (DO). If you don’t have a DigitalOcean account, feel free to use my referral link to sign up (and get free $100 credits that you can spend over 60 days) if you haven’t got a DigitalOcean account already. (Disclaimer: I benefit too, but your support helps me make more of this kind of content!) Once you’ve signed up and logged into the DigitalOcean cloud dashboard, you can now set up a new Droplet.
Droplets? To do so, click on the big green “Create” button, set up a Droplet with the following settings:
Ubuntu operating system “Standard” plan at $5/mo or $10/mo 1 Additional options: “monitoring” Authentication: you should use SSH keys (this is a pre-requisite for this essay). Hostname: Give it something memorable. Backups: Highly recommended. Give yourself peace of mind that you can rollback anytime. Once you’re done with that, hit the big green “Create Droplet” button right at the bottom of the screen!
Once your droplet is set up, you can go ahead and click on the “Manage > Droplets” left menu item, and that will bring you to a place where you can see all of your rented computers.
The Cloud Setup Dokku on your Shiny Rented Machine Let’s now go ahead and set up Dokku on your shiny new droplet.
Run the Dokku installation commands Dokku installation on Ubuntu is quite easy; the following instructions are taken from the Dokku docs. SSH into your machine, then type the following:
wget https://raw.githubusercontent.com/dokku/dokku/v0.20.4/bootstrap.sh;
CHECK FOR LATEST DOKKU VERSION!
http://dokku.viewdocs.io/dokku/getting-started/installation/
sudo DOKKU_TAG=v0.20.4 bash bootstrap.sh Wrap up Dokku installation If you’re on an Ubuntu installation, you’ll want to navigate to your Linux box’s IP address, and finish up the installation instructions, which involves letting your Dokku installation know about your SSH public keys. These keys are important, as the are literally the keys to letting you push your app to your Dokku box!
SSH Keys and Dokku For those of you who have CentOS or other flavours of Linux, you will need to follow analogous instructions on the Dokku website. I have had experience following the CentOS instructions at work, and had to modify the installation commands a little bit to work with our internal proxies.
Test that Dokku is working To test that your Dokku installation is working, type the following command:
dokku help That should show the Dokku help menu, and you’ll know the installation has completed successfully!
Optionally set proxies for Docker Now, because Dokku builds upon Docker, if you’re behind a corporate proxy, you might need to configure your Docker daemon proxies as well. You’ll then want to follow instructions on the Docker daemon documentation.
Those steps will generally be the same as what’s in the docs, though the specifics will change (e.g. your proxy server address).
Configure domain names The allure of Heroku is that it gives your app a name: myapp.herokuapp.com, or myshinyapp.herokuapp.com quite automagically. With Dokku and a bit of extra legwork, we can replicate this facility.
We’re going to set up your Dokku box uch that its main domain will be mydomain.com, and apps will get a subdomain myapp.mydomain.com.
Register a domain name Firstly, you’ll need a domain name from a Domain Name Service (DNS) registrar. Cloudflare seems to be doing all of the right things at the moment, so their domain name registration service is something I’m not hesitant to recommend (at least as of 2020). For historical reasons, I’m currently using Google Domains. At work, we have an internal service that lets us register an internal domain name. What matters most is that you have the ability to assign a domain name that points to your Dokku machine’s IP address.
Go ahead and register a domain name that reflects who “you” are on the web. For myself, I have a personal domain name that I use. At work, I registered a name that reflects the research group that I work in. Make sure that the name “points”/”forwards” to the IP address of your Dokku box.
Enable subdomains! To enable the ability to use subdomains like myapp.mydomain.com for each app, you’ll want to also configure the DNS settings. On your domain registrar, look for the place where you can customise “resource records”. On Google Domains, it’s under “DNS > Custom resource records”.
There, you’ll want to add an “A” record (as opposed to other options that you might see, like “CNAME”, “AAAA”, and others). The “name” should be *, literally just an asterisk. The IPv4 address should point to your Dokku machine. This is all that is needed.
What is an ‘A’ record? What then about the name *? To test whether your domain name is setup correctly, head over to the domain in your favourite browser. At this point, you should see the default NGINX landing page, as you have no apps deployed and no domains configured.
How do you pronounce ‘NGINX’? And what is NGINX? Deploy a test app Heroku provides a “Python getting started” repository that we will use to check that the installation is working correctly. This one deploys reliably with all of the vanilla commands entered. Leveraging this, I will also show you how to leverage your * A record to put in nice subdomains!
Clone the test app Firstly, git clone Heroku’s python-getting-started repository to your laptop/local machine (i.e. not your Dokku box).
Next, cd into the repository:
cd python-getting-started After that, add your Dokku box as a git remote to the repository:
git remote add dokku dokku@your-domain-name:python-getting-started Be sure to replace your-domain-name with your newfangled domain that you registered.
App name Now, push the app to your Dokku box!
git push dokku master Unlike your usual pushes to GitHub, GitLab or Bitbucket, you’ll see a series of remote outputs being beamed back to your terminal. What’s happening here is the build of the app! In particular, a Docker build is happening behind-the-scenes, so your app is completely self-contained and containerised on the Dokku box!
If everything went well, the last output beamed back to you should look like:
=====> Application deployed: http://mydomain.com:10161 Wonderful! Now let’s go back to Dokku and configure your app.
Configure the app domain and ports We’re now going to configure Dokku to recognise which subdomains should point to which apps.
Firstly, get familiar with the Dokku domains command:
On your Dokku box
dokku domains:help That should list out all of the Dokku domains sub-commands.
Usage: dokku domains[:COMMAND]
Manage domains used by the proxy
Additional commands:
domains:add
On your Dokku box
dokku domains:report python-getting-started The output should look something like this:
$ dokku domains:report python-getting-started =====> python-getting-started domains information Domains app enabled: false Domains app vhosts: Domains global enabled: false Domains global vhosts: This tells us that python-getting-started has no domains configured for it. We can now set it:
On your Dokku box
dokku domains:set python-getting-started python-getting-started.mydomain.com The output will look like this:
—–> Added python-getting-started.mydomain.com to python-getting-started —–> Configuring python-getting-started.mydomain.com…(using built-in template) —–> Creating http nginx.conf Reloading nginx Now, you should be able to go to http://python-getting-started.mydomain.com, and the page that gets loaded should be the “Getting Started with Python on Heroku” landing page!
So, what magic happened here? If everything deployed correctly up till this point, you’re good to go with deploying a data app on your Dokku machine!
Deploy your data app Deploying the python-getting-started app should have given you:
the confidence that your Dokku installation is working correctly, firsthand experience configuring Dokku, a taste of the workflow for deploying an app. Now, we’re going to apply that to a Streamlit app. I’ve chosen Streamlit because it’s got the easiest programming model amongst all of the Dashboard/app development frameworks that I’ve seen; in fact, I was able to stand up an explainer on the Beta distribution in under 3 hours, the bulk of which was spent on writing prose, not figuring out how to program with Streamlit.
Build a streamlit app (skip if you already have an app) If you don’t have a Streamlit app, here’s one that you can use as a starter, which simply displays some text and a button:
app.py
import streamlit as st
”””
First Streamlit App!
This is a dummy streamlit app. “””
finished = st.button(“Click me!”) if finished: st.balloons() Save it as app.py in your project directory.
Now, you can run the app with Streamlit:
On your local machine
streamlit run app.py You should see the following show up in your terminal:
You can now view your Streamlit app in your browser.
Local URL: http://localhost:8501
Network URL: http://
Add project-specific configuration files for Dokku Now, we need to add a few configuration files that Dokku will recognise.
requirements.txt Firstly, make sure you have a requirements.txt file in the project root directory, in which you specify all of the requirements for your app to run.
requirements.txt
streamlit==0.57.3 # pinning version numbers is good for apps.
put more below as necessary, e.g.:
numpy==0.16 With a requirements.txt file, Dokku (and Heroku) will automagically recognise that you have a Python app. Dokku will then create a Docker container equipped with Python, and install all of the dependencies in there. Declarative configuration FTW!
Procfile Next, you need a Procfile in the project root directory:
Procfile
web: streamlit run app.py Procfile The general syntax in the Procfile is: