Who should install Transcrobes on microk8s?

microk8s is a Kubernetes-in-a-snap install, intended to make setting up a Kubernetes cluster on a single physical machine or VM easy and relatively painless for development or non-critical CI tasks. It is not intended to be used for production, and makes many assumptions about the environment it is running on.

That said, if you don’t have access to a Kubernetes cluster and are aware of the limitations of running multi-node cluster software on a single machine, it is a perfect way to get started with Transcrobes. The maintainer is using Transcrobes on microk8s for his own language learning, so at least it will always be well tested!

Prerequisites

  • A recent Linux distribution on a virtual or physical machine

  • A (free) Azure Cloud account with the Translator Text API enabled. Transcrobes (currently) uses the Azure (Bing) translation API for machine translation. You do need a credit card for this but your credit card will NOT be charged unless you EXPLICITLY activate charging. You don’t need to do this to benefit from the 2 million characters Azure allows you to lookup/translate/transliterate per month for free. It’s a great service and worry-free - it will stop translating rather than start charging your card unless charging is activated. When did Microsoft go and get so awesome?!? :-). Set up a Translator Text API token and write that down somewhere.

  • Familiarity with running commands on the Linux command line. Only very basic knowledge of Kubernetes is assumed or required.

  • (Optional, sort of) A public IP pointing to the machine with two open ports, one of which is port 80 (but see below). Two publically resolvable (sub-)domains pointing to this IP (CNAME or A records).

In order to simulate a full Kubernetes environment, microk8s does various things with the networking stack automatically. As such, installation on a machine that is used for other purposes can be problematic. If you decide to run microk8s on a machine that does other things, you should be aware that other sofware can interfere with it and can cause hard-to-diagnose issues.

If you don’t have a public IP pointing to your machine, you will need to manually install SSL certificates into the browsers you will be using for testing. Transcrobes is designed to be used everywhere, so for it to actually be useful you’ll need a public IP. You can test without one but functionality will be limited and setup a pain.

Installing microk8s for Transcrobes

Transcrobes has been tested extensively on microk8s 1.13.2+. Some earlier versions may work but no effort has been made to ensure it works properly on them.

microk8s has its own docker and kubectl, and this document assumes you do not have a separate ones installed. At the very least you should disable docker while you are using microk8s to avoid interaction issues. If you don’t know how both these applications work under the hood, it is easiest to simply uninstall the non-microk8s versions, at least while you are testing Transcrobes. If you have a web server listening on port 80, you will also have extra steps to perform, and you will have to disable it when setting up Transcrobes.

You need to have snapd installed on the machine. Full instructions for various distributions can be found on the Snapcraft site. On recent Ubuntu installations that is as simple as:

sudo apt install snapd

Once installed, you can install microk8s:

sudo snap install microk8s --classic

Until this issue is fixed, you will need to modify a configuration file to ensure performance is satisfactory. Remove the following line

--proxy-mode=userspace

From

/var/snap/microk8s/current/args/kube-proxy

This option was added to microk8s to allow for use in certain cloud environments (Azure, more information in the Github PR), and makes the network 4x slower for Transcrobes. Restart microk8s.

sudo microk8s.stop
sudo microk8s.start

In order to allow for easier copy/pasting, you should add aliases for kubectl and docker

sudo snap alias microk8s.kubectl kubectl
sudo snap alias microk8s.docker docker

Now enable the two microk8s-bundled services we’ll be using

sudo microk8s.enable storage dns

Install Helm

Transcrobes is packaged as a Kubernetes Helm chart, so you need to install that into microk8s first. First install the command line client

sudo snap install helm --classic

Now install the Helm service user into the cluster

kubectl apply -f https://raw.githubusercontent.com/istio/istio/master/install/kubernetes/helm/helm-service-account.yaml

And then install helm into your cluster

helm init --service-account tiller

Optional, sort of, install cert-manager

Transcrobes takes the pain out of provisioning and updating SSL certificates by leveraging the Kubernetes application cert-manager. If you really don’t have a public IP or open ports then you will have several manual steps to set up a certificate. We’ll update with instructions on that here soon. Otherwise

helm install --name cert-manager --namespace kube-system stable/cert-manager

Install Transcrobes

Sanity check

Before installing Transcrobes, you should ensure all the prerequsite services are installed and running properly

kubectl get pods -n kube-system

Should give you something very similar to

NAME                                    READY   STATUS    RESTARTS   AGE
cert-manager-6464494858-dsnq5           1/1     Running   6          10d
hostpath-provisioner-599db8d5fb-57mbr   1/1     Running   6          10d
kube-dns-6ccd496668-cgq2w               3/3     Running   18         10d
tiller-deploy-8485766469-tj9k7          1/1     Running   8          10d

The GUIDs at the end of the application (pod) names will be different but you should definitely see Running in the STATUS column and 1/1 or 3/3 in the READY column for all 4 lines. You will not have a cert-manager line if you are not going to install cert-manager and let Transcrobes automatically manage SSL certificates.

Install the chart repository

Now you have the required services installed, we can finally install Transcrobes. First add the Transcrobes chart repository

helm repo add transcrobes https://charts.transcrob.es

Install Transcrobes

Now actually install Transcrobes. Helm calls an installation into a cluster a “release”, and that term is used below to mean “your installation of Transcrobes in microk8s” - you are “releasing” a version of the application.

helm install transcrobes/transcrobes --name releaseName

Note: You can call your release anything - releaseName is used in the examples here.

Helm will then take care of downloading and installing the various components into microk8s/Kubernetes. It could take a while, depending on your internet connection, as there are some large (docker) image files that need to be downloaded.

Check the status with

kubectl get pods

You should eventually end up with something very similar (except GUIDs) to the following, so wait until you do. If you don’t, then jump on one of the lists and ask for help:

NAME                                                        READY   STATUS        RESTARTS   AGE
releaseName-nginxingress-controller-759d5f8967-7rtxt        1/1     Running       0          19h
releaseName-nginxingress-default-backend-755c69dd4d-5kq2d   1/1     Running       0          19h
releaseName-transcrobes-corenlp-5cddd78cbd-4hfxn            1/1     Running       3          19h
releaseName-transcrobes-transcrobes-85c4569cbf-qd27b        1/1     Running       0          18h
tcr-postgres-0                                       1/1     Running       0          19h

Again, the important columns are READY and STATUS. STATUS should be Running and you should have the same number to the left of the slash (healthy containers) as to the right (total number of containers) for each pod in the Ready column.

Notes

If something goes wrong, then by far the easiest way to move forward is to completely purge the Helm release and start again.

helm del releaseName --purge

You can now try the release again with the install command. The (useful) data for Transcrobes is all in the database, and even a “complete” purge will NOT destroy the database permanent storage - you need to do that manually. The database data partition is almost certain to NOT be the issue, but if you really want to you can remove that with

kubectl delete pvc data-tcr-postgres-0

kubectl is an awesome tool for managing Kubernetes clusters. It allows you to do pretty much everything. But you need to have some serious json-parsing-foo in order to do many things, and the commands can be very long and difficult to grasp. You can add aliases for pretty much everything though, and the following validation and activation sections assume the following aliases.

alias tpn.tcr='kubectl get pod -l component=transcrobes -o jsonpath="{.items[0].metadata.name}"'
alias tpn.nlp='kubectl get pod -l component=corenlp -o jsonpath="{.items[0].metadata.name}"'
alias tpn.pgs='kubectl get pod -l app=postgresql -o jsonpath="{.items[0].metadata.name}"'

alias tip.tcr='kubectl get svc -l component=transcrobes -o jsonpath="{.items[0].spec.clusterIP}"'
alias tip.nlp='kubectl get svc -l component=corenlp -o jsonpath="{.items[0].spec.clusterIP}"'
alias tip.pgs='kubectl get svc -l app=postgresql -o jsonpath="{.items[0].spec.clusterIP}"'

These aliases consist of two parts - a prefix tpn for Transcrobes Pod Name or tip for Transcrobes (service) IP, and a suffix for each application - tcr for Transcrobes Core, ank for Ankrobes Server, nlp for Standford’s CoreNLP and pgs for PostgreSQL.

Put this in a file and source it manually

source whereYouPutIt.sh

Or simply add these to your .bashrc, .bash_aliases, or whatever is sourced when you log on to the machine you are installing on. Don’t forget to manually source that file before proceeding!

Setting up the database and adding a user

Transcrobes Core is a Django application, which you need to initialise

kubectl exec $(tpn.tcr) -- ./runmanage.sh migrate

This should give you output similar to the following. There may be additional lines corresponding to updates if you execute this later but as long as it reports success, the process has finished correctly.

Operations to perform:
  Apply all migrations: admin, auth, contenttypes, enrich, sessions
Running migrations:
  Applying contenttypes.0001_initial... OK
  Applying auth.0001_initial... OK
  Applying admin.0001_initial... OK
  Applying admin.0002_logentry_remove_auto_add... OK
  Applying admin.0003_logentry_add_action_flag_choices... OK
  Applying contenttypes.0002_remove_content_type_name... OK
  Applying auth.0002_alter_permission_name_max_length... OK
  Applying auth.0003_alter_user_email_max_length... OK
  Applying auth.0004_alter_user_username_opts... OK
  Applying auth.0005_alter_user_last_login_null... OK
  Applying auth.0006_require_contenttypes_0002... OK
  Applying auth.0007_alter_validators_add_error_messages... OK
  Applying auth.0008_alter_user_username_max_length... OK
  Applying auth.0009_alter_user_last_name_max_length... OK
  Applying enrich.0001_initial... OK
  Applying enrich.0002_auto_20180910_0724... OK
  Applying enrich.0003_bingapitransliteration... OK
  Applying enrich.0004_auto_20181008_0214... OK
  Applying sessions.0001_initial... OK

Each actual end-user of the application needs a user. A helper script for creating the user and initialising various database structures is available in the Transcrobes Core pod. Simply connect with the following

kubectl exec -it $(tpn.tcr) -- sh

And then

./adduser.sh your-username

You can add the user password directly on the command line via the -p or --password switches, or you will be prompted for a password if none is provided.

Activating the release

The Transcrobes Helm chart has default values but they are obviously not usable out of the box, given you have an API token and two (sub-)domains that you need to configure to actually use Transcrobes.

Helm has two ways of overriding the default values, either via command line parameters or in an overrides file. In order to keep the instructions simple, we will only use the overrides file here. Check out the Helm docs for more information.

Download the sample file and modify it according to your actual values

curl https://gitlab.com/transcrobes/charts/raw/master/sample-overrides.yaml > overrides.yaml
## Don't uncomment the ingress lines until AFTER you have validated
## you can get certificates from letsencrypt-staging!
transcrobes:
  bingSubscriptionKey: yourAPIToken
  hosts: ['transcrobes.example.com']
  #ingress:
  #  tls:
  #    issuer: letsencrypt-prod

## Only uncomment if you need to modify the incoming port
#nginxingress:
#  controller:
#    service:
#      nodePorts:
#        https: yourPortNumber

Do NOT uncomment the commented lines yet!

This allows you to run the following, which you should do after replacing the values with your own

helm upgrade releaseName transcrobes/transcrobes --install -f overrides.yaml

NOTE: The following section contains over-simplifications so drastic they aren’t strictly accurate - if you know better then you’ll know what is intended!

By default, the Transcrobes chart makes the applications available on port 32443 of the public (or NATed) IP of the machine because, by default, Kubernetes/microk8s only lets you use ports 30000-32767. It’s possible to make the applications available on the standard ports (80 / 443) but it’s a little dangerous, particularly if you are running other stuff on the machine. If you don’t have control over which ports you get then this can be changed by uncommenting in the example file, after updating with the port you need

#nginxingress:
#  controller:
#    service:
#      nodePorts:
#        https: yourPortNumber

Your admin may be able to map the incoming port on the external firewal to a different port on your machine - just ask them to map to 32443, if they can, to avoid further work. Note that if the port is not in the 32000-32767 range, you will need to add the following

--service-node-port-range=yourPortsStart-yourPortsFinish

- where yourPortsStart and yourPortsFinish correspond to the range of ports you have pointing to your machine - to the end of the following file

/var/snap/microk8s/current/args/kube-apiserver

And restart microk8s

sudo microk8s.stop
sudo microk8s.start

You can update the overrides file as often as you wish. Simply run

helm upgrade releaseName transcrobes/transcrobes --install -f overrides.yaml

To apply the changes. Helm is intelligent enough to only update what has changed so, for example, you won’t be restarting any of the applications if you only make a change to one of the proxies.

Validating the release

You should now be able to access the application on the port configured, execute:

curl -k https://transcrobes.example.com:32443/hello

Which should return

Hello, World!

To validate Transcrobes Core.

If you are not going to set up proper SSL certificates then you are done. Follow up in the installing clients section to start using Transcrobes on actual devices.

Setting up Let’s Encrypt certificates

The Transcrobes helm chart provides automated Let’s Encrypt (free) certificate setup and renewal via cert-manager. Somewhat unfortunately for us, Let’s Encrypt will only do automated HTTP validation on port 80, or 443 if you have an existing certificate (which we obviously don’t).

If any of the steps here seem too complicated, you also have the option of manually doing Let’s Encrypt DNS validation and manually provisioning the certificate. You are also not obliged to use a Let’s Encrypt certificate - any valid SSL certificate is Ok. We will update these instructions soon with help on setting up a certificate manually.

cert-manager works by checking the validity of the certificates it manages every now and then and when it’s time to renew, it will attempt to update.

If your public IP can have port 80 open and pointing to your machine, then a script is available to temporarily redirect port 80 to the port that cert-manager is expecting the validation request to come to from. If you have a web server listening on port 80, you will need to deactivate that, or set up cert-manager to listen on specific ports and redirect requests to your Transcrobes domains to that port via the web server. The details of this are out of scope for this document but ask on one of the lists if you need help.

Assuming your machine can be accessed on its public IP on port 80, download the script

curl https://gitlab.com/transcrobes/charts/raw/master/microk8s-port80.sh > microk8s-port80.sh

And set it up to run every few (1-3 is probably a good value) minutes as a root cron job/systemd timer. The script requires socat and jq, which can be installed on Ubuntu with

sudo apt install socat jq

The script assumes that the public (or NATed) IP address is on the eth0 device but you can override this by providing the device as a parameter to the script, for example

bash microk8s-port80.sh enp0s3

The script runs and queries Kubernetes to see whether cert-manager is waiting for a validation request. If it is, it will redirect port 80 to the port that cert-manager is listening on, and then stop redirecting when cert-manager has finished the provisioning or update. If cert-manager is not expecting a validation request, the script will exit (with success) immediately.

Once the script is set up (or you are manually running it in a terminal in a loop), you can go ahead and activate the letsencrypt issuers

kubectl apply -f https://gitlab.com/transcrobes/charts/raw/master/transcrobes/utils/letsencrypt-staging-cluster-issuer.yaml
kubectl apply -f https://gitlab.com/transcrobes/charts/raw/master/transcrobes/utils/letsencrypt-prod-cluster-issuer.yaml

After you execute this, at some point over the next few minutes cert-manager will wake up, realise there is a domain it needs to provision/update, and start working. If the port redirect script is running, then port 80 will get redirected until cert-manager has finished. You can check whether cert-manager issuer is running by executing

kubectl get pods | grep issuer

If that command returns something, then an issuer is working. You know that cert-manager has finished when the above command returns nothing. If it can’t succeed for any reason, it will stay working until it does. Make sure it has actually appeared and then disappeared before assuming it’s finished! The script has pretty detailed logging so check the output of the script to be sure.

Assuming cert-manager finishes successfully, you can proceed to actually get a valid, production certificate. Let’s Encrypt has rather strict limits on its production server, so make sure you have no working issuers for the staging server before proceeding. In order to get production certificates, simply uncomment the ingress section in the overrides.yaml file, like the following

## Don't uncomment the ingress lines until AFTER you have validated you can get certificates from letsencrypt-staging
transcrobes:
  bingSubscriptionKey: yourAPIToken
  hosts: ['transcrobes.example.com']
  ingress:
    tls:
      issuer: letsencrypt-prod

## Only uncomment if you need to modify the incoming port
#nginxingress:
#  controller:
#    service:
#      nodePorts:
#        https: yourPortNumber

And run as before

helm upgrade releaseName transcrobes/transcrobes --install -f overrides.yaml

When the issuer has finished you should check the certificates. Just open a browser and put in (with your domain and port) - https://ankrobes.example.com:32443. If you get Anki Sync Server, your Ankrobes Server is working. Transcrobes can only be accessed with an authenticated user, so check https://your-username:your-password@transcrobes.example.com:32443/hello. It should return Hello, World! with no certificate error. If both those work, congratulations, you have finished!

Now you too can follow up with the installing clients section to start using Transcrobes on actual devices. You will now be able to use the Transcrobes client applications on any internet-connected device.

Problems

Let us know on the lists if you encounter any issues, or you think these instructions aren’t clear or have an error.