Developing with Squonk2 (locally)

Instructions for deploying the Squonk2 application suite, consisting of the Account Server, Data Manager, and its Operators to a local Kubernetes cluster (Minikube, Docker Desktop or Rancher Desktop). Instructions for the installation of a local Kubernetes cluster can be found on the Data Manager Wiki’s Development Policies section.

A working knowledge of the following will prove extremely beneficial: -

  • Kubernetes
  • Ansible, specifically the Ansible role playbook structure

Background

Squonk2 consists of: -

  • An Account Server and its API (the AS) to manage
    • Organisations and Units
    • Product Subscriptions and associated Charges
    • Assets
    • Event Streams (via a separate Event Streaming Service deployment)
  • The Data Manager Service and its API (the DM)
    • A Job operator to run DM jobs
    • A Jupyter" operator to run Jupyter notebook server
  • The Graphical User Interface (the UI)

Squonk2 also relies on an infrastruture deployment to provide a database server (PostgreSQL), Keycloak authentication service, and messaging service (RabbitMQ) that is shared between the AS and DM.

Here we discuss the deployment of a local infrastructure (that relies on Keycloak in out AWS cluster), and the deployment of Squonk2.

Configuration of the application components is achieved using Ansible playbooks. Some deployments, due to their complexity, have a large number of playbook variables but a basic set is provided in each of the playbook repositories in a parameters-local.yaml file. You are encouraged to review each application’s variables, which are accompanied by in-line documentation, in the defaults/main.yaml and vars/main.yaml files in each repository’s roles directory. We put common user-definable variables in defaults and less common variables in vars.

This guide will cover the installation of a Squonk2 environment that will result in the following Kubernetes namespaces, application components and storage volumes: -

Preparation

You will need…

  1. A linux/MacOS development machine (desktop/laptop) with at least 16GiB RAM and around 10-20Gi of Disk.
  2. GitHub and GitLab accounts
  3. The git client
  4. Python 3 (Ideally 3.12)
  5. Poetry v1.8
  6. A kubectl compatible with your cluster
  7. A kubernetes cluster (like the one described in our Docker Desktop or minikube notes). It must have: -
    • An nginx ingress controller
    • Labelled nodes

Note: Not all our repositories are public, so you may need to be given access to them.
Typically those on GitLab are private

The container images we deploy here are built using ARM and Intel architectures and should therefore run using natively on Intel or ARM (and Apple silicon) machines.

Clone all the playbook repositories to a suitable directory on your development machine: -.

git clone https://github.com/informaticsmatters/ansible-infrastructure.git
git clone https://gitlab.com/informaticsmatters/squonk2-account-server-ansible.git
git clone https://gitlab.com/informaticsmatters/squonk2-data-manager-ansible.git
git clone https://github.com/informaticsmatters/squonk2-data-manager-jupyter-operator.git
git clone https://github.com/informaticsmatters/squonk2-data-manager-job-operator.git
git clone https://github.com/informaticsmatters/squonk2-data-manager-ui-ansible.git
git clone https://github.com/informaticsmatters/squonk2-fastapi-ws-event-stream.git

Checkout the corresponding branch for each Ansible repository : -

  • ansible-infrastructure: master
  • squonk2-account-server-ansible: master
  • squonk2-data-manager-ansible: master
  • squonk2-data-manager-jupyter-operator: main
  • squonk2-data-manager-job-operator: main
  • squonk2-data-manager-ui-ansible: master
  • squonk2-fastapi-ws-event-stream: main

And, finally, clone the DM repository (which contains a number of markdown documents): -

git clone https://gitlab.com/informaticsmatters/squonk2-data-manager.git

1. Define KUBECONFIG and hostnames

Set the KUBECONFIG environment variable to point to the kubernetes config file suitable for your cluster, often this is created automatically for you by the cluster application and typically written to ~/.kube/config

You might want to keep separate files for each cluster, and start by copying the default from ~/.kube/config

To simplify ingress routing we’ll rely on some hostnames that you can set in your /etc/hosts file. A Docker Desktop cluster can be found on 127.0.0.1 but minikube provides its own IP address that can be found by running minikube ip. Whatever you cluster’s IP address is, set appropriate mappings in your /etc/hosts file…

127.0.0.1 rabbitmq
127.0.0.1 squonk2
127.0.0.1 account-server-ess

Finally, ensure that you have setup the KUBECONFIG environment variable, i.e. have set something like: -

export KUBECONFIG=~/.kube/config

2. Enter the playbook environment (ansible-infrastructure)

From the root of the ansible-infrastructure repository…

You can use one environment to run all the deployment playbooks. The one in the Ansible Infrastructure repository is the best to start with. It’s managed by poetry. From the root of the Ansible Infrastructure clone enter the environment using poetry’s shell: -

poetry shell
poetry install --sync

You will run all the playbooks from this environment.

3. Deploy the Infrastructure (a Database and RabbitMQ)

From the root of the ansible-infrastructure repository run the following: -

ansible-playbook site.yaml -e @parameters-local.yaml

This will install the core infrastructure components (a database and rabbitmq) to the Namespace im-infra. Wait, and check that RabbitMQ is available at http://rabbitmq. The connection will be insecure but click through the warnings in our browser to see the RabbitMQ login page.

If you actually want to login, the rabbitmq credentials are stored in the im-rabbitmq Secret in the im-infra Namespace where you will find a user (admin) and a password.

IMPORTANTLY if you CANNOT access the RabbitMQ console at http://rabbitmq then you should stop here and understand why. If you cannot access RabbitMQ it is extremely unlikely that you will be able to access other services.

4. Deploy the Account Server (AS)

Move to the Account Server (AS) Ansible project…

cd ../squonk2-account-server-ansible/

Here you will need to ensure that you have one or two environment variables setup. You might want to put these in your ~/.bash_profile or similar file to avoid having to set them every time you want to run the playbooks. You’l find the token under Notes in Keepass GitLab -> account-server -> GitLab Registry Deploy Token - (squonk2): -

export IM_DEV_AS_PULL_SECRET=??????

The parameters-local.yaml file should define a default/sensible AS image tag. If you want to change it then simply define your own value, e.g. export IM_DEV_AS_IMAGE_TAG=4.3.0

Then run the playbook (using the same command as earlier)…

ansible-playbook site.yaml -e @parameters-local.yaml

Check the AS pods are running and you can access the AS swagger web-page at https://squonk2/account-server-api/api/

If you’re logging in the client ID for the AS is "account-server-api-local"

5. Deploy the Data Manager (DM)

Move to the Data Manager (DM) Ansible project…

 cd ../squonk2-data-manager-ansible/

You’ll now need the Data Manager GitLab deployment token. It’ll be in Keepass GitLab -> data-manager -> GitLab Registry Deploy Token - (squonk2): -

export IM_DEV_DM_PULL_SECRET=??????

The parameters-local.yaml file should define a default/sensible DM image tag. If you want to change it then simply define your own value, e.g. export IM_DEV_DM_IMAGE_TAG=4.2.0

Now, just as before, run the playbook…

ansible-playbook site.yaml -e @parameters-local.yaml

Check the DM pods are running and you can access the DM swagger web-page at https://squonk2/data-manager-api/api/

If you’re logging in the client ID for the DM is "data-manager-api-local"

This local installation of the DM includes the im-test collection of Jobs, which are automatically installed as the DM starts.

6. Deploy the DM Job Operator

Move to the Job Operator project…

cd ../squonk2-data-manager-job-operator/

The parameters-local.yaml file should define a default/sensible Job Operator image tag. If you want to change it then simply define your own value, e.g. export IM_DEV_JOBO_IMAGE_TAG=31.1.0

Run the playbook…

ansible-playbook site.yaml -e @parameters-local.yaml

7. Deploy the DM Jupyter Operator

Move to the Jupyter Operator project…

cd ../squonk2-data-manager-jupyter-operator/

The parameters-local.yaml file should define a default/sensible Jupyter Operator image tag. If you want to change it then simply define your own value, e.g. export IM_DEV_JUPO_IMAGE_TAG=31.1.0

Run the playbook…

ansible-playbook site.yaml -e @parameters-local.yaml

8. Deploy the Data Manager UI

Move to the root of your clone of the UI repository: -

cd ../squonk2-data-manager-ui-ansible/

You will need the UI keycloak client ID secret, and use it as the value for the following environment variable: -

export IM_DEV_UI_CLIENT_SECRET=?????

The parameters-local.yaml file should define a default/sensible UI image tag. If you want to change it then simply define your own value, e.g. export IM_DEV_UI_IMAGE_TAG=5.4.0

Then, like earlier, run: -

ansible-playbook site.yaml -e @parameters-local.yaml

Check the UI pods and the web-page at https://squonk2/data-manager-ui/

9. Deploying the AS Event Streaming Service (FastAPI/WS)

Move to the root of your clone of the Event Streaming FastAPI WebSocket repository: -

cd ../squonk2-fastapi-ws-event-stream/

Before deploying the Event Streaming Service you will need the RabbitMQ
Event Stream user credentials. If you are running an AS that supports the eventstream service
a RabbitMQ user will have been created. Consult the rabbitmq secret
in the im-account-server Namespace and look for the vhost_es_user_password property. The username will be eventstream.

When you have these set the corresponding variables before running the playbook: -

export IM_DEV_ESS_STREAM_USER=eventstream
export IM_DEV_ESS_STREAM_PASSWORD=password123

The parameters-local.yaml file should define a default/sensible ESS image tag. If you want to change it then simply define your own value, e.g. export IM_DEV_ESS_IMAGE_TAG=2.0.0

The playbooks are in the ansible directory of the streaming service repo: -

cd ansible

Then, like earlier, run: -

ansible-playbook site.yaml -e @parameters-local.yaml

10. Loading additional Jobs and workflows into the DM

The DM is installed with a collection of im-test Jobs but you can also load more Job definitions if you have a URL to their manifests. A series of supported Job manifests and their URL locations can be found in the DM Wiki’s Day 1 Jobs article.

Using an admin account you can load your chosen selection of Jobs using the DM Swagger /admin/job-manifest PUT endpoint.

Similarly, you can load workflows into the DM using the DM Swagger /workflow POST endpoint where the DM repository has some example workflows that you can load.

  • Give the workflow a “name” if you don’t like the default "My Workflow"
  • In “definition_file” field click Choose File and select isolated-linear-im-test.yaml from your clone of the squonk2-data-manager repository’s workflow-definitions directory. This workflow uses the built-in set of Jobs so there’s no need to load any more in order to use it.
  • Leave the “scope” set to "GLOBAL"
  • Click "Execute"

In response you should get something like this: -

{
  "id": "workflow-9d277044-c46d-4ff1-8a0e-9bff6f35a98a",
  "validated": true
}

The “id” will be unique for you. Importantly, the validated property should be true.

Once you have loaded the workflow you should be able to run it.


Resetting the kubernetes cluster

You can reset your local cluster using Rancher Desktop: -

  • From Rancher Desktop select Troubleshooting -> Reset kubernetes