GKE-ENKI-GitLab-agent

GKE-ENKI-GitLab-agent is a project to manage and configure GitLab Kubernetes Agent for the ENKI Google Cloud cluster and to configure all required cluster tools for production, monitoring, and backup.

PublishedJan 14, 2026

Loading actions...

5 minBeginnerpromptSingle file

Skill content

Main instructions and any bundled files for this skill.

markdown

GKE-ENKI-GitLab-agent

▪️ Future roadmap
▪️ Installed cluster components (dependent order)
▪️ CI/CD for automated deployment and maintenance
▪️ Some useful kubectl commands
▪️ Kubernetes cluster configuration
▪️ GitLab Kubernetes agent installation
▪️ Tearing down and reinstalling the agent

Future roadmap

Fully integrate GitLab Kubernetes Agent for GitOps as an alternative to using GitLab Runner and Helm.

The agent has to mature to handle sequenced YAML deploys, and the agent must operate with clusterwide admin privileges to make this integration possible.
Consider adding the following:
- User billing and tracking (using Kubecost)
- Runbooks for JupyterLab (notebook-based) GitOps using Rubix/Nurtch
- Cloudwatch integration
- Elastic Container Service
Investigate the Google Cloud Run serverless platform.

Port knative Geobarometer and MELTS web services to remove any dependence on the Kubernetes cluster.

Installed cluster components (dependent order)

GitLab Kubernetes Agent

Entity that attaches a GKE cluster to this repository (configuration notes below).
GitLab Runner

Gitlab Runner allows CI jobs to run on the cluster in privileged mode, which allows us to execute kubectl and helm commands to perform GitOps tasks using YAML files stored in this repository. Basically, the runner gives us the functionality of Google Cloud Shell or a desktop connection of gcloud/kubectl using GitLab CI.
Kubernetes NGINX Ingress Controller

The ingress controller is utilized to expose endpoints of services to external ports. There are multiple ingress controllers operating on the cluster. This one is used to expose Grafana and Kasten K10 endpoints. Another is built into JupyterHub to expose that endpoint.
Cert Manager

Used by the ingress controller to acquire and attach TLS certificates to ingress external endpoints so that ports can support https and encrypted traffic.
Prometheus and Grafana (exposed at https://cluster.enki-portal.org/)

The Kube Prometheus stack (with Grafana) monitors the cluster and exposes metrics at an external endpoint so that cluster performance can be assessed.
Google Cloud Storage

Storage independent of the Kubernetes cluster that is utilized for backups of cluster resources. The backup service (Kasten K10) is capable of restoring and migrating the cluster using this independent storage.
Kasten K10 (exposed at https://k10.enki-portal.org/k10/)

Backup, restoration, and migration tool for Kubernetes
JupyterHub

Service that hosts the ENKI server. JupyterHub exposes single-user pods that host the ThermoEngine Docker container image with a JupyterLab user interface. It also allocates and maintains access to user-based persistent storage.
1. Testing installation
  
  This installation is for testing options and configuring possible upgrades to the production server. For cost reasons, it is normally not running.
2. Production installation
  
  This installation is the production server exposed at https://server.enki-portal.org/ .
Knative web services

Service to expose stateless, scalable web services. These services should probably be moved outside the cluster and exposed using the Google Cloud Run serverless platform. See Future RoadMap above.
MySQL (exposed as http://mysql.enki-portal.org:3306/ )

Database server that currently holds the LEPR/TraceDs as well as some smaller databases (Stixrude, Berman, Inforex, etc.) that are used by cluster apps.

CI/CD for automated deployment and maintenance

The .gitlab-ci.yml YAML file performs a number of functions:

Deploys manifests using GitLab Kubernetes Agent to perform GitOps tasks
Runs helm and kubectl jobs on the cluster to perform GitOps tasks
Functions as the downstream pipeline for related projects that generate content related to the cluster (See the GitLab project https://gitlab.com/ENKI-portal/jupyterhub_custom)

Some useful kubectl commands

Commands for managing namespaces and their resources:

kubectl create ns gitlab-runner
kubectl delete all --all -n {namespace}

Get GitLab usernames associated with persistent storage volumes:

kubectl --namespace jhub describe persistentvolumeclaims | grep "hub.jupyter.org/username"

Restart hub on cluster using Google Cloud Shell in order to update ENKI-portal/jupyterhub_custom to amend login page:
```
helm upgrade --cleanup-on-fail jhub jupyterhub/jupyterhub --version=1.1.3 --namespace jhub --reuse-values
```

Kubernetes cluster configuration

The following Google Cloud setup instructions are from the Zero to JupyterHub document https://zero-to-jupyterhub.readthedocs.io/en/latest/kubernetes/google/step-zero-gcp.html, as found in October 2021.

Using Google Cloud Shell, install kubectl and helm using gcloud after enabling the Kubernetes Engine API.

Create a managed kubernetes cluster with a default node pool:

gcloud container clusters create \
  --machine-type n1-standard-2 \
  --enable-autoscaling \
  --max-nodes=6 \
  --min-nodes=2 \
  --zone &#x3C;compute zone from the list linked below> \
  --cluster-version latest \
  &#x3C;CLUSTERNAME>

<CLUSTERNAME> is enkiserver
<compute zone from the list linked below> is us-west1-a

Elevate the user Google Cloud account for administrative functions:

kubectl create clusterrolebinding cluster-admin-binding \
  --clusterrole=cluster-admin \
  --user=&#x3C;GOOGLE-EMAIL-ACCOUNT>

<GOOGLE-EMAIL-ACCOUNT> is email address of Google Cloud account owner

Create a node pool for users:

gcloud beta container node-pools create user-pool \
  --machine-type n1-standard-2 \
  --num-nodes 0 \
  --enable-autoscaling \
  --min-nodes 0 \
  --max-nodes 6 \
  --node-labels hub.jupyter.org/node-purpose=user \
  --node-taints hub.jupyter.org_dedicated=user:NoSchedule \
  --zone us-central1-b \
  --cluster &#x3C;CLUSTERNAME>

After you complete these steps, two node pools are up and running. The default node pool is used to run cluster-wide apps, while the tainted user node pool is used to launch nodes for single-user Jupyter pods. Six nodes in the user pool should be able to accommodate about 100 users doing small-scale ENKI-related modeling.

GitLab Kubernetes Agent installation

The following instructions are from the GitLab document https://docs.gitlab.com/ee/user/clusters/agent/#set-up-the-kubernetes-agent-server, as found in October 2021.

Create a config.yaml file in the repository at .gitlab/agents/primary-agent with the contents:
```
gitops:
  manifest_projects:
  - id: "enki-portal/gke-enki-gitlab-agent"
    paths:
    - glob: 'generated-manifests/**/*.{yaml,yml,json}'
    inventory_policy: adopt_if_no_inventory
```
- The ID is the repository name that contains the manifest files (this repository).
- The glob is altered from the default suggestion to look only at YAML files in the folder and subfolders of generated-manifests.
- The inventory_policy is changed from the default suggestion to allow the agent to inherit the management of applications that are already running on the cluster when their YAML manifests are added to the generated-manifests file hierarchy.
Multiple manifest projects can be defined; future plans will allow these to be private repositories.

Currently the agent repository must be public; future plans will allow the agent to be associated with a group.
Create the agent in GitLab (Infrastructure > Kubernetes clusters) and generate a secret token. Assign this token to a pipeline environment variable (Settings > CI/CD > Variables) with the name GITLAB_AGENT_TOKEN. Make sure that the value is protected and masked in order to keep it hidden in pipeline logs.

In Google Cloud Shell, execute the following to create a namespace for the agent:

kubectl create ns gitlab-kubernetes-agent

Then install the agent, with the appropriate token value substituted for $(GITLAB_AGENT_TOKEN):

docker run --pull=always --rm \
    registry.gitlab.com/gitlab-org/cluster-integration/gitlab-agent/cli:stable generate \
    --agent-token=$(GITLAB_AGENT_TOKEN) \
    --kas-address=wss://kas.gitlab.com \
    --agent-version stable \
    --namespace gitlab-kubernetes-agent | kubectl apply -f -

Upgrade the GitLab agent service account to have a cluster-admin role (so that it can create secrets, pods, config maps, etc. in arbitrary cluster namespaces) by executing first in Google Cloud Shell:

kubectl get rolebindings,clusterrolebindings --all-namespaces  \
    -o custom-columns='KIND:kind,NAMESPACE:metadata.namespace,NAME:metadata.name,SERVICE_ACCOUNTS:subjects[?(@.kind=="ServiceAccount")].name' | grep gitlab-agent

Note that this critical step is missing from the GitLab documentation. The command gives the output:

ClusterRoleBinding   &#x3C;none>          cilium-alert-read                                      gitlab-agent
ClusterRoleBinding   &#x3C;none>          gitlab-agent-gitops-read-all                           gitlab-agent
ClusterRoleBinding   &#x3C;none>          gitlab-agent-gitops-write-all                          gitlab-agent
ClusterRoleBinding   &#x3C;none>          gitlab-agent-read-binding                              gitlab-agent
ClusterRoleBinding   &#x3C;none>          gitlab-agent-write-binding                             gitlab-agent

Apply the binding with the command:

kubectl create clusterrolebinding gitlab-agent-cluster-admin-binding --clusterrole=cluster-admin --serviceaccount=default:gitlab-agent
kubectl get clusterrolebinding | grep gitlab-agent

The command gives output such as the following:

gitlab-agent-cluster-admin-binding                     ClusterRole/cluster-admin                                          12s
gitlab-agent-gitops-read-all                           ClusterRole/gitlab-agent-gitops-read-all                           162d
gitlab-agent-gitops-write-all                          ClusterRole/gitlab-agent-gitops-write-all                          162d
gitlab-agent-read-binding                              ClusterRole/gitlab-agent-read                                      162d
gitlab-agent-write-binding                             ClusterRole/gitlab-agent-write                                     162d

The agent is now installed.

Tearing down and reinstalling the agent

This process is tricky and not automated by GitLab. Occasionally, reinstalling the agent is necessary, as the agent does not tolerate errors in YAML manifests very well and can enter a condition in which it is unresponsive.

Follow this procedure in Google Cloud Shell:

Delete all resources associated with the agent in its namespace:
```
kubectl delete all --all -n gitlab-kubernetes-agent
```

Delete the namespace:

kubectl delete ns gitlab-kubernetes-agent

Delete the inventory file in the default namespace that the agent uses to track managed installations (This resource is not automatically removed with the agent's namespace resources):
1. Go to the Google Cloud Platform, and choose Kubernetes Engine > Configuration from the upper left menu.
2. In the default namespace, delete the Config Map named inventory-nnn, where nnn is a string of numbers and dashes.
3. In the default namespace, delete the secret gitlab-agent-token-nnn, where nnn is some arbitrary hexadecimal number.
Reinstall the agent following the above instructions, utilizing the same authorization token.

Contents

Prompt Playground

1 Variable

Fill Variables

CLUSTERNAME

Preview

# GKE-ENKI-GitLab-agent

## Contents
▪️ [Future roadmap](#future-roadmap)
▪️ [Installed cluster components (dependent order)](#installed-cluster-components-dependent-order)
▪️ [CI/CD for automated deployment and maintenance](#cicd-for-automated-deployment-and-maintenance)
▪️ [Some useful kubectl commands](#some-useful-kubectl-commands)
▪️ [Kubernetes cluster configuration](#kubernetes-cluster-configuration)
▪️ [GitLab Kubernetes agent installation](#gitlab-kubernetes-agent-installation)
▪️ [Tearing down and reinstalling the agent](#tearing-down-and-reinstalling-the-agent)

## Future roadmap
- Fully integrate GitLab Kubernetes Agent for GitOps as an alternative to using GitLab Runner and Helm.
> The agent has to mature to handle sequenced YAML deploys, and the agent must operate with clusterwide admin privileges to make this integration possible.
- Consider adding the following:
- User billing and tracking (using Kubecost)
- Runbooks for JupyterLab (notebook-based) GitOps using Rubix/Nurtch
- Cloudwatch integration
- Elastic Container Service

- Investigate the Google Cloud Run serverless platform.
> Port knative Geobarometer and MELTS web services to remove any dependence on the Kubernetes cluster.

## Installed cluster components (dependent order)

1. **GitLab Kubernetes Agent**
> Entity that attaches a GKE cluster to this repository (configuration notes below).
1. **GitLab Runner**
> Gitlab Runner allows CI jobs to run on the cluster in privileged mode, which allows us to execute *kubectl* and *helm* commands to perform GitOps tasks using YAML files stored in this repository. Basically, the runner gives us the functionality of Google Cloud Shell or a desktop connection of gcloud/kubectl using GitLab CI.
1. **Kubernetes NGINX Ingress Controller**
> The ingress controller is utilized to expose endpoints of services to external ports. There are multiple ingress controllers operating on the cluster. This one is used to expose Grafana and Kasten K10 endpoints. Another is built into JupyterHub to expose that endpoint.
1. **Cert Manager**
> Used by the ingress controller to acquire and attach TLS certificates to ingress external endpoints so that ports can support *https* and encrypted traffic.
1. **Prometheus** and **Grafana** (exposed at https://cluster.enki-portal.org/)
> The Kube Prometheus stack (with Grafana) monitors the cluster and exposes metrics at an external endpoint so that cluster performance can be assessed.
1. **Google Cloud Storage**
> Storage independent of the Kubernetes cluster that is utilized for backups of cluster resources. The backup service (Kasten K10) is capable of restoring and migrating the cluster using this independent storage.
1. **Kasten K10** (exposed at https://k10.enki-portal.org/k10/)
> Backup, restoration, and migration tool for Kubernetes
1. **JupyterHub**
> Service that hosts the ENKI server. JupyterHub exposes single-user pods that host the ThermoEngine Docker container image with a JupyterLab user interface. It also allocates and maintains access to user-based persistent storage.
1. Testing installation
> This installation is for testing options and configuring possible upgrades to the production server. For cost reasons, it is normally not running.
1. Production installation
> This installation is the production server exposed at https://server.enki-portal.org/ .
1. **Knative** web services
> Service to expose stateless, scalable web services. These services should probably be moved outside the cluster and exposed using the Google Cloud Run serverless platform. See *Future RoadMap* above.
1. **MySQL** (exposed as http://mysql.enki-portal.org:3306/ )
> Database server that currently holds the LEPR/TraceDs as well as some smaller databases (Stixrude, Berman, Inforex, etc.) that are used by cluster apps.

## CI/CD for automated deployment and maintenance
The *.gitlab-ci.yml* YAML file performs a number of functions:
- Deploys manifests using GitLab Kubernetes Agent to perform GitOps tasks
- Runs *helm* and *kubectl* jobs on the cluster to perform GitOps tasks
- Functions as the downstream pipeline for related projects that generate content related to the cluster (See the GitLab project https://gitlab.com/ENKI-portal/jupyterhub_custom)

## Some useful kubectl commands
- Commands for managing namespaces and their resources:
```
kubectl create ns gitlab-runner
kubectl delete all --all -n {namespace}
```
- Get GitLab usernames associated with persistent storage volumes:
```
kubectl --namespace jhub describe persistentvolumeclaims | grep "hub.jupyter.org/username"
```
- Restart hub on cluster using Google Cloud Shell in order to update ENKI-portal/jupyterhub_custom to amend login page:
```
helm upgrade --cleanup-on-fail jhub jupyterhub/jupyterhub --version=1.1.3 --namespace jhub --reuse-values
```

## Kubernetes cluster configuration
The following Google Cloud setup instructions are from the **Zero to JupyterHub** document https://zero-to-jupyterhub.readthedocs.io/en/latest/kubernetes/google/step-zero-gcp.html, as found in October 2021.
1. Using Google Cloud Shell, install **kubectl** and **helm** using **gcloud** after enabling the Kubernetes Engine API.
1. Create a managed kubernetes cluster with a default node pool:
```
gcloud container clusters create \
--machine-type n1-standard-2 \
--enable-autoscaling \
--max-nodes=6 \
--min-nodes=2 \
--zone <compute zone from the list linked below> \
--cluster-version latest \
<CLUSTERNAME>
```
- *\<CLUSTERNAME\>* is **enkiserver**
- *\<compute zone from the list linked below\>* is **us-west1-a**
1. Elevate the user Google Cloud account for administrative functions:
```
kubectl create clusterrolebinding cluster-admin-binding \
--clusterrole=cluster-admin \
--user=<GOOGLE-EMAIL-ACCOUNT>
```
- *\<GOOGLE-EMAIL-ACCOUNT\>* is *email address* of Google Cloud account owner
1. Create a node pool for users:
```
gcloud beta container node-pools create user-pool \
--machine-type n1-standard-2 \
--num-nodes 0 \
--enable-autoscaling \
--min-nodes 0 \
--max-nodes 6 \
--node-labels hub.jupyter.org/node-purpose=user \
--node-taints hub.jupyter.org_dedicated=user:NoSchedule \
--zone us-central1-b \
--cluster <CLUSTERNAME>
```
After you complete these steps, two node pools are up and running. The default node pool is used to run cluster-wide apps, while the tainted user node pool is used to launch nodes for single-user Jupyter pods. Six nodes in the user pool should be able to accommodate about 100 users doing small-scale ENKI-related modeling.

## GitLab Kubernetes Agent installation
The following instructions are from the GitLab document https://docs.gitlab.com/ee/user/clusters/agent/#set-up-the-kubernetes-agent-server, as found in October 2021.
1. Create a config.yaml file in the repository at *.gitlab/agents/primary-agent* with the contents:
```
gitops:
manifest_projects:
- id: "enki-portal/gke-enki-gitlab-agent"
paths:
- glob: 'generated-manifests/**/*.{yaml,yml,json}'
inventory_policy: adopt_if_no_inventory
```
- The *ID* is the repository name that contains the manifest files (this repository).
- The *glob* is altered from the default suggestion to look only at YAML files in the folder and subfolders of *generated-manifests*.
- The *inventory_policy* is changed from the default suggestion to allow the agent to inherit the management of applications that are already running on the cluster when their YAML manifests are added to the *generated-manifests* file hierarchy.

Multiple manifest projects can be defined; future plans will allow these to be *private* repositories.

Currently the agent repository must be public; future plans will allow the agent to be associated with a *group*.
1. Create the agent in GitLab (*Infrastructure > Kubernetes clusters*) and generate a *secret token*. Assign this token to a pipeline environment variable (*Settings* > *CI/CD* > *Variables*) with the name *GITLAB_AGENT_TOKEN*. Make sure that the value is *protected* and *masked* in order to keep it hidden in pipeline logs.
1. In Google Cloud Shell, execute the following to create a namespace for the agent:
```
kubectl create ns gitlab-kubernetes-agent
```
Then install the agent, with the appropriate token value substituted for *$(GITLAB_AGENT_TOKEN)*:
```
docker run --pull=always --rm \
registry.gitlab.com/gitlab-org/cluster-integration/gitlab-agent/cli:stable generate \
--agent-token=$(GITLAB_AGENT_TOKEN) \
--kas-address=wss://kas.gitlab.com \
--agent-version stable \
--namespace gitlab-kubernetes-agent | kubectl apply -f -
```

1. Upgrade the GitLab agent service account to have a cluster-admin role (so that it can create *secrets*, *pods*, *config maps*, etc. in arbitrary cluster *namespaces*) by executing first in Google Cloud Shell:
```
kubectl get rolebindings,clusterrolebindings --all-namespaces \
-o custom-columns='KIND:kind,NAMESPACE:metadata.namespace,NAME:metadata.name,SERVICE_ACCOUNTS:subjects[?(@.kind=="ServiceAccount")].name' | grep gitlab-agent
```
Note that this critical step is missing from the GitLab documentation. The command gives the output:
```
ClusterRoleBinding <none> cilium-alert-read gitlab-agent
ClusterRoleBinding <none> gitlab-agent-gitops-read-all gitlab-agent
ClusterRoleBinding <none> gitlab-agent-gitops-write-all gitlab-agent
ClusterRoleBinding <none> gitlab-agent-read-binding gitlab-agent
ClusterRoleBinding <none> gitlab-agent-write-binding gitlab-agent
```
1. Apply the binding with the command:
```
kubectl create clusterrolebinding gitlab-agent-cluster-admin-binding --clusterrole=cluster-admin --serviceaccount=default:gitlab-agent
kubectl get clusterrolebinding | grep gitlab-agent
```
The command gives output such as the following:
```
gitlab-agent-cluster-admin-binding ClusterRole/cluster-admin 12s
gitlab-agent-gitops-read-all ClusterRole/gitlab-agent-gitops-read-all 162d
gitlab-agent-gitops-write-all ClusterRole/gitlab-agent-gitops-write-all 162d
gitlab-agent-read-binding ClusterRole/gitlab-agent-read 162d
gitlab-agent-write-binding ClusterRole/gitlab-agent-write 162d
```
The agent is now installed.

## Tearing down and reinstalling the agent
This process is tricky and not automated by GitLab. Occasionally, reinstalling the agent is necessary, as the agent does not tolerate errors in YAML manifests very well and can enter a condition in which it is unresponsive.

Follow this procedure in Google Cloud Shell:
1. Delete all resources associated with the agent in its namespace:
```
kubectl delete all --all -n gitlab-kubernetes-agent
```
1. Delete the namespace:
```
kubectl delete ns gitlab-kubernetes-agent
```
1. Delete the inventory file in the default namespace that the agent uses to track managed installations (This resource is not automatically removed with the agent's namespace resources):
1. Go to the Google Cloud Platform, and choose *Kubernetes Engine* > *Configuration* from the upper left menu.
1. In the *default* namespace, delete the *Config Map* named *inventory-nnn*, where *nnn* is a string of numbers and dashes.
1. In the *default* namespace, delete the *secret* *gitlab-agent-token-nnn*, where *nnn* is some arbitrary hexadecimal number.
1. Reinstall the agent following the above instructions, utilizing the same authorization token.

View Original Source

Related Skills

General

PromptBeginner5 minmarkdown

Untitled Skill

193

Jan 12, 2026

General

PromptBeginner5 minmarkdown

Frontend Typescript Linting.mdc

TypeScript and ESLint rules that MUST be followed when creating, modifying, or reviewing any file under apps/frontend/, including .ts, .tsx, .js, and .jsx files. Also apply when discussing frontend li...

160

Feb 15, 2026

General

PromptBeginner5 minmarkdown

2. Apply Deepthink Protocol (reason about dependencies

risks

126

Jan 15, 2026