Deploying PostgreSQL in Kubernetes with ArgoCD using the Apps of Apps Pattern

k8s-postgres

PostgreSQL, also known as Postgres, is a widely popular open-source relational database system known for its robustness, advanced features, and strong community support. This article will guide you through deploying PostgreSQL in a Kubernetes cluster using ArgoCD with the Apps of Apps pattern.

Why PostgreSQL is Popular

PostgreSQL's popularity stems from several key features:

  1. Open Source: Free and open-source, allowing usage, modification, and distribution without licensing costs.
  2. Advanced Features: Supports advanced data types, full ACID compliance, complex queries, JSON support, full-text search, and custom data types.
  3. Performance: Optimized for high performance with large datasets, supporting concurrent transactions efficiently.
  4. Extensibility: Allows users to define custom functions and operators, and supports a wide range of extensions.
  5. Community Support: A large, active community provides extensive documentation, plugins, and third-party tools.

Prerequisites

Before deploying PostgreSQL with ArgoCD, ensure you have the following:

  • A Kubernetes cluster running (local or cloud-based).
  • kubectl command-line tool configured to interact with your cluster.
  • ArgoCD installed and configured on your Kubernetes cluster.
  • A Git repository to store your Kubernetes manifests.

Step 1: Set Up ArgoCD and the Git Repository

Ensure that ArgoCD is installed and set up correctly on your Kubernetes cluster. You should also have a Git repository where you will store your Kubernetes manifests.

Step 2: Create the Root Application

The Apps of Apps pattern in ArgoCD involves having a root application that manages other applications. Let's create a root application manifest.

apiVersion: argoproj.io/v1alpha1
kind: Application
metadata:
  name: registry
  namespace: argocd
  finalizers:
 - resources-finalizer.argocd.argoproj.io
spec:
  project: default
  source:
    repoURL: 'https://github.com/mrpbennett/home-ops.git'
    path: kubernetes/registry
    targetRevision: HEAD
    directory:
      recurse: true
  destination:
    server: 'https://kubernetes.default.svc'
  syncPolicy:
    automated:
      prune: true
      selfHeal: true
    syncOptions:
 - Validate=true
 - CreateNamespace=false
    retry:
      limit: 5
      backoff:
        duration: 5s
        maxDuration: 5m0s
        factor: 2

Now we have the root application, we can create our Postgres application. The root application will look inside a directory, where we can place more applications like the Postgres one below.

apiVersion: argoproj.io/v1alpha1
kind: Application
metadata:
  name: &app postgres-db
  namespace: argocd
  finalizers:
 - resources-finalizer.argocd.argoproj.io
spec:
  project: default
  source:
    repoURL: 'https://github.com/mrpbennett/home-ops.git'
    path: kubernetes/apps/postgres-db
    targetRevision: HEAD
    directory:
      recurse: true
  destination:
    namespace: *app
    server: 'https://kubernetes.default.svc'
  syncPolicy:
    automated:
      prune: true
      selfHeal: true
    syncOptions:
 - CreateNamespace=true
    retry:
      limit: 5
      backoff:
        duration: 5s
        maxDuration: 5m0s
        factor: 2

Creating the manifest files for Postgres

Storage

PersistentVolume (PV) and PersistentVolumeClaim (PVC) are Kubernetes resources that provide and claim persistent storage in a cluster. A PersistentVolume provides storage resources in the cluster, while a PersistentVolumeClaim allows pods to request specific storage resources.

Persistant Volume

apiVersion: v1
kind: PersistentVolume
metadata:
  name: postgres-volume
  labels:
    type: local
    app: postgres
spec:
  storageClassName: manual
  capacity:
    storage: 50Gi
  accessModes:
 - ReadWriteMany
  hostPath:
    path: /mnt/storage/postgresql

Here I have set the accessMode to ReadWriteMany allowing multiple Pods to read and write to the volume simultaneously. This is because we're going to be setting the replicas to more than 1.

Persistant Volume Claim

apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  name: postgres-volume-claim
  labels:
    app: postgres
spec:
  storageClassName: manual
  accessModes:
 - ReadWriteMany
  resources:
    requests:
      storage: 50Gi

Configuration

Config Map

In Kubernetes, a ConfigMap is an API object that stores configuration data in key-value pairs, which pods or containers can use in a cluster. ConfigMaps in Kubernetes are ideal for storing non-confidential data in key-value pairs. They're instrumental in separating your PostgreSQL configuration data from your application code, enhancing maintainability and flexibility.

Let’s create a ConfigMap configuration file to store PostgreSQL configuration on a basic level.

apiVersion: v1
kind: ConfigMap
metadata:
  name: postgres-configmap
  labels:
    app: postgres
data:
  postgresql.conf: |
    listen_addresses = '*'
    max_connections = 100
    shared_buffers = 128MB

  pg_hba.conf: |
    ...

Secret

Kubernetes Secrets are essential for managing sensitive data such as passwords and should be used to secure your PostgreSQL deployment.

For PostgreSQL, this usually involves the database user password. Encode your password using Base64. For example: echo -n 'yourpassword' | base64

Here I have commented out POSTGRES_USER & POSTGRES_DB as I will use the default.

apiVersion: v1
kind: Secret
metadata:
  name: postgres-secret
  namespace: postgres-db
type: Opaque
data:
  # POSTGRES_USER: xxx
  POSTGRES_PASSWORD: cGFzc3dvcmQ=
  # POSTGRES_DB: xxx

Deployment

Creating a PostgreSQL deployment in Kubernetes involves defining a Deployment manifest to orchestrate the PostgreSQL pods.

apiVersion: apps/v1
kind: Deployment
metadata:
  name: postgres-deployment
spec:
  replicas: 3
  selector:
    matchLabels:
      app: postgres
  template:
    metadata:
      labels:
        app: postgres
    spec:
      containers:
        - name: postgres
          image: "postgres:16"
          imagePullPolicy: IfNotPresent
          ports:
            - containerPort: 5432
          env:
            - name: PGDATA
              value: /etc/postgresql/main

            - name: POSTGRES_PASSWORD
              valueFrom:
                secretKeyRef:
                  name: postgres-secret
                  key: POSTGRES_PASSWORD

          volumeMounts:
            - mountPath: /etc/postgresql
              name: postgresdata
      volumes:
        - name: postgresdata
          persistentVolumeClaim:
            claimName: postgres-pvc

        - name: postgres-config-vol
          configMap:
            name: postgres-configmap
            items:
              - key: postgresql.conf
                path: postgresql.conf
              - key: pg_hba.conf
                path: pg_hba.conf

Here you can see we have hooked up our configmap and secret in our deployment.

env:
  ...
  - name: POSTGRES_PASSWORD
    valueFrom:
      secretKeyRef:
        name: postgres-secret
        key: POSTGRES_PASSWORD
  ...
volumes:
  ...

- name: postgres-config-vol
  configMap:
    name: postgres-configmap
    items:
      - key: postgresql.conf
        path: postgresql.conf

IMPORTANT!! I kept on getting an error in my postgres pod:

initdb: error: directory /var/lib/postgresql/data exists but is not empty initdb: hint: If you want to create a new database system, either remove or empty the directory /var/lib/postgresql/data or run initdb with an argument other than /var/lib/postgresql/data.

This was because I realised that the problem is that I set the mount point path and the data path to be the same dir. So I amended the deployment with the following:

env:
  - name: PGDATA
  value: /var/lib/postgresql/data/pgdata

This resolved the issue and deployed without a hitch.

The PersistentVolumeClaim named β€œpostgres-volume-claim” which we created earlier. This claim is likely used to provide persistent storage to the PostgreSQL container so that data is retained across Pod restarts or rescheduling.

Service

As I am running 3 replicas and KubeVIP I have chosen to use LoadBalancer for my service, this will give me an external IP from the cluster. The IP is one of the internal IPs on my network. A Service is used to define a logical set of Pods that enable other Pods within the cluster to communicate with a set of Pods without needing to know the specific IP addresses of those Pods.

apiVersion: v1
kind: Service
metadata:
  name: postgres
  labels:
    app: postgres
spec:
  type: LoadBalancer
  ports:
 - port: 5432
  selector:
    app: postgres

Once the is created other applications or services within the Kubernetes cluster can communicate with the PostgreSQL database using the Postgres name and port 5432 as the entry point. You can see how the LoadBalancer has provided an internal IP on my network that allows me to connect to it from my local machine.

All the above manifests should be within their own directory like so:

πŸ“
β”œβ”€β”€ config-map.yaml
β”œβ”€β”€ deployment.yaml
β”œβ”€β”€ ingress.yaml
β”œβ”€β”€ namespace.yaml
β”œβ”€β”€ persistent-vol-claim.yaml
β”œβ”€β”€ persistent-vol.yaml
β”œβ”€β”€ secret.yaml
└── service.yaml

Now all that is left to do is, commit the changes to your repo and let ArgoCD take care of the rest. Once ArgoCD has synced and got everything up and running it should look like this:

argo-apps-of-apps-postgres

Things will look a little different here, as I only have 2 replicas and I am still playing around with PGAdmin.

That's it

On a basic level, this is how I have set up Postgres. If this has helped then please do let me know, I will be exploring how to back up my database as well as implementing things like PGAdmin.

UPDATE: With the below tutorials, I was running issues with deployment, I have amended the deployment manifest in this article thanks to postgres-mount-volume-error-in-k8s

Inspo: