Skip to content
Open
170 changes: 170 additions & 0 deletions docs/GKE_PostgreSQL_Quickstart_generic.MD
Original file line number Diff line number Diff line change
@@ -0,0 +1,170 @@
# PostgreSQL on GKE - Benchmark Quickstart Guide

## Overview

This guide covers the PKB benchmark module for PostgreSQL
performance automation on Google Kubernetes Engine (GKE):

- **`kubernetes_postgres_sysbench`** — Standalone PostgreSQL benchmark: deploys a
single PostgreSQL instance as a Kubernetes StatefulSet and runs Sysbench OLTP
workloads from a client pod within the same cluster.

This benchmark:
- Creates and tears down GKE infrastructure automatically via PKB.
- Measures TPS (Transactions Per Second), QPS (Queries Per Second), and Latency.
- Supports multiple optimization profiles for tuning PostgreSQL and GKE node configuration.

## Architecture Overview

1. **GKE Cluster**: Created with 2 nodepools:
* `postgres`: For the PostgreSQL server (StatefulSet).
* `clients`: For the Sysbench client (Pod).
2. **Private Networking**:
* PostgreSQL runs as a StatefulSet with a Persistent Volume.
* Sysbench connects via the **Private Pod IP** of the server.
* No public IPs are used for database traffic.
3. **Storage**:
* Uses `pd-ssd` (for N-series) or `hyperdisk-balanced` (for C4-series).

## New Developer Setup (First Time Only)

If you're a new developer cloning this repository for the first time:

```bash
# 1. Clone the repository
git clone <repository-url>
cd PerfKitBenchmarker

# 2. Create Python virtual environment (first time only)
python3 -m venv venv_postgres

# 3. Activate virtual environment
source venv_postgres/bin/activate

# 4. Install Python dependencies
pip install "setuptools<70.0.0"
pip install pytz
pip install -r requirements.txt
# This may take 2-3 minutes

# 5. Authenticate with GCP
gcloud auth login
gcloud auth application-default login

# 6. Set your GCP project
export PROJECT_ID="your-project-id"
gcloud config set project $PROJECT_ID
```

**Note**: The `venv_postgres/` directory is NOT in git. Each developer creates their own.

## Prerequisites (For Each Session)

```bash
# 1. Activate virtual environment
source venv_postgres/bin/activate

# 2. Create temp directory (for logs and results)
mkdir -p pkb_temp

# 3. Set GCP project variable
export PROJECT_ID="your-project-id"
```

**Note**:
- The `pkb_temp/` directory stores benchmark logs and results. It's excluded from git via `.gitignore`.


## Baseline Tests

Runs the benchmark with standard PostgreSQL settings (no special tuning).

### Baseline Run (C4 Standard)

```bash
python3 pkb.py \
--benchmarks=kubernetes_postgres_sysbench \
--cloud=GCP \
--vm_platform=Kubernetes \
--zone=us-central1-a \
--project=$PROJECT_ID \
--postgres_gke_server_machine_type=c4-standard-16 \
--postgres_gke_client_machine_type=c4-standard-16 \
--postgres_gke_disk_type=hyperdisk-balanced \
--postgres_gke_disk_size=500 \
--postgres_gke_optimization_profile=baseline \
--sysbench_tables=10 \
--sysbench_table_size=4000000 \
--sysbench_run_threads=512 \
--sysbench_run_seconds=300 \
--sysbench_testname=oltp_read_write \
--metadata=cloud:GCP \
--metadata=geo:us-central1 \
--metadata=scenario:postgres_baseline \
--temp_dir=./pkb_temp \
--run_stage_iterations=1 \
--owner=$(whoami | tr '.' '-') \
--log_level=error \
--accept_licenses
```

## Optimized Tests - example

Runs the benchmark with specific optimization profiles.



### 1. Profile: Postgres Tuned
Aggressive PostgreSQL configuration tuning (Shared Buffers, Workers, etc.).

```bash
python3 pkb.py \
--benchmarks=kubernetes_postgres_sysbench \
--cloud=GCP \
--vm_platform=Kubernetes \
--zone=us-central1-a \
--project=$PROJECT_ID \
--postgres_gke_server_machine_type=c4-standard-16 \
--postgres_gke_client_machine_type=c4-standard-16 \
--postgres_gke_disk_type=hyperdisk-balanced \
--postgres_gke_disk_size=500 \
--postgres_gke_optimization_profile=postgres-tuned \
--sysbench_tables=10 \
--sysbench_table_size=4000000 \
--sysbench_run_threads=512 \
--sysbench_run_seconds=300 \
--sysbench_testname=oltp_read_write \
--metadata=cloud:GCP \
--metadata=geo:us-central1 \
--metadata=scenario:postgres_optimized \
--metadata=optimization_profile:postgres-tuned \
--temp_dir=./pkb_temp \
--run_stage_iterations=1 \
--owner=$(whoami | tr '.' '-') \
--log_level=error \
--accept_licenses
```





## Understanding the Workload

* **Workload**: Sysbench OLTP Read/Write (`oltp_read_write`).
* **Tables**: 10 tables.
* **Table Size**: 4,000,000 rows per table.
* **Threads**: 512 concurrent threads.
* **Duration**: 300 seconds (5 minutes) per run.

## Results Location

Results are saved to:
```
./pkb_temp/runs/<run_uri>/perfkitbenchmarker_results.json
```

View results:
```bash
cat ./pkb_temp/runs/<run_uri>/perfkitbenchmarker_results.json | jq
```
110 changes: 110 additions & 0 deletions docs/Technical_Architecture_PostgreSQL_PKB.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,110 @@
# Technical Architecture: PostgreSQL Benchmarking on GKE with PerfKitBenchmarker

This document provides a technical deep dive into the architecture and implementation of the PostgreSQL benchmarking suite used for evaluating performance on Google Kubernetes Engine (GKE). It covers the implementation details of both the Baseline and Optimized benchmarks, explaining how PerfKitBenchmarker (PKB) is leveraged to simulate real-world workloads using Sysbench.

## Overview

The benchmarking suite is designed to compare the performance of standard PostgreSQL deployments ("Baseline") against GKE-optimized PostgreSQL configurations ("Optimized"). The benchmarks use `sysbench` (OLTP Read/Write) as the load generator and are orchestrated by PKB.

## Baseline Benchmark Implementation

The baseline benchmark is executed using the `kubernetes_postgres_sysbench` benchmark configuration. This configuration represents a standard, unoptimized PostgreSQL deployment on Kubernetes.

### Execution Command


```bash
python3 pkb.py \
--benchmarks=kubernetes_postgres_sysbench \
--postgres_gke_optimization_profile=baseline \
...
```

### Architecture & Logic
1. **Kubernetes-Native Architecture**: PKB provisions a native Kubernetes architecture:
* **Server**: A StatefulSet with 1 replica (`postgres-standalone-0`) running PostgreSQL 16.
* **Client**: A separate Pod (`postgres-client`) running `sysbench`.
2. **StatefulSet & Storage**: The PostgreSQL server uses a StatefulSet to ensure stable identity and persistent storage. It claims a Persistent Volume (PVC) using either `pd-ssd` (for N-series) or `hyperdisk-balanced` (for C4-series).
3. **Private Connectivity**: To ensure secure and low-latency communication, the client pod connects to the server using the **Pod IP** (`.status.podIP`) of the server pod. This avoids any potential public load balancer paths and keeps traffic internal to the cluster.
4. **Secure Authentication**: The benchmark generates a password (or uses `POSTGRES_PASSWORD` env var) and passes it securely to the server (via Secret) and the client (via `PGPASSWORD` env var).

## Optimized Benchmark Implementation

The optimized benchmark uses the same `kubernetes_postgres_sysbench` benchmark class but applies specific "Optimization Profiles" to tune the infrastructure and database configuration.

### Execution Command

```bash
python3 pkb.py \
--benchmarks=kubernetes_postgres_sysbench \
--postgres_gke_optimization_profile=infra+postgres+hugepages \
...
```

### Optimization Profiles
The benchmark supports granular optimization profiles that can be combined:

* **infra-tuned**: Uses Container-Optimized OS (COS) for nodes and Ubuntu 24.04 for the client.
* **fast-startup**: Uses Ubuntu node image and removes the init container for faster startup (at the cost of less robust permission handling).
* **kernel-tuned**: Applies sysctl tuning (`vm.swappiness=1`, `vm.dirty_ratio=10`, etc.) to the node.
* **hugepages**: Enables HugePages (2MB) on the node and configures PostgreSQL (`huge_pages=on`) to use them. This reduces TLB misses and improves memory management efficiency.
* **postgres-tuned**: Applies aggressive PostgreSQL configuration tuning.
* **infra+postgres**: Combines Infrastructure and Postgres Tuning profiles.
* **infra+postgres+hugepages**: Combines Infrastructure, Postgres Tuning, and HugePages for maximum performance.
* **infra+postgres+hugepages+hostnetwork**: Extends the "All-in-One" profile by enabling Host Networking (`hostNetwork: true`) for the PostgreSQL pods. This bypasses the Kubernetes CNI/Overlay network stack, allowing the database to use the node's native network interface for maximum throughput and reduced latency.

## Control Parameters Comparison

The following table summarizes the key control parameters used in both the Baseline and Optimized runs.

### Sysbench Parameters (Load Generator)

| Parameter | Baseline | Optimized |
| :--- | :--- | :--- |
| `tables` | 10 | 10 |
| `table_size` | 4,000,000 | 4,000,000 |
| `threads` | 512 | 512 |
| `testname` | oltp_read_write | oltp_read_write |
| `duration` | 300s | 300s |
| `report_interval` | 10s | 10s |

### PostgreSQL Server Parameters

Memory configurations like `shared_buffers` and `effective_cache_size` are determined dynamically by a rule-based sizing engine that detects the Server Machine Type (`--postgres_gke_server_machine_type`) and aggressively scales K8s pod resources to ~85% of total node RAM, assigning proportionate limits to PostgreSQL to prevent Out-Of-Memory.

| Parameter | Baseline | Optimized (postgres-tuned / infra+postgres+hugepages) |
| :--- | :--- | :--- |
| **Shared Buffers** | 25% of Pod RAM | 40% of Pod RAM |
| **Effective Cache Size** | 50% of Pod RAM | 75% of Pod RAM |
| **Work Mem** | 64MB | 256MB |
| **Effective IO Concurrency** | 100 | 200 |
| **Huge Pages** | Off | On (hugepages) |
| **WAL Buffers** | 64MB | 512MB |
| **Max Worker Processes** | 20 | 32 |
| **Host Network** | False | Optional (infra+postgres+hugepages+hostnetwork) |

## Implementation Details

### 1. Private IP Implementation
To enforce private networking:
* The benchmark explicitly retrieves the Pod IP: `kubectl get pod postgres-standalone-0 -o jsonpath={.status.podIP}`.
* This IP is passed to `sysbench` via the `--pgsql-host` flag.
* The client architecture operates exclusively via native K8s pods initialized in the exact namespace as the Server, maintaining an exact replication of enterprise internal-cluster layouts.

### 2. Disk Type Selection
The benchmark automatically maps machine types to optimal disk types:
* **C4 / C4A / C4D / N4 / N4A / N4D**: `hyperdisk-balanced`

### 3. Sysbench Execution
* The benchmark installs `sysbench` in the client pod via `apt-get`.
* It executes the `oltp_read_write.lua` script located at `/usr/share/sysbench/`.
* The execution command includes a timeout buffer (`duration + 120s`) to prevent premature termination.

### 4. Password Handling & Security
* **Dynamic Password Generation**: A unique password is generated per benchmark run based on the Run URI, ensuring isolation between runs. The plaintext password is never hardcoded or stored in source control. PostgreSQL handles password hashing internally on the server side.
* **Secret Management**:
* **Standalone**: Password is injected into the PostgreSQL pod via the StatefulSet manifest and passed to the Sysbench client via the `PGPASSWORD` environment variable, preventing it from appearing in process listings or command-line logs.



* **Disk Automation**: Selects `hyperdisk-balanced` (C4) or `pd-ssd` (N2) automatically.
Original file line number Diff line number Diff line change
@@ -0,0 +1,44 @@
---
# Client pod for running Sysbench benchmarks
apiVersion: v1
kind: Pod
metadata:
name: postgres-client
namespace: {{ namespace }}
labels:
app: postgres-client
spec:
# PKB will handle node selection through nodepool configuration
tolerations:
- key: "kubernetes.io/arch"
operator: "Equal"
value: "arm64"
effect: "NoSchedule"
containers:
- name: postgres-client
image: {{ client_image }}
imagePullPolicy: IfNotPresent
command:
- sleep
- infinity
resources:
requests:
cpu: "{{ client_cpu_request }}"
memory: "{{ client_memory_request }}"
limits:
cpu: "{{ client_cpu_limit }}"
memory: "{{ client_memory_limit }}"
env:
- name: PGHOST
value: postgres-standalone
- name: PGPORT
value: "5432"
- name: PGUSER
value: benchmark
- name: PGPASSWORD
valueFrom:
secretKeyRef:
name: sysbench-passwords
key: benchmark-password
- name: PGDATABASE
value: benchmark
Loading