Skip to content

Commit 91073d0

Browse files
committed
Benchmarks 2.0
1 parent 2fe4d8f commit 91073d0

42 files changed

Lines changed: 1645 additions & 635 deletions

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

.gitignore

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,10 @@
11
# Maven/Java
22
target/
33
out/
4+
# Python/Scripts
5+
venv/
6+
.venv/
7+
dist/
48
# Intellij
59
.idea/
610
*.iml

README.md

Lines changed: 12 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -2,6 +2,14 @@
22

33
Events over SQL.
44

5+
Simple, Reliable, Fast.
6+
7+
Able to publish and consume thousands of events per second on a single PostgreSQL instance.
8+
9+
With sharding, it can easily support tens of thousands events per second for virtually endless scalability.
10+
11+
For scalability details, see [benchmarks](/benchmarks/README.md).
12+
513
## How it works
614

715
We just need to have three tables:
@@ -80,3 +88,7 @@ WHERE topic = :topic AND name = :c_name AND partition = 0;
8088
Limitation being that if consumer is partitioned, it must have the exact same number of partition as in the topic
8189
definition.
8290
It's a rather acceptable tradeoff and easy to enforce at the library level.
91+
92+
## How to use it
93+
94+
TODO: for now, check out benchmarks/app being an example app.

TODO.md

Lines changed: 1 addition & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -1,10 +1,9 @@
1-
* performance benchmarks on infra & scripts to reproduce them
21
* usage examples
32
* just pub/sub
43
* giving access to event tables as a means of a simple export - since they are all there
54
* expiring events/TTL?
65
* compact topics - unique key
76
* join, aka streams
8-
* more elaborate definitions change support
7+
* more elaborate definitions change support - especially around partitions growth & shrinkage
98
* JavaDocs
109
* Support schemas init in registry - why require schemas from users, if it is always the same?

benchmarks/README.md

Lines changed: 161 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -1,13 +1,167 @@
11
# Benchmarks
22

3-
Various benchmarks to show performance of EventSQL.
3+
If you care only about numbers, you can find them in the results dir.
44

5-
## Queries
5+
Some background and details:
6+
* all benchmarks were run on [DigitalOcean](https://www.digitalocean.com/) infrastructure
7+
* benchmarks were run with both single Postgres instance serving as events backend as well as multiple (sharding)
8+
* there are the following components:
9+
* `app` - simple Spring Boot that uses EventSQL to consume events
10+
* `runner` - script that uses EventSQL to publish events with set per second rate and amount and waits for consumers to finish consumption, gathering relevant stats (it's running benchmarks)
11+
* `events-db` - Postgres serving as a backend for EventSQL; events are published and consumed from it.
12+
Depending on the benchmark, we run it in one or a few (3) instances
13+
* most of the setup to run benchmarks is automated and described below, so it's fairly easy to reproduce
14+
15+
## Infrastructure
16+
17+
Defined in `prepare_infra.py`; sometimes resources are limited by `docker run`, but essentially:
18+
* app (consumer) runs on 2 GB and with 2 CPUs (AMD) machine
19+
* each events-db runs on 8 GB and with 4 CPUs (AMD) machine
20+
* each benchmarks-runner runs alongside events-db, but is throttled to 2 GB memory and 2 CPUs
21+
* there is a basic firewall and virtual private network (vpc) setup, so that nobody is bothering us during tests
22+
23+
## Requirements
24+
25+
* DigitalOcean account - you might also use different infrastructure provider, but will need to adjust `prepare_infra.py` script accordingly or write your own
26+
* Python 3 & Bash for scripts
27+
* Java 21 + compatible Maven version to build apps
28+
* Docker to dockerize them and run various command (scripts assume non-root, current user, access)
29+
30+
## Preparation
31+
32+
### Infra
33+
34+
From scripts, dir, Python env setup:
35+
```
36+
bash init_python_env.bash
37+
source venv/bin/activate
38+
```
39+
40+
Prepare infra; this can take a while, since we are creating a few machines - one for the consumer app and three for multiple Postgres instances.
41+
```
42+
export DO_API_TOKEN=<your DigitalOcean API key>
43+
export SSH_KEY_FINGERPRINT=<fingerprint of your ssh key, uploaded to DigitalOcean, giving you ssh access to machines>
44+
45+
python prepare_infra.py
46+
```
47+
48+
We right now have 4 machines connected with each other by the vpc.
49+
To each we have access, using ssh public key authentication, as the `eventsql` user.
50+
Infrastructure is ready, let's prepare the apps.
51+
52+
### Build apps
53+
54+
Let's build `events-db` (from scripts dir again):
55+
```
56+
export APP=events-db
57+
bash build_and_package.bash
58+
```
59+
60+
Let's build `app`:
61+
```
62+
export APP=app
63+
export DB0_HOST="<db0 private ip>"
64+
export DB1_HOST="<db1 private ip>"
65+
export DB2_HOST="<db2 private ip>"
66+
bash build_and_package.bash
67+
```
68+
69+
Private ips can be taken from DigitalOcean UI - only they will work, public ips will not, since we have set up a firewall blocking traffic of this kind.
70+
71+
Finally, let's build `runner`:
72+
```
73+
export APP=runner
74+
bash build_and_package.bash
75+
```
76+
77+
### Deploy apps
78+
79+
As all apps are now ready, let's deploy them!
80+
81+
We deploy by copying gzipped Docker images + load and run scripts to the target machines.
82+
83+
Three events-dbs:
684
```
7-
select id, convert_from(value, 'UTF8')::json from account_created_event limit 10;
8-
create index account_created_event_email
9-
on account_created_event ((encode(value, 'escape')::json->>'email'));
85+
export EVENTS_DB0_HOST=<ip of events-db-0 machine"
86+
export EVENTS_DB1_HOST=<ip of events-db-1 machine"
87+
export EVENTS_DB2_HOST=<ip of events-db-2 machine"
88+
bash deploy_events_dbs.bash
1089
```
1190

12-
## TODO
13-
* sharding version tests -> endpoint to see when it's ready
91+
App:
92+
```
93+
export APP_HOST=<ip of consumer app machine>
94+
bash deploy_app.bash
95+
```
96+
97+
All dbs and app are running now.
98+
With runners it is slightly different - we will copy them to target machines, but not run them just yet.
99+
They will run on the same machines dbs are hosted; each db has a corresponding benchmarks runner:
100+
```
101+
export EVENTS_DB0_HOST=<ip of events-db-0 machine"
102+
export EVENTS_DB1_HOST=<ip of events-db-1 machine"
103+
export EVENTS_DB2_HOST=<ip of events-db-2 machine"
104+
bash deploy_runners.bash
105+
```
106+
107+
Everything is now ready to run various benchmarks.
108+
109+
## Running benchmarks
110+
111+
### Single db
112+
113+
Let's start with single db cases.
114+
115+
First, copy and run `collect_docker_stats.bash` script to one of the events dbs machine and start collecting them:
116+
```
117+
scp collect_docker_stats.bash eventsql@<events-db-ip>:/home/eventsql
118+
ssh eventsql@<events-db-ip>
119+
bash collect_docker_stats.bash
120+
```
121+
122+
You might do the same for the consumer machine to have those stats as well.
123+
124+
Finally, run various benchmarks:
125+
```
126+
export RUNNER_HOST=<events-db-ip>
127+
export EVENTS_RATE=1000
128+
# EVENTS_RATE * 60 for benchmark to last approximately 1 minute
129+
export EVENTS_TO_PUBLISH=60000
130+
bash run_single_db_benchmark.bash
131+
132+
export EVENTS_RATE=5000
133+
export EVENTS_TO_PUBLISH=300000
134+
bash run_single_db_benchmark.bash
135+
136+
export EVENTS_RATE=10000
137+
export EVENTS_TO_PUBLISH=600000
138+
bash run_single_db_benchmark.bash
139+
```
140+
141+
### Multiple dbs
142+
143+
It's almost the same, difference being that we need to repeat steps on the all machines, more or less simultaneously.
144+
145+
To simplify it, I've prepared a script that does it.
146+
So, all we have to do is:
147+
```
148+
export RUNNER0_HOST=<events-db-0-ip>
149+
export RUNNER1_HOST=<events-db-1-ip>
150+
export RUNNER2_HOST=<events-db-2-ip>
151+
152+
export EVENTS_RATE=5000
153+
# EVENTS_RATE * 60 for benchmark to last approximately 1 minute
154+
export EVENTS_TO_PUBLISH=300000
155+
bash run_multiple_dbs_benchmark.bash
156+
157+
export EVENTS_RATE=10000
158+
export EVENTS_TO_PUBLISH=600000
159+
bash run_multiple_dbs_benchmark.bash
160+
```
161+
162+
We have 3 dbs (shards), so real rates are:
163+
```
164+
3 * 5000 = 15 000 per second
165+
3 * 10000 = 30 000 per second
166+
```
167+
...which is quite a lot!
Lines changed: 56 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,56 @@
1+
#!/bin/bash
2+
set -euo pipefail
3+
4+
app="benchmarks-app"
5+
app_dir="app"
6+
tag="${TAG:-latest}"
7+
tagged_image="${app}:${tag}"
8+
9+
echo "Creating package in dist directory for $tagged_image image..."
10+
echo "Preparing dist dir..."
11+
12+
rm -r -f dist
13+
mkdir dist
14+
15+
echo "Building jar..."
16+
17+
mvn clean install
18+
19+
echo "Building image..."
20+
21+
docker build . -t "$tagged_image"
22+
23+
gzipped_image_path="dist/$app.tar.gz"
24+
25+
echo "Image built, exporting it to $gzipped_image_path, this can take a while..."
26+
27+
docker save "$tagged_image" | gzip > ${gzipped_image_path}
28+
29+
echo "Image exported, preparing scripts..."
30+
31+
export DB0_HOST=${DB0_HOST:-localhost}
32+
export DB0_URL="jdbc:postgresql://$DB0_HOST:5432/events"
33+
export DB0_ENABLED=${DB0_ENABLED:-true}
34+
35+
export DB1_HOST=${DB1_HOST:-localhost}
36+
export DB1_URL="jdbc:postgresql://$DB1_HOST:5432/events"
37+
export DB1_ENABLED=${DB1_ENABLED:-true}
38+
39+
export DB2_HOST=${DB2_HOST:-localhost}
40+
export DB2_URL="jdbc:postgresql://$DB2_HOST:5432/events"
41+
export DB2_ENABLED=${DB2_ENABLED:-true}
42+
43+
export app=$app
44+
export tag=$tag
45+
export run_cmd="docker run -d \\
46+
-e DB0_URL=\"$DB0_URL\" -e DB0_ENABLED=\"$DB0_ENABLED\" \\
47+
-e DB1_URL=\"$DB1_URL\" -e DB1_ENABLED=\"$DB1_ENABLED\" \\
48+
-e DB2_URL=\"$DB2_URL\" -e DB2_ENABLED=\"$DB2_ENABLED\" \\
49+
--network host --restart unless-stopped \\
50+
--name $app $tagged_image"
51+
52+
cd ..
53+
envsubst '${app} ${tag}' < scripts/template_load_and_run_app.bash > $app_dir/dist/load_and_run_app.bash
54+
envsubst '${app} ${run_cmd}' < scripts/template_run_app.bash > $app_dir/dist/run_app.bash
55+
56+
echo "Package prepared."

benchmarks/app/build_and_run.bash

Lines changed: 0 additions & 12 deletions
This file was deleted.

benchmarks/app/pom.xml

Lines changed: 16 additions & 28 deletions
Original file line numberDiff line numberDiff line change
@@ -67,34 +67,22 @@
6767
<artifactId>spring-boot-starter-test</artifactId>
6868
<scope>test</scope>
6969
</dependency>
70-
71-
<dependency>
72-
<groupId>org.testcontainers</groupId>
73-
<artifactId>postgresql</artifactId>
74-
<version>${testcontainers.version}</version>
75-
<scope>test</scope>
76-
</dependency>
7770
</dependencies>
7871

79-
<profiles>
80-
<profile>
81-
<id>executable</id>
82-
<build>
83-
<plugins>
84-
<plugin>
85-
<groupId>org.springframework.boot</groupId>
86-
<artifactId>spring-boot-maven-plugin</artifactId>
87-
<executions>
88-
<execution>
89-
<goals>
90-
<goal>build-info</goal>
91-
<goal>repackage</goal>
92-
</goals>
93-
</execution>
94-
</executions>
95-
</plugin>
96-
</plugins>
97-
</build>
98-
</profile>
99-
</profiles>
72+
<build>
73+
<plugins>
74+
<plugin>
75+
<groupId>org.springframework.boot</groupId>
76+
<artifactId>spring-boot-maven-plugin</artifactId>
77+
<executions>
78+
<execution>
79+
<goals>
80+
<goal>build-info</goal>
81+
<goal>repackage</goal>
82+
</goals>
83+
</execution>
84+
</executions>
85+
</plugin>
86+
</plugins>
87+
</build>
10088
</project>

benchmarks/app/src/main/java/com/binaryigor/eventsql/benchmarks/AccountCreatedHandler.java

Lines changed: 0 additions & 38 deletions
This file was deleted.

0 commit comments

Comments
 (0)