|
1 | 1 | # Benchmarks |
2 | 2 |
|
3 | | -Various benchmarks to show performance of EventSQL. |
| 3 | +If you care only about numbers, you can find them in the results dir. |
4 | 4 |
|
5 | | -## Queries |
| 5 | +Some background and details: |
| 6 | +* all benchmarks were run on [DigitalOcean](https://www.digitalocean.com/) infrastructure |
| 7 | +* benchmarks were run with both single Postgres instance serving as the events backend as well as multiple (sharding) |
| 8 | +* we have the following components: |
| 9 | + * `app` - simple Spring Boot that uses *EventSQL* to consume events |
| 10 | + * `runner` - script that uses *EventSQL* to publish events with set per second rate and amount, waits for consumers to finish consumption and gathers relevant stats (it's running benchmarks) |
| 11 | + * `events-db` - Postgres serving as a backend for *EventSQL* - events are published to and consumed from it; |
| 12 | + depending on the benchmark, it's run in one or a few (3) instances |
| 13 | +* most of the setup to run benchmarks is automated and described below; it's fairly straightforward to reproduce |
| 14 | + |
| 15 | +## Infrastructure |
| 16 | + |
| 17 | +Defined in the `prepare_infra.py` script; sometimes resources are limited by `docker run` command, but essentially: |
| 18 | +* *benchmarks-app (consumer)* runs on 2 GB and 2 CPUs (AMD) machine |
| 19 | +* each *events-db* runs on 8 GB and 4 CPUs (AMD) machine |
| 20 | +* each *benchmarks-runner* runs alongside *events-db*, but is throttled to 2 GB memory and 2 CPUs |
| 21 | +* there is a basic firewall and virtual private network (vpc) setup (`prepare_infra.py`), so that nobody is bothering us during benchmarks |
| 22 | + |
| 23 | +## Requirements |
| 24 | + |
| 25 | +* DigitalOcean account - you might also use a different infrastructure provider but will need to adjust `prepare_infra.py` script accordingly or write your own setup from scratch |
| 26 | +* Python 3 & Bash for scripts |
| 27 | +* Java 21 + compatible Maven version to build apps |
| 28 | +* Docker to dockerize them and run various commands (scripts assume non-root, current user, access) |
| 29 | + |
| 30 | +## Preparation |
| 31 | + |
| 32 | +### Infra |
| 33 | + |
| 34 | +From scripts dir, Python env setup: |
| 35 | +``` |
| 36 | +bash init_python_env.bash |
| 37 | +source venv/bin/activate |
| 38 | +``` |
| 39 | + |
| 40 | +The following might take a while, since we are creating a few machines - one for the consumer app and three for multiple Postgres instances: |
| 41 | +``` |
| 42 | +export DO_API_TOKEN=<your DigitalOcean API key> |
| 43 | +export SSH_KEY_FINGERPRINT=<fingerprint of your ssh key, uploaded to DigitalOcean, giving you ssh access to created machines> |
| 44 | +
|
| 45 | +python prepare_infra.py |
| 46 | +``` |
| 47 | + |
| 48 | +After it finishes, on the DigitalOcean UI we should see something like this: |
| 49 | + |
| 50 | + |
| 51 | +We right now have four machines connected to each other by the vpc. |
| 52 | +To each we have access, using ssh public key authentication, as the `eventsql` user. |
| 53 | +Infrastructure is now ready, let's prepare apps. |
| 54 | + |
| 55 | +### Apps |
| 56 | + |
| 57 | +Let's build `events-db` (from scripts dir again): |
| 58 | +``` |
| 59 | +export APP=events-db |
| 60 | +bash build_and_package.bash |
| 61 | +``` |
| 62 | + |
| 63 | +Let's build `app` (consumer): |
| 64 | +``` |
| 65 | +export APP=app |
| 66 | +export DB0_HOST="<db0 private ip>" |
| 67 | +export DB1_HOST="<db1 private ip>" |
| 68 | +export DB2_HOST="<db2 private ip>" |
| 69 | +bash build_and_package.bash |
| 70 | +``` |
| 71 | + |
| 72 | +Private ips can be taken from the DigitalOcean UI - only they will work, public ips will not, since we have set up a firewall blocking traffic of this kind. |
| 73 | + |
| 74 | +Finally, let's build `runner`: |
| 75 | +``` |
| 76 | +export APP=runner |
| 77 | +bash build_and_package.bash |
| 78 | +``` |
| 79 | + |
| 80 | +### Deployment |
| 81 | + |
| 82 | +As all apps are now packaged and ready, let's deploy them! |
| 83 | + |
| 84 | +We deploy by copying gzipped Docker images alongside with load and run scripts to the target machines. |
| 85 | + |
| 86 | +Three `events-dbs`: |
| 87 | +``` |
| 88 | +export EVENTS_DB0_HOST=<ip of events-db-0 machine" |
| 89 | +export EVENTS_DB1_HOST=<ip of events-db-1 machine" |
| 90 | +export EVENTS_DB2_HOST=<ip of events-db-2 machine" |
| 91 | +bash deploy_events_dbs.bash |
| 92 | +``` |
| 93 | + |
| 94 | +`app`: |
| 95 | +``` |
| 96 | +export APP_HOST=<ip of consumer app machine> |
| 97 | +bash deploy_app.bash |
| 98 | +``` |
| 99 | + |
| 100 | +All dbs and app are running now. |
| 101 | +With `benchmark-runners` it is slightly different - we will copy them to target machines but not run just yet. |
| 102 | +They will run on the same machines dbs are hosted; each db has a corresponding benchmarks-runner: |
| 103 | +``` |
| 104 | +export EVENTS_DB0_HOST=<ip of events-db-0 machine" |
| 105 | +export EVENTS_DB1_HOST=<ip of events-db-1 machine" |
| 106 | +export EVENTS_DB2_HOST=<ip of events-db-2 machine" |
| 107 | +bash deploy_runners.bash |
| 108 | +``` |
| 109 | + |
| 110 | +Everything is now ready to run various benchmarks. |
| 111 | + |
| 112 | +## Running benchmarks |
| 113 | + |
| 114 | +### Single db |
| 115 | + |
| 116 | +Let's start with single db cases. |
| 117 | + |
| 118 | +First, copy and run `collect_docker_stats.bash` script to one of the events dbs machine and start collecting stats: |
| 119 | +``` |
| 120 | +scp collect_docker_stats.bash eventsql@<events-db-ip>:/home/eventsql |
| 121 | +ssh eventsql@<events-db-ip> |
| 122 | +bash collect_docker_stats.bash |
| 123 | +
|
| 124 | +Removing previous stats file, if exists... |
| 125 | +
|
| 126 | +Collecting docker stats to /tmp/docker_stats.txt... |
| 127 | +Stats collected, sleeping for 10 s... |
| 128 | +... |
| 129 | +Collecting docker stats to /tmp/docker_stats.txt... |
| 130 | +Stats collected, sleeping for 10 s... |
| 131 | +... |
| 132 | +``` |
| 133 | + |
| 134 | +You might do the same for the consumer machine to its stats as well. |
| 135 | + |
| 136 | +Finally, let's run various benchmarks: |
| 137 | +``` |
| 138 | +export RUNNER_HOST=<events-db-ip> |
| 139 | +export EVENTS_RATE=1000 |
| 140 | +# EVENTS_RATE * 60 for benchmark to last approximately 1 minute |
| 141 | +export EVENTS_TO_PUBLISH=60000 |
| 142 | +bash run_single_db_benchmark.bash |
| 143 | +
|
| 144 | +export EVENTS_RATE=5000 |
| 145 | +export EVENTS_TO_PUBLISH=300000 |
| 146 | +bash run_single_db_benchmark.bash |
| 147 | +
|
| 148 | +export EVENTS_RATE=10000 |
| 149 | +export EVENTS_TO_PUBLISH=600000 |
| 150 | +bash run_single_db_benchmark.bash |
| 151 | +``` |
| 152 | + |
| 153 | +### Multiple dbs |
| 154 | + |
| 155 | +It's almost the same, the difference being that we need to repeat steps on all machines, more or less simultaneously. |
| 156 | + |
| 157 | +For simplicity, I've prepared a script that does it. |
| 158 | +So, all we have to do is: |
| 159 | +``` |
| 160 | +export RUNNER0_HOST=<events-db-0-ip> |
| 161 | +export RUNNER1_HOST=<events-db-1-ip> |
| 162 | +export RUNNER2_HOST=<events-db-2-ip> |
| 163 | +
|
| 164 | +export EVENTS_RATE=5000 |
| 165 | +# EVENTS_RATE * 60 for benchmark to last approximately 1 minute |
| 166 | +export EVENTS_TO_PUBLISH=300000 |
| 167 | +bash run_multiple_dbs_benchmark.bash |
| 168 | +
|
| 169 | +export EVENTS_RATE=10000 |
| 170 | +export EVENTS_TO_PUBLISH=600000 |
| 171 | +bash run_multiple_dbs_benchmark.bash |
| 172 | +``` |
| 173 | + |
| 174 | +We have 3 dbs (shards), so real rates are: |
| 175 | +``` |
| 176 | +3 * 5000 = 15 000 per second |
| 177 | +3 * 10000 = 30 000 per second |
6 | 178 | ``` |
7 | | -select id, convert_from(value, 'UTF8')::json from account_created_event limit 10; |
8 | | -create index account_created_event_email |
9 | | -on account_created_event ((encode(value, 'escape')::json->>'email')); |
| 179 | +...which is quite a lot! |
| 180 | + |
| 181 | +As you can see in the results, we got pretty close to these rates: |
10 | 182 | ``` |
| 183 | +... |
| 184 | +
|
| 185 | +Publishing 300000 events with 5000 per second rate took: PT1M3.436S, which means 4729 per second rate |
| 186 | +3 runner instances were running in parallel, so the real rate was 14187 per second for 900000 events |
| 187 | +
|
| 188 | +... |
| 189 | +
|
| 190 | +Waiting for consumption.... |
| 191 | +
|
| 192 | +... |
| 193 | +
|
| 194 | +Consumer of 2 partition is at the event 2588597, but latest event is 2590000; waiting for 1s... |
| 195 | +Consumer of 3 partition is at the event 2589971, but latest event is 2589996; waiting for 1s... |
| 196 | +Consumer of 3 partition is at the event 2589971, but latest event is 2589996; waiting for 1s... |
| 197 | +
|
| 198 | +... |
| 199 | +
|
| 200 | +Consuming 300000 events with 5000 per second rate took: PT1M6.875S, which means 4486 per second rate |
| 201 | +3 runner instances were running in parallel, so the real rate was 13458 per second for 900000 events |
| 202 | +
|
| 203 | +... |
| 204 | +
|
| 205 | +... |
| 206 | +
|
| 207 | +Publishing 600000 events with 10000 per second rate took: PT1M11.099S, which means 8438 per second rate |
| 208 | +3 runner instances were running in parallel, so the real rate was 25314 per second for 1800000 events |
| 209 | +
|
| 210 | +... |
| 211 | +
|
| 212 | +Waiting for consumption.... |
| 213 | +
|
| 214 | +... |
| 215 | +
|
| 216 | +Consumer of 0 partition is at the event 3187024, but latest event is 3189999; waiting for 1s... |
| 217 | +
|
| 218 | +... |
| 219 | +
|
| 220 | +Consuming 600000 events with 10000 per second rate took: PT1M12.561S, which means 8268 per second rate |
| 221 | +3 runner instances were running in parallel, so the real rate was 24804 per second for 1800000 events |
| 222 | +
|
| 223 | +... |
11 | 224 |
|
12 | | -## TODO |
13 | | -* sharding version tests -> endpoint to see when it's ready |
| 225 | +``` |
0 commit comments