You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
@@ -12,7 +12,8 @@ For scalability details, see [benchmarks](/benchmarks/README.md).
12
12
13
13
## How it works
14
14
15
-
We just need to have three tables:
15
+
We just need to have three tables (postgres syntax):
16
+
16
17
```sql
17
18
CREATETABLEtopic (
18
19
name TEXTPRIMARY KEY,
@@ -27,7 +28,7 @@ CREATE TABLE event (
27
28
key TEXT,
28
29
value BYTEANOT NULL,
29
30
created_at TIMESTAMPNOT NULL DEFAULT NOW(),
30
-
metadata JSONBNOT NULL,
31
+
metadata JSONNOT NULL,
31
32
PRIMARY KEY (topic, id)
32
33
) PARTITION BY LIST (topic);
33
34
@@ -43,29 +44,35 @@ CREATE TABLE consumer (
43
44
```
44
45
45
46
To consume messages, we just need to periodically (every one to a few seconds) do:
47
+
46
48
```sql
47
49
BEGIN;
48
50
49
51
SELECT*FROM consumer
50
-
WHERE topic = :topic AND name = :c_name
52
+
WHERE topic = :topic AND name = :c_name
51
53
FOR UPDATE SKIP LOCKED;
52
54
53
55
SELECT*FROM event
54
-
WHERE (:last_event_id IS NULL)OR id > last_event_id
55
-
ORDER BY id LIMITN;
56
+
WHEREtopic = :topic AND(:last_event_id IS NULLOR id >:last_event_id)
57
+
ORDER BY id LIMIT:limit;
56
58
57
59
(process events)
58
60
59
61
UPDATE consumer
60
62
SET last_event_id = :id,
61
63
last_consumption_at = :now
62
64
WHERE topic = :topic AND name = :c_name;
65
+
66
+
COMMIT;
63
67
```
64
68
65
-
Optionally, to increase throughput & concurrency, we might have partitioned topic and consumers (-1 partition standing for not partitioned topic/consumer).
69
+
Optionally, to increase throughput & concurrency, we might have a partitioned topic and consumers (-1 partition standing
70
+
for not partitioned topic/consumer/event).
71
+
72
+
Distribution of partitioned events is the sole responsibility of the publisher.
73
+
74
+
Consumption of such events per partition (0 in an example) might look like this:
66
75
67
-
Distribution of partitioned events is a sole responsibility of publisher - the library provides sensible default (random distribution).
68
-
Consumption of such events per partition (0 in example) might look like this:
69
76
```sql
70
77
BEGIN;
71
78
@@ -74,26 +81,205 @@ WHERE topic = :topic AND name = :c_name AND partition = 0
74
81
FOR UPDATE SKIP LOCKED;
75
82
76
83
SELECT*FROM event
77
-
WHERE (:last_event_id IS NULL)OR id > last_event_idAND partition =0
78
-
ORDER BY id LIMITN;
84
+
WHEREtopic = :topic AND partition =0AND(:last_event_id IS NULLOR id >:last_event_id)
85
+
ORDER BY id LIMIT:limit;
79
86
80
87
(process events)
81
88
82
89
UPDATE consumer
83
90
SET last_event_id = :id,
84
91
last_consumption_at = :now
85
92
WHERE topic = :topic AND name = :c_name AND partition =0;
93
+
94
+
COMMIT;
86
95
```
87
96
88
-
Limitation being that if consumer is partitioned, it must have the exact same number of partition as in the topic
89
-
definition.
90
-
It's a rather acceptable tradeoff and easy to enforce at the library level.
97
+
Limitation being that if consumer is partitioned, it must have the exact same number of partition as the topic
98
+
definition has. It's a rather acceptable tradeoff and easy to enforce at the library level.
91
99
92
100
## How to use it
93
101
94
-
TODO: for now, check out benchmarks/app being an example app.
102
+
`EventSQL` is an entrypoint to the whole library. It requires standard Java `javax.sql.DataSource` or a list of
103
+
them:
104
+
105
+
```java
106
+
107
+
importcom.binaryigor.eventsql.EventSQL;
108
+
importjavax.sql.DataSource;
109
+
// dialect of your events backend - POSTGRES, MYSQL, MARIADB and so on;
110
+
// as of now, only POSTGRES has fully tested support;
111
+
// should also work with others but some things - event table partition management for example - works only with Postgres, for others it must be managed manually
112
+
importorg.jooq.SQLDialect;
113
+
114
+
var eventSQL =newEventSQL(dataSource, SQLDialect.POSTGRES);
115
+
ver shardedEventSQL =newEventSQL(dataSources, SQLDialect.POSTGRES);
116
+
```
117
+
118
+
Sharded version works in the same vain - it just assumes that topics and consumers are hosted on multiple dbs.
119
+
120
+
### Topics and Consumers
121
+
122
+
Having `EventSQL` instance, we can register topics and their consumers:
Topics and consumers can be both partitioned and not partitioned.
136
+
**Partitioned topics allow to have partitioned consumers, increasing parallelism.**
137
+
Parallelism of partitioned consumers is as high as consumed topic number of partitions - events have ordering guarantee within a partition.
138
+
As a consequence, for a given consumer, each partition can be processed only by a single thread at the time.
139
+
140
+
For a consumer to be partitioned its topic must be partitioned as well - it will have the same number of partitions.
141
+
The opposite does not have to be true - consumer might not be partitioned but a related topic can; it has performance implications though, since as described above, consumer parallelism is capped at its number of partitions.
142
+
143
+
**With sharding, partitions are multiplied by the number of shards (db instances).**
144
+
145
+
For example, if we have *3 shards (3 dbs) and a topic with 10 partitions - each shard (db) will host 10 partitions, giving 30 partitions in total*.
146
+
Same with consumers of a sharded topic - they will be all multiplied by the number of shards.
147
+
148
+
For events, it works differently - in the example above, *each shard will host ~ 33% (1/3) of the topic events data* (assuming even partition distribution).
149
+
To get all events, we must read them from all shards.
150
+
151
+
There will be *30 consumer instances* in this particular case - `3 shards * 10 partitions`; each consuming from one partition hosted on a given shard.
152
+
Each event will be published to a one partition of a single shard - as a consequence, events are unique globally, across all shards.
153
+
154
+
### Publishing
155
+
156
+
We can publish single events and batches of arbitrary data and type:
// events can be also published in batches, for improved throughput
179
+
publisher.publishAll(List.of(
180
+
new EventPublication("txt_topic", "txt event 1".getBytes(StandardCharsets.UTF_8)),
181
+
new EventPublication("txt_topic", "txt event 2".getBytes(StandardCharsets.UTF_8)),
182
+
new EventPublication("txt_topic", "txt event 3".getBytes(StandardCharsets.UTF_8))));
183
+
```
184
+
185
+
### Partitioner
186
+
187
+
Event partition is determined by `EventSQLPublisher.Partitioner`. By default, the following implementation is used:
188
+
```java
189
+
public class DefaultPartitioner implements EventSQLPublisher.Partitioner {
190
+
191
+
private static final Random RANDOM = new Random();
192
+
193
+
@Override
194
+
public int partition(EventPublication publication, int topicPartitions) {
195
+
if (topicPartitions == -1) {
196
+
return -1;
197
+
}
198
+
if (publication.key() == null) {
199
+
return RANDOM.nextInt(topicPartitions);
200
+
}
201
+
202
+
return keyHash(publication.key())
203
+
.mod(BigInteger.valueOf(topicPartitions))
204
+
.intValue();
205
+
}
206
+
207
+
...
208
+
```
209
+
210
+
For a not partitioned topic, no partition is assigned.
211
+
212
+
If the topic is partitioned and has a null key - partition is random.
213
+
If a key is defined, the partition is assigned based on key hash. Thanks to this, we have a guarantee that events associated with the same key always land in the same partition.
214
+
215
+
If you want to change this behavior, you can provide your own implementation and configure it by calling `EventSQLPublisher.configurePartitioner` method.
216
+
217
+
### Consuming
218
+
219
+
We can have both single event and batch consumers:
Under certain circumstances, it will get a special treatment.
249
+
250
+
When a consumer throws `EventSQLConsumptionException`, `DefaultDLTEventFactory` takes over and publishes failed event to the associated DLT, if it can find one:
251
+
```java
252
+
...
253
+
254
+
@Override
255
+
public Optional<EventPublication> create(EventSQLConsumptionException exception, String consumer) {
256
+
var event = exception.event();
257
+
258
+
var dltTopic = event.topic() + "_dlt";
259
+
var dltTopicDefinitionOpt = topicDefinitionsCache.getLoadingIf(dltTopic, true);
260
+
if (dltTopicDefinitionOpt.isEmpty()) {
261
+
return Optional.empty();
262
+
}
263
+
264
+
...
265
+
266
+
// creates dlt event
267
+
```
268
+
269
+
This factory can be customized by calling `EventSQLConsumers.configureDLTEventFactory` method.
270
+
271
+
What is also worth noting is that any exception thrown by a single event consumer is wrapped into `EventSQLConsumptionException` automatically - see `ConsumerWrapper` class.
272
+
273
+
When you use `EventSQLConsumers.startBatchConsumer` you have to do the wrapping yourself.
0 commit comments