You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: README.md
+73-37Lines changed: 73 additions & 37 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -54,21 +54,23 @@ FOR UPDATE SKIP LOCKED;
54
54
55
55
SELECT*FROM event
56
56
WHERE topic = :topic AND (:last_event_id IS NULLOR id > :last_event_id)
57
-
ORDER BY id LIMITN;
57
+
ORDER BY id LIMIT:limit;
58
58
59
59
(process events)
60
60
61
61
UPDATE consumer
62
62
SET last_event_id = :id,
63
63
last_consumption_at = :now
64
64
WHERE topic = :topic AND name = :c_name;
65
+
66
+
COMMIT;
65
67
```
66
68
67
69
Optionally, to increase throughput & concurrency, we might have a partitioned topic and consumers (-1 partition standing
68
-
for not partitioned topic/consumer).
70
+
for not partitioned topic/consumer/event).
71
+
72
+
Distribution of partitioned events is the sole responsibility of the publisher.
69
73
70
-
Distribution of partitioned events is a sole responsibility of publisher - the library provides sensible default (random
71
-
distribution).
72
74
Consumption of such events per partition (0 in an example) might look like this:
73
75
74
76
```sql
@@ -80,19 +82,20 @@ FOR UPDATE SKIP LOCKED;
80
82
81
83
SELECT*FROM event
82
84
WHERE topic = :topic AND partition =0AND (:last_event_id IS NULLOR id > :last_event_id)
83
-
ORDER BY id LIMITN;
85
+
ORDER BY id LIMIT:limit;
84
86
85
87
(process events)
86
88
87
89
UPDATE consumer
88
90
SET last_event_id = :id,
89
91
last_consumption_at = :now
90
92
WHERE topic = :topic AND name = :c_name AND partition =0;
93
+
94
+
COMMIT;
91
95
```
92
96
93
-
Limitation being that if consumer is partitioned, it must have the exact same number of partition as in the topic
94
-
definition.
95
-
It's a rather acceptable tradeoff and easy to enforce at the library level.
97
+
Limitation being that if consumer is partitioned, it must have the exact same number of partition as the topic
98
+
definition has. It's a rather acceptable tradeoff and easy to enforce at the library level.
96
99
97
100
## How to use it
98
101
@@ -104,7 +107,8 @@ them:
104
107
importcom.binaryigor.eventsql.EventSQL;
105
108
importjavax.sql.DataSource;
106
109
// dialect of your events backend - POSTGRES, MYSQL, MARIADB and so on;
107
-
// as of now, only POSTGRES has fully tested support
110
+
// as of now, only POSTGRES has fully tested support;
111
+
// should also work with others but some things - event table partition management for example - works only with Postgres, for others it must be managed manually
108
112
importorg.jooq.SQLDialect;
109
113
110
114
var eventSQL =newEventSQL(dataSource, SQLDialect.POSTGRES);
Topics and consumers can be both partitioned and not partitioned (-1 stands for not partitioned).
135
+
Topics and consumers can be both partitioned and not partitioned.
132
136
**Partitioned topics allow to have partitioned consumers, increasing parallelism.**
133
137
Parallelism of partitioned consumers is as high as consumed topic number of partitions - events have ordering guarantee within a partition.
134
138
As a consequence, for a given consumer, each partition can be processed only by a single thread at the time.
135
139
136
-
For a consumer to be partitioned (third argument in the example) its topic must be partitioned as well - it will have the same number of partitions.
137
-
The opposite does not have to be true - consumer might not be partitioned but related topic can;
138
-
it has performance implications though, since as described above, consumer parallelism is capped at its number of partitions.
140
+
For a consumer to be partitioned its topic must be partitioned as well - it will have the same number of partitions.
141
+
The opposite does not have to be true - consumer might not be partitioned but a related topic can; it has performance implications though, since as described above, consumer parallelism is capped at its number of partitions.
139
142
140
-
**For sharding, partitions are multiplied by the number of shards.**
143
+
**With sharding, partitions are multiplied by the number of shards (db instances).**
141
144
142
-
For example, if we have *3 shards and a topic with 10 partitions - each shard will host 10 partitions, giving 30 partitions in total*.
145
+
For example, if we have *3 shards (3 dbs) and a topic with 10 partitions - each shard (db) will host 10 partitions, giving 30 partitions in total*.
143
146
Same with consumers of a sharded topic - they will be all multiplied by the number of shards.
144
147
145
-
For events, it works differently - in the example above, *each shard will host ~ 33% (1/3) of the topic events data*.
148
+
For events, it works differently - in the example above, *each shard will host ~ 33% (1/3) of the topic events data* (assuming even partition distribution).
149
+
To get all events, we must read them from all shards.
146
150
147
151
There will be *30 consumer instances* in this particular case - `3 shards * 10 partitions`; each consuming from one partition hosted on a given shard.
148
-
Each event will be published to a one partition of a single shard - events are unique globally, across all shards.
152
+
Each event will be published to a one partition of a single shard - as a consequence, events are unique globally, across all shards.
149
153
150
154
### Publishing
151
155
152
156
We can publish single events and batches of arbitrary data and type:
153
157
```java
154
158
var publisher = eventSQL.publisher();
155
159
156
-
// with - 1 argument, if topic is partitioned, it will be published to a random partition
// events can be published in batches, for improved throughput
178
+
// events can be also published in batches, for improved throughput
175
179
publisher.publishAll(List.of(
176
-
new EventPublication("txt_topic", -1, "txt event 1".getBytes(StandardCharsets.UTF_8)),
177
-
new EventPublication("txt_topic", -1, "txt event 2".getBytes(StandardCharsets.UTF_8)),
178
-
new EventPublication("txt_topic", -1, "txt event 3".getBytes(StandardCharsets.UTF_8))));
180
+
new EventPublication("txt_topic", "txt event 1".getBytes(StandardCharsets.UTF_8)),
181
+
new EventPublication("txt_topic", "txt event 2".getBytes(StandardCharsets.UTF_8)),
182
+
new EventPublication("txt_topic", "txt event 3".getBytes(StandardCharsets.UTF_8))));
179
183
```
180
184
185
+
### Partitioner
186
+
187
+
Event partition is determined by `EventSQLPublisher.Partitioner`. By default, the following implementation is used:
188
+
```java
189
+
public class DefaultPartitioner implements EventSQLPublisher.Partitioner {
190
+
191
+
private static final Random RANDOM = new Random();
192
+
193
+
@Override
194
+
public int partition(EventPublication publication, int topicPartitions) {
195
+
if (topicPartitions == -1) {
196
+
return -1;
197
+
}
198
+
if (publication.key() == null) {
199
+
return RANDOM.nextInt(topicPartitions);
200
+
}
201
+
202
+
return keyHash(publication.key())
203
+
.mod(BigInteger.valueOf(topicPartitions))
204
+
.intValue();
205
+
}
206
+
207
+
...
208
+
```
209
+
210
+
For a not partitioned topic, no partition is assigned.
211
+
212
+
If the topic is partitioned and has a null key - partition is random.
213
+
If a key is defined, the partition is assigned based on key hash. Thanks to this, we have a guarantee that events associated with the same key always land in the same partition.
214
+
215
+
If you want to change this behavior, you can provide your own implementation and configure it by calling `EventSQLPublisher.configurePartitioner` method.
216
+
181
217
### Consuming
182
218
183
219
We can have both single event and batch consumers:
Under certain circumstances, it will have special treatment.
248
+
Under certain circumstances, it will get a special treatment.
213
249
214
-
When a consumer throws `EventSQLConsumptionException`, `DefaultDLTEventFactory` takes it over and publishes failed event to the associated dlt if it can find one:
250
+
When a consumer throws `EventSQLConsumptionException`, `DefaultDLTEventFactory` takes over and publishes failed event to the associated DLT, if it can find one:
215
251
```java
216
252
...
217
253
@@ -230,11 +266,11 @@ public Optional<EventPublication> create(EventSQLConsumptionException exception,
230
266
// creates dlt event
231
267
```
232
268
233
-
This factory can be customized by using another `EventSQL` constructor or by calling `EventSQLConsumers.configureDLTEventFactory` method.
269
+
This factory can be customized by calling `EventSQLConsumers.configureDLTEventFactory` method.
234
270
235
-
What is also worth noting is that any exception thrown by single event consumer is wrapped into `EventSQLConsumptionException` automatically - see *ConsumerWrapper.class*.
271
+
What is also worth noting is that any exception thrown by a single event consumer is wrapped into `EventSQLConsumptionException` automatically - see `ConsumerWrapper` class.
236
272
237
-
When you use `consumers.startBatchConsumer` you have to do wrapping yourself.
273
+
When you use `EventSQLConsumers.startBatchConsumer` you have to do the wrapping yourself.
0 commit comments