Skip to content

Commit 3c84a41

Browse files
committed
Better docs
1 parent 0ab31a8 commit 3c84a41

3 files changed

Lines changed: 164 additions & 9 deletions

File tree

README.md

Lines changed: 159 additions & 9 deletions
Original file line numberDiff line numberDiff line change
@@ -12,7 +12,8 @@ For scalability details, see [benchmarks](/benchmarks/README.md).
1212

1313
## How it works
1414

15-
We just need to have three tables:
15+
We just need to have three tables (postgres syntax):
16+
1617
```sql
1718
CREATE TABLE topic (
1819
name TEXT PRIMARY KEY,
@@ -43,15 +44,16 @@ CREATE TABLE consumer (
4344
```
4445

4546
To consume messages, we just need to periodically (every one to a few seconds) do:
47+
4648
```sql
4749
BEGIN;
4850

4951
SELECT * FROM consumer
50-
WHERE topic = : topic AND name = :c_name
52+
WHERE topic = :topic AND name = :c_name
5153
FOR UPDATE SKIP LOCKED;
5254

5355
SELECT * FROM event
54-
WHERE (:last_event_id IS NULL) OR id > last_event_id
56+
WHERE topic = :topic AND (:last_event_id IS NULL OR id > :last_event_id)
5557
ORDER BY id LIMIT N;
5658

5759
(process events)
@@ -62,10 +64,13 @@ SET last_event_id = :id,
6264
WHERE topic = :topic AND name = :c_name;
6365
```
6466

65-
Optionally, to increase throughput & concurrency, we might have partitioned topic and consumers (-1 partition standing for not partitioned topic/consumer).
67+
Optionally, to increase throughput & concurrency, we might have a partitioned topic and consumers (-1 partition standing
68+
for not partitioned topic/consumer).
69+
70+
Distribution of partitioned events is a sole responsibility of publisher - the library provides sensible default (random
71+
distribution).
72+
Consumption of such events per partition (0 in an example) might look like this:
6673

67-
Distribution of partitioned events is a sole responsibility of publisher - the library provides sensible default (random distribution).
68-
Consumption of such events per partition (0 in example) might look like this:
6974
```sql
7075
BEGIN;
7176

@@ -74,7 +79,7 @@ WHERE topic = :topic AND name = :c_name AND partition = 0
7479
FOR UPDATE SKIP LOCKED;
7580

7681
SELECT * FROM event
77-
WHERE (:last_event_id IS NULL) OR id > last_event_id AND partition = 0
82+
WHERE topic = :topic AND partition = 0 AND (:last_event_id IS NULL OR id > :last_event_id)
7883
ORDER BY id LIMIT N;
7984

8085
(process events)
@@ -91,9 +96,154 @@ It's a rather acceptable tradeoff and easy to enforce at the library level.
9196

9297
## How to use it
9398

94-
TODO: for now, check out benchmarks/app being an example app.
99+
`EventSQL` is an entrypoint to the whole library. It requires standard Java `javax.sql.DataSource` or a list of
100+
them:
101+
102+
```java
103+
104+
import com.binaryigor.eventsql.EventSQL;
105+
import javax.sql.DataSource;
106+
// dialect of your events backend - POSTGRES, MYSQL, MARIADB and so on;
107+
// as of now, only POSTGRES has fully tested support
108+
import org.jooq.SQLDialect;
109+
110+
var eventSQL = new EventSQL(dataSource, SQLDialect.POSTGRES);
111+
ver shardedEventSQL = new EventSQL(dataSources, SQLDialect.POSTGRES);
112+
```
113+
114+
Sharded version works in the same vain - it just assumes that topics and consumers are hosted on multiple dbs.
115+
116+
### Topics and Consumers
117+
118+
Having `EventSQL` instance, we can register topics and their consumers:
119+
120+
```java
121+
// all operations are idempotent
122+
eventSQL.registry()
123+
// -1 stands for not partitioned topic
124+
.registerTopic(new TopicDefinition("account_created", -1))
125+
.registerTopic(new TopicDefinition("invoice_issued", 5))
126+
// thirds argument (true/false) determines whether Consumer is partitioned or not
127+
.registerConsumer(new ConsumerDefinition("account_created", "consumer-1", false))
128+
.registerConsumer(new ConsumerDefinition("invoice_issued", "consumer-2", true));
129+
```
130+
131+
Topics and consumers can be both partitioned and not partitioned (-1 stands for not partitioned).
132+
**Partitioned topics allow to have partitioned consumers, increasing parallelism.**
133+
Parallelism of partitioned consumers is as high as consumed topic number of partitions - events have ordering guarantee within a partition.
134+
As a consequence, for a given consumer, each partition can be processed only by a single thread at the time.
135+
136+
For a consumer to be partitioned (third argument in the example) its topic must be partitioned as well - it will have the same number of partitions.
137+
The opposite does not have to be true - consumer might not be partitioned but related topic can;
138+
it has performance implications though, since as described above, consumer parallelism is capped at its number of partitions.
139+
140+
**For sharding, partitions are multiplied by the number of shards.**
141+
142+
For example, if we have *3 shards and a topic with 10 partitions - each shard will host 10 partitions, giving 30 partitions in total*.
143+
Same with consumers of a sharded topic - they will be all multiplied by the number of shards.
144+
145+
For events, it works differently - in the example above, *each shard will host ~ 33% (1/3) of the topic events data*.
146+
147+
There will be *30 consumer instances* in this particular case - `3 shards * 10 partitions`; each consuming from one partition hosted on a given shard.
148+
Each event will be published to a one partition of a single shard - events are unique globally, across all shards.
149+
150+
### Publishing
151+
152+
We can publish single events and batches of arbitrary data and type:
153+
```java
154+
var publisher = eventSQL.publisher();
155+
156+
// with - 1 argument, if topic is partitioned, it will be published to a random partition
157+
publisher.publish(new EventPublication("txt_topic", -1, "txt event".getBytes(StandardCharsets.UTF_8)));
158+
publisher.publish(new EventPublication("raw_topic", 1, new byte[]{1, 2, 3}));
159+
publisher.publish(new EventPublication("json_topic", 2,
160+
"""
161+
{
162+
"id": 2,
163+
"name: "some-user"
164+
}
165+
""".getBytes(StandardCharsets.UTF_8)));
166+
167+
// events can have keys and metadata as well
168+
publisher.publish(new EventPublication("txt_topic", -1,
169+
"event-key",
170+
"txt event".getBytes(StandardCharsets.UTF_8),
171+
Map.of("some-tag", "some-meta-info")));
172+
173+
174+
// events can be published in batches, for improved throughput
175+
publisher.publishAll(List.of(
176+
new EventPublication("txt_topic", -1, "txt event 1".getBytes(StandardCharsets.UTF_8)),
177+
new EventPublication("txt_topic", -1, "txt event 2".getBytes(StandardCharsets.UTF_8)),
178+
new EventPublication("txt_topic", -1, "txt event 3".getBytes(StandardCharsets.UTF_8))));
179+
```
180+
181+
### Consuming
182+
183+
We can have both single event and batch consumers:
184+
```java
185+
var consumers = eventSQL.consumers();
186+
187+
consumers.startConsumer("txt_topic", "single-consumer", (Event e) -> {
188+
// handle single event
189+
});
190+
// with more frequent polling - by default it is 1 second
191+
consumers.startConsumer("txt_topic", "single-consumer-customized", (Event e) -> {
192+
// handle single event
193+
}, Duration.ofMillis(100));
194+
195+
consumers.startBatchConsumer("txt_topic", "batch-consumer", (List<Event> events) -> {
196+
// handle events batch
197+
}, // customize batch behavior:
198+
// minEvents, maxEvents,
199+
// pollingDelay and maxPollingDelay - how long to wait for minEvents
200+
EventSQLConsumers.ConsumptionConfig.of(5, 100,
201+
Duration.ofSeconds(1), Duration.ofSeconds(10)));
202+
```
203+
204+
### Dead Letter Topics (DLT)
205+
206+
If we register a topic with the DLT as follows:
207+
```java
208+
eventSQL.registry()
209+
.registerTopic(new TopicDefinition("account_created", -1))
210+
.registerTopic(new TopicDefinition("account_created_dlt", -1));
211+
```
212+
Under certain circumstances, it will have special treatment.
213+
214+
When a consumer throws `EventSQLConsumptionException`, `DefaultDLTEventFactory` takes it over and publishes failed event to the associated dlt if it can find one:
215+
```java
216+
...
217+
218+
@Override
219+
public Optional<EventPublication> create(EventSQLConsumptionException exception, String consumer) {
220+
var event = exception.event();
221+
222+
var dltTopic = event.topic() + "_dlt";
223+
var dltTopicDefinitionOpt = topicDefinitionsCache.getLoadingIf(dltTopic, true);
224+
if (dltTopicDefinitionOpt.isEmpty()) {
225+
return Optional.empty();
226+
}
227+
228+
...
229+
230+
// creates dlt event
231+
```
232+
233+
This factory can be customized by using another `EventSQL` constructor or by calling `EventSQLConsumers.configureDLTEventFactory` method.
234+
235+
What is also worth noting is that any exception thrown by single event consumer is wrapped into `EventSQLConsumptionException` automatically - see *ConsumerWrapper.class*.
236+
237+
When you use `consumers.startBatchConsumer` you have to do wrapping yourself.
95238
96239
97240
## How to get it
98241
99-
TODO
242+
Maven:
243+
```
244+
TODO: publish it
245+
```
246+
Gradle:
247+
```
248+
TODO: publish it
249+
```

TODO.md

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -7,3 +7,4 @@
77
* more elaborate definitions change support - especially around partitions growth & shrinkage
88
* JavaDocs
99
* Support schemas init in registry - why require schemas from users, if it is always the same?
10+
* using JOOQ tradeoffs?

src/main/java/com/binaryigor/eventsql/EventPublication.java

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -8,6 +8,10 @@ public record EventPublication(String topic,
88
byte[] value,
99
Map<String, String> metadata) {
1010

11+
public EventPublication(String topic, int partition, String key, byte[] value) {
12+
this(topic, partition, key, value, Map.of());
13+
}
14+
1115
public EventPublication(String topic, int partition, byte[] value, Map<String, String> metadata) {
1216
this(topic, partition, null, value, metadata);
1317
}

0 commit comments

Comments
 (0)