Skip to content

Commit 8d2a262

Browse files
authored
Update RFC.md
1 parent 37fc240 commit 8d2a262

1 file changed

Lines changed: 30 additions & 7 deletions

File tree

β€ŽRFC.mdβ€Ž

Lines changed: 30 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -60,7 +60,10 @@ from kloppy import secondspectrum
6060
dataset = secondspectrum.load(...normal arguments...)
6161

6262
# Add ball speed
63-
container.add_metric("ball_speed", expression="sqrt((ball.x - ball.x.shift(1))**2 + ...)", engine="polars")
63+
def ball_speed(ball, **kwargs):
64+
return pl.sqrt((ball.x - ball.x.shift(1))**2 + ...)
65+
66+
container.add_column("ball_speed", ball_speed)
6467

6568
# Frame materialization
6669
frame = container.materialize_frame(timestamp=534.2)
@@ -80,10 +83,15 @@ class TrackingDataContainer:
8083
def arrow(self) -> pa.Table
8184
def lazy(self) -> pl.LazyFrame
8285
def selector: ColumnSelector # e.g. selector.player("home", 7).x
83-
def add_metric(name: str, expression: str | pl.Expr, engine: str)
84-
def add_metrics(list_of_exprs: list[pl.Expr])
8586
def get_column(...)
8687
def materialize_frame(timestamp: float)
88+
89+
def add_column(name: str, fn: Callable)
90+
def add_column_per_player(name: str, fn: Callable)
91+
def add_column_per_position(name: str, fn: Callable)
92+
def remove_column(name: str)
93+
94+
def validate(self) -> boolean
8795
```
8896

8997
### Column Naming Convention
@@ -115,6 +123,7 @@ By adhering to CDF, the `TrackingDataContainer` will be interoperable with other
115123
- Includes teams, players, field size, orientation, coordinate system
116124
- Tracks added metrics, including provenance
117125
- Defines optional logical "layers" (e.g. tracking, skeleton, predictions)
126+
- Use kloppy Metadata objects
118127

119128
### Storage & Computation Model
120129

@@ -161,6 +170,7 @@ This design ensures the container can be scaled from local processing to lakehou
161170
- Packages must use `TrackingDataContainer` for I/O
162171
- Read-only tools may optionally operate on Arrow, if they follow the schema
163172
- Kloppy will provide `load_as_container(...)` as the default loader
173+
- TDC should be able go from/to other DataFrame formats like pandas to ensure packages can adopt easily.
164174

165175
---
166176

@@ -175,15 +185,15 @@ This design ensures the container can be scaled from local processing to lakehou
175185

176186
## Open Questions
177187

178-
- Should metrics be namespaced (`package/metric`)?
188+
- Should metrics be namespaced (`package/metric`)? => No, TDC doesn't know about metrics, only about columns (maybe we introduce a column type some day?)
179189
- Should struct-based layouts be supported now or later?
180-
- Do we support event data natively or as a separate container?
181-
- Does `TrackingDataContainer` need to extend from kloppy `Dataset`? It probably should
190+
- Do we support event data natively or as a separate container? => It would be good to sync timestamps somehow, but Arrow table isn't a great fit for event data.
191+
- Does `TrackingDataContainer` need to extend from kloppy `Dataset`? => No, but we do use the kloppy Metadata. Q: can we use kloppy `to_df`?
182192
- Add a FrameBuilder? (using PyArrow `ArrayBuilder` - not supported in Python yet - see https://github.com/apache/arrow/issues/20529 )
183193
- Related tickets:
184194
- [Refactor serializer options into own component](https://github.com/PySport/kloppy/issues/10) -> Describes a FrameBuilder approach
185195
- [Refactor tracking data model](https://github.com/PySport/kloppy/pull/377)
186-
196+
- Should TDC support transformations? Probably yes, so a package can ensure all data is in the correct orientation and dimensions.
187197
---
188198

189199
## Conclusion
@@ -192,4 +202,17 @@ The `TrackingDataContainer` provides a shared, fast, and extensible foundation f
192202

193203
We propose making this the default container for all tools in the ecosystem.
194204

205+
## Support
206+
207+
We brought together contributors from different open source projects to discuss how we can align our work and improve interoperability in football analytics.
208+
209+
Participants included:
210+
- πŸ‡§πŸ‡ͺ Pieter Robberechts (soccerdata, socceraction, Kloppy)
211+
- πŸ‡³πŸ‡± Alexander Oonk (databallpy)
212+
- πŸ‡―πŸ‡΅ Keisuke Fujii, Calvin Yeung (OpenSTARLab)
213+
- πŸ‡©πŸ‡ͺ Manuel Bassek (Floodlight)
214+
- πŸ‡³πŸ‡± Joris Bekkers, Koen de Raad & Koen Vossen (PySport/Kloppy)
215+
- πŸ‡§πŸ‡· Thiago Costa Porto, Ricardo Furbino (UFMG)
216+
- πŸ‡°πŸ‡· Hyunsung Kim (ballradar, soccercpd)
217+
195218
**Feedback welcome!**

0 commit comments

Comments
Β (0)