Merge with main and demo fixes by andyorange · Pull Request #36 · tswast/leanframe

andyorange · 2026-05-27T05:25:46Z

Hi Tim,
I merged the current main branch into an_indexing and fixed the demos by re-adding the filter_by method to the DataFrameHandler. Indexing and nesting not work again!
Would love to see the branch merged into main. Also, Vodafone would love to see me mentioned as a collaborator for the project...
The remaining issues are from the linter and refer to files you created, I do not want to interfere here and leave them changes to you

P.S.: I created the demos with Chat GPT. As that is also google, I hope there are no rights conflicts.

tswast

Thanks!

tswast · 2026-05-28T16:30:44Z

+
+    # Set index on timestamp
+    print("\n📍 Setting index on 'timestamp' (ascending)...")
+    df_indexed = df.set_index('timestamp', ascending=True)


The ascending=True is a departure from pandas.DataFrame.set_index, but I think this is okay, because:

Most of the time folks would expect some kind of ordering to be available when they set an index, so that iloc can be used.

The ascending=True naming is consistent with how ordering is defined in pandas.

tswast · 2026-05-28T16:47:42Z

+df_ordered = DataFrame(ordered)
+
+# Now use indexing
+df_ordered._index = Index('priority', ascending=False)


I'm a bit confused what the intention of this line is here? I guess it's so that we don't override the sorting introduced in the ibis expression?

tswast · 2026-05-28T16:58:49Z


    def __init__(self, data: ibis_types.Table):
        self._data = data
+        self._index: Index | None = None  # Explicit ordering specification


I'm curious if we'd want to separate the ordering from the index definition for a bit better compatibility with pandas, like how bigframes does it? I suspect that adds an unnecessary level of compexity?

tswast · 2026-05-28T17:01:47Z

            path: info["extracted_name"] for path, info in self.nested_fields.items()
        }

+    def filter_by(self, **kwargs) -> "DataFrameHandler":


This also is a departure from pandas, but I do quite like it. Happy to keep it. In regular pandas, basically every filtering method I can think of requires an implicit join on the index column.

andyorange · 2026-05-29T06:08:37Z

Hi Tim, let us discuss those questions during the next meeting! Thanks a lot for looking into it! Andy Gesendet: Donnerstag, 28. Mai 2026 um 19:41 Von: "Tim Sweña (Swast)" ***@***.***> An: tswast/leanframe ***@***.***> CC: andyorange ***@***.***>,Author ***@***.***> Betreff: Re: [tswast/leanframe] Merge with main and demo fixes (PR #36) @tswast approved this pull request. Thanks! In demos/demo_indexing_with_nested.py:

+ data = {

+ 'id': [1, 2, 3, 4, 5], + 'value': [10, 20, 30, 40, 50], + 'timestamp': pd.date_range('2024-01-01', periods=5) + } + + # Create leanframe DataFrame + ibis_table = ibis.memtable(data) + df = DataFrame(ibis_table) + + print(f"\nOriginal DataFrame shape: {len(df.columns)} columns") + print(f"Columns: {df.columns.tolist()}") + + # Set index on timestamp + print("\n📍 Setting index on 'timestamp' (ascending)...") + df_indexed = df.set_index('timestamp', ascending=True) The ascending=True is a departure from pandas.DataFrame.set_index, but I think this is okay, because: Most of the time folks would expect some kind of ordering to be available when they set an index, so that iloc can be used. The ascending=True naming is consistent with how ordering is defined in pandas. In docs/indexing_guide.md:

+## Advanced: Custom Ordering Logic

+ +For complex ordering (multiple columns, null handling), directly use Ibis: + +```python +# Complex ordering with Ibis +ordered = df._data.order_by([ + ibis.desc(df._data.priority), + df._data.timestamp +]) + +from leanframe.core.frame import DataFrame +df_ordered = DataFrame(ordered) + +# Now use indexing +df_ordered._index = Index('priority', ascending=False) I'm a bit confused what the intention of this line is here? I guess it's so that we don't override the sorting introduced in the ibis expression? In leanframe/core/frame.py:

@@ -32,11 +40,72 @@ class DataFrame:

def __init__(self, data: ibis_types.Table): self._data = data + self._index: Index | None = None # Explicit ordering specification I'm curious if we'd want to separate the ordering from the index definition for a bit better compatibility with pandas, like how bigframes does it? I suspect that adds an unnecessary level of compexity? In leanframe/core/frame.py:

@@ -375,6 +517,30 @@ def extracted_fields(self) -> dict[str, str]:

path: info["extracted_name"] for path, info in self.nested_fields.items() } + def filter_by(self, **kwargs) -> "DataFrameHandler": This also is a departure from pandas, but I do quite like it. Happy to keep it. In regular pandas, basically every filtering method I can think of requires an implicit join on the index column. — Reply to this email directly, view it on GitHub, or unsubscribe. Triage notifications, keep track of coding agent tasks and review pull requests on the go with GitHub Mobile for iOS and Android. Download it today! You are receiving this because you authored the thread.Message ID: ***@***.***>

isleofdeath-afk added 4 commits January 26, 2026 15:42

indexing, first approach

62b3c68

added multi column indexing

f025e3d

Merge branch 'main' into ab_indexing

281ac90

first fixes after merge

b6f0b57

andyorange requested a review from tswast May 27, 2026 05:25

isleofdeath-afk and others added 4 commits May 27, 2026 07:41

fixed

12b811b

changed filter_by to multiple cols via kwargs dict

58ea0e2

imports added

8e4330a

ran: uv run ruff format && uv run ruff check --fix

c23b28f

tswast approved these changes May 28, 2026

View reviewed changes

tswast merged commit d82a4c1 into main May 28, 2026
5 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Merge with main and demo fixes#36

Merge with main and demo fixes#36
tswast merged 8 commits into
mainfrom
ab_indexing

andyorange commented May 27, 2026 •

edited

Loading

Uh oh!

tswast left a comment

Uh oh!

tswast May 28, 2026

Uh oh!

tswast May 28, 2026

Uh oh!

tswast May 28, 2026

Uh oh!

tswast May 28, 2026

Uh oh!

Uh oh!

andyorange commented May 29, 2026 via email

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

andyorange commented May 27, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

tswast left a comment

Choose a reason for hiding this comment

Uh oh!

tswast May 28, 2026

Choose a reason for hiding this comment

Uh oh!

tswast May 28, 2026

Choose a reason for hiding this comment

Uh oh!

tswast May 28, 2026

Choose a reason for hiding this comment

Uh oh!

tswast May 28, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

andyorange commented May 29, 2026 via email

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

andyorange commented May 27, 2026 •

edited

Loading