The current implementation appears to be bound to the Lance Python package / pylance API rather than the LanceDB Python client, but several code names and docs still use lancedb / LanceDB. This can mislead readers into thinking the integration depends on or targets the LanceDB client API.
Examples found in the repo:
LanceDBScanOperator
_lancedb_table_factory_function
_lancedb_count_result_function
tests/io/lancedb/...
- docstrings that refer to "LanceDB table" or say the functions require LanceDB
At runtime, the code imports and uses lance, e.g. lance.dataset(...), lance.LanceDataset, and lance.LanceOperation. The project dependencies include pylance and lance-namespace*, but not lancedb.
Proposal:
- Rename internal symbols from
lancedb / LanceDB to lance / Lance where they describe Lance dataset behavior.
- Update docstrings and examples to say "Lance dataset/table" instead of "LanceDB table" unless a true LanceDB client integration is intended.
- Consider moving
tests/io/lancedb to a Lance-specific path, or at least update test names over time to reduce confusion.
- Keep backward compatibility in mind for any public or semi-public names, but the current package API appears to expose
read_lance, merge_columns, merge_columns_df, create_scalar_index, and compact_files, not LanceDBScanOperator.
This is mostly a clarity issue, but it matters because Lance and LanceDB are distinct layers and the current naming makes the integration boundary harder to understand.
The current implementation appears to be bound to the Lance Python package / pylance API rather than the LanceDB Python client, but several code names and docs still use
lancedb/LanceDB. This can mislead readers into thinking the integration depends on or targets the LanceDB client API.Examples found in the repo:
LanceDBScanOperator_lancedb_table_factory_function_lancedb_count_result_functiontests/io/lancedb/...At runtime, the code imports and uses
lance, e.g.lance.dataset(...),lance.LanceDataset, andlance.LanceOperation. The project dependencies includepylanceandlance-namespace*, but notlancedb.Proposal:
lancedb/LanceDBtolance/Lancewhere they describe Lance dataset behavior.tests/io/lancedbto a Lance-specific path, or at least update test names over time to reduce confusion.read_lance,merge_columns,merge_columns_df,create_scalar_index, andcompact_files, notLanceDBScanOperator.This is mostly a clarity issue, but it matters because Lance and LanceDB are distinct layers and the current naming makes the integration boundary harder to understand.