Eliminate DataFrame copies and double data loading in datashader path

timtreis · claude · timtreis · commit 1b195f03858e · 2026-03-27T14:47:17.000+01:00
Two additional performance fixes on top of the datashader speedups:

1. Replace .assign() + .rename() with direct column assignment when
   attaching the color column to the transformed element. Avoids two
   full DataFrame copies (~320MB saved for 10M points).

2. Add preloaded_color_data parameter to _set_color_source_vec so
   _render_points can pass already-loaded color data from get_values()
   instead of triggering a redundant second load from the table.

Co-Authored-By: Claude Opus 4.6 (1M context) &lt;noreply@anthropic.com&gt;
diff --git a/src/spatialdata_plot/pl/render.py b/src/spatialdata_plot/pl/render.py
@@ -796,8 +796,7 @@ def _render_points(
     # from the registered points (see above) avoids duplicate-origin ambiguities.
     color_table_name = table_name
 
-    # When color was already loaded from a table (line 690), pass it directly
-    # to avoid a redundant get_values() call inside _set_color_source_vec.
+    # Reuse color data already loaded from the table to avoid a redundant get_values() call.
     _preloaded = points_pd_with_color[col_for_color] if added_color_from_table and col_for_color is not None else None
 
     color_source_vector, color_vector, _ = _set_color_source_vec(