Integer columns mapped to fill/color are dropped by bar stat transform

## Summary

When an integer column is mapped to `fill` (or `color`) in a bar chart, ggsql's stat transform drops the column from the result, causing a validation error:

```
Validation error: Column 'fill' referenced in aesthetic 'fill' (layer 1 (global data)) does not exist.
Available columns: __ggsql_aes_pos1__, __ggsql_aes_pos2__, __ggsql_aes_pos2end__
```

## Reproducible example

### Rust (integration test style)

```rust
use ggsql::reader::{DuckDBReader, Reader};
use ggsql::writer::VegaLiteWriter;

let reader = DuckDBReader::from_connection_string("duckdb://memory").unwrap();

// Integer column (survived: 0/1) mapped to fill
let spec = reader.execute(
    "SELECT *
     FROM (VALUES
       ('Male', 0), ('Male', 1), ('Female', 0), ('Female', 1),
       ('Male', 0), ('Male', 0), ('Female', 1), ('Female', 1)
     ) AS t(sex, survived)
     VISUALISE sex AS x, survived AS fill
     DRAW bar"
);

// This fails with: Column 'fill' referenced in aesthetic 'fill' ... does not exist
assert!(spec.is_ok(), "Should handle integer fill: {:?}", spec.err());
```

### Python

```python
import ggsql
import polars as pl

reader = ggsql.DuckDBReader("duckdb://memory")
df = pl.DataFrame({
    "sex": ["Male", "Male", "Female", "Female", "Male", "Male", "Female", "Female"],
    "survived": [0, 1, 0, 1, 0, 0, 1, 1],
})
reader.register("titanic", df)

# Fails with validation error
spec = reader.execute("""
    SELECT * FROM titanic
    VISUALISE sex AS x, survived AS fill
    DRAW bar
""")
```

Note: adding `SCALE DISCRETE fill` or `SCALE fill RENAMING 0 => 'No', 1 => 'Yes'` doesn't help because RENAMING doesn't set a `scale_type`, so the discreteness check still falls through to the schema-based inference.

## Root cause

In `src/execute/schema.rs:171-172`, discreteness is determined purely by data type:

```rust
let is_discrete =
    matches!(dtype, DataType::String | DataType::Boolean) || dtype.is_categorical();
```

Integers are never considered discrete. The downstream effect:

1. `add_discrete_columns_to_partition_by` (`src/execute/mod.rs:677`) checks if a mapped column is discrete
2. Integer `survived` → not discrete → **not added** to `partition_by`
3. The bar stat transform (`src/plot/layer/geom/bar.rs:87`) builds `GROUP BY` from `partition_by` + x column
4. Since `fill` isn't in `group_by`, `survived` is dropped from the aggregation SQL
5. The resulting DataFrame only has `pos1`, `pos2`, `pos2end`
6. Writer validation fails because `fill` references a column that no longer exists

Note that `SCALE fill RENAMING ...` doesn't help because RENAMING doesn't set `scale.scale_type`, so `add_discrete_columns_to_partition_by` falls through to the schema check (line 740-741), which still says "integer = not discrete."

## Inconsistency with ggplot2

In ggplot2, this works because **all mapped aesthetics contribute to grouping**, regardless of column type:

```r
library(ggplot2)
df <- data.frame(sex = c("Male", "Female", "Male", "Female"),
                 survived = c(0L, 1L, 0L, 1L))
# Works fine — survived (integer) is used for grouping in stat_count
ggplot(df, aes(x = sex, fill = survived)) + geom_bar()
```

ggplot2 treats the integer as continuous for _color scale_ purposes (producing a gradient), but still uses it for grouping in the stat transform. The grouping and the scale type are independent concerns.

## Possible approaches

### A) Aesthetic-based grouping

Certain aesthetics (`fill`, `color`, `shape`, `linetype`, `stroke`) inherently imply grouping. Any column mapped to these should be added to `partition_by` regardless of data type.

**Pros**: Targeted fix, only changes behavior for aesthetics where grouping is clearly intended.
**Cons**: Doesn't cover edge cases like mapping a numeric column to `opacity` in a bar chart. Requires maintaining a list of "grouping aesthetics."

### B) All non-positional mapped columns survive stat transforms

Every non-positional, non-stat-consumed aesthetic column gets added to `GROUP BY` for stat transforms, regardless of data type or aesthetic name.

**Pros**: Simpler logic, matches ggplot2's behavior most closely (where `group` is the interaction of all mapped discrete variables, but stat transforms preserve all mappings). No need to maintain a special list.
**Cons**: Broader change — could affect behavior for intentionally continuous aesthetics like `opacity` mapped to a numeric column in a stat geom. Though in practice, including a continuous column in GROUP BY just means "don't aggregate it away," which is usually correct.

### Additional consideration: RENAMING should imply discrete

Independently of the above, `SCALE fill RENAMING ...` should probably set or imply a discrete scale type. If you're providing explicit label mappings for specific values, discrete semantics are almost certainly intended.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Integer columns mapped to fill/color are dropped by bar stat transform #239

Summary

Reproducible example

Rust (integration test style)

Python

Root cause

Inconsistency with ggplot2

Possible approaches

A) Aesthetic-based grouping

B) All non-positional mapped columns survive stat transforms

Additional consideration: RENAMING should imply discrete

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Integer columns mapped to fill/color are dropped by bar stat transform #239

Description

Summary

Reproducible example

Rust (integration test style)

Python

Root cause

Inconsistency with ggplot2

Possible approaches

A) Aesthetic-based grouping

B) All non-positional mapped columns survive stat transforms

Additional consideration: RENAMING should imply discrete

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions