Skip to content

small features: add option to save cache in parquet, save judge input…#35

Merged
geoalgo merged 1 commit intomainfrom
small_features
Apr 13, 2026
Merged

small features: add option to save cache in parquet, save judge input…#35
geoalgo merged 1 commit intomainfrom
small_features

Conversation

@geoalgo
Copy link
Copy Markdown
Collaborator

@geoalgo geoalgo commented Apr 8, 2026

…, improve error handling of openrouter, remove compute_cohen_kappa

Reasons:

  • saving in parquet is better in term of storage
  • saving judge input allow to have all the context used to make the judge call and to be able to estimate the cost
  • the handling of errors in openrouter becomes a bit less whacky
  • cohen_kappa is available in scikit-learn, we should use this code instead (which produces the same values)

@geoalgo geoalgo requested a review from ErlisLushtaku April 8, 2026 09:50
completion_A: str # completion of the first model
completion_B: str # completion of the second model
judge_completion: str # output of the judge
judge_input: str | None = None # input that was passed to the judge
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this be added to the estimate_elo_ratings.py workflow as well?

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It uses judge_and_parse_prefs from this file so it is updated as well if I am not mistaken.

Copy link
Copy Markdown
Collaborator

@ErlisLushtaku ErlisLushtaku Apr 11, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, but we are dropping it here because we are constructing the Dataframe manually

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

good catch, I will change the elo-rating to also store input.

return x

for col in df.select_dtypes(include="object").columns:
df[col] = df[col].apply(_to_python).astype(str)
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should be careful here if the dataframe can contain missing values. Calling .astype(str) on missing values (None or np.nan) converts them into strings "None" and "nan". When the parquet file is read back, they would be processed as strings instead of missing values.

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes I agree but I dont see another way to serialize to parquet. I agree that this conversion is loosing the missingness information but I think all downstream code should probably exclude empty strings too when computing annotations.

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I searched a bit and I think we have two safe solutions:

Option 1: Parquet + JSON Sidecar
We selectively stringify only the complex objects (like dicts/lists), leave None/NaN untouched by filtering something like this:

df_cache[col] = df_cache[col].apply(
    lambda x: x if x is None or (isinstance(x, float) and np.isnan(x)) else repr(_to_python(x))
)

and save a small .meta.json tracking which columns were altered. Then when reading we again filter to call ast.literal_eval only on cells that don't contain these missing values and were serialized.

Option 2: Compressed Pickle (.pkl.gz)
We could switch to df.to_pickle(cache_file, compression="gzip"). It would be simpler, but probably less storage-efficient.

Otherwise we could leave it like this and note it for later, if you think this is not a big issue at the moment.

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Option 1 seems like an overkill given that the option is for util which is deactivated by default.
Option 2 is a strong no, as we would be replacing the issue with a security issue (also we wont be getting the benefit of parquet of column storage compression).

I would be keen to merge as is, possibly adding a comment.
Alternatively, I am also fine to drop the feature and keep it in my branch.

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Then we could merge it as is so we can use it too 👍

@geoalgo geoalgo merged commit 31dc7a3 into main Apr 13, 2026
1 check passed
@geoalgo geoalgo deleted the small_features branch April 13, 2026 12:32
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants