Update fieldset ingestion to use convert modules by VeckoTheGecko · Pull Request #40 · Parcels-code/parcels-benchmarks

VeckoTheGecko · 2026-03-18T15:36:28Z

OK - I've gone ahead and updated the ingestion code here so that its inline with Parcels-code/Parcels#2549 . We are closer to having a working benchmark suite, but unfortunately we're not there yet. Hence I propose that we go ahead and merge this anyway as it gets us closer to the end goal.

MOI error

Currently ingestion works, but we get an error during the execution itself (note this PR now closes #33, as ingestion works and we now have a different error).

pixi run setup-data
pixi run asv run --bench moi_curvilinear.MOICurvilinear.time_pset_execute_3d

· Creating environments
· Discovering benchmarks
· Running 1 total benchmarks (1 commits * 1 environments * 1 benchmarks)
[ 0.00%] · For parcels commit be625b01 <update-convert>:
[ 0.00%] ·· Benchmarking rattler-py3.12-intake-xarray
[50.00%] ··· Running (moi_curvilinear.MOICurvilinear.time_pset_execute_3d--).
[100.00%] ··· ...vilinear.MOICurvilinear.time_pset_execute_3d             failed
[100.00%] ··· ============== =============
              --             chunk / npart
              -------------- -------------
               interpolator   256 / 10000 
              ============== =============
                 XLinear         failed   
              ============== =============
              For parameters: 'XLinear', 256, 10000
              /Users/Hodgs004/coding/repos/parcels-benchmarks/benchmarks/moi_curvilinear.py:2: UserWarning: This is an alpha version of Parcels v4. The API is not stable and may change without deprecation warnings.
                import parcels
              /Users/Hodgs004/coding/repos/parcels-benchmarks/.asv/env/44a3d831dbbb05d5c17212900f2e92b0/lib/python3.12/site-packages/parcels/convert.py:126: UserWarning: No depth dimension found in your dataset. Assuming no depth (i.e., surface data).
                warnings.warn("No depth dimension found in your dataset. Assuming no depth (i.e., surface data).", stacklevel=1)
              Traceback (most recent call last):
                File "/Users/Hodgs004/coding/repos/parcels-benchmarks/.pixi/envs/default/lib/python3.12/site-packages/asv/benchmark.py", line 99, in <module>
                  main()
                File "/Users/Hodgs004/coding/repos/parcels-benchmarks/.pixi/envs/default/lib/python3.12/site-packages/asv/benchmark.py", line 91, in main
                  commands[mode](args)
                File "/Users/Hodgs004/coding/repos/parcels-benchmarks/.asv/env/44a3d831dbbb05d5c17212900f2e92b0/lib/python3.12/site-packages/asv_runner/run.py", line 72, in _run
                  result = benchmark.do_run()
                           ^^^^^^^^^^^^^^^^^^
                File "/Users/Hodgs004/coding/repos/parcels-benchmarks/.asv/env/44a3d831dbbb05d5c17212900f2e92b0/lib/python3.12/site-packages/asv_runner/benchmarks/_base.py", line 661, in do_run
                  return self.run(*self._current_params)
                         ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
                File "/Users/Hodgs004/coding/repos/parcels-benchmarks/.asv/env/44a3d831dbbb05d5c17212900f2e92b0/lib/python3.12/site-packages/asv_runner/benchmarks/time.py", line 165, in run
                  samples, number = self.benchmark_timing(
                                    ^^^^^^^^^^^^^^^^^^^^^^
                File "/Users/Hodgs004/coding/repos/parcels-benchmarks/.asv/env/44a3d831dbbb05d5c17212900f2e92b0/lib/python3.12/site-packages/asv_runner/benchmarks/time.py", line 258, in benchmark_timing
                  timing = timer.timeit(number)
                           ^^^^^^^^^^^^^^^^^^^^
                File "/Users/Hodgs004/coding/repos/parcels-benchmarks/.asv/env/44a3d831dbbb05d5c17212900f2e92b0/lib/python3.12/timeit.py", line 180, in timeit
                  timing = self.inner(it, self.timer)
                           ^^^^^^^^^^^^^^^^^^^^^^^^^^
                File "<timeit-src>", line 6, in inner
                File "/Users/Hodgs004/coding/repos/parcels-benchmarks/.asv/env/44a3d831dbbb05d5c17212900f2e92b0/lib/python3.12/site-packages/asv_runner/benchmarks/time.py", line 90, in func
                  self.func(*param)
                File "/Users/Hodgs004/coding/repos/parcels-benchmarks/benchmarks/moi_curvilinear.py", line 76, in time_pset_execute_3d
                  self.pset_execute_3d(interpolator, chunk, npart)
                File "/Users/Hodgs004/coding/repos/parcels-benchmarks/benchmarks/moi_curvilinear.py", line 71, in pset_execute_3d
                  pset.execute(
                File "/Users/Hodgs004/coding/repos/parcels-benchmarks/.asv/env/44a3d831dbbb05d5c17212900f2e92b0/lib/python3.12/site-packages/parcels/_core/particleset.py", line 435, in execute
                  self._kernel.execute(self, endtime=next_time, dt=dt)
                File "/Users/Hodgs004/coding/repos/parcels-benchmarks/.asv/env/44a3d831dbbb05d5c17212900f2e92b0/lib/python3.12/site-packages/parcels/_core/kernel.py", line 245, in execute
                  error_func(pset[inds].z, pset[inds].lat, pset[inds].lon)
                File "/Users/Hodgs004/coding/repos/parcels-benchmarks/.asv/env/44a3d831dbbb05d5c17212900f2e92b0/lib/python3.12/site-packages/parcels/_core/statuscodes.py", line 44, in _raise_field_interpolation_error
                  raise FieldInterpolationError(f"Field interpolation returned NaN at (z={z}, lat={y}, lon={x})")
              parcels._core.statuscodes.FieldInterpolationError: Field interpolation returned NaN at (z=array([0., 0., 0., ..., 0., 0., 0.], shape=(10000,), dtype=float32), lat=array([-30.   , -29.999, -29.998, ..., -20.002, -20.001, -20.   ],
                    shape=(10000,), dtype=float32), lon=array([-10.      ,  -9.998   ,  -9.995999, ...,   9.995999,   9.998   ,
                      10.      ], shape=(10000,), dtype=float32))

FESOM error

Here we get an error on the selection of the interpolator - this is a bug upstream in Parcels (this dataset has dims ('time', 'nz1', 'elem', 'nod2', 'nz') but _select_uxinterpolator doesn't expect these dimension namings isn't able to determine the right interpolators. AFAICT this problem was always here for this dataset. Let me know what you think @fluidnumerics-joe ).

pixi run setup-data
pixi run asv run --bench 'fesom2.*'

· Creating environments
· Discovering benchmarks
· Running 3 total benchmarks (1 commits * 1 environments * 3 benchmarks)
[ 0.00%] · For parcels commit be625b01 <update-convert>:
[ 0.00%] ·· Benchmarking rattler-py3.12-intake-xarray
[33.33%] ··· Running (fesom2.FESOM2.time_load_data--)..
[66.67%] ··· fesom2.FESOM2.peakmem_pset_execute                          failed
[66.67%] ··· ======= ============================
             --               integrator         
             ------- ----------------------------
              npart   <function AdvectionRK2_3D> 
             ======= ============================
              10000             failed           
             ======= ============================
             For parameters: 10000, <function AdvectionRK2_3D>
             /Users/Hodgs004/coding/repos/parcels-benchmarks/benchmarks/fesom2.py:4: UserWarning: This is an alpha version of Parcels v4. The API is not stable and may change without deprecation warnings.
               from parcels import (
             INFO: Using known vertical dimension mapping: 'nz' (interfaces) and 'nz1' (centers).
             INFO: Renaming vertical dimensions: {'nz': 'zf', 'nz1': 'zc'}
             INFO: cf_xarray found variable 'w' with CF standard name 'w' in dataset, renamed it to 'W' for Parcels simulation.
             INFO: cf_xarray found variable 'unod' with CF standard name 'unod' in dataset, renamed it to 'U' for Parcels simulation.
             INFO: cf_xarray found variable 'vnod' with CF standard name 'vnod' in dataset, renamed it to 'V' for Parcels simulation.
             Traceback (most recent call last):
               File "/Users/Hodgs004/coding/repos/parcels-benchmarks/.pixi/envs/default/lib/python3.12/site-packages/asv/benchmark.py", line 99, in <module>
                 main()
               File "/Users/Hodgs004/coding/repos/parcels-benchmarks/.pixi/envs/default/lib/python3.12/site-packages/asv/benchmark.py", line 91, in main
                 commands[mode](args)
               File "/Users/Hodgs004/coding/repos/parcels-benchmarks/.asv/env/44a3d831dbbb05d5c17212900f2e92b0/lib/python3.12/site-packages/asv_runner/run.py", line 72, in _run
                 result = benchmark.do_run()
                          ^^^^^^^^^^^^^^^^^^
               File "/Users/Hodgs004/coding/repos/parcels-benchmarks/.asv/env/44a3d831dbbb05d5c17212900f2e92b0/lib/python3.12/site-packages/asv_runner/benchmarks/_base.py", line 661, in do_run
                 return self.run(*self._current_params)
                        ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
               File "/Users/Hodgs004/coding/repos/parcels-benchmarks/.asv/env/44a3d831dbbb05d5c17212900f2e92b0/lib/python3.12/site-packages/asv_runner/benchmarks/peakmem.py", line 66, in run
                 self.func(*param)
               File "/Users/Hodgs004/coding/repos/parcels-benchmarks/benchmarks/fesom2.py", line 57, in peakmem_pset_execute
                 self.pset_execute(npart, integrator)
               File "/Users/Hodgs004/coding/repos/parcels-benchmarks/benchmarks/fesom2.py", line 45, in pset_execute
                 fieldset = FieldSet.from_ugrid_conventions(ds)
                            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
               File "/Users/Hodgs004/coding/repos/parcels-benchmarks/.asv/env/44a3d831dbbb05d5c17212900f2e92b0/lib/python3.12/site-packages/parcels/_core/fieldset.py", line 215, in from_ugrid_conventions
                 fields["U"] = Field("U", ds["U"], grid, _select_uxinterpolator(ds["U"]))
                               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
               File "/Users/Hodgs004/coding/repos/parcels-benchmarks/.asv/env/44a3d831dbbb05d5c17212900f2e92b0/lib/python3.12/site-packages/parcels/_core/field.py", line 132, in __init__
                 assert_same_function_signature(interp_method, ref=ZeroInterpolator, context="Interpolation")
               File "/Users/Hodgs004/coding/repos/parcels-benchmarks/.asv/env/44a3d831dbbb05d5c17212900f2e92b0/lib/python3.12/site-packages/parcels/_python.py", line 26, in assert_same_function_signature
                 sig = inspect.signature(f)
                       ^^^^^^^^^^^^^^^^^^^^
               File "/Users/Hodgs004/coding/repos/parcels-benchmarks/.asv/env/44a3d831dbbb05d5c17212900f2e92b0/lib/python3.12/inspect.py", line 3348, in signature
                 return Signature.from_callable(obj, follow_wrapped=follow_wrapped,
                        ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
               File "/Users/Hodgs004/coding/repos/parcels-benchmarks/.asv/env/44a3d831dbbb05d5c17212900f2e92b0/lib/python3.12/inspect.py", line 3085, in from_callable
                 return _signature_from_callable(obj, sigcls=cls,
                        ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
               File "/Users/Hodgs004/coding/repos/parcels-benchmarks/.asv/env/44a3d831dbbb05d5c17212900f2e92b0/lib/python3.12/inspect.py", line 2522, in _signature_from_callable
                 raise TypeError('{!r} is not a callable object'.format(obj))
             TypeError: None is not a callable object

[83.33%] ··· fesom2.FESOM2.time_load_data                                    ok
[83.33%] ··· ======= ============================
             --               integrator         
             ------- ----------------------------
              npart   <function AdvectionRK2_3D> 
             ======= ============================
              10000           96.6±0.9ms         
             ======= ============================

[100.00%] ··· fesom2.FESOM2.time_pset_execute                             failed
[100.00%] ··· ======= ============================
              --               integrator         
              ------- ----------------------------
               npart   <function AdvectionRK2_3D> 
              ======= ============================
               10000             failed           
              ======= ============================
              For parameters: 10000, <function AdvectionRK2_3D>
              /Users/Hodgs004/coding/repos/parcels-benchmarks/benchmarks/fesom2.py:4: UserWarning: This is an alpha version of Parcels v4. The API is not stable and may change without deprecation warnings.
                from parcels import (
              INFO: Using known vertical dimension mapping: 'nz' (interfaces) and 'nz1' (centers).
              INFO: Renaming vertical dimensions: {'nz': 'zf', 'nz1': 'zc'}
              INFO: cf_xarray found variable 'w' with CF standard name 'w' in dataset, renamed it to 'W' for Parcels simulation.
              INFO: cf_xarray found variable 'unod' with CF standard name 'unod' in dataset, renamed it to 'U' for Parcels simulation.
              INFO: cf_xarray found variable 'vnod' with CF standard name 'vnod' in dataset, renamed it to 'V' for Parcels simulation.
              Traceback (most recent call last):
                File "/Users/Hodgs004/coding/repos/parcels-benchmarks/.pixi/envs/default/lib/python3.12/site-packages/asv/benchmark.py", line 99, in <module>
                  main()
                File "/Users/Hodgs004/coding/repos/parcels-benchmarks/.pixi/envs/default/lib/python3.12/site-packages/asv/benchmark.py", line 91, in main
                  commands[mode](args)
                File "/Users/Hodgs004/coding/repos/parcels-benchmarks/.asv/env/44a3d831dbbb05d5c17212900f2e92b0/lib/python3.12/site-packages/asv_runner/run.py", line 72, in _run
                  result = benchmark.do_run()
                           ^^^^^^^^^^^^^^^^^^
                File "/Users/Hodgs004/coding/repos/parcels-benchmarks/.asv/env/44a3d831dbbb05d5c17212900f2e92b0/lib/python3.12/site-packages/asv_runner/benchmarks/_base.py", line 661, in do_run
                  return self.run(*self._current_params)
                         ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
                File "/Users/Hodgs004/coding/repos/parcels-benchmarks/.asv/env/44a3d831dbbb05d5c17212900f2e92b0/lib/python3.12/site-packages/asv_runner/benchmarks/time.py", line 165, in run
                  samples, number = self.benchmark_timing(
                                    ^^^^^^^^^^^^^^^^^^^^^^
                File "/Users/Hodgs004/coding/repos/parcels-benchmarks/.asv/env/44a3d831dbbb05d5c17212900f2e92b0/lib/python3.12/site-packages/asv_runner/benchmarks/time.py", line 258, in benchmark_timing
                  timing = timer.timeit(number)
                           ^^^^^^^^^^^^^^^^^^^^
                File "/Users/Hodgs004/coding/repos/parcels-benchmarks/.asv/env/44a3d831dbbb05d5c17212900f2e92b0/lib/python3.12/timeit.py", line 180, in timeit
                  timing = self.inner(it, self.timer)
                           ^^^^^^^^^^^^^^^^^^^^^^^^^^
                File "<timeit-src>", line 6, in inner
                File "/Users/Hodgs004/coding/repos/parcels-benchmarks/.asv/env/44a3d831dbbb05d5c17212900f2e92b0/lib/python3.12/site-packages/asv_runner/benchmarks/time.py", line 90, in func
                  self.func(*param)
                File "/Users/Hodgs004/coding/repos/parcels-benchmarks/benchmarks/fesom2.py", line 54, in time_pset_execute
                  self.pset_execute(npart, integrator)
                File "/Users/Hodgs004/coding/repos/parcels-benchmarks/benchmarks/fesom2.py", line 45, in pset_execute
                  fieldset = FieldSet.from_ugrid_conventions(ds)
                             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
                File "/Users/Hodgs004/coding/repos/parcels-benchmarks/.asv/env/44a3d831dbbb05d5c17212900f2e92b0/lib/python3.12/site-packages/parcels/_core/fieldset.py", line 215, in from_ugrid_conventions
                  fields["U"] = Field("U", ds["U"], grid, _select_uxinterpolator(ds["U"]))
                                ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
                File "/Users/Hodgs004/coding/repos/parcels-benchmarks/.asv/env/44a3d831dbbb05d5c17212900f2e92b0/lib/python3.12/site-packages/parcels/_core/field.py", line 132, in __init__
                  assert_same_function_signature(interp_method, ref=ZeroInterpolator, context="Interpolation")
                File "/Users/Hodgs004/coding/repos/parcels-benchmarks/.asv/env/44a3d831dbbb05d5c17212900f2e92b0/lib/python3.12/site-packages/parcels/_python.py", line 26, in assert_same_function_signature
                  sig = inspect.signature(f)
                        ^^^^^^^^^^^^^^^^^^^^
                File "/Users/Hodgs004/coding/repos/parcels-benchmarks/.asv/env/44a3d831dbbb05d5c17212900f2e92b0/lib/python3.12/inspect.py", line 3348, in signature
                  return Signature.from_callable(obj, follow_wrapped=follow_wrapped,
                         ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
                File "/Users/Hodgs004/coding/repos/parcels-benchmarks/.asv/env/44a3d831dbbb05d5c17212900f2e92b0/lib/python3.12/inspect.py", line 3085, in from_callable
                  return _signature_from_callable(obj, sigcls=cls,
                         ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
                File "/Users/Hodgs004/coding/repos/parcels-benchmarks/.asv/env/44a3d831dbbb05d5c17212900f2e92b0/lib/python3.12/inspect.py", line 2522, in _signature_from_callable
                  raise TypeError('{!r} is not a callable object'.format(obj))
              TypeError: None is not a callable object

Future work

Better testing

I'm finding it quite difficult to debug all of this since it's working using heavy datasets, and the iteration loop using asv run is very frustrating (e.g., run benchmark, find error (that I can't easily open using pdb since that has poor integration with asv), recreate the error using a normal python script, realise the bug is in Parcels, etc).

At the moment we have the following (which can be thought of as a pyramid - from the top to the most foundational):

This benchmarks repo
The datasets generated and used in Parcels

There is, however, the possibility of a layer in between:

This benchmarks repo
Datasets generated from real coordinates and metadata, but with fake array data.
- Almost akin to generating datasets from CDL (i.e., ncdump output). We need to build a small amount of custom tooling around this since xarray doesn't provide it Adding CDL Parser/open_cdl? pydata/xarray#6269 and ds.to_dict(data=False) is close to what we need, but excludes coordiantes)
The datasets generated and used in Parcels

This intermediate layer is hinted at in our "Participating in the issue tracker: 'Parcels doesn't work with my data'" doc page section, but I think can be formalised and also extended to the coordinates (as those are also quite important). Regarding implementation, we can create a separate repo to host these small files (similar to https://github.com/Parcels-code/parcels-data )

This intermediate layer has the following benefits:

Improved testing of the "convert" module using realistic metadata
Lightweight (can be integrated into our main test suite)
Easy to debug

Keen to hear your thoughts @erikvansebille .

Better cataloguing

From https://discourse.pangeo.io/t/data-pipelining-and-cataloging-best-practices-using-intake-xarray-to-transform-and-combine-data-metadata/5550/6 , I think we can streamline how we ingest data (by using Intake 2 in combination with the convert module or in combination with uxarray). Honestly, this is a low priority - I'm happy with what we have at the moment.

The important this from my POV is the "Better testing" above as that will flag any errors with our convert module.

for more information, see https://pre-commit.ci

VeckoTheGecko · 2026-04-07T09:53:19Z

Future work: Better testing

I'm going to get started setting the groundwork on this - keen to discuss if either of you have ideas so this can be further refined :)

VeckoTheGecko · 2026-04-07T13:29:59Z

Here we get an error on the selection of the interpolator - this is a bug upstream in Parcels (this dataset has dims ('time', 'nz1', 'elem', 'nod2', 'nz') but _select_uxinterpolator doesn't expect these dimension namings isn't able to determine the right interpolators.

Joe mentions that we have a convert function for this

VeckoTheGecko added 2 commits March 18, 2026 10:39

Update code

2e86ec5

Update fieldset ingestion

d678e03

VeckoTheGecko mentioned this pull request Mar 18, 2026

Update NEMO ingestion code Parcels-code/Parcels#2549

Merged

pre-commit-ci bot and others added 6 commits March 18, 2026 15:37

[pre-commit.ci] auto fixes from pre-commit.com hooks

6e84d4a

for more information, see https://pre-commit.ci

Comment out list_datasets, update readme

1d07d48

Add typer and ty

351c470

remove typer

a2c10a2

Switch to local Parcels install

3372708

Update README

bf540bc

VeckoTheGecko mentioned this pull request Mar 27, 2026

Streamline parcels-benchmarks #42

Merged

VeckoTheGecko added 3 commits April 2, 2026 15:48

Merge branch 'main' into vecko-update

a5e2ef1

Complete the catalogs

c6699b6

Update MOI loading

b50f901

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Update fieldset ingestion to use convert modules#40

Update fieldset ingestion to use convert modules#40
VeckoTheGecko wants to merge 11 commits intomainfrom
vecko-update

VeckoTheGecko commented Mar 18, 2026 •

edited

Loading

Uh oh!

VeckoTheGecko commented Apr 7, 2026

Future work: Better testing

Uh oh!

VeckoTheGecko commented Apr 7, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

VeckoTheGecko commented Mar 18, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

MOI error

FESOM error

Future work

Better testing

Better cataloguing

Uh oh!

VeckoTheGecko commented Apr 7, 2026

Future work: Better testing

Uh oh!

VeckoTheGecko commented Apr 7, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

VeckoTheGecko commented Mar 18, 2026 •

edited

Loading