Skip to content

Modifications to ForecastTrajectoryGapFiller#2369

Open
gavinevans wants to merge 6 commits into
metoppv:masterfrom
gavinevans:mobt_783_temporal_interpolation
Open

Modifications to ForecastTrajectoryGapFiller#2369
gavinevans wants to merge 6 commits into
metoppv:masterfrom
gavinevans:mobt_783_temporal_interpolation

Conversation

@gavinevans

Copy link
Copy Markdown
Contributor

Related to https://github.com/metoppv/mo-blue-team/issues/783

Description
This PR makes some edits to the ForecastTrajectoryGapFiller, mainly for the aim of being able to interpolate accumulation fields, like instantaneous fields e.g. interpolating a T+3 1 hour precipitation accumulation field, from a fields at T+2 and T+4, without an assumption that the interpolation is actually an attempt to disaggregate an accumulation field e.g. disaggregating a 3 hour precipitation accumulation field into 1 hour precipitation accumulation field.

This PR therefore provides a pathway through the ForecastTrajectoryGapFiller (and TemporalInterpolation) plugins for accumulation field interpolation, modifies bounds handling of the accumulation fields, and converts quite a lot of time-based processing in the ForecastTrajectoryGapFiller to use seconds to consistency with the coordinates being handled.

Testing:

  • Ran tests and they passed OK
  • Added new tests for the new feature(s)

@gavinevans gavinevans force-pushed the mobt_783_temporal_interpolation branch from d4a7391 to 8b2f876 Compare April 28, 2026 14:48
…_interpolation

* upstream/master: (24 commits)
  houndCI removal (metoppv#2393)
  Add the Ensemble Copula Coupling "transformation" approach (metoppv#2303)
  Ensure blend_time is consistent with forecast_reference_time when resetting (metoppv#2392)
  Refine realization coordinate handling when clustering (metoppv#2361)
  Mobt1205 better accum temporal interp (metoppv#2390)
  Refactor pollen forecast period setting (metoppv#2385)
  Changes to how we do distance_to calculations (metoppv#2379)
  Mobt 1138 ignore git hash cli (metoppv#2388)
  Ensure that the SubperiodSelector plugin and CLI returns a cube with the model_id_attr, if requested. (metoppv#2386)
  Add option to update a scalar coordinate attribute as part of the StandardiseMetadata plugin (metoppv#2378)
  Allow cube-extraction to use a formatted time string (metoppv#2380)
  Removing a name from CONTRIBUTING.md (metoppv#2384)
  Support more complex forecast period filtering when training QRF (metoppv#2383)
  Mobt1161 precip phase decision tree (metoppv#2376)
  MOBT-1148 Add subperiod-selector tool (metoppv#2373)
  Revert "change ApplyDecisionTree categorical cube dtype from int32 to int16 (…" (metoppv#2382)
  change ApplyDecisionTree categorical cube dtype from int32 to int16 (metoppv#2371)
  Changed implementation from clipping to masking + added Unit Tests (metoppv#2366)
  Adding kwarg as CLI argument & updating acceptance tests. (metoppv#2377)
  Refactor Pollen index for daily and hourly to single plugin (metoppv#2372)
  ...
@gavinevans gavinevans marked this pull request as ready for review June 10, 2026 08:59

@maxwhitemet maxwhitemet left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks good. I've added 3 comments on clarity and 1 on simplifying a code block.

Comment on lines +115 to +116
combined with accumulation, max, or min period-type specifiers.
Default is False.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am a bit confused and may have missed something: the min, max, and accumulation parameters are not exposed here. Comments:

  • Should these parameters be exposed here?
  • It may be good to add backticks around 'max', 'min', and 'accumulation' (max, min, accumulation) in-text so it's clear they are parameters.
  • Alternatively to point 2, perhaps merge these 3 parameters into one parameter with each of the former being an optional argument (e.g. period_type: Literal['accumulation','min','max'] = None. This feels clearer and may simplify some of the error raising.

Comment on lines 1022 to 1023
interpolation_window_in_minutes:
Time window (in minutes) as +/- range around forecast source transitions.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am confused by the interpolation_window_in_minutes parameter, having not read the code for a while, Could some clarification be added here – perhaps an example – on the difference between interval_in_minutes and interpolation_window_in_minutes.

I understand the former as 'supplying 60 means I expect 60min between all provided cubes, so if I provide a cube valid at 12PM and another valid at 2PM, an intermediate cube should be added at 1PM'. An example such as this, with the interpolation_window_in_minutes on top could be helpful.

Comment on lines 1603 to 1604
missing_periods = self._identify_gaps(sorted_cubelist)
periods_to_regenerate = self._identify_periods_to_regenerate(sorted_cubelist)

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I may have missed this: is there handling for when there are no gaps? E.g. skipping if missing_periods is empty. I think this happens further down but this would be a good place for it.

Comment on lines 1133 to +1140
min_period = existing_periods[0]
max_period = existing_periods[-1]
missing_periods = []
current = min_period + self.interval_in_minutes
current = min_period + self.interval_in_seconds
while current < max_period:
if current not in existing_periods:
missing_periods.append(current)
current += self.interval_in_minutes
current += self.interval_in_seconds

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Given the output of `_get_forecast_periods()' is a 'Sorted list of unique forecast periods in seconds, then this chunk could be simplified by finding the set difference. E.g.

existing_periods = set(self._get_forecast_periods(cubelist))
min_period = min(existing_periods)
max_period = max(existing_periods)

possible_periods_given_interval = set(range(min_period, max_period + self.interval_in_seconds, self.interval_in_seconds))
missing_periods = sorted(possible_periods_given_interval - existing_periods)

@maxwhitemet maxwhitemet left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks good. I've added 3 comments on clarity and 1 on simplifying a code block.

@mo-jbeaver mo-jbeaver left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A few comments, but overall happy with the changes. All the tests passed successfully.

treat_period_as_instantaneous:
If True, period diagnostics (inputs with time bounds) are treated
as instantaneous values for interpolation. No period-specific
renormalisation or max/min constraints are applied.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe needs to be rephrased to clarify that it will error if one of accumulation, min, or max inputs are also provided.

Comment on lines 1123 to 1127
ValueError: If interval_in_minutes is not set.
"""
if self.interval_in_minutes is None:
if self.interval_in_seconds is None:
raise ValueError(
"interval_in_minutes must be set to identify gaps in forecast period."

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There seems to be a mismatch of interval_in_seconds and interval_in_minutes here

"""Create interpolation tasks for periods to regenerate at regular intervals.

Instead of regenerating only at the transition point, generates forecasts
at regular intervals (specified by interval_in_minutes) across the entire

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I understand the initial variable is interval_in_minutes, but the check and code only uses interval_in_seconds .

Comment on lines +289 to +295
{
"interval_in_minutes": 60,
"accumulation": True,
"treat_period_as_instantaneous": True,
},
"treat_period_as_instantaneous cannot be combined",
), # Conflicting period handling modes

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this need to be repeated for max and min too?

Comment on lines +830 to +831
np.testing.assert_almost_equal(result[0].data, 4.0)
np.testing.assert_almost_equal(result[1].data, 7.0)

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A short explanation may be need for how these numbers are determined

Comment on lines +514 to +520
Expected behavior:
- Regeneration fills only interior 1-hour intervals in window: 4, 5, 6, 7, 8
(3 and 9 are boundary inputs and are not regenerated)
- Gap filling fills remaining missing intervals: 10, 11
- Original cubes included: 3, 9, 12

Result: [3, 4, 5, 6, 7, 8, 9, 10, 11, 12]

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a good explanation for the difference between gap and regeneration fills. Maybe consider putting this in the plugin to.

@mo-jbeaver mo-jbeaver removed their assignment Jun 11, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants