Skip to content

Check input file workflow and refactor replace #28

@atteggiani

Description

@atteggiani

Overview

Currently (version 1.0.1), the hres_eccb.py and hres_ic.py scripts expect input files ($ICFILE and $ECCBFILE, respectively) having the suffix .tmp at the end.

However, this suffix is trimmed in the internal scripts replace_landsurface_with_ERA5land_IC.py and replace_landsurface_with_BARRA2R_IC.py.

Specifically the file used as an input (ff_in) is trimmed, while the output file (ff_out) stays the same.
Example from the replace_landsurface_with_ERA5land_IC.py script:

ff_in = ic_file_fullpath.as_posix().replace('.tmp', '')
# Path to output file
ff_out = ic_file_fullpath.as_posix()
)

This means the actual filepath used for the input file is the one without .tmp suffix.

Additionally, the script doesn't preserve the "original" file. It instead overwrites it in 2 steps:

  1. First it writes a file with the same path as the input but with .tmp appended.
  2. Then, it moves the file by trimming the appended .tmp (effectively replacing the original input file).
    These 2 steps happen in 2 consecutive "processing lines" (the lines belong to two different functions, but one is at the end of the function, the other is at the start of a function called immediately next. So, regarding the whole execution of the program, they happen one right after the other).

Solution

I suggest setting the input file for the hres_eccb.py and hres_ic.py scripts as the original initial/external condition filepaths (without .tmp).
If the original file needs to be preserved (thing that does not happen at the moment), I suggest renaming the original file before the new output is written.

This simplifies the code overall.

Other note

On top of that, I suggest refactoring the statement filepath.replace('.tmp', '') (or similar) with a different logic.
This because the str.replace(oldvalue, newvalue) method in Python replaces all occurrences of oldvalue with newvalue.
This would create problems if we have an input with the string .tmp in the middle of the name (for example a file called input.tmp_astart).
What we want, instead, is to only trim the suffix we create (in this case .tmp) from the end of the path.

I suggest the new logic could be something like:

new_filepath = filepath[:-4] if filepath.endswith(`.tmp`)

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    Status
    Todo ⏳

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions