This release introduces a dedicated ogcapi module that consolidates all
OGC API-based web service clients under a single, extensible framework.
The module exposes a new OGCAPIBase base class that handles pagination,
CQL filtering, geometry queries, feature ID lookups, bounding-box queries,
User-Agent identification, optional API-key injection, and automatic
cache eviction for error responses. The existing GeoConnex and
FabricData classes have been refactored to inherit from OGCAPIBase,
and a new NWIS class has been added to access the USGS Water Data OGC
API (https://api.waterdata.usgs.gov/ogcapi/v0) as USGS migrates its
services to the OGC API standard.
- Add
NWISclass for accessing the new USGS Water Data OGC API (https://api.waterdata.usgs.gov/ogcapi/v0). USGS is migrating its web services to the OGC API standard; this class provides access to monitoring locations, daily/continuous observations, field/channel measurements, and reference code tables. The class reads theUSGS_API_KEYenvironment variable automatically for higher rate limits. - Add
NLDI.get_characteristics_byidmethod for retrieving local, total, or divergence-routed catchment characteristics for a specific feature directly from the NLDI API. - Add
trim_toleranceparameter toNLDI.navigate_byidandNLDI.navigate_byboxfor controlling how aggressively the first flowline is trimmed whentrim_start=True. - Add
StreamCat.changelog,StreamCat.data_dictionary, andStreamCat.all_metrics_bycomidmethods to expose additional StreamCat API endpoints.
- Move
GeoConnexandFabricDatafromcoreto the new dedicatedogcapimodule alongside the newNWISclass. - Introduce
OGCAPIBaseas a reusable base class for all OGC API services.GeoConnex,FabricData, andNWISnow inherit from it, making it straightforward to add further OGC API-based services. OGCAPIBasenow sends aUser-Agentheader (pynhd/<version>) on every request so API providers can identify traffic from HyRiver.FabricDataandNWISread theUSGS_API_KEYenvironment variable when no explicitapi_keyis passed.- Error responses (e.g., rate-limit or server errors) are automatically evicted from the HTTP cache so they do not persist across retries.
- Fix type annotations across
core,network_tools,nhdplus_derived,ogcapi,pygeoapi, andpynhdmodules to pass strict Pyright type-checking.
- Add the development version of the NLDI web service to the
NLDIclass. This version is not stable and is intended for testing purposes only. The default version is still the production version.
- Fix the issue with normalizing variables in
streamcatfunctionc caused by changes in the StreamCat web service.
- Use
orjsoninstead ofujsondue to the package not being maintained anymore. The developer ofujsonraised conrcerns about security vulnerabilities and recommended usingorjsoninstead. - Create two new modules called
nldiandpygeoapifor better organization of the codebase.
- Remove
NLDI.getcharacteristic_byidmethod since its endpoint will be removed from the NLDI service. The characteristics can still be accessed via theNLDI.get_characteristicsmethod and only requires specifying the characteristic names and optionally the NHDPlus ComIDs. This method calls thepynhd.nhdplus_attrs_s3function internally. - Switch to using the new StreamCat web service
link. While the public API of the
pynhd.streamcatfunction did not change the web service itself might return different results and metrics names have been changed. Thus, this change might affect the results of the function and is considered a breaking change.
- Add a new optional argument to
pynhd.nhdplus_attrs_s3to pass a PyArrow Expression for filtering the query.
- Switch to using the new NLDI web service link. Note that NLDI web service now has a rate limit of 3600 requests per hour per IP.
- In
GeoConnex, set therequest_type='POST'for all CQL queries. While non-spatial CQL queries are working, spatial CQL queries are still not working due to an issue with the GeoConnex service. For now, it's recommended to use thebyfiltermethod for most of the queries, including spatial queries. For simple spatial queries, you can use thebyboxmethod then filter the results based on the actual geometry.
- Replace the links to NLDI and PyGeoAPI web services to their new URLs.
- Add two new methods to
GeoConnexclass for queryingbyboxandbyfilter. Note that CQL query is still not working due to an issue with the GeoConnex service. For now, it's recommended to use thebyfiltermethod for most of the queries, including spatial queries. For simple spatial queries, you can use thebyboxmethod then filter the results based on the actual geometry.
- Drop support for Python 3.8 since its end-of-life date is October 2024.
- Remove all exceptions from the main module and raise them from the
exceptionsmodule. This is to declutter the public API and make it easier to maintain.
- The function
pynhd.streamcatnow can be called without any arguments to get a dataframe of all available metrics and their descriptions.
- Add the
exceptionsmodule to the high-level API to declutter the main module. In the future, all exceptions will be raised from this module and not from the main module. For now, the exceptions are raised from both modules for backward compatibility. - Switch to using the
srclayout instead of theflatlayout for the package structure. This is to make the package more maintainable and to avoid any potential conflicts with other packages. - Add artifact attestations to the release workflow.
- Add support for LakeCat dataset in
streamcatfunction. A new argument calledlakes_onlyis added to the function. If set toTrue, only metrics for lake and their associated catchments will be returned. The default isFalseto retain backward compatibility.
- Modify
HP3Dclass based on the latest changes to the 3D Hydrography Program service. Hydrolocation layer has now three sub-layers:hydrolocation_waterbodyfor Sink, Spring, Waterbody Outlet,hydrolocation_flowlinefor Headwater, Terminus, Divergence, Confluence, Catchment Outlet,hydrolocation_reachfor Reach Code, External Connection.
- EPA's HMS no longer supports the StreamCat dataset, since they have a dedicated
service for it. Thus, the
epa_nhd_catchmentsfunction no longer accepts "streamcat" as an input for thefeatureargument. For all StreamCat queries, use thestreamcatfunction instead. Now, theepa_nhd_catchmentsfunction is essentially useful for getting Curve Number data.
- In
NLDI.get_basins, the indices used to be station IDs but in the previous release they were reset by mistake. This version retains the correct indices.
- In
nhdplus_l48function, when the layer isNHDFlowline_NetworkorNHDFlowline_NonNetwork, merge allMultiLineStringgeometries toLineString.
- Fix an issue in
network_xsectionandflowline_xsectionrelated to the changes inshapely2 API. Now, these functions should return the correct cross-sections.
- Add access to USGS 3D Hydrography Program (3DHP) service. The new
class is called
HP3D. It can be queried by IDs, geometry, or SQL where clause. - Add support for the new PyGeoAPI endpoints called
xsatpathpts. This new endpoint is useful for getting elevation profile along Ashapely.LineString. You can usepygeoapifunction withservice="elevation_profile"(orPyGeoAPIclass) to access this new endpoint. Previously, theelevation_profileendpoint was used for getting elevation profile along a path from two endpoints and the inputGeoDataFramemust have been aMultiPointwith two coordinates. Now, you must the input must containLineStringgeometries. - Switch to using the new smoothing algorithm from
pygeoutilsfor resampling the flowlines and getting their cross-sections. This new algorithm is more robust, accurate, and faster. It has a new argument calledsmoothingfor controlling the number knots of the spline. Higher values result in smoother curves. The default value isNonewhich uses all the points from the input flowline.
- Update
GeoConnexbased on the latest changes in the web service.
- Fix HyRiver libraries requirements by specifying a range instead
of exact version so
conda-forgecan resolve the dependencies.
From release 0.15 onward, all minor versions of HyRiver packages
will be pinned. This ensures that previous minor versions of HyRiver
packages cannot be installed with later minor releases. For example,
if you have py3dep==0.14.x installed, you cannot install
pydaymet==0.15.x. This is to ensure that the API is
consistent across all minor versions.
- Add a new function, called
nhdplus_h12pp, for retrieving HUC12 pour points across CONUS. - Add
use_arrow=Truetopynhd.nhdplus_l48when reading the NHDPlus dataset. This speeds up the process sincepyarrowis installed. - In
nhdplus_l48makelayeroption sosqlparameter ofpyogrio.read_dataframecan also be used. This is necessary sincepyogrio.read_dataframedoes not support passing bothlayerandsqlparameters. - Update the mainstems dataset link to version 2.0 in
mainstem_huc12_nx. - Expose
NHDToolsclass to the public API. - For now, retain compatibility with
shapely<2while supportingshapley>=2.
- Remove unnecessary conversion of
id_colandtoid_coltoInt64innhdflw2nxandvector_accumulation. This ensures that the input data types are preserved. - Fix an issue in
nhdplus_l48, where if the inputdata_diris not absolutepy7zrfails to extract the file.
- Rewrite the
GeoConnexclass to provide access to new capabilities of the web service. Support for spatial queries have been added via CQL queries. For more information, check out the updated GeoConnex example notebook. - Add a new property to
StreamCat, calledmetrics_dfthat gets a dataframe of metric names and their description. - Create a new private
StreamCatValidatorclass to avoid polluting the publicStreamCatclass with private attributes and methods. Moreover, add a new alternative metric names attribute toStreamCatcalledalt_namesfor handling those metric names that do not followMETRIC+YYYYconvention. This attribute is a dictionary that maps the alternative names to the actual metric names, so users can useMETRIC_NAMEcolumn ofmetrics_dfand add a year suffix fromvalid_yearsattribute ofStreamCatto get the actual metric name. - In
navigate_by*functions ofNLDIaddstop_comid, which is another criterion for stopping the navigation in addition todistance. - Improve
UserWarningmessages ofNLDIandWaterData.
- Remove
pynhd.geoconnexfunction since more functionality has been added to the GeoConnex service that existence of this function does not make sense anymore. All queries should be done viapynhd.GeoConnexclass. - Rewrite
NLDIto improve code readability and significantly improving performance. Now, its methods do now return tuples if there are failed requests, instead they will be shown as aUserWarning. - Bump the minimum required version of
shapelyto 2.0, and use its new API.
- Sync all minor versions of HyRiver packages to 0.14.0.
- Update the link to version 2.0 of the ENHD dataset in
enhd_attrs.
- Improve columns data types in
enhd_attrsandnhdplus_vaaby usingint32instead ofInt64, where applicable. - Sync all patch versions of HyRiver packages to x.x.12.
- The
prepare_nhdplusnow supports NHDPlus HR in addition to NHDPlus MR. It automatically detects the NHDPlus version based on the ID column name:nhdplusidfor HR andcomidfor MR.
- Fully migrate
setup.cfgandsetup.pytopyproject.toml. - Convert relative imports to absolute with
absolufy-imports. - Improve performance of
prepare_nhdplusby usingpandas.mergeinstead of applying a function to each row of the dataframe.
- Add support for the new EPA's
StreamCat
Restful API with around 600 NHDPlus
catchment level metrics. One class is added for getting the service
properties such as valid metrics, called
StreamCat. You can usestreamcatfunction to get the metrics as apandas.DataFrame. - Refactor the
show_versionsfunction to improve performance and print the output in a nicer table-like format.
- Skip 0.13.9 version so the minor version of all HyRiver packages become the same.
- Modify the codebase based on the latest changes in
geopandasrelated to empty dataframes.
- Add a new function, called
nhdplus_attrs_s3, for accessing the recently released NHDPlus derived attributes on a USGS's S3 bucket. The attributes are provided in parquet files, so getting them is faster thannhdplus_attrs. Also, you can request for multiple attributes at once whereas innhdplus_attrsyou had to request for each attribute one at a time. This function will replacenhdplus_attrsin a future release, as soon as all data that are available on the ScienceBase version are also accessible from the S3 bucket. - Add two new functions called
mainstem_huc12_nxandenhd_flowlines_nx. These functions generate anetworkxdirected graph object of NHD HUC12 water boundaries and flowlines, respectively. They also return a dictionary mapping of COMID and HUC12 to the correspondingnetworkxnode. Additionally, a topologically sorted list of COMIDs/HUC12s are returned. The generated data are useful for doing US-scale network analysis and flow accumulation on the NHD network. The NHD graph has about 2.7 million edges and the mainstem HUC12 graph has about 80K edges. - Add a new function for getting the entire NHDPlus dataset for CONUS (Lower 48),
called
nhdplus_l48. The entire NHDPlus dataset is downloaded from here. This 7.3 GB file will take a while to download, depending on your internet connection. The first time you run this function, the file will be downloaded and stored in the./cachedirectory. Subsequent calls will use the cached file. Moreover, there are two additional dependencies for using this function:pyogrioandpy7zr. These dependencies can be installed usingpip install pyogrio py7zrorconda install -c conda-forge pyogrio py7zr.
- Refactor
vector_accumulationfor significant performance improvements. - Modify the codebase based on Refurb suggestions.
- Add a new function called
epa_nhd_catchmentsto access one of the EPA's HMS endpoints calledWSCatchment. You can use this function to access 414 catchment-scale characteristics for all the NHDPlus catchments including 16-day average curve number. More information on the curve number dataset can be found at its project page here.
- Fix a bug in
NHDToolswhere due to the recent changes inpandasexception handling, theNHDToolsfails in converting columns withNaNvalues to integer type. Now,pandasthrowsIntCastingNaNErrorinstead ofTypeErrorwhen usingastypemethod on a column.
- Use
pyupgradepackage to update the type hinting annotations to Python 3.10 style.
- Add the missing PyPi classifiers for the supported Python versions.
- Append "Error" to all exception classes for conforming to PEP-8 naming conventions.
- Bump the minimum versions of
pygeoogcandpygeoutilsto 0.13.5 and that ofasync-retrieverto 0.3.5.
- Fix an issue in
nhdplus_vaaandenhd_attrsfunctions where ifcachefolder does not exist, it would not have been created, thus resulting to an error.
- Use the new
async_retriever.stream_writefunction to download files innhdplus_vaaandenhd_attrsfunctions. This is more memory efficient. - Convert the type of list of not found items in
NLDI.comid_bylocandNLDI.feature_bylocto list of tuples of coordinates from list of strings. This matches the type of returned not found coordinates to that of the inputs. - Fix an issue with NLDI that was caused by the recent changes in the NLDI web
service's error handling. The NLDI web service now returns more descriptive
error messages in a
jsonformat instead of returning the usual status errors. - Slice the ENHD dataframe in
NHDTools.clean_flowlinesbefore updating the flowline dataframe to reduce the required memory for theupdateoperation.
- Set the minimum supported version of Python to 3.8 since many of the
dependencies such as
xarray,pandas,rioxarrayhave dropped support for Python 3.7.
- Use micromamba for running tests and use nox for linting in CI.
- Add support for all the GeoConnex web service endpoints. There are two
ways to use it. For a single query, you can use the
geoconnexfunction and for multiple queries, it's more efficient to use theGeoConnexclass. - Add support for passing any of the supported NLDI feature sources to
the
get_basinsmethod of theNLDIclass. The default isnwissiteto retain backward compatibility.
- Set the type of "ReachCode" column to
strinstead ofintinpygeoapiandnhdplus_vaafunctions.
- Add two new functions called
flowline_resampleandnetwork_resamplefor resampling a flowline or network of flowlines based on a given spacing. This is useful for smoothing jagged flowlines similar to those in the NHDPlus database. - Add support for the new NLDI endpoint called "hydrolocation". The
NLDIclass now has two methods for getting features by coordinates:feature_bylocandcomid_byloc. Thefeature_bylocmethod returns the flowline that is associated with the closest NHDPlus feature to the given coordinates. Thecomid_bylocmethod returns a point on the closest downstream flowline to the given coordinates. - Add a new function called
pygeoapifor calling the API in batch mode. This function accepts the input coordinates as ageopandas.GeoDataFrame. It is more performant than calling its counteractPyGeoAPImultiple times. It's recommended to switch to using this new batch function instead of thePyGeoAPIclass. Users just need to prepare an input data frame that has all the required service parameters as columns. - Add a new step to
prepare_nhdplusto convertMultiLineStringtoLineString. - Add support for the
simplifiedflag of NLDI'sget_basinsfunction. The default value isTrueto retain the old behavior.
Remove caching-related arguments from all functions since now they can be set globally via three environmental variables:
HYRIVER_CACHE_NAME: Path to the caching SQLite database.HYRIVER_CACHE_EXPIRE: Expiration time for cached requests in seconds.HYRIVER_CACHE_DISABLE: Disable reading/writing from/to the cache file.
You can do this like so:
import os
os.environ["HYRIVER_CACHE_NAME"] = "path/to/file.sqlite"
os.environ["HYRIVER_CACHE_EXPIRE"] = "3600"
os.environ["HYRIVER_CACHE_DISABLE"] = "true"- Add a new class called
NHDfor accessing the latest National Hydrography Dataset. More info regarding this data can be found here. - Add two new functions for getting cross-sections along a single flowline via
flowline_xsectionor throughout a network of flowlines vianetwork_xsection. You can specify spacing and width parameters to control their location. For more information and examples please consult the documentation. - Add a new property to
AGRBasecalledservice_infoto include some useful info about the service includingfeature_typeswhich can be handy for converting numeric values of types to their string equivalent.
- Use the new PyGeoAPI API.
- Refactor
prepare_nhdplusfor improving the performance and robustness of determiningtocomidwithin a network of NHD flowlines. - Add empty geometries that
NLDI.getbasinsreturns to the list ofnot foundIDs. This is because the NLDI service does not include non-network flowlines and instead returns an empty geometry for these flowlines. (:issue_nhd:`#48`)
- Use the three new
ar.retrieve_*functions instead of the oldar.retrievefunction to improve type hinting and to make the API more consistent. - Revert to the original PyGeoAPI base URL.
- Rewrite
ScienceBaseto make it applicable for working with other ScienceBase items. A new function has been added for staging the Additional NHDPlus attributes items calledstage_nhdplus_attrs. - Refactor
AGRBaseto remove unnecessary functions and make them more general. - Update
PyGeoAPIclass to conform to the newpygeoapiAPI. This web service is undergoing some changes at the time of this release and the API is not stable, might not work as expected. As soon as the web service is stable, a new version will be released.
- In
WaterData.byidshow a warning if there are any missing feature IDs that are requested but are not available in the dataset. - For all
by*methods ofWaterDatathrow aZeroMatchedexception if no features are found. - Add
expire_afteranddisable_cachingarguments to all functions that useasync_retriever. Set the default request caching expiration time to never expire. You can usedisable_cachingif you don't want to use the cached responses. Please refer to documentation of the functions for more details.
- Refactor
prepare_nhdplusto reduce code complexity by grouping all the NHDPlus tools as a private class. - Modify
AGRBaseto reflect the latest API changes inpygeoogc.ArcGISRESTfullclass. - Refactor
prepare_nhdplusby creating a private class that includes all the previously used private functions. This will make the code more readable and easier to maintain. - Add all the missing types so
mypy --strictpasses.
- Add a new argument to
NLDI.get_basinscalledsplit_catchmentthat if is set toTruewill split the basin geometry at the watershed outlet.
- Catch service errors in
PyGeoAPIand show useful error messages. - Use
importlib-metadatafor getting the version instead ofpkg_resourcesto decrease import time as discussed in this issue.
- More robust handling of inputs and outputs of
NLDI's methods. - Use an alternative download link for NHDPlus VAA file on Hydroshare.
- Restructure the codebase to reduce the complexity of
pynhd.pyfile by dividing it into three files:pynhdall classes that provide access to the supported web services,corethat includes base classes, andnhdplus_derivedthat has functions for getting databases that provided additional attributes for the NHDPlus database.
- Add support for PyGeoAPI. It offers
four functionalities:
flow_trace,split_catchment,elevation_profile, andcross_section.
- Add a function for getting all NHD
FCodesas a data frame, callednhd_fcode. - Improve
prepare_nhdplusfunction by removing all coastlines and better detection of the terminal point in a network.
- Migrate to using
AsyncRetrieverfor handling communications with web services. - Catch the
ConnectionErrorseparately inNLDIand raise aServiceErrorinstead. So user knows that data cannot be returned due to the out of service status of the server notZeroMatched.
- Add
nhdplus_vaato access NHDPlus Value Added Attributes for all its flowlines. - To see a list of available layers in NHDPlus HR, you can instantiate its class without
passing any argument like so
NHDPlusHR().
- Drop support for Python 3.6 since many of the dependencies such as
xarrayandpandashave done so.
- Use persistent caching for all requests which can help speed up network responses significantly.
- Improve documentation and testing.
- Add an announcement regarding the new name for the software stack, HyRiver.
- Improve
pipinstallation and release workflow.
- The first release after renaming hydrodata to PyGeoHydro.
- Make
mypychecks more strict and fix all the errors and prevent possible bugs. - Speed up CI testing by using
mambaand caching.
- Bump version to the same version as PyGeoHydro.
- Add a new function for getting basins geometries for a list of USGS station IDs.
The function is a method of
NLDIclass calledget_basins. So, nowNLDI.getfeature_byidfunction does not have a basin flag. This change makes getting geometries easier and faster. - Remove
characteristics_dataframemethod fromNLDIand make a standalone function callednhdplus_attrsfor accessing NHDPlus attributes directly from ScienceBase. - Add support for using hydro
or edits
webs services for getting NHDPlus High-Resolution using
NHDPlusHRfunction. The new arguments areservicewhich acceptshydrooredits, andautos_switchflag for automatically switching to the other service if the ones passed byservicefails.
- Add a new argument to
topoogical_sortcallededge_attrthat allows adding attribute(s) to the returned Networkx Graph. By default, it isNone. - A new base class,
AGRBasefor connecting to ArcGISRESTful-based services such as National Map and EPA's WaterGEOS. - Add support for setting the buffer distance for the input geometries to
AGRBase.bygeom. - Add
comid_byloctoNLDIclass for getting ComIDs of the closest flowlines from a list of lon/lat coordinates. - Add
bydistancetoWaterDatafor getting features within a given radius of a point.
- Re-wrote the
NLDIfunction to use API v3 of the NLDI service. - The
crsargument ofWaterDatanow is the target CRS of the output dataframe. The service CRS is nowEPSG:4269for all the layers. - Remove the
url_onlyargument ofNLDIsince it's not applicable anymore.
- Added support for NHDPlus High Resolution for getting features by geometry, IDs, or SQL where clause.
- The following functions are added to
NLDI:
getcharacteristic_byid: Getting characteristics of NHDPlus catchments.navigate_byloc: Getting the nearest ComID to a coordinate and performing navigation.characteristics_dataframe: Getting all the available catchment-scale characteristics as a data frame.get_validchars: Getting a list of available characteristic IDs for a specified characteristic type.
- The following function is added to
WaterData:
byfilter: Getting data based on any valid CQL filter.bygeom: Getting data within a geometry (polygon and multipolygon).
- Add support for Python 3.9 and tests for Windows.
- Refactored
WaterDatato fix the CRS inconsistencies (#1).
- Replaced
simplejsonwithorjsonto speed-up JSON operations.
- Add
show_versionsfunction for showing versions of the installed deps. - Improve documentation
- Improved documentation
- Refactored
WaterDatato improve readability.
- First release on PyPI.