Skip to content

Commit 92a715f

Browse files
Merge pull request #166 from vlahm/master
mostly EML updates
2 parents 453a858 + 5a0dab5 commit 92a715f

87 files changed

Lines changed: 2778 additions & 104 deletions

File tree

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

.RData

-1.33 MB
Binary file not shown.

.gitignore

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -27,4 +27,7 @@ data/general/nhd_hr
2727
output/
2828
old_dataset_dirs.tar.gz
2929
old_logs
30+
eml/data_links
31+
eml/eml_out
3032
vault/*
33+
.Rdata

eml/additional_info.txt

Lines changed: 46 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,46 @@
1+
Glossary of Terms (*as used in this documentation)
2+
3+
watershed
4+
All land area contributing runoff to a point of interest along a stream.
5+
Does not necessarily account for inputs from subsurface flow or human-constructed diversions.
6+
We avoid the terms "catchment" and "basin," though they are sometimes used in this way.
7+
site*
8+
An individual gauging station or stream sampling location and its watershed.
9+
domain*
10+
One or more sites under common management.
11+
network*
12+
One or more domains under common funding/leadership.
13+
provider*
14+
Also sometimes referred to as a "source" -- a primary source of data assimilated into MacroSheds.
15+
May be a network, domain, or third party.
16+
product*
17+
A collection of data, possibly including multiple datasets/tables. Providers may separate products by
18+
temporal extent/interval, scientific category, detection method, and/or sampling location.
19+
prodname
20+
One of the 7 core product categories included in the MacroSheds dataset. See below.
21+
prodcode
22+
An alphanumeric string associated with a product. Providers have their own. MacroSheds uses its own scheme internally.
23+
This won't be relevant for most users, but see detailed documentation included with core data downloads for more information.
24+
site-product, site-year, etc.
25+
Terms like these are used to designate various subdivisions of the overall MacroSheds dataset.
26+
A site-product, for example, is the collection of all data for a single MacroSheds product, available at a single site.
27+
28+
MacroSheds data are organized into the following products:
29+
30+
discharge
31+
Streamflow; water volume over time; reported in L/s.
32+
stream chemistry
33+
Concentration of chemical constituents in stream water; reported in mg/L or mEq/L.
34+
stream flux
35+
Mass of chemical constituents in stream water, per watershed area, over time; reported in kg/ha/d.
36+
(not currently included with this dataset, but can be generated via the macrosheds package for R)
37+
precipitation
38+
Rainfall, snowfall, or both combined; reported per watershed in mm.
39+
precipitation chemistry
40+
Concentration of chemical constituents in precipitation; reported in mg/L or mEq/L; averaged across watershed area.
41+
precipitation flux
42+
Mass of chemical constituents in precipitation, per watershed area, over time; reported in kd/ha/d.
43+
(not currently included with this dataset, but can be generated via the macrosheds package for R)
44+
watershed attributes
45+
Areal watershed summary statistics, variables available are common to all MacroSheds sites.
46+

eml/eml_templates/abstract.txt

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1 @@
1+
There exist hundreds of long-term watershed ecosystem monitoring efforts from which solute fluxes can be calculated. While details of instrumentation and sampling methods vary between studies, the types of data collected and the questions that motivate their analysis are often remarkably similar. Nevertheless, little effort toward the compilation of these datasets has previously been made, and comparative watershed analyses have remained limited in scale. The MacroSheds project has developed a flexible, future-friendly system for continually harmonizing daily time series of streamflow, precipitation, and solute chemistry from 168+ watershed studies throughout the U.S. and beyond, and supplementing each with a comprehensive set of predictive watershed attributes. The MacroSheds dataset is an unprecedented resource for watershed ecosystem science, and for hydrology, as a small-watershed supplement to existing collections of streamflow predictors, like CAMELS and GAGES-II. Macrosheds is accompanied by a web dashboard for visualization (macrosheds.org) and an R package (https://github.com/MacroSHEDS/macrosheds) for local analysis.
Lines changed: 40 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,40 @@
1+
-----
2+
Glossary of Terms (*as used in this documentation)
3+
-----
4+
5+
watershed - All land area contributing runoff to a point of interest along a stream. Does not necessarily account for inputs from subsurface flow or human-constructed diversions. We avoid the terms "catchment" and "basin," though they are sometimes used in this way.
6+
7+
site* - An individual gauging station or stream sampling location and its watershed.
8+
9+
domain* - One or more sites under common management.
10+
11+
network* - One or more domains under common funding/leadership.
12+
13+
provider* - Also sometimes referred to as a "source" -- a primary source of data assimilated into MacroSheds. May be a network, domain, or third party.
14+
15+
product* - A collection of data, possibly including multiple datasets/tables. Providers may separate products by temporal extent/interval, scientific category, detection method, and/or sampling location.
16+
17+
prodname - One of the 7 core product categories included in the MacroSheds dataset. See below.
18+
19+
prodcode - An alphanumeric string associated with a product. Providers have their own. MacroSheds uses its own scheme internally. This won't be relevant for most users, but see detailed documentation included with core data downloads for more information.
20+
21+
site-product, site-year, etc. - Terms like these are used to designate various subdivisions of the overall MacroSheds dataset. A site-product, for example, is the collection of all data for a single MacroSheds product, available at a single site.
22+
23+
-----
24+
MacroSheds data are organized into the following products:
25+
-----
26+
27+
discharge - Streamflow; water volume over time; reported in L/s.
28+
29+
stream chemistry - Concentration of chemical constituents in stream water; reported in mg/L or mEq/L.
30+
31+
stream flux - Mass of chemical constituents in stream water, per watershed area, over time; reported in kg/ha/d. (not currently included with this dataset, but can be generated via the macrosheds package for R)
32+
33+
precipitation - Rainfall, snowfall, or both combined; reported per watershed in mm.
34+
35+
precipitation chemistry - Concentration of chemical constituents in precipitation; reported in mg/L or mEq/L; averaged across watershed area.
36+
37+
precipitation flux - Mass of chemical constituents in precipitation, per watershed area, over time; reported in kd/ha/d. (not currently included with this dataset, but can be generated via the macrosheds package for R)
38+
39+
watershed attributes - Areal watershed summary statistics, variables available are common to all MacroSheds sites.
40+
Lines changed: 11 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,11 @@
1+
"attributeName" "attributeDefinition" "class" "unit" "dateTimeFormatString" "missingValueCode" "missingValueCodeExplanation"
2+
"date" "date as UTC timestamp with 00:00:00 time" "Date" "Y-M-DTh:m:sZ"
3+
"site_code" "Short name for MacroSheds site. See site_metadata.csv" "character"
4+
"dayl(s)" "Watershed average seconds of daylight" "numeric" "second" "NA" "missing value"
5+
"prcp(mm/day)" "Watershed average precipitation" "numeric" "millimetersPerDay" "NA" "missing value"
6+
"srad(W/m2)" "Watershed average solar radiation" "numeric" "wattPerMeterSquared" "NA" "missing value"
7+
"swe(mm)" "Watershed average snow-water equivalent" "numeric" "millimeter" "NA" "missing value"
8+
"tmax(C)" "Watershed average maximum air temperature" "numeric" "celsius" "NA" "missing value"
9+
"tmin(C)" "Watershed average minimum air temperature" "numeric" "celsius" "NA" "missing value"
10+
"vp(Pa)" "Watershed average vapor pressure" "numeric" "pascal" "NA" "missing value"
11+
"pet(mm)" "Watershed average potential evapotranspiration" "numeric" "millimeter" "NA" "missing value"
Lines changed: 34 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,34 @@
1+
"attributeName" "attributeDefinition" "class" "unit" "dateTimeFormatString" "missingValueCode" "missingValueCodeExplanation"
2+
"site_code" "Short name for MacroSheds site. See site_metadata.csv" "character"
3+
"p_mean" "mean daily precipitation 1989-10-01 to 2009-09-30 (Daymet)" "numeric" "millimeterPerDay" "NA" "missing value"
4+
"pet_mean" "mean daily PET (estimated using Priestley-Taylor formulation with gridded alpha product from Aschonitis et al. 2017) (Daymet)" "numeric" "millimeterPerDay" "NA" "missing value"
5+
"aridity" "aridity (ratio of pet_mean to p_mean) (Daymet)" "numeric" "dimensionless" "NA" "missing value"
6+
"p_seasonality" "seasonality and timing of precipitation (estimated using sine curves to represent the annual temperature and preciptiation cycles, positive [negative] values indicate that precipitation peaks in summer [winter], values close to 0 indicate uniform precipitation throughout the year) (Daymet)" "numeric" "dimensionless" "NA" "missing value"
7+
"frac_snow" "fraction of precipitation falling as snow (i.e., on days colder than 0°C) (Daymet)" "numeric" "dimensionless" "NA" "missing value"
8+
"high_prec_freq" "frequency of high precipitation days ( >= 5 times mean daily precipitation) (Daymet)" "numeric" "daysPerYear" "NA" "missing value"
9+
"high_prec_dur" "average duration of high precipitation events (number of consecutive days >= 5 times mean daily precipitation) (Daymet)" "numeric" "day" "NA" "missing value"
10+
"high_prec_timing" "season during which most high precipitation days ( >= 5 times mean daily precip.) occur (Daymet)" "categorical" "NA" "missing value"
11+
"low_prec_freq" "frequency of dry days ( <1 mm/day) (Daymet)" "numeric" "daysPerYear" "NA" "missing value"
12+
"low_prec_dur" "average duration of dry periods (number of consecutive days <1 mm/day) (Daymet)" "numeric" "day" "NA" "missing value"
13+
"low_prec_timing" "season during which most dry days ( <1 mm/day) occur (Daymet)" "categorical" "NA" "missing value"
14+
"geol_1st_class" "most common geologic class in the catchment (GLiM)" "categorical" "NA" "missing value"
15+
"glim_1st_class_frac" "fraction of the catchment area associated with its most common geologic class (GLiM)" "numeric" "dimensionless" "NA" "missing value"
16+
"geol_2nd_class" "2nd most common geologic class in the catchment (GLiM)" "categorical" "NA" "missing value"
17+
"glim_2nd_class_frac" "fraction of the catchment area associated with its 2nd most common geologic class (GLiM)" "numeric" "dimensionless" "NA" "missing value"
18+
"carbonate_rocks_frac" "fraction of the catchment area characterized as ""Carbonate sedimentary rocks"" (GLiM)" "numeric" "dimensionless" "NA" "missing value"
19+
"geol_porosity" "subsurface porosity (GLHYMPS)" "numeric" "dimensionless" "NA" "missing value"
20+
"geol_permeability" "subsurface permeability (log10) (GLHYMPS)" "numeric" "squareMeter" "NA" "missing value"
21+
"sand_frac" "sand fraction (of the soil material smaller than 2 mm, layers marked as oragnic material, water, bedrock and ""other"" were excluded) (gSSURGO)" "numeric" "percent" "NA" "missing value"
22+
"silt_frac" "silt fraction (of the soil material smaller than 2 mm, layers marked as oragnic material, water, bedrock and ""other"" were excluded) (gSSURGO)" "numeric" "percent" "NA" "missing value"
23+
"clay_frac" "clay fraction (of the soil material smaller than 2 mm, layers marked as oragnic material, water, bedrock and ""other"" were excluded) (gSSURGO)" "numeric" "percent" "NA" "missing value"
24+
"organic_frac" "fraction of soil_depth_statsgo marked as organic material (class 13) (gSSURGO)" "numeric" "percent" "NA" "missing value"
25+
"gauge_lat" "gauge latitude " "numeric" "degreesNorth" "NA" "missing value"
26+
"gauge_lon" "gauge longitude" "numeric" "degreesEast" "NA" "missing value"
27+
"area" "watershed area" "numeric" "squareKilometers" "NA" "missing value"
28+
"elev_mean" "watershed mean elevation" "numeric" "meter" "NA" "missing value"
29+
"slope_mean" "watershed mean slope" "numeric" "metersPerKilometer" "NA" "missing value"
30+
"frac_forest" "forest fraction" "numeric" "!Add units here!" "NA" "missing value"
31+
"dom_land_cover_frac" "fraction of the catchment area associated with the dominant land cover" "numeric" "dimensionless" "NA" "missing value"
32+
"dom_land_cover" "dominant land cover type (Noah-modified 20-category IGBP-MODIS land cover)" "categorical" "NA" "missing value"
33+
"root_depth_50" "root depth (percentile 50% extracted from a root depth distribution based on IGBP land cover) (MODIS)" "numeric" "meter" "NA" "missing value"
34+
"root_depth_99" "root depth (percentile 99% extracted from a root depth distribution based on IGBP land cover) (MODIS)" "numeric" "meter" "NA" "missing value"
Lines changed: 10 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,10 @@
1+
"attributeName" "attributeDefinition" "class" "unit" "dateTimeFormatString" "missingValueCode" "missingValueCodeExplanation"
2+
"network" "Short name for the data network associated with a MacroSheds site. A network includes one or more domains under common funding/leadership." "character"
3+
"domain" "Short name for the data domain associated with a MacroSheds site. A domain includes one or more sites under common management." "character"
4+
"site_name" "May be a site_code or a stream_name. Cross-reference with sites.csv." "character"
5+
"start_date_of_concern" "The first date on which the irregularity is applicable. May be “whole_record”" "character"
6+
"end_date_of_concern" "The last date on which the irregularity is applicable. May be “whole_record”" "character"
7+
"macrosheds_product_affected" "The MacroSheds data product to which this irregularity pertains" "character"
8+
"macrosheds_variables_affected" "The MacroSheds variable(s) to which this irregularity pertains" "character"
9+
"concern" "Details of the irregularity" "character"
10+
"included_in_current_dataset" "If FALSE, this irregularity is not yet present in the published MacroSheds dataset." "character"
Lines changed: 11 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,11 @@
1+
"attributeName" "attributeDefinition" "class" "unit" "dateTimeFormatString" "missingValueCode" "missingValueCodeExplanation"
2+
"domain" "Short name for the data domain associated with a MacroSheds site. A domain includes one or more sites under common management." "character"
3+
"prodcode" "One or more MacroSheds product codes, separated by |. These codes are mostly for internal use, but they may be relevant for distinguishing primary data sources here, e.g. if a detection limit differs between stream and precip chemistry within the same domain." "character"
4+
"variable_converted" "The MacroSheds variable code after it has been converted (only relevant for molecule-to-atomic-constituent conversions, e.g. NO3 to NO3-N)" "character"
5+
"variable_original" "The MacroSheds variable code before any molecule-to-atomic-constituent conversions have taken place (e.g. NO3 to NO3-N)" "character"
6+
"detection_limit_converted" "The detection limit after conversion to MacroSheds standard units (see variables_time_series.csv)" "numeric" "variable_unit"
7+
"detection_limit_original" "The detection limit before conversion to MacroSheds standard units (see variables_time_series.csv)" "numeric" "variable_unit"
8+
"unit_converted" "The units of the detection limit after conversion to MacroSheds standard units (see variables_time_series.csv)" "character"
9+
"unit_original" "The units of the detection limit before conversion to MacroSheds standard units (see variables_time_series.csv)" "character"
10+
"start_date" "The first known date on which the detection limit is applicable" "Date" "Y-M-D"
11+
"end_date" "The last known date on which the detection limit is applicable" "Date" "Y-M-D"
Lines changed: 6 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,6 @@
1+
"attributeName" "attributeDefinition" "class" "unit" "dateTimeFormatString" "missingValueCode" "missingValueCodeExplanation"
2+
"variable_code" "MacroSheds variable short name" "character"
3+
"variable_name" "Full name of MacroSheds variable" "character"
4+
"unit" "The standard unit of the variable within the MacroSheds dataset" "character"
5+
"range_check_minimum" "The minimum value that is allowed through the range filter" "numeric" "variable_unit"
6+
"range_check_maximum" "The maximum value that is allowed through the range filter" "numeric" "variable_unit"

0 commit comments

Comments
 (0)