Streams and Catchments
GEOGloWS V2 uses a modified subset of the TDX-Hydro streams dataset. TDX-Hydro is a hydrography dataset derived from a 12m Airbus DEM created by the National Geospatial-Intelligence Agency (NGA). It is a public dataset first released in summer 2023. Production of TDX-Hydro is based on a delineation using the TauDEM software followed by extensive modifications. You can download the full dataset and review the full technical description document describing its creation at https://earth-info.nga.mil/ under the "Geosciences" tab. The full TDX-Hydro dataset has about 16 million river segments and covers the full globe. We have excluded select regions of the world farthest north and performed modifications to the streams to reduce the total number of rivers to a total of 7 million rivers.
Modifications Made to TDX-Hydro
All the modifications we performed on each TDX-Hydro region are recorded in Excel format. That Excel table can be viewed here. A brief overview of these changes are listed below.
The regions that were excluded include those that are farther north and some of the smaller islands, where runoff datasets may not be as accurate and there is less interest. Future GEOGloWS versions may include some of these regions. Additionally, we corrected errors found in the V1 TDX-Hydro dataset, which include:
Streams that have no length and no upstream/downstream segments, i.e. streams where there are only two points and both points are the same location. These, along with any associated catchments, were removed.
Streams that have no length with upstream or downstream segments. These were removed along with any associated catchments, and the attributes of the upstream and/or downstream segments were modified to refer to each other and preserve the stream network's connectivity.
Catchments with a stock identifier of '0' never had an associated stream. These were deleted.
All of the above errors and methods for correction were corresponded back to the NGA.
For most, but not all, of the regions, the headwaters streams were dissolved with the downstream segments, up to and including the downstream segment with a Strahler stream order of either 2 or 3. It was decided that regions that were largely coastal (Japan, Carribean islands, Indonesia) were more sensitive to changes in their stream networks, and so these regions did not have their headwaters modified. Other areas, such as the Saharan desert, were delineated to the same resolution as the rest of the world -- often too much resolution, creating "streams" that in reality don't exist. In these areas, more features could be dissolved without significantly altering the river routing answer. Thus, the headwaters and downstream streams were dissolved into one feature along with their associated catchments, and relevant attributes such as length and slope were recomputed. The stream order to which the headwaters would be dissolved was also chosen based on these considerations.
Figure showing how the headwaters are dissolved with the downstream segment into one feature
Small watersheds up to 200 square kilometers were removed from the TDX-Hydro dataset for all regions. This was done for similar reasons as the differing headwater stream dissolving. The more coastal regions had watersheds between up to 25 and 75 square kilometers dropped. Other areas, like the Saharan desert or northern Canada, had watersheds of 200 square kilometers dropped. In these flatter regions, the high resolution of the delineation creates little "pools" or small collections of streams that do not drain to the ocean and do not represent flowing streams. They often collectively have an area of less than 200 square kilometers. For the less coastal and flatter/drier regions, bigger watersheds were dropped.
Headwater streams that led directly into a stream with a Strahler stream order of two or greater were dissolved with the immediate downstream segment for most, but not all, of the regions. The decision to prune these streams are the same as above.
TDX-Hydro Processing Files
The files used to dictate the way in which we modified the TDX-Hydro dataset can be found here. There are three files: processing_options.xlsx (which is the spreadsheet mentioned and explained above), tdx_header_numbers.json, and terminal_node_vpu_list.csv. The TDX header number JSON file maps every TDX-Hydro region number to a unique 2-digit number, with the first digit being the first digit of the region number, and the second digit corresponding to the index of the sorted order of all the regions that share the first digit. The terminal node vpu CSV matches every terminal node (the id associated with the outlet of a watershed) with a VPU number.
Python scripts have been generated to process the TDX-Hydro dataset using these files. Those python scripts can be found here.
The V2 streams have the following attributes which come from the TauDEM delineation process. For more explanation of these attributes, please check here.
*For modified features this value was not recomputed and should not be trusted
LINKNO - A river ID number unique to the TDXHydro delineation. In TDXHydro v1 this is not globally unique. In future versions this will be the same as geoglowsID.
DSLINKNO - The ID of the river immediately downstream of the segment represented on that row.
USLINKNO* - There will be 1 column per river segment upstream of the river on this row.
DSNODEID - The node identifier for node at downstream end of river.
strmOrder - The Strahler stream order *
Length - Geodesic length in meters of the river segment
Magnitude - The Shreve stream magnitude *
USContArea - The total drainage area upstream of the most upstream point (i.e. the inlet) of this segment *
DSContArea - The total drainage area upstream of the most downstream point (i.e. the outlet) of this segment.
strmDrop - The change in elevation between the inlet and outlet of the river segment. *
Slope - The average stream slope equal to "strmDrop / Length."
StraightL - Distance from start to end of a river in a straight line between the first and last points. *
WSNO - Watershed number.
DOUTEND - Distance to the eventual outlet from the end of the river. *
DOUTSTART - Distance to the eventual outlet from the start of the river. *
DOUTMID - Distance to the eventual outlet from the midpoint of the river. *
V2 streams also have the following additional attributes added by the GEOGloWS modelers.
TDXHydroLinkNO - This value is created as the global unique id for each stream. It is numerically computed as the first two digits of the TDXHydroRegion number times 10 million plus the original id number
LengthGeodesicMeters - The geodesic length of the stream in meters
LEN_GEOM - Intermediate value used in computing the Muskingum parameters.
lat - the latitude of the centroid of the stream
lon - the longitude of the centroid of the stream
z - The elevation for each stream. This is a placeholder value only which we included for compatibility with other software. It will have a value of 0 for all rivers.
TDXHydroRegion - the original TDX regional group number of which this stream is part of
TopologicalOrder - the topological order of a stream, from headwater to outlet
Musk_kfac - A copy of the Musk_k attribute which was used for routing parameter estimation.
Musk_k - The Muskingum k parameter.
Musk_x - The Muskingum x parameter.
TerminalNode - This number is the TDXHydroLinkNo of the eventual outlet of this stream's watershed
VPUCode - a three digit number representing which VPU region this stream belongs to
Stream Centerlines Locations (GIS Dataset)
Derived from the TDX-Hydro dataset.
Not an exact copy of TDX-Hydro.
7 million rivers (out of 16 million available in the TDX-Hydro source)
Divided into 125 computational groups (VPU) (increased from 62 in TDX-Hydro source)
Shapefile is not available because of file format limitations and the number of stream features.
From Box cloud storage: https://byu.box.com/s/8f1zcglvjtk301jataz61b1n4rzhgz2o
AWS S3 Bucket
See the Tutorials page for instructions on accessing S3 data.
Catchment (subbasin) Boundaries (GIS Dataset)
Derived from the TDX-Hydro dataset
7 million rivers
Divided into 125 computational groups (VPU)
Shapefile is not available because of file format limitations
From NGA website: https://earth-info.nga.mil/ (look under the "Geosciences" tab)
Computational Unit Boundaries (GIS Dataset)
125 groups (VPU)
Outlines of groups of watersheds
From Box cloud storage: https://byu.box.com/s/mfmd0alwf5ka4k6m0o4gigl6wiv4src1