Making Better Maps of PreModern Rivers

Archaeologists and others are faced with the challenge of producing good regional maps showing ancient waterways.  The problem is the readily available data on rivers and streams have two major shortcomings.  The first is that they reflect modern changes to hydrography, most significantly the construction of artificial lakes and dams.  This means that regional maps of the ancient landscape are often populated with numerous anachronistic lakes (see the first map below).  Second, many collections of waterways either contain too many or too few water channels for any selected scale and purposes and contain no tools to make it possible to effectively select candidate rivers.

Peruse the four maps below to get a sense of the issues.  The first two are problematic because they contain man-made lakes or because they contain too many rivers for most purposes.  The second two are significantly improved as regional maps because they limit the number of rivers or they visually code the rivers by their size and importance.

modernrivers
Map showing typical GIS river shapefile. Includes man made dams and too many rivers.

50cfs
This map is also problematic because it simply shows too many rivers and streams for the scale.

500cfs
This is a much better map for most purposes because it only shows more important rivers.

gradient_500px
This actually shows more rivers than the map above, but makes it useful by varying the thickness of the lines with the volume of the river.  This is a very effective technique.

The river data in the last three images comes from the National Hydrography Plus dataset.  These data show the paths of water courses that ignore most lakes and they also contain data on the average flow of water in each section which makes it easy to only depict rivers of any given size.

The NHDPlus dataset is described in great detail in its documentation available on Horizon System’s website.  It involves several different products, but the most useful for our purposes is the nhdflowlines.  These are the paths that water flows across the US and were created by simulating the fall of rain across elevation maps of the US.  By reconstructing the flow of water this way, it largely ignores the presence of reservoir systems.  Also, a byproduct of this reconstruction is model data on the volume and velocity of water along each segment.

The data are freely available, but perhaps the biggest obstacle to using these data for most archaeological applications is the way they are packaged.  The data on the NHDPlus website are separated by drainage basins and subbasins and require many file downloads and lots of merging shapefiles and data fields.  If you are working in a very restricted area, this may be preferable, but for those looking for an easy way to make large regional maps, this is problematic and time consuming.

As an aid for everyone, I am offering for download a single shapefile for the eastern US and a single file for the western US that contains almost all of the flowline data as well as the associated data on water volume and velocity.  Further, I am providing instructions on how to use these data in ArcGIS.  But first a few important pieces of information.

Data Ownership

These are NOT my data.  These data are in the public domain.  The real credit for these goes to Horizon Systems for producing this under a contract to the USGS and EPA.  This is based on version 2 of the data and was downloaded from the NHDPlus website in August of 2013. I am merely repackaging elements of this data for others to use.

Dowloading the Data

The data can be dowlnoaded from the following links.  The split between the eastern and western US is done along the 100 degree West longitude line.  Note these are pretty large files.

Eastern US Flowline Shapefile (789 MB)
Western US Flowline Shapfile (590 MB)

In order to use these data, extract all files from the zip file into a single directory and then load into your GIS software as a layer.

Data Fields

I have included just seven fields in these files.  Many more are available through the NHDPlus site for those with an interest.

Field Name Description
COMID This is the unique identification number used by NHDPlus for each flowline segment.  If you wish to download and join other data fields, this is the ID to use.
GNIS_ID This is the Geographic Names Information System ID for the name of this water body.
GNIS_NAME This is the GNIS text name.
LENGTHKM This is the length of this segment as reported in the NHD.
FTYPE This is the type of feature this is.  Consult the documentation for further information.  The possible values in the data for download from this web page are Artificial Path, Canal ditch, Coastline, Connector, Pipeline, and StreamRiver.
Q0001C This is the average annual volume of waterflow in this segment in cubic feet per second.  As you can read in the documentation, NHDPlus produced several different estimates for each segment (Q0001A, Q0001B, etc.) using different models.  They state Q0001C is the best estimate “natural” waterflow that does not take into account artificial channels, and is therefore the only one I included in this package of data since it is the best proxy for premodern flows.  As stated in the documentation, a more precise estimate of actual modern water volume is Q0001E, which takes into account actual gage readings.  See the documentation for more details on how all of these were calculated.
V0001C This is the average annual velocity of the water along this segment in feet per second.  This is the best estimate in the NHD models for “natural” waterflow that does not take into account artificial channels.  A more precise estimate of actual modern water velocity is V0001E, but this is not included in this data set.  See the documentation for more details on how this was calculated.

Making Nice Maps

I currently use ArcGIS 10.1 and I have written these instructions for that software and version.  Users using ArcGIS 8.x, 9.x, or 10.x will find they follow very similar steps and the same principles will apply to users of any GIS software.

After you have downloaded and extracted the shapefile, open ArcMap and add nhdflowlines as a layer.  By default, all water channels will be displayed.  This includes coastlines, pipelines, actual river channels, and others.  For most users making regional maps, the only step you need to take is to select a subset of these segments that only include flows greater than a certain volume.  The easiest way to do that in ArcGIS is to use a Definition Query.  Right click on nhdflowlines and select Properties.  Then click on the Definition Query tab.  Finally, use the water volume field (Q0001C) to select water courses with more than a certain volume of water.  In ArcGIS this query can be typed directly into the window, like below, or you can use the Query Builder to create and test the query.

query

For regional maps, experiment with the number to find the right representation of rivers for your map.  Here are some examples:

20cfs
“Q0001C”>20
200cfs
“Q0001C”>200
750cfs
“Q0001C”>750
1500cfs
“Q0001C”>1500
3000cfs
“Q0001C”>3000
5000cfs
“Q0001C”>5000

An even nicer map can be made by using graduated symbols.  After pruning the rivers to an adequate level with a Definition Query, go to Symbology.  Choose Quantities | Graduated Symbols.  Under Template choose an appropriate line color.  For the Value Field choose Q0001C.  Likely when you do this you will be warned that “Maximum Sample Size has been reached.”  To fix this, click on Classify… and then Sampling…  Select a very large number (>10 million) and press OK.  It will think for awhile as it builds a new histogram (see the section on Optimizing below for ideas on speeding this up).  While on the Classify screen, experiment with various classification methods and class numbers.  I prefer Jenks and 32 classes for most maps.  Also experiment with various symbol size ranges.

layerproperties

The resulting map can look like this.  The map below was made with Q0001C>500, 32 classification breaks using Jenks, and .5 to 3 symbol size.

gradient500

Depending on your region and your query, you may need to also exclude certain FTYPES, for example Coastline, Canal ditch, or Pipeline.  Since these are usually defined with small water volumes, they typically are excluded once you select volumes above a certain level.  However, you could also construct a Definition Query such as this:

"Q0001C" >200 AND "FTYPE" <> 'Pipeline' AND "FTYPE" <> 'Coastline' 

Optimizing the Use of the Shapefile

Because of the size of the shapefile, users may find that it performs rather slowly.  Consider the following options to improve the performance: (1) create a smaller shapfile that only covers the region you need, (2) move the data to a file geodatabase or other format with better performance, (3) create a spatial index, and (4) create a field index, most importantly of Q0001C.

In order to do these, here are some suggestions:
1. There are many methods to create a smaller shapefile.  Perhaps the easiest is to adjust the view extent to your region of interest.  Right click on the nhdflowlines later and select Data | Export Data.  Then select to only export data in the current view extent.
2. Use the Create File GDB tool to create a geodatabse.  Then move or save layers in this, which will behave like a folder.
3. The Add Spatial Index tool will create a spatial index and speed up many drawing operations.
4. Use the Add Attribute Index to index specific fields.  I recommend creating an index of Q0001C by itself.  And perhaps FTYPE if you use that in your Definition Query.

Merging East and West

To make the downloads more manageable, I have split the data between east and west.  If you wish to create a single later that combines these, make sure to store it in a Geodatabase.  The resulting file is too large for a regular shapefile.  In ArcGIS the Merge toolbox tool can be used to combine the layers into one.

Limitations of the Data

This is the best data I have yet found to depict prehistoric rivers at a regional scale.  However, this is not, nor was it intended to be, an accurate map of prehistoric river channels.  These are reconstructed using modern DEMs, which include modern natural and man-made levees, canals, as well as other features.  These are more apparent the more one zooms in on specific locales.  For example, in modern DEMs, modern lakes look “flat” and the nhdflowlines will show some odd meanders across these modern lakes at the right scales.  A second issue for use by archaeologists and others issued in the premodern landscape is that the flow velocity and volume are created using modern rainfall data.

Data Specifications

The data I provide has been repackaged from NHDPlus v2 data downloaded August 2013.  Specifically, I merged all of the nhdflowlines shapefiles, attached the EROM_MA0001 tables, and then extracted just certain fields.  I also excluded all features where Q0001C was null, since these mostly included modern pipelines.

Questions and Comments

If you have any questions or comments on this, please feel free to email me at patrickl@ou.edu.