Skip to content

alt

USGS - Water

Introduction

The United States Geological Survey (USGS) has collected water-resources data at approximately 1.5 million sites in all 50 States, the District of Columbia, Puerto Rico, the Virgin Islands, Guam, American Samoa and the Commonwealth of the Northern Mariana Islands. The USGS investigates the occurrence, quantity, quality, distribution, and movement of surface and underground waters and disseminates the data to the public, State and local governments, public and private utilities, and other Federal agencies involved with managing our water resources.Water-quality data are available for both surface water and groundwater. Examples of water-quality data collected are temperature, specific conductance, pH, nutrients, pesticides, and volatile organic compounds.

Source: USGS Water Data

Tags: Climate and Environment, Water, Time-series, Risk, Daily

Modules

Scrapping:

Below is the API endpoints and the parameters that we need to pass to get the data.

https://waterservices.usgs.gov/nwis/dv/?format=json&stateCd={state}&startDT={start_date}&endDT={end_date}&siteStatus=all

Geocoder:

Coordinates are added to the metadata for the country. Region and region code are also appended. Geocoder library is used for getting coordinates. We also have a separate JSON file for country’s coordinates to avoid calling third party library to make geocoding process more efficient and faster.

Standardization:

Additional information like sample frequency, units, source and description are included in the metadata. Function for fetching ISO country code and appending it is present in standardization. Predefined domain and subdomain are added in this step.

Cleaning:

Duplicate and additional columns are removed from the data. Location names are rectified and country names are formatted correctly.

Metadata

Metadata Attributes

Attributes Descriptions
timestamp standard timestamp used for the timeseries , tsunami was observed
map_coordinates Latitude and Longitude of the station location (geojson format
country The country where the tsunami effects were observed.
country_code ISO 3-letter country code
domain Predefined domain by Taiyo.
name name of the data
region region for a country according to World Bank Standards}
region_code region code for a region according to World Bank Standards.
sample_frequency frequency in which data gets updated on the source
sub_domain Predefined subdomain by Taiyo.
time_of_sampling time of data collection
date_of_sampling date of data collection
timezone Timezone for the time and date
units Type of value stored in timeseries
measurement_type type of measure (min, max, median)
url url for the each of the datasets.
agency_code
site_no unique site number
site_name name of the site
variable_description description of the timseries value
description description of the dataset
variable_name name of the variable for timseries
hydrogic_unit_code
county_code country fips code for United States
site_type_code tyep of site code
measurement_type
sub_division_code ISO 3166-2 code of subdivision strictly followed by ISO
value value of timeseries
qualifiers USGS unique code: (e) Value has been edited or estimated by USGS personnel and is write protected; (&)Value was computed from affected unit values (E) Value was computed from estimated unit values. (A) Approved for publication -- Processing and review completed. (P) Provisional data subject to revision. (<) The value is known to be less than reported value and is write protected. (>) The value is known to be greater than reported value and is write protected. (1) Value is write protected without any remark code to be printed (2)Remark is write protected without any remark code to be printed No remark (blank)
income_level It defines which economic income group country belongs to.
sub_division_name Name of the subdivision.
sub_division_level this includes the subdivision level (e.g. states, Union Territory, Province, Economic Region etc.)
county County/District Name

Data Flow

The above data pipeline runs on Argo and it will be executed on a periodic frequency.

DAGs:

  • USGS-Water: Total No of DAGs file is 1

Taiyo Data Format

Entity USGS-Water
Frequency Daily
Updated On 01-06-2022 UTC 12:14:16 PM
- -
Coverage covering all the states in USA
Uncertainties -
## Scope for Improvement

Following can be improved in the next version of the data product:

  • In future we might want to improve it to only scrap the data that we don’t already have.
  • https://waterdata.usgs.gov/nwis/rt
  • https://help.waterdata.usgs.gov/codes-and-parameters/daily-value-qualification-code-dv_rmk_cd
  • https://help.waterdata.usgs.gov/faq/about-the-usgs-water-data-for-the-nation-site