Skip to content

alt{: style="width:unset"}

NOAA – Tsunamis

Introduction

NCEI archives and assimilates tsunami, earthquake and volcano data to support research, planning, response and mitigation. Long-term data, including photographs, can be used to establish the history of natural hazard occurrences and help mitigate against future events. The NCEI/WDS Global Historical Tsunami Database contains runup information on locations where tsunami effects were observed. The tsunami runup data is related to the tsunami source data which contains information on the source of the tsunami.

Source: NOAA

Tags: Climate and Environment, Disasters, Time-series, Risk, Daily

Modules

Scrapping:

Below is the API endpoints and the parameters that we need to pass to get the data.

https://www.ngdc.noaa.gov/hazel/hazard-service/api/v1/{dataset}?maxYear={end_year}&minYear={start_year}

Geocoder:

Coordinates are added to the metadata for the country. Region and region code are also appended. Geocoder library is used for getting coordinates. We also have a separate JSON file for country’s coordinates to avoid calling third party library to make geocoding process more efficient and faster.

Standardization:

Additional information like sample frequency, units, source and description are included in the metadata. Function for fetching ISO country code and appending it is present in standardization. Predefined domain and subdomain are added in this step.

Cleaning:

Duplicate and additional columns are removed from the data. Location names are rectified and country names are formatted correctly.

Metadata

Metadata Attributes

Attributes Descriptions
sourceEqMagnitude The value in this column contains the primary earthquake magnitude
timestamp standard timestamp used for the timeseries , tsunami was observed
id Id used to connect timeseries data to the metadata
map_coordinates Latitude and Longitude of the station location (geojson format
country The country where the tsunami effects were observed.
country_code ISO 3-letter country code
description description of the dataset
domain Predefined domain by Taiyo.
name name of the data
original_id in this case we create our own original id using {city_measure_indicator}
region region for a country according to World Bank Standards}
region_code region code for a region according to World Bank Standards.
sample_frequency frequency in which data gets updated on the source
sub_domain Predefined subdomain by Taiyo.
time_of_sampling time of data collection
date_of_sampling date of data collection
timezone Timezone for the time and date
units Type of value stored in timeseries
measure type of measure (min, max, median)
url url for the each of the datasets
latest_timestamp_id mongoDB id for latest timestamp in the timeseries
income_level region code for a region according to World Bank Standards.
locationName The location (city, state or island) where the tsunami effects were observed.
distFromSource The distance from the tsunami event source to the runup location.
travHours The travel time is the time in hours and minutes that it took the initial tsunami wave to travel from the source to the location of effects.
travMins The travel time is the time in Minutes that it took the initial tsunami wave to travel from the source to the location of effects.
publish
area area where Tsunami effects were observered
arrDay
arrHour
arrMin
runupHt
firstMotion
runupHoriz
damageAmountOrder For those events not offering an exact number of houses damaged, the following four-level scale was used to classify the damage and was listed in the Houses Destroyed column. If the actual number of houses destroyed was listed, a descriptor was also added for search purposes.
housesDestroyedAmountOrder
deaths Whenever possible, numbers of deaths are listed.
deathsAmountOrder When a description was found in the historical literature instead of an actual number of deaths, this value was coded and listed in the Deaths column. If the actual number of deaths was listed, a descriptor was also added for search purposes; 0 None; 1 Few (~1 to 50 deaths); 2 Some (~51 to 100 deaths); 3 Many (~101 to 1000 deaths); 4 Very many (over 1000 deaths)
injuries Whenever possible, numbers of injuries from the runup are listed.
injuriesAmountOrder When a description was found in the historical literature instead of an actual number of injuries, this value was coded and listed in the Injuries column. If the actual number of injuries was listed, a descriptor was also added for search purposes. 0 None;1 Few (~1 to 50 injuries); 2 Some(~51 to 100 injuries); 3 Many (~101 to 1000 injuries); 4 Very many (over 1000 injuries)
housesDestroyed Whenever possible, numbers of houses destroyed are listed.
maxWaveArrDay
maxWaveArrHour
maxWaveArrMin
missingAmountOrder
volcanoEventId
housesDamagedAmountOrder Valid values: 0 to 4 For those events not offering an exact number of houses damaged, the following four-level scale was used to classify the damage and was listed in the Houses Destroyed column. If the actual number of houses destroyed was listed, a descriptor was also added for search purpos
damageMillionsDollars The value in the Damage column should be multiplied by 1,000,000 to obtain the actual dollar amount.
sourceEventValidity Validity of the actual tsunami occurrence is indicated by a numerical rating of the reports of that event: -1 erroneous entry, 0 event that only caused a seiche or disturbance in an inland river, 1 very doubtful tsunami, 2 questionable tsunami, 3 probable tsunami, 4 definite tsunami
sourceCauseCode The source of the tsunami: 0 Unknown 1 Earthquake, 2 Questionable Earthquake, 3 Earthquake and Landslide, 4 Volcano and Earthquake, 5 Volcano, Earthquake, and Landslide; 6 Volcano; 7 Volcano and Landslide; 8 Landslide; 9 Meteorological; 10 Explosion; 11 Astronomical Tide
tsunamiEventId
earthquakeEventId
doubtful Doubtful values; n Runup entry was not doubtful; y Runup entry was doubtful;m The waves likely had a meteorologic source, and thus were not true tsunami waves

Data Flow

alt

The above data pipeline runs on Argo and it will be executed on a periodic frequency.

DAGs:

  • NOAA-TSUNAMIS: Total No of DAGs file is 1

Taiyo Data Format

Entity NOAA-Tsunamis
Frequency Even Based
Updated On 09-06-2022 UTC 12:14:16 PM
- -
Coverage covering all the countries with the
Uncertainties -
## Scope for Improvement

Following can be improved in the next version of the data product:

  • Every time Argo Workflow run, we overwrite existing data on the S3 bucket.
  • In future we might want to improve it to only scrap the data that we don’t already have.
  • https://www.ngdc.noaa.gov/hazard/tsu.shtml
  • https://www.ngdc.noaa.gov/hazel/view/hazards/tsunami/runup-search