Overview

Get meteorological data from met stations located all over the world. That’s what you can do with this R package. There are LOTS of stations too (29,729 available in this dataset) and many have data that go pretty far back in time. The data comes from the Integrated Surface Dataset (ISD), which is maintained by the National Oceanic and Atmospheric Administration (NOAA).

Discovering Met Stations

At a minimum we need a station’s identifier to obtain its met data. We can start the process of getting an identifier by accessing the entire catalog of station metadata with the get_station_metadata() function. The output tibble has station id values in the first column. Let’s get a subset of stations from that: those stations that are located in Norway.

This table can be even more greatly reduced to isolate the stations of interest. For example, we could elect to get only high-altitude stations (above 1000 meters) in Norway.

The station IDs from the tibble can be transformed into a vector of station IDs with dplyr::pull().

Suppose you’d like to collect several years of met data from a particular station and fetch only the observations that meet some set of conditions. Here’s an example of obtaining temperatures above 15 degrees Celsius from the high-altitude "JUVVASSHOE" station in Norway and adding a column with temperatures in degrees Fahrenheit.

station_data <- 
  get_station_metadata() %>%
  dplyr::filter(name == "JUVVASSHOE") %>%
  dplyr::pull(id) %>%
  get_met_data(years = 2011:2019)

high_temp_data <-
  station_data %>%
  dplyr::select(id, time, wd, ws, temp) %>% 
  dplyr::filter(temp > 16) %>%
  dplyr::mutate(temp_f = ((temp * (9/5)) + 32) %>% round(1)) %>%
  dplyr::arrange(dplyr::desc(temp_f))

Additional Data Fields

There can be a substantial amount of additional met data beyond wind speed, ambient temperature, etc. However, these additional fields can vary greatly across stations. The nomenclature for the additional categories of data uses ‘two-letter + digit’ identifiers (e.g., AA1, GA1, etc.). Within each category are numerous fields, where the variables are coded as [identifer]_[index]). More information about these additional data fields can be found in this PDF document.

To find out which categories of additional data fields are available for a station, we can use the station_coverage() function. You’ll get a tibble with the available additional categories and their counts over the specified period.

We can use purrr’s map_df() function to get additional data field coverage for a subset of stations (those that are near sea level and have data in 2019). With the station_coverage() function set to output tibbles in wide mode (one row per station, field categories as columns, and counts of observations as values), we can ascertain which stations have the particular fields we need.

coverage_tbl
#> # A tibble: 16 x 88
#>    id       AA1   AB1   AC1   AD1   AE1   AG1   AH1   AI1   AJ1   AK1   AL1
#>    <chr>  <int> <int> <int> <int> <int> <int> <int> <int> <int> <int> <int>
#>  1 01023…     0     0     0     0     0     0     0     0     0     0     0
#>  2 01046…     0     0     0     0     0     0     0     0     0     0     0
#>  3 01049…  5810     0     0     0     0     0     0     0     0     0     0
#>  4 01107…  1048     0     0     0     0     0     0     0     0     0     0
#>  5 01139…     0     0     0     0     0     0     0     0     0     0     0
#>  6 01146…  6014     0     0     0     0     0     0     0     0     0     0
#>  7 01162…     1     0     0     0     0     0     0     0     0     0     0
#>  8 01167…   376     0     0     0     0     0     0     0   122     0     0
#>  9 01217…     0     0     0     0     0     0     0     0     0     0     0
#> 10 01225…     0     0     0     0     0     0     0     0     0     0     0
#> 11 01234…  1047     0     0     0     0     0     0     0     0     0     0
#> 12 01290…     0     0     0     0     0     0     0     0     0     0     0
#> 13 01332…  6288     0     0     0     0     0     0     0     0     0     0
#> 14 01355…  6125     0     0     0     0     0     0     0     0     0     0
#> 15 01467…     0     0     0     0     0     0     0     0     0     0     0
#> 16 01476…     0     0     0     0     0     0     0     0     0     0     0
#> # … with 76 more variables: AM1 <int>, AN1 <int>, AO1 <int>, AP1 <int>,
#> #   AU1 <int>, AW1 <int>, AX1 <int>, AY1 <int>, AZ1 <int>, CB1 <int>,
#> #   CF1 <int>, CG1 <int>, CH1 <int>, CI1 <int>, CN1 <int>, CN2 <int>,
#> #   CN3 <int>, CN4 <int>, CR1 <int>, CT1 <int>, CU1 <int>, CV1 <int>,
#> #   CW1 <int>, CX1 <int>, CO1 <int>, CO2 <int>, ED1 <int>, GA1 <int>,
#> #   GD1 <int>, GF1 <int>, GG1 <int>, GH1 <int>, GJ1 <int>, GK1 <int>,
#> #   GL1 <int>, GM1 <int>, GN1 <int>, GO1 <int>, GP1 <int>, GQ1 <int>,
#> #   GR1 <int>, HL1 <int>, IA1 <int>, IA2 <int>, IB1 <int>, IB2 <int>,
#> #   IC1 <int>, KA1 <int>, KB1 <int>, KC1 <int>, KD1 <int>, KE1 <int>,
#> #   KF1 <int>, KG1 <int>, MA1 <int>, MD1 <int>, ME1 <int>, MF1 <int>,
#> #   MG1 <int>, MH1 <int>, MK1 <int>, MV1 <int>, MW1 <int>, OA1 <int>,
#> #   OB1 <int>, OC1 <int>, OE1 <int>, RH1 <int>, SA1 <int>, ST1 <int>,
#> #   UA1 <int>, UG1 <int>, UG2 <int>, WA1 <int>, WD1 <int>, WG1 <int>

For the "KAWAIHAE" station in Hawaii, some interesting data fields are available. In particular, its SA1 category provides sea surface temperature data, where the sa1_1 and sa1_2 variables represent the sea surface temperature and its quality code.

Combining the use of get_met_data() with functions from dplyr, we can create a table of the mean ambient and sea-surface temperatures by month. The additional data is included in the met data table by using the add_fields argument and specifying the "SA1" category (multiple categories can be included).

Installation

The stationaRy package can be easily installed from CRAN.

install.packages("stationaRy")

To install the development version of stationaRy, use the following:

install.packages("devtools")
remotes::install_github("rich-iannone/stationaRy")

If you encounter a bug, have usage questions, or want to share ideas to make this package better, feel free to file an issue.

License

MIT © Richard Iannone