The get_shark_data
function retrieves data from the SHARK database hosted by SMHI. The function sends a POST request
to the SHARK API with customizable filters, including year, month, taxon name, water category, and more, and returns the
retrieved data as a structured data.frame
. To view available filter options, see get_shark_options
.
Usage
get_shark_data(
tableView = "sharkweb_overview",
headerLang = "internal_key",
save_data = FALSE,
file_path = NULL,
delimiters = "point-tab",
lineEnd = "win",
encoding = "utf_8",
dataTypes = c(),
bounds = c(),
fromYear = NULL,
toYear = NULL,
months = c(),
parameters = c(),
checkStatus = "",
qualityFlags = c(),
deliverers = c(),
orderers = c(),
projects = c(),
datasets = c(),
minSamplingDepth = "",
maxSamplingDepth = "",
redListedCategory = c(),
taxonName = c(),
stationName = c(),
vattenDistrikt = c(),
seaBasins = c(),
counties = c(),
municipalities = c(),
waterCategories = c(),
typOmraden = c(),
helcomOspar = c(),
seaAreas = c(),
hideEmptyColumns = FALSE,
row_limit = 10^7,
prod = TRUE,
verbose = TRUE
)
Arguments
- tableView
Character. Specifies the columns of the table to retrieve. Options include:
"sharkweb_overview"
: Overview table"sharkweb_all"
: All available columns"sharkdata_bacterioplankton"
: Bacterioplankton table"sharkdata_chlorophyll"
: Chlorophyll table"sharkdata_epibenthos"
: Epibenthos table"sharkdata_greyseal"
: Greyseal table"sharkdata_harbourporpoise"
: Harbour porpoise table"sharkdata_harbourseal
: Harbour seal table"sharkdata_jellyfish"
: Jellyfish table"sharkdata_physicalchemical"
: Physical chemical table"sharkdata_physicalchemical_columns"
: Physical chemical table: column view"sharkdata_phytoplankton"
: Phytoplankton table"sharkdata_picoplankton"
: Picoplankton table"sharkdata_planktonbarcoding"
: Plankton barcoding table"sharkdata_primaryproduction"
: Primary production table"sharkdata_ringedseal"
: Ringed seal table"sharkdata_sealpathology"
: Seal pathology table"sharkdata_sedimentation"
: Sedimentation table"sharkdata_zoobenthos"
: Zoobenthos table"sharkdata_zooplankton"
: Zooplankton table"report_sum_year_param"
: Report sum per year and parameter"report_sum_year_param_taxon"
: Report sum per year, parameter and taxon"report_sampling_per_station"
: Report sampling per station"report_obs_taxon"
: Report observed taxa"report_stations"
: Report stations"report_taxon"
: Report taxa
Default is
"sharkweb_overview"
.- headerLang
Character. Language option for column headers. Possible values:
"sv"
: Swedish."en"
: English."short"
: Shortened version."internal_key"
: Internal key (default).
- save_data
Logical. If TRUE, the data will be saved to a specified file (see
file_path
). If FALSE, a temporary file will be created instead. The temporary file will be automatically deleted after it is loaded into memory.- file_path
Character. The file path where the data should be saved. Required if
save_data
is TRUE. Ignored ifsave_data
is FALSE.- delimiters
Character. Specifies the delimiter used to separate values in the file, if
save_data
is TRUE. Options are"point-tab"
(tab-separated) or"point-semi"
(semicolon-separated). Default is"point-tab"
.- lineEnd
Character. Defines the type of line endings in the file, if
save_data
is TRUE. Options are"win"
(Windows-style,\r\n
) or"unix"
(Unix-style,\n
). Default is"win"
.- encoding
Character. Sets the file's text encoding, if
save_data
is TRUE. Options are"cp1252"
,"utf_8"
,"utf_16"
, or"latin_1"
. Default is"utf_8"
.- dataTypes
Character vector. Specifies data types to filter. Possible values include:
"Bacterioplankton"
"Chlorophyll"
"Epibenthos"
"Grey seal"
"Harbour Porpoise"
"Harbour seal"
"Jellyfish"
"Physical and Chemical"
"Phytoplankton"
"Picoplankton"
"PlanktonBarcoding"
"Primary production"
"Profile"
"Ringed seal"
"Seal pathology"
"Sedimentation"
"Zoobenthos"
"Zooplankton"
- bounds
A numeric vector of length 4 specifying the geographical search boundaries in decimal degrees, formatted as
c(lon_min, lat_min, lon_max, lat_max)
, e.g.,c(11, 58, 12, 59)
. Default isc()
to include all data.- fromYear
Integer (optional). The starting year for data retrieval. If set to
NULL
(default), the function will use the earliest available year in SHARK.- toYear
Integer (optional). The ending year for data retrieval. If set to
NULL
(default), the function will use the latest available year in SHARK.- months
Integer vector. The months to retrieve data for, e.g.,
c(4, 5, 6)
for April to June.- parameters
Character vector. Optional parameters to filter the results by, such as
"Chlorophyll-a"
.- checkStatus
Character string. Optional status check to filter results.
- qualityFlags
Character vector. Specifies the quality flags to filter the data. By default, all data are included, including those with the "B" flag (Bad).
- deliverers
Character vector. Specifies the data deliverers to filter by.
- orderers
Character vector. Orderers to filter by specific organizations or individuals.
- projects
Character vector. Projects to filter data by specific research or monitoring projects.
- datasets
Character vector. Datasets to filter data by specific datasets.
- minSamplingDepth
Numeric. Minimum sampling depth (in meters) to filter the data.
- maxSamplingDepth
Numeric. Maximum sampling depth (in meters) to filter the data.
- redListedCategory
Character vector. Red-listed taxa for conservation filtering.
- taxonName
Character vector. Optional vector of taxa names to filter by.
- stationName
Character vector. Station names to filter data by specific stations.
- vattenDistrikt
Character vector. Water district names to filter by Swedish water districts.
- seaBasins
Character vector. Sea basins to filter by.
- counties
Character vector. Counties to filter by specific administrative regions.
- municipalities
Character vector. Municipalities to filter by.
- waterCategories
Character vector. Water categories to filter by.
- typOmraden
Character vector. Type areas to filter by.
- helcomOspar
Character vector. HELCOM or OSPAR areas for regional filtering.
- seaAreas
Character vector. Sea area codes to filter by specific sea areas.
- hideEmptyColumns
Logical. Whether to hide empty columns. Default is FALSE.
- row_limit
Numeric. Specifies the maximum number of rows that can be retrieved in a single request. If the requested data exceeds this limit, the function automatically downloads the data in yearly chunks. The default value is 10 million rows.
- prod
Logical. Whether to query the PROD (production) server or the SMHI internal TEST (testing) server. Default is TRUE (PROD).
- verbose
Logical. Whether to display progress information. Default is TRUE.
Value
A data.frame
containing the retrieved SHARK data, with column names based on the API's response.
Details
This function sends a POST request to the SHARK API with the specified filters. The response is parsed as JSON
and then converted into a data.frame
. The function handles the dynamic construction of the query body to filter
the data based on the provided parameters. If the row_limit
parameter is reached, the data retrieval process is
split into manageable chunks to avoid overwhelming the API or running into memory issues. Please note that making very
large requests, such as retrieving the entire database, can be extremely memory-intensive.