
Retrieve Taxonomic Data From Dyntaxa
Source:vignettes/retrieve_dyntaxa_data.Rmd
retrieve_dyntaxa_data.Rmd
Dyntaxa
Dyntaxa is a taxonomic database of Swedish organisms hosted at SLU Artdatabanken,
providing information on their names and relationships. The database
includes details such as the current classification, recommended names,
and commonly used synonymous or misapplied names. Dyntaxa is
continuously updated with new species for Sweden, new Swedish names,
synonymous scientific names, and new data on relationships. The data in
Dyntaxa serves as the foundation and framework for taxonomic information
in SHARK. It can be accessed via a
web interface or through the API, as demonstrated in
this tutorial using SHARK4R
. Please note that the authors
of SHARK4R
are not affiliated with Dyntaxa.
Getting Started
Installation
You can install the package from GitHub using the
devtools
package:
# install.packages("devtools")
devtools::install_github("sharksmhi/SHARK4R",
dependencies = TRUE)
Load the SHARK4R
, tibble
and
dplyr
libraries:
Retrieve Taxonony Table from SHARK
Taxon and data tables can be retrieved with the same filtering
options available in SHARK. To see
the available filtering options, please refer to get_shark_options
and the Retrieve Data
From SHARK tutorial.
# Retrieve taxonomy reports for phytoplankton between 2019 and 2020
shark_taxon <- get_shark_data(tableView = "report_taxon",
fromYear = 2019,
toYear = 2020,
dataTypes = "Phytoplankton",
verbose = FALSE)
# Print data
print(shark_taxon)
## # A tibble: 701 × 6
## reported_scientific_name scientific_name dyntaxa_id aphia_id taxon_hierarchy
## <chr> <chr> <dbl> <dbl> <chr>
## 1 Acanthoceras zachariasii Acanthoceras z… 264148 178990 Chromista - Sa…
## 2 Acanthoica quattrospina Acanthoica qua… 236952 235802 Chromista - Ha…
## 3 Acanthostomella Acanthostomella 1010638 NA Chromista - Sa…
## 4 Acanthostomella norvegica Acanthostomell… 238502 183556 Chromista - Sa…
## 5 Achnanthes Achnanthes 1010466 149191 Chromista - Sa…
## 6 Actinastrum hantzschii Actinastrum ha… 238839 160543 Plantae - Viri…
## 7 Actinocyclus Actinocyclus 1010407 148944 Chromista - Sa…
## 8 Actinocyclus normanii Actinocyclus n… 237433 148945 Chromista - Sa…
## 9 Actinocyclus octonarius Actinocyclus o… 237434 149164 Chromista - Sa…
## 10 Actinocyclus octonarius … Actinocyclus o… 248668 162770 Chromista - Sa…
## # ℹ 691 more rows
## # ℹ 1 more variable: counted_rows <dbl>
# Retrieve all phytoplankton data from July 2015
shark_data <- get_shark_data(tableView = "sharkdata_phytoplankton",
fromYear = 2015,
toYear = 2015,
months = 7,
dataTypes = c("Phytoplankton"),
verbose = FALSE)
# Print data
print(shark_data)
## # A tibble: 12,519 × 138
## delivery_datatype check_status_sv data_checked_by_sv visit_year visit_month
## <chr> <chr> <chr> <dbl> <dbl>
## 1 Phytoplankton Klar Leverantör 2015 7
## 2 Phytoplankton Klar Leverantör 2015 7
## 3 Phytoplankton Klar Leverantör 2015 7
## 4 Phytoplankton Klar Leverantör 2015 7
## 5 Phytoplankton Klar Leverantör 2015 7
## 6 Phytoplankton Klar Leverantör 2015 7
## 7 Phytoplankton Klar Leverantör 2015 7
## 8 Phytoplankton Klar Leverantör 2015 7
## 9 Phytoplankton Klar Leverantör 2015 7
## 10 Phytoplankton Klar Leverantör 2015 7
## # ℹ 12,509 more rows
## # ℹ 133 more variables: station_name <chr>, reported_station_name <chr>,
## # sample_location_id <dbl>, station_id <dbl>, sample_project_name_sv <chr>,
## # sample_orderer_name_sv <chr>, platform_code <chr>, expedition_id <dbl>,
## # shark_sample_id_md5 <chr>, sample_date <date>, sample_time <time>,
## # sample_latitude_dm <chr>, sample_longitude_dm <chr>,
## # sample_latitude_dd <dbl>, sample_longitude_dd <dbl>, …
Dyntaxa API Key
Dyntaxa requires a subscription key to access its API. To obtain your own key, sign up for the taxonomy product at the SLU Swedish Species Information Centre´s Developer Portal. In the example below, the key is retrieved from an environment variable.
# Retrieve the API key
dyntaxa_key <- Sys.getenv("DYNTAXA_KEY")
Update SHARK Taxonomy Data
If the taxonomic data downloaded from SHARK are outdated, they can be
updated to the latest Dyntaxa information using SHARK4R
.
Alternatively, data can be retrieved from WoRMS. For details, see the WoRMS Tutorial.
# Update taxonomy information for the retrieved phytoplankton data
updated_taxonomy <- update_dyntaxa_taxonomy(
dyntaxa_ids = shark_data$dyntaxa_id,
subscription_key = dyntaxa_key,
verbose = FALSE)
# Print the updated taxonomy data
print(updated_taxonomy)
## # A tibble: 12,519 × 10
## dyntaxa_id scientific_name taxon_kingdom taxon_phylum taxon_class taxon_order
## <dbl> <chr> <chr> <chr> <chr> <chr>
## 1 237393 Aulacoseira am… Chromista Gyrista Coscinodis… Aulacoseir…
## 2 237398 Aulacoseira it… Chromista Gyrista Coscinodis… Aulacoseir…
## 3 237398 Aulacoseira it… Chromista Gyrista Coscinodis… Aulacoseir…
## 4 238026 Diatoma tenuis Chromista Gyrista Bacillario… Rhabdonema…
## 5 237039 Dinobryon bava… Chromista Gyrista Chrysophyc… Ochromonad…
## 6 237978 Tabellaria flo… Chromista Gyrista Bacillario… Rhabdonema…
## 7 237978 Tabellaria flo… Chromista Gyrista Bacillario… Rhabdonema…
## 8 237978 Tabellaria flo… Chromista Gyrista Bacillario… Rhabdonema…
## 9 1010525 Cryptomonas Chromista Cryptophyta Cryptophyc… Cryptomona…
## 10 1010525 Cryptomonas Chromista Cryptophyta Cryptophyc… Cryptomona…
## # ℹ 12,509 more rows
## # ℹ 4 more variables: taxon_family <chr>, taxon_genus <chr>,
## # taxon_species <chr>, taxon_hierarchy <chr>
Match Taxon Names
# Randomly select 10 phytoplankton taxa from shark_taxon
taxon_names <- sample(shark_taxon$scientific_name, size = 10)
# Match taxon_names with Dyntaxa API
matches <- match_taxon_name(taxon_names = taxon_names,
subscription_key = dyntaxa_key,
multiple_options = FALSE,
verbose = FALSE)
# Print the result
tibble(matches)
## # A tibble: 10 × 5
## search_pattern taxon_id best_match author valid_name
## <chr> <int> <chr> <chr> <chr>
## 1 Cosmarium 1010708 Cosmarium Corda… Cosmarium
## 2 Monoraphidium dybowskii 238756 Monoraphid… (Wolo… Monoraphi…
## 3 Pavlova 1010307 Pavlova Butch… Pavlova
## 4 Aulacoseira islandica subsp. helvetica 248665 Aulacoseir… (O.F.… Aulacosei…
## 5 Planktosphaeria gelatinosa 238776 Planktosph… G.M. … Planktosp…
## 6 Pyramimonas virginica 238976 Pyramimona… Penni… Pyramimon…
## 7 Oxytoxum gracile 238173 Oxytoxum g… J.Sch… Oxytoxum …
## 8 Desmodesmus armatus 238842 Desmodesmu… (Chod… Desmodesm…
## 9 Dactyliosolen fragilissimus 237461 Dactylioso… (Berg… Dactylios…
## 10 Protoperidinium curtipes 238251 Protoperid… (E. J… Protoperi…
Retrieve Taxonomic information
Taxonomic records can be retrieved for indivudual taxa using the get_dyntaxa_records
function.
# Get all Dyntaxa IDs
dyntaxa_id <- unique(matches$taxon_id)
# Remove potential NAs
dyntaxa_id <- dyntaxa_id[!is.na(dyntaxa_id)]
# Get Dyntaxa records
dyntaxa_records <- get_dyntaxa_records(taxon_ids = dyntaxa_id,
subscription_key = dyntaxa_key)
# Print records
tibble(dyntaxa_records)
## # A tibble: 10 × 24
## taxonId parentId secondaryParents sortOrder isMicrospecies externalComment
## <int> <int> <list> <int> <lgl> <chr>
## 1 237461 1010415 <list [0]> 72962 FALSE NA
## 2 238173 1010573 <list [0]> 69219 FALSE NA
## 3 238251 1010596 <list [0]> 69311 FALSE NA
## 4 238756 1016310 <list [0]> 111278 FALSE NA
## 5 238776 1010736 <list [0]> 111238 FALSE NA
## 6 238842 1010759 <list [0]> 111099 FALSE NA
## 7 238976 1010807 <list [0]> 111444 FALSE NA
## 8 248665 237397 <list [0]> 72831 FALSE NA
## 9 1010307 2003125 <list [0]> 68529 FALSE NA
## 10 1010708 2003281 <list [0]> 95703 FALSE Original publicat…
## # ℹ 18 more variables: redlistCategory <lgl>, excludeFromReportingSystem <lgl>,
## # nrOfChilds <int>, names <list>, typedRelations.parentRelations <list>,
## # typedRelations.childRelations <list>, status.id <int>, status.value <chr>,
## # status.name <chr>, statusReason.id <int>, statusReason.value <chr>,
## # statusReason.name <chr>, category.id <int>, category.value <chr>,
## # category.name <chr>, type.id <int>, type.value <chr>, type.name <chr>
Retrieve Parent IDs
All parent taxa above the Dyntaxa ID can be retrieved using the get_dyntaxa_parent_ids
function.
# Get all parents
parents_id <- get_dyntaxa_parent_ids(taxon_ids = dyntaxa_id,
subscription_key = dyntaxa_key,
verbose = FALSE)
# List the IDs
print(parents_id)
## [[1]]
## [1] 5000045 6000581 6000543 6000542 4000126 6008733 2003281 1010708
##
## [[2]]
## [1] 5000045 6000581 6000582 5000046 4000128 3000412 6001105 1016310 238756
##
## [[3]]
## [1] 5000055 6011754 5000056 4000154 3000560 2003125 1010307
##
## [[4]]
## [1] 5000055 6011755 6011756 6011758 6322929 5000104 6323134 6323136 4000164
## [10] 3000587 2003165 1010397 237397 248665
##
## [[5]]
## [1] 5000045 6000581 6000582 5000046 4000128 3000412 6007670 1010736 238776
##
## [[6]]
## [1] 5000045 6000581 6000582 5000046 4000181 3000643 2003305 1010807 238976
##
## [[7]]
## [1] 5000055 6011755 6011756 6011759 6011678 5000062 6011725 6011726 4000169
## [10] 3000850 2003944 1010573 238173
##
## [[8]]
## [1] 5000045 6000581 6000582 5000046 4000128 3000412 2003290 1010759 238842
##
## [[9]]
## [1] 5000055 6011755 6011756 6011758 6322929 5000104 6323134 6323136 4000164
## [10] 3000591 2003174 1010415 237461
##
## [[10]]
## [1] 5000055 6011755 6011756 6011759 6011678 5000062 6011725 6011726 4000169
## [10] 3000850 2003235 1010596 238251
Construct Complete Taxonomic Table
A comprehensive taxonomic table, including related taxa, can be
created with the construct_dyntaxa_table
function. Use the add_synonyms
parameter to include
synonyms, and the add_parents
and
add_descendants
parameters to include parent and descendant
taxa, respectively. If Taxon IDs are missing from the DwC-A export (e.g. species
complex and pseudotaxon), they can be matched using the
add_missing_taxa
argument. Additionally, complete hierarchy
information can be added as a string of parent taxa separated by “-”
using the add_hierarchy
argument.
# Retrieve complete taxonomic table (including parents and descendants)
taxonomy_table <- construct_dyntaxa_table(taxon_ids = dyntaxa_id,
subscription_key = dyntaxa_key,
shark_output = FALSE,
add_parents = TRUE,
add_synonyms = TRUE,
add_descendants = TRUE,
add_descendants_rank = "genus",
add_missing_taxa = FALSE,
add_hierarchy = FALSE,
verbose = FALSE)
# Print the taxonomy table as a tibble
tibble(taxonomy_table)
## # A tibble: 677 × 16
## taxonId acceptedNameUsageID parentNameUsageID scientificName taxonRank
## <chr> <chr> <chr> <chr> <chr>
## 1 urn:lsid:dynt… urn:lsid:dyntaxa.s… urn:lsid:dyntaxa… Cosmarium genus
## 2 urn:lsid:dynt… urn:lsid:dyntaxa.s… urn:lsid:dyntaxa… Pavlova genus
## 3 urn:lsid:dynt… urn:lsid:dyntaxa.s… urn:lsid:dyntaxa… Pyramimonas v… species
## 4 urn:lsid:dynt… urn:lsid:dyntaxa.s… urn:lsid:dyntaxa… Oxytoxum grac… species
## 5 urn:lsid:dynt… urn:lsid:dyntaxa.s… urn:lsid:dyntaxa… Desmodesmus a… species
## 6 urn:lsid:dynt… urn:lsid:dyntaxa.s… urn:lsid:dyntaxa… Aulacoseira i… subspeci…
## 7 urn:lsid:dynt… urn:lsid:dyntaxa.s… urn:lsid:dyntaxa… Protoperidini… species
## 8 urn:lsid:dynt… urn:lsid:dyntaxa.s… urn:lsid:dyntaxa… Dactyliosolen… species
## 9 urn:lsid:dynt… urn:lsid:dyntaxa.s… urn:lsid:dyntaxa… Planktosphaer… species
## 10 urn:lsid:dynt… urn:lsid:dyntaxa.s… urn:lsid:dyntaxa… Monoraphidium… species
## # ℹ 667 more rows
## # ℹ 11 more variables: scientificNameAuthorship <chr>, taxonomicStatus <chr>,
## # nomenclaturalStatus <chr>, taxonRemarks <chr>, kingdom <chr>, phylum <chr>,
## # class <chr>, order <chr>, family <chr>, genus <chr>, species <chr>
Citation
## To cite package 'SHARK4R' in publications use:
##
## Markus Lindh, Anders Torstensson (2025). SHARK4R: Retrieving,
## Analyzing, and Validating Marine Data from SHARK and Nordic
## Microalgae. R package version 0.1.7.
## https://doi.org/10.5281/zenodo.14169399
##
## A BibTeX entry for LaTeX users is
##
## @Manual{,
## title = {SHARK4R: Retrieving, Analyzing, and Validating Marine Data from SHARK and Nordic Microalgae},
## author = {Markus Lindh and Anders Torstensson},
## year = {2025},
## note = {R package version 0.1.7},
## url = {https://doi.org/10.5281/zenodo.14169399},
## }