
Retrieve Taxonomic Data From AlgaeBase
Source:vignettes/retrieve_algaebase_data.Rmd
retrieve_algaebase_data.Rmd
AlgaeBase
AlgaeBase is a comprehensive database containing information on a
wide variety of algae species, including terrestrial, marine, and
freshwater organisms, with an emphasis on marine botany. AlgaeBase is
continually updated and funded by various phycological societies, with
contributions from researchers and institutions worldwide. It can be
accessed via a web interface or
through the API, as
demonstrated in this tutorial using SHARK4R
. Please note
that the authors of SHARK4R
are not affiliated with
AlgaeBase.
Getting Started
Installation
You can install the package from GitHub using the
devtools
package:
# install.packages("devtools")
devtools::install_github("sharksmhi/SHARK4R",
dependencies = TRUE)
Load the SHARK4R
and tibble
libraries:
AlgaeBase API Key
AlgaeBase requires a subscription key to access its API. To obtain your own key, please visit the API documentation. In the example below, the key is retrieved from an environment variable.
# Retrieve the API key
algaebase_key <- Sys.getenv("ALGAEBASE_KEY")
Match Genus Name
Taxonomic records can be retrieved for individual genera names using
the get_algaebase_genus
function.
# Match a genus name with AlgaeBase API
genus_records <- get_algaebase_genus(genus = "Gymnodinium",
apikey = algaebase_key)
# Print the result
print(genus_records)
## kingdom phylum class order family id
## 1 Chromista Dinoflagellata Dinophyceae Gymnodiniales Gymnodiniaceae 43632
## genus species infrasp taxonomic_status
## 1 Gymnodinium NA NA currently accepted taxonomically
## nomenclatural_status currently_accepted accepted_name genus_only input_name
## 1 NA 1 NA 1 Gymnodinium
## input_match taxon_rank mod_date long_name authorship
## 1 1 genus 2021-03-22 Gymnodinium F.Stein, 1878 F.Stein
Match Species Name
Taxonomic records can be retrieved for individual species names using
the get_algaebase_species
function.
# Match a species with AlgaeBase API
species_records <- get_algaebase_species(genus = "Tripos",
species = "muelleri",
apikey = algaebase_key)
# Print the result
print(species_records)
## id accepted_name input_name input_match currently_accepted
## 1 65254 Tripos muelleri Tripos muelleri 1 1
## genus_only kingdom phylum class order family
## 1 0 Chromista Dinoflagellata Dinophyceae Gonyaulacales Ceratiaceae
## genus species infrasp long_name taxonomic_status
## 1 Tripos muelleri NA Tripos muelleri Bory currently accepted taxonomically
## nomenclatural_status taxon_rank mod_date authorship
## 1 <NA> species 2024-05-28 Bory
Match Multiple Scientific Names
Multiple names can be matched with the match_algaebase
function. The scientific names need to be parsed into genus
and species
names before being passed to the API, which can
be achieved by the parse_scientific_names
function.
# Retrieve all phytoplankton data from April 2015
shark_data <- get_shark_data(fromYear = 2015,
toYear = 2015,
months = 4,
dataTypes = c("Phytoplankton"),
verbose = FALSE)
# Randomly select 10 rows from the shark_data dataframe
random_rows <- shark_data[sample(nrow(shark_data), 10), ]
# Parse scientific names into genus and species names
parsed_taxa <- parse_scientific_names(random_rows$scientific_name)
# Print the parsed data
print(parsed_taxa)
## genus species
## 1 Pyramimonas
## 2 Tripos muelleri
## 3 Mesodinium rubrum
## 4 Unicell
## 5 Gyrodinium spirale
## 6 Tripos furca
## 7 Ebria tripartita
## 8 Amylax triacantha
## 9 Pseudopedinella thomsenii
## 10 Flagellates
# Match the taxa with AlgaeBase
algaebase_match <- match_algaebase(genus = parsed_taxa$genus,
species = parsed_taxa$species,
apikey = algaebase_key,
verbose = FALSE)
# Print the result
tibble(algaebase_match)
## # A tibble: 10 × 20
## genus species kingdom phylum class order family id infrasp
## <chr> <chr> <chr> <chr> <chr> <chr> <chr> <int> <lgl>
## 1 Amylax triacantha Chromista Dinof… Dino… Gony… Lingu… 52120 NA
## 2 Ebria tripartita Chromista Cerco… Thec… Ebri… Ebria… 56482 NA
## 3 Flagellates NA NA NA NA NA NA NA NA
## 4 Gyrodinium spirale Chromista Dinof… Dino… Gymn… Gyrod… 52419 NA
## 5 Mesodinium rubrum Chromista Cilio… Lito… Cycl… Mesod… 56539 NA
## 6 Pseudopedinella thomsenii Chromista Heter… Dict… Pedi… Actin… 124166 NA
## 7 Pyramimonas NA Plantae Chlor… Pyra… Pyra… Pyram… 43516 NA
## 8 Tripos furca Chromista Dinof… Dino… Gony… Cerat… 149801 NA
## 9 Tripos muelleri Chromista Dinof… Dino… Gony… Cerat… 65254 NA
## 10 Unicell NA NA NA NA NA NA NA NA
## # ℹ 11 more variables: taxonomic_status <chr>, nomenclatural_status <chr>,
## # currently_accepted <dbl>, accepted_name <chr>, genus_only <dbl>,
## # input_name <chr>, input_match <dbl>, taxon_rank <chr>, mod_date <date>,
## # long_name <chr>, authorship <chr>
Citation
## To cite package 'SHARK4R' in publications use:
##
## Markus Lindh, Anders Torstensson (2025). SHARK4R: Retrieving,
## Analyzing, and Validating Marine Data from SHARK and Nordic
## Microalgae. R package version 0.1.7.
## https://doi.org/10.5281/zenodo.14169399
##
## A BibTeX entry for LaTeX users is
##
## @Manual{,
## title = {SHARK4R: Retrieving, Analyzing, and Validating Marine Data from SHARK and Nordic Microalgae},
## author = {Markus Lindh and Anders Torstensson},
## year = {2025},
## note = {R package version 0.1.7},
## url = {https://doi.org/10.5281/zenodo.14169399},
## }