Skip to contents

This function queries the Algaebase API to retrieve taxonomic information for a list of algae names based on genus and (optionally) species. It supports exact matching, genus-only searches, and retrieval of higher taxonomic ranks.

Usage

match_algaebase(
  genus,
  species,
  apikey = NULL,
  genus_only = FALSE,
  higher = TRUE,
  unparsed = FALSE,
  exact_matches_only = TRUE,
  sleep_time = 1,
  newest_only = TRUE,
  verbose = TRUE
)

Arguments

genus

A character vector of genus names.

species

A character vector of species names corresponding to the genus vector. Must be the same length as genus.

apikey

A character string containing the API key for accessing the Algaebase API.

genus_only

Logical. If TRUE, searches are based solely on the genus name, ignoring species. Defaults to FALSE.

higher

Logical. If TRUE, includes higher taxonomy (e.g., kingdom, phylum) in the output. Defaults to TRUE.

unparsed

Logical. If TRUE, returns raw JSON output instead of an R data frame. Defaults to FALSE.

exact_matches_only

Logical. If TRUE, restricts results to exact matches. Defaults to TRUE.

sleep_time

Numeric. The delay (in seconds) between consecutive Algaebase API queries. Defaults to 1. A delay is recommended to avoid overwhelming the API for large queries.

newest_only

A logical value indicating whether to return only the most recent entries (default is TRUE).

verbose

Logical. If TRUE, displays a progress bar to indicate query status. Defaults to TRUE.

Value

A data frame containing taxonomic information for each input genus-species combination. Columns may include:

  • id: Algaebase ID (if available)

  • kingdom, phylum, class, order, family: Higher taxonomy (if higher = TRUE)

  • genus, species, infrasp: Genus, species, and infraspecies names (if applicable)

  • taxonomic_status: Status of the name (e.g., "accepted", "synonym", "unverified")

  • currently_accepted: Logical indicator for whether the name is currently accepted

  • accepted_name: Currently accepted name if different from the input name

  • input_name: Name supplied by the user

  • input_match: 1 for exact matches, otherwise 0

  • taxon_rank: Taxonomic rank of the accepted name (e.g., "genus", "species")

  • mod_date: Date when the entry was last modified in Algaebase

  • long_name: Full species name with authorship and date

  • authorship: Authors associated with the species name

Details

Scientific names can be parsed using the parse_scientific_names() function before being processed by match_algaebase().

Duplicate genus-species combinations are handled efficiently by querying each unique combination only once. Genus-only searches are performed when genus_only = TRUE or when the species name is missing or invalid. Errors during API queries are gracefully handled by returning rows with NA values for missing or unavailable data.

The function allows for integration with data analysis workflows that require resolving or verifying taxonomic names against Algaebase.

Examples

if (FALSE) { # \dontrun{
# Example with genus and species vectors
genus_vec <- c("Thalassiosira", "Skeletonema", "Tripos")
species_vec <- c("pseudonana", "costatum", "furca")

algaebase_results <- match_algaebase(
  genus = genus_vec,
  species = species_vec,
  apikey = "your_api_key",
  exact_matches_only = TRUE,
  verbose = TRUE
)
head(algaebase_results)
} # }