
Assign Phytoplankton Group to Scientific Names
Source:R/worms_api_functions.R
assign_phytoplankton_group.Rd
This function assigns default phytoplankton groups (Diatoms, Dinoflagellates, Cyanobacteria, or Other)
to a list of scientific names or Aphia IDs by retrieving species information from the
World Register of Marine Species (WoRMS). The function checks both Aphia IDs and scientific names,
handles missing records, and assigns the appropriate plankton group based on taxonomic classification in WoRMS.
Additionally, custom plankton groups can be specified using the custom_groups
parameter,
allowing users to define additional classifications based on specific taxonomic criteria.
Usage
assign_phytoplankton_group(
scientific_names,
aphia_ids = NULL,
diatom_class = c("Bacillariophyceae", "Coscinodiscophyceae", "Mediophyceae",
"Diatomophyceae"),
dinoflagellate_class = "Dinophyceae",
cyanobacteria_class = "Cyanophyceae",
cyanobacteria_phylum = "Cyanobacteria",
match_first_word = TRUE,
marine_only = FALSE,
return_class = FALSE,
custom_groups = list(),
verbose = TRUE
)
Arguments
- scientific_names
A character vector of scientific names of marine species.
- aphia_ids
A numeric vector of Aphia IDs corresponding to the scientific names. If provided, it improves the accuracy and speed of the matching process. The length of
aphia_ids
must match the length ofscientific_names
. Defaults toNULL
, in which case the function will attempt to assign plankton groups based only on the scientific names.- diatom_class
A character string or vector representing the diatom class. Default is "Bacillariophyceae", "Coscinodiscophyceae" and "Mediophyceae".
- dinoflagellate_class
A character string or vector representing the dinoflagellate class. Default is "Dinophyceae".
- cyanobacteria_class
A character string or vector representing the cyanobacteria class. Default is "Cyanophyceae".
- cyanobacteria_phylum
A character string or vector representing the cyanobacteria phylum. Default is "Cyanobacteria".
- match_first_word
A logical value indicating whether to match the first word of the scientific name if the Aphia ID is missing. Default is TRUE.
- marine_only
A logical value indicating whether to restrict the results to marine taxa only. Default is
FALSE
.- return_class
A logical value indicating whether to include class information in the result. Default is
FALSE
.- custom_groups
A named list of additional custom plankton groups (optional). The names of the list correspond to the custom group names (e.g., "Cryptophytes"), and the values should be character vectors specifying one or more of the following taxonomic levels:
phylum
,class
,order
,family
,genus
, orscientific_name
. For example:list("Green Algae" = list(class = c("Chlorophyceae", "Ulvophyceae")))
. This allows users to extend the default classifications (e.g., Cyanobacteria, Diatoms, Dinoflagellates) with their own groups.- verbose
A logical value indicating whether to print progress messages. Default is TRUE.
Value
A data frame with two columns: scientific_name
and plankton_group
, where the plankton group is assigned based on taxonomic classification.
Details
The aphia_ids
parameter is not necessary but, if provided, will improve the certainty of the
matching process. If aphia_ids
are available, they will be used directly to retrieve more accurate
WoRMS records. If missing, the function will attempt to match the scientific names to Aphia IDs by
querying WoRMS using the scientific name(s), with an additional fallback mechanism to match based on the
first word of the scientific name.
To skip one of the default plankton groups, you can set the class or phylum of the respective group to an empty string (""
).
For example, to skip the "Cyanobacteria" group, you can set cyanobacteria_class = ""
or cyanobacteria_phylum = ""
. These
taxa will then be placed in Others
.
Custom groups are processed in the order they appear in the custom_groups
list. If a taxon matches
multiple custom groups, it will be assigned to the group that appears last in the list, as later matches
overwrite earlier ones. For example, if Teleaulax amphioxeia
matches both Cryptophytes
(class-based)
and a specific group Teleaulax
(name-based), it will be assigned to Teleaulax
if Teleaulax
is listed after
Cryptophytes
in the custom_groups
list.
Examples
if (FALSE) { # \dontrun{
# Assign plankton groups to a list of species
result <- assign_phytoplankton_group(
scientific_names = c("Tripos fusus", "Diatoma", "Nodularia spumigena", "Octactis speculum"),
aphia_ids = c(840626, 149013, 160566, NA))
print(result)
# Assign plankton groups using additional custom grouping
custom_groups <- list(
Cryptophytes = list(class = "Cryptophyceae"),
Ciliates = list(phylum = "Ciliophora")
)
# Assign with custom groups
result_custom <- assign_phytoplankton_group(
scientific_names = c("Teleaulax amphioxeia", "Mesodinium rubrum", "Dinophysis acuta"),
aphia_ids = c(106306, 232069, 109604),
custom_groups = custom_groups, # Adding custom groups
verbose = TRUE
)
print(result_custom)
} # }