One of the main pillars of natural history are the libraries of taxonomic publications. These publications disseminate scientific knowledge derived from the observation and analysis of specimens. Specimens cited in taxonomic treatments...
moreOne of the main pillars of natural history are the libraries of taxonomic publications. These publications disseminate scientific knowledge derived from the observation and analysis of specimens. Specimens cited in taxonomic treatments may be linked to other treatments where they are also cited, or to collections databases where specimens are archived, and groups of specimens cited together in treatments contribute to taxonomic concepts including defining characteristics and distribution. Thus, digitization of specimens and literature followed by providing open-access to all resources allow for a seamless network of links moving from a specimen to its inclusion in publications.<br> <br> Plazi is a Swiss-based international NGO that promotes and provides access to data from taxonomic publications. A total of 375,000 publications including 275,000 scientific illustrations, and 400,000 material citations are now open, findable, accessible, citable and re-usable. Data on 45,...
The corpus of biodiversity literature represents the scientific output produced in biodiversity research. Each species has at least a taxonomic treatment, but in many several hundreds treatments exist. These are all interconnected and...
moreThe corpus of biodiversity literature represents the scientific output produced in biodiversity research. Each species has at least a taxonomic treatment, but in many several hundreds treatments exist. These are all interconnected and composing the entire catalogue of life. Each of these treatments is implicitly (sometimes explicitly) linked to specimens in natural history collections. Already in the advent of taxonomy Linnaeus kept a reference collection for his seminal works which he cited by listing its geographic origin in a regularly occurring section in the treatments. These material citations range today from a summary of specimen data (e.g. 20 paratypes from Columbia" ) to very detailed, normalized, augmented data including specimen, collection, and gene accession codes as well as collectors, date, collecting methods of host, etc. These material citations represent one of the best curated data about a specimen, full of links between different resources. This is a citati...
TaxPub was created as an XML extension to the general JATS to provide domain-specific markup for prospective publishing in the area of biological systematics. The core idea of the schema is to delimit descriptions of taxa, or treatments,...
moreTaxPub was created as an XML extension to the general JATS to provide domain-specific markup for prospective publishing in the area of biological systematics. The core idea of the schema is to delimit descriptions of taxa, or treatments, within an article, and to use these individual portions of information for various purposes. TaxPub was developed in a close cooperation between the author (Terence Catapano), a community interested in such markup (Plazi), the NLM JATS group and a journal publisher (Pensoft). Since July 2009, TaxPub has been routinely implemented in the everyday publishing practice of Pensoft, to provide: (1) Semantically enhanced, domain-specific XML versions of articles for archiving in PubMedCentral (PMC); (2) Visualization of taxon treatments on PMC; (3) Export of taxon treatments to various aggregators, such as Encyclopedia of Life, Plazi Treatment Repository, and the Wiki Species-ID.net.
The European Journal of Taxonomy (EJT) is a decade-old journal dedicated to the taxonomy of living and fossil eukaryotes. Launched in 2011, the EJT published exactly 900 articles (31 778 pages) from 2011 to 2021. The journal has been...
moreThe European Journal of Taxonomy (EJT) is a decade-old journal dedicated to the taxonomy of living and fossil eukaryotes. Launched in 2011, the EJT published exactly 900 articles (31 778 pages) from 2011 to 2021. The journal has been processed in its entirety by Plazi, liberating the data therein, depositing it into TreatmentBank, Biodiversity Literature Repository and disseminating it to partners, including the Global Biodiversity Information Facility (GBIF) using a combination of a highly automated workflow, quality control tools, and human curation. The dissemination of original research along with the ability to use and reuse data as freely as possible is the key to innovation, opening the corpus of known published biodiversity knowledge, and furthering advances in science. This paper aims to discuss the advantages and limitations of retro-conversion and to showcase the potential analyses of the data published in EJT and made findable, accessible, interoperable and reusable (FAI...
Here we present a descriptive analysis of the bibliographic production of the world-renowned heteropterist Dr. Jocélia Grazia and comments on her taxonomic reach based on extracted taxonomic treatments. We analyzed a total of 219...
moreHere we present a descriptive analysis of the bibliographic production of the world-renowned heteropterist Dr. Jocélia Grazia and comments on her taxonomic reach based on extracted taxonomic treatments. We analyzed a total of 219 published documents, including scientific papers, scientific notes, and book chapters. Additionally, we applied the Plazi workflow to extract taxonomic treatments, images, tables, treatment citations and materials citations, and references from 75 different documents in accordance with the FAIR (Findability, Accessibility, Interoperability, and Reuse) principles and made them available on the Biodiversity Literature Repository (BLR), hosted on Zenodo, and on TreatmentBank. We found that Dr. Grazia published 200 new names, including species (183) and genera (17), and 1,444 taxonomic treatments in total. From these, 104 and 581, respectively, were extracted after applying the Plazi Workflow. A total of 544 figures, 50 tables, 2,242 references, 2,107 materials...
This paper describes a set of guidelines for the citation of zoological and botanical specimens in the European Journal of Taxonomy. The guidelines stipulate controlled vocabularies and precise formats for presenting the specimens...
moreThis paper describes a set of guidelines for the citation of zoological and botanical specimens in the European Journal of Taxonomy. The guidelines stipulate controlled vocabularies and precise formats for presenting the specimens examined within a taxonomic publication, which allow for the rich data associated with the primary research material to be harvested, distributed and interlinked online via international biodiversity data aggregators. Herein we explain how the EJT editorial standard was defined and how this initiative fits into the journal’s project to semantically enhance its publications using the Plazi TaxPub DTD extension. By establishing a standardised format for the citation of taxonomic specimens, the journal intends to widen the distribution of and improve accessibility to the data it publishes. Authors who conform to these guidelines will benefit from higher visibility and new ways of visualising their work. In a wider context, we hope that other taxonomy journals...
Scholarly publications in taxonomy are used as the sole carrier of the communication channel to publicize the description of new species, more generally any kind of taxon, their augmentations in form of re-descriptions to small notes such...
moreScholarly publications in taxonomy are used as the sole carrier of the communication channel to publicize the description of new species, more generally any kind of taxon, their augmentations in form of re-descriptions to small notes such as additional observation records, or deprecations when the name of a taxon is changing. This is communicated in a highly standardized way. For nomenclatural issues, the Codes (e.g. International Code of Zoological Nomenclature) require certain elements, and for comparative reasons, highly formalized language, document structure, illustrations and citation systems are used. This estimated corpus of 500M printed pages includes all we know about the biodiversity in the sciences as well as what has been processed in terms of specimens in the natural history collections. This includes tens of millions of scientific illustrations, taxonomic treatments, i.e. parts of articles that are clearly delimited and headed by a label including a taxonomic name, ma...
Scholarly publications in taxonomy are used as the sole carrier of the communication channel to publicize the description of new species, more generally any kind of taxon, their augmentations in form of re-descriptions to small notes such...
moreScholarly publications in taxonomy are used as the sole carrier of the communication channel to publicize the description of new species, more generally any kind of taxon, their augmentations in form of re-descriptions to small notes such as additional observation records, or deprecations when the name of a taxon is changing. This is communicated in a highly standardized way. For nomenclatural issues, the Codes (e.g. International Code of Zoological Nomenclature) require certain elements, and for comparative reasons, highly formalized language, document structure, illustrations and citation systems are used. This estimated corpus of 500M printed pages includes all we know about the biodiversity in the sciences as well as what has been processed in terms of specimens in the natural history collections. This includes tens of millions of scientific illustrations, taxonomic treatments, i.e. parts of articles that are clearly delimited and headed by a label including a taxonomic name, ma...
The Swiss NGO Plazi (
http://plazi.org) has developed an automated workflow for liberating data, including images and text, from new taxonomic publications issued in PDF format. This stepwise process extracts, article metadata,...
moreThe Swiss NGO Plazi (
http://plazi.org) has developed an automated workflow for liberating data, including images and text, from new taxonomic publications issued in PDF format. This stepwise process extracts, article metadata, illustrations and their captions, bibliographic references, scientific names, named geographic entities such as coordinates and country names, collection codes, and finally, taxonomic treatments. These extracted data are enhanced and published in TreatmentBank (
http://plazi.org) and deposited in Biodiversity Literature Repository (https:/biolitrepo.org) respectively, in which a Digital Object Identifier (DataCite DOI) is minted for articles as well as their contained figures and taxon treatments, each linked to each other in their metadata. This input is complemented by the import of Journal Article Tag Suite/Taxpub XML based publications from Pensoft publishers (e.g. Zookeys, Journal of Hymenoptera Research;
https://pensoft.net/browse_journals) that are seman...
The biodiversity domain, and in particular biological taxonomy, is moving in the direction of semantization of its research outputs. The present work introduces OpenBiodiv-O, the ontology that serves as the basis of the OpenBiodiv...
moreThe biodiversity domain, and in particular biological taxonomy, is moving in the direction of semantization of its research outputs. The present work introduces OpenBiodiv-O, the ontology that serves as the basis of the OpenBiodiv Knowledge Management System. Our intent is to provide an ontology that fills the gaps between ontologies for biodiversity resources, such as DarwinCore-based ontologies, and semantic publishing ontologies, such as the SPAR Ontologies. We bridge this gap by providing an ontology focusing on biological taxonomy. OpenBiodiv-O introduces classes, properties, and axioms in the domains of scholarly biodiversity publishing and biological taxonomy and aligns them with several important domain ontologies (FaBiO, DoCO, DwC, Darwin-SW, NOMEN, ENVO). By doing so, it bridges the ontological gap across scholarly biodiversity publishing and biological taxonomy and allows for the creation of a Linked Open Dataset (LOD) of biodiversity information (a biodiversity knowledge...
Much biodiversity data is collected worldwide, but it remains challenging to assemble the scattered knowledge for assessing biodiversity status and trends. The concept of Essential Biodiversity Variables (EBVs) was introduced to structure...
moreMuch biodiversity data is collected worldwide, but it remains challenging to assemble the scattered knowledge for assessing biodiversity status and trends. The concept of Essential Biodiversity Variables (EBVs) was introduced to structure biodiversity monitoring globally, and to harmonize and standardize biodiversity data from disparate sources to capture a minimum set of critical variables required to study, report and manage biodiversity change. Here, we assess the challenges of a 'Big Data' approach to building global EBV data products across taxa and spatiotemporal scales, focusing on species distribution and abundance. The majority of currently available data on species distributions derives from incidentally reported observations or from surveys where presence-only or presence-absence data are sampled repeatedly with standardized protocols. Most abundance data come from opportunistic population counts or from population time series using standardized protocols (e.g. re...
Background. The 7(th) Framework Programme for Research and Technological Development is helping the European Union to prepare for an integrative system for intelligent management of biodiversity knowledge. The infrastructure that is...
moreBackground. The 7(th) Framework Programme for Research and Technological Development is helping the European Union to prepare for an integrative system for intelligent management of biodiversity knowledge. The infrastructure that is envisaged and that will be further developed within the Programme "Horizon 2020" aims to provide open and free access to taxonomic information to anyone with a requirement for biodiversity data, without the need for individual consent of other persons or institutions. Open and free access to information will foster the re-use and improve the quality of data, will accelerate research, and will promote new types of research. Progress towards the goal of free and open access to content is hampered by numerous technical, economic, sociological, legal, and other factors. The present article addresses barriers to the open exchange of biodiversity knowledge that arise from European laws, in particular European legislation on copyright and database pro...
Taxonomy is the discipline responsible for charting the world's organismic diversity, understanding ancestor/descendant relationships, and organizing all species according to a unified taxonomic classification system. Taxonomists...
moreTaxonomy is the discipline responsible for charting the world's organismic diversity, understanding ancestor/descendant relationships, and organizing all species according to a unified taxonomic classification system. Taxonomists document the attributes (characters) of organisms, with emphasis on those can be used to distinguish species from each other. Character information is compiled in the scientific literature as text, tables, and images. The information is presented according to conventions that vary among taxonomic domains; such conventions facilitate comparison among similar species, even when descriptions are published by different authors. There is considerable uncertainty within the taxonomic community as to how to re-use images that were included in taxonomic publications, especially in regard to whether copyright applies. This article deals with the principles and application of copyright law, database protection, and protection against unfair competition, as applie...