Reference

Last updated on 2026-03-24 | Edit this page

Glossary


CAS Registry Number
A unique numerical identifier assigned by the Chemical Abstracts Service to every chemical substance. Proprietary – not ideal for FAIR data.
Chemotion
An open-source electronic lab notebook (ELN) and repository ecosystem for chemistry, maintained by NFDI4Chem. The ELN supports structure drawing, reaction schemes, and direct export to the Chemotion Repository.
CIF (Crystallographic Information File)
A standard file format for representing crystallographic data, managed by the International Union of Crystallography.
CML (Chemical Markup Language)
An XML-based format for representing molecular and chemical data.
Creative Commons (CC)
A set of copyright licences that allow creators to grant permissions for reuse. Common variants include CC0, CC-BY, CC-BY-SA, and CC-BY-NC.
CSD (Cambridge Structural Database)
A curated collection of over one million small-molecule crystal structures, managed by the CCDC. Deposition is mandatory for most crystallography journals.
Data Access Statement
A statement included in a publication that describes how the supporting research data can be accessed. Required by UKRI and many publishers.
Data Management Plan (DMP)
A document that describes how data will be collected, organised, stored, shared, and preserved during and after a research project.
DOI (Digital Object Identifier)
A persistent identifier used to uniquely identify a dataset, publication, or other digital object. Makes data citable and discoverable.
ELN (Electronic Lab Notebook)
A digital replacement for a paper lab notebook, offering features such as searchability, automatic timestamping, and data linking.
FAIR Principles
A set of guiding principles for research data management: Findable, Accessible, Interoperable, and Reusable.
InChI (International Chemical Identifier)
A machine-readable, non-proprietary textual identifier for chemical substances, maintained by IUPAC.
InChIKey
A fixed-length hash of an InChI string, designed for web searching and database lookups.
ioChem-BD
A computational chemistry repository that accepts DFT and molecular dynamics input/output files from codes including Gaussian, ORCA, CP2K, and VASP.
IUPAC FAIRSpec
An IUPAC specification defining metadata standards for FAIR management of spectroscopic data in chemistry.
JCAMP-DX
An open standard file format for spectroscopic data (NMR, IR, MS, UV-Vis, Raman). A plain-text format that embeds metadata in a structured header.
Metadata
Data about data. Descriptive information that provides context for a dataset, making it findable, interpretable, and reusable.
mzML
An open standard file format for mass spectrometry data.
NFDI4Chem
Germany’s national initiative for chemistry research data infrastructure. Maintains the Chemotion ecosystem and a detailed knowledge base of chemistry RDM best practice at knowledgebase.nfdi4chem.de.
NOMAD
An open repository for computational materials science data, with strong metadata standards for DFT and molecular dynamics output.
ODC-By / ODbL (Open Data Commons)
Licences designed specifically for databases. ODC-By requires attribution; ODbL additionally requires derivative databases to remain open.
PSDI (Physical Sciences Data Infrastructure)
A UK initiative supporting researchers in chemistry and materials science with data management tools, guidance, and infrastructure.
re3data
A global registry of research data repositories, useful for finding domain-specific repositories. Available at re3data.org.
SMILES (Simplified Molecular Input Line Entry System)
A widely used line notation for representing molecular structures as text strings.

Key Resources


Electronic Lab Notebooks