This glossary is not meant to be comprehensive; we offer definitions for selected terms that are used throughout this website

Term

Definition

data integrity

maintaining and assuring the accuracy, completeness, consistency and validity of the data

data visualisation

representing data in a graphical way in order to communicate the information more effectively

data stewardship

the oversight of data to ensure its high quality; closely linked to data management/curation. Data stewards are concerned with taking care of data, they are responsible for processes linked to data and its documentation. 

metadata

information and documentation about the data set, explains what the data are about. Metadata contain the technical information about the data and information about other data attributes such as ownership.

reproducibility

methods and analyses should be carried out and documented in ways that make it possible for others to understand the processes used, as well as to repeat and verify the results

open access

used to describe digital objects and materials that are freely available

data archiving

transfer of data to a facility for long-term (permanent) preservation; data should be retrievable from the archive

data reusability

data should be processed and stored in a standardised, cleaned form, so that they can be re-used by others. In order for data to be reusable they have to be well documented (see metadata).

data security

protecting data from unauthorised access, modification and loss/destruction

data repository

a location where data are permanently stored, providing online access 

data sharing

is the practice of making the data available to others; such data have to be discoverable and reusable

FAIR principles

principles linked to data management/stewardship; there are four main principles: Findability, Accessibility, Interoperability and Reusability. FAIR principles facilitate data discovery.

data management

(curation) the act of collecting, organising, processing, describing, storing, maintaining and protecting data throughout its life-cycle

data processing

operations performed on data in preparation for extraction of information; may involve cleaning/cleansing, validation, aggregation etc. in order to convert data into a more standardised format ready for analysis

data management plan

a document that outlines how data are to be handled during and after the research project

data aggregation

a process of searching and collecting data from multiple sources

anonymisation

removing the connection between individuals and their records in a database, so that health records do not reveal the identity of individuals and cannot be linked to specific patients

database

a structured digital collection of data that be stored and accessed electronically

data access

the method of viewing and/or retrieving stored data

data collection

a process through which data is captured/gathered fro either physical or electronic documents

data cleansing

checking, reviewing and revising of the data to remove errors, duplicated, deal with missing data

data integration

combining data from multiple sources

data source

any provider of data, for example a database

data structure

a specific way of organising and storing data

de-identification

removing all data that links an individual to a specific piece of information

linked data

common attributes between different pieces of data that allow us to identify connections between different sources of data

data storage

any means of storing data persistently

structured data

data is organised using a pre-determined structure

unstructured data

data that has no identifiable structure; e.g. text, images

identifier

a permanent (long-lasting) and unique reference to a digital object (e.g. a dataset). Identifiers allow the discovery and reliable citation of digital content.

restricted (access) data

data that are only available once a specific set of conditions has been met

data discovery

is the process of finding and collection data from various databases so that these data can be analysed; in order for data to be discoverable they need to be archived and appropriately described through documentation and metadata.