This is the process used to create or acquire data. Careful data collection makes data processing (the collection and manipulation of items of data to produce meaningful information), maintenance and usage much easier, and increases the quality of the data.

Creating and using a data collection template helps to ensure that all relevant information is recorded and standardised, which is particularly important when multiple people are collecting data. We need to make sure that each parameter and term is defined, that specific units are used, and that everyone knows how to record missing data.

File names: before you begin gathering data define a naming standard so that your data are understandable. This includes any primary data – e.g. labelling the blood samples you collected – but also any secondary data, for instance your notes in a notebook, results from processed samples, spreadsheets and scripts etc.

Particularly relevant keywords:

  •     metadata
  •     data management
  •     data integrity
  •     data reusability