Proposal for structuring a dataset description

Proposal for structuring a dataset description
The problem:
Free text descriptions serve a variety of purposes. Without some form of structure they may
be so minimal as to be effectively empty or so verbose as to be impenetrable. Collections of
data are not homogenous and descriptions need to be flexible. But it may be possible to
provide some minimalist prompts that balance the need for structure while providing the
freedom for text to be fit for purpose.
Suggested structural prompts for a collection, type: dataset
1. What field of research is this data related to?
Example: ...this data is part of a longitudinal climate study.
2. What is the problem or the research question?
Example: ...the data has been collected in order to measure regional changes in
rainfall have been observed in response to climate change.
3. How has the data been collected?
Example: ...the data has been collected daily using 50 rain gauges in sites across the
4. Has there been any special software used to collect the data?
Example: ...rainfall data is obtained from a telemetry-equipped rain gauge. Data is
transmitted to a central receiving station. The amount of rain is accumulated and
every 5 minutes the data is exported to a database for an accurate history of the
rainstorm. The database is read by the ‘Rainfall Conditions’ Internet application.
5. In what format is the data recorded and in what units?
Example: ...rainfall data is measured in millimetres and recorded beside the date as
yyyy/mm/dd in an Excel spreadsheet.
6. What identifiers or tags have been used?
Example: ...each rain gauge is identified alpha-numerically as RG001, RG002 etc.
7. Is this just raw data or are there also analysing transformations or normalisations
being made?
Example: ...rainfall figures have been aggregated for the region and expressed as
monthly and yearly averages.
Prepared by Simon Pockley 28/09/2011
Page 1 of 1