Skip to main content

Publish and share

Research data are increasingly considered as a valuable research output, equivalent to communicating research results through journal articles and monographs. For information on publishing research outputs please see the University of Salford Institutional Repository (USIR). The following guidance relates to the publication of research data.

Benefits of publishing data:

  • Build up your academic reputation – making data openly available facilitates discovery and reuse, and is associated with increased citation rates
  • Meet funder requirements about sharing research data
  • Enable data validation and facilitate scientific progress
  • Increase collaboration opportunities and stimulate novel interdisciplinary research
  • Avoid duplication and increase the return on public investments in scientific research

There are also valid reasons for restricting access to research data:

  • To allow an embargo period so researchers have exclusive use of the data and the opportunity to exploit the results
  • The data is patentable
  • To protect confidentiality and for other ethical and legal considerations

Research data about people, including sensitive data, can be shared ethically and legally if a number of strategies are used, such as informed consent and anonymisation

The various concerns that researchers may have about sharing their data are discussed in Carly Strasser's Data Pub blogpost.

There are different ways to publish data depending on what your funder expects, or standard practices within your discipline or personal preference:

1. Deposit with a data repository (specialist data centre, institutional repository)

A data repository provides online archival storage, which is usually open access and cares for digital materials, ensuring that they remain usable over time.

2. Publish in a data journal

Data journals are publications whose primary purpose is to expose datasets by providing the infrastructure and scholarly reward opportunities that will encourage researchers, funders and data centre managers to share research data outputs. The data is peer reviewed and made publically available under a citable unique identifier.

A list of data journals can be found here. Some highlights include:

More information is available in the Data Journals Guide from the Australian National Data Service, and details of the data publication process have been outlined by the University of Bristol.

3. Supplementary materials to a journal publication

Some publishers have specific areas to upload supplementary materials associated to a published article. Refer to individual publisher guidelines to find out if this is available. For example, the Journal of Applied Physics has a ‘Data & Media’ tab for each article, and the International Journal of Numerical Methods in Engineering allows Supporting Information to be submitted alongside the article.

4. Dissemination via a project or institutional website

Project websites can offer easy, immediate storage and access, but they offer less sustainability and it is difficult to control and analyse who is using the data and how.

These options allows proper citation, so that reuse of the data can be recognised. Whatever form of publishing is used, research data needs to be licensed to indicate what users may or may not do with the data.

To meet the RCUK Policy on Open Access any published research paper must include a data access statement or an explanation of why the data cannot be made accessible. Many academic publishers also require data underpinning research results to be made available.

Licensing research data is essential for clarifying what users may or may not do to the data.

When depositing data in a repository a license or legal agreement should be chosen. Licenses are granted by the Intellectual Property holder of the data, which should be determined when planning the research project.

Information about different licenses for research data are available from:

EUDAT have created a License Selector Tool to help you choose an appropriate license. Some of the most common licenses for research data are by the Creative Commons.

Creative Commons Attribution Unported 3.0 (CC-BY 3.0)

You are free to:

  • Share — copy and redistribute the material in any medium or format
  • Adapt — remix, transform, and build upon the material for any purpose, even commercially.

Under the following terms:

  • Attribution — You must give appropriate credit, provide a link to the license, and indicate if changes were made. You may do so in any reasonable manner, but not in any way that suggests the licensor endorses you or your use.

Public Domain Dedication (CCO 1.0 Universal)

The person who associated a work with this deed has dedicated the work to the public domain by waiving all of his or her rights to the work worldwide under copyright law, including all related and neighbouring rights, to the extent allowed by law.

You can copy, modify, distribute and perform the work, even for commercial purposes, all without asking permission.

N.B. CC0 waives an authors’ moral rights; the author has to rely on scholarly convention to ensure attribution.

Typically, Creative Commons licenses are not ideal for software being deposited in a data repository; instead there are open source GPL or Apache licenses which are more appropriate.

Legal agreements

Sometimes a license is not sufficient to cover the terms of use of the data. The following legal agreements should be considered:

  1. Non-Disclosure Agreements place an obligation of confidentiality on to the University. It is necessary to ensure data access is only permitted to authorised parties
  2. Data deposit/sharing agreements
  3. Commercial contracts - It is necessary to ensure data access is only permitted to authorised parties. Typically with commercial contracts company data must be kept confidential and only used for the specific project purpose, and so data deposit agreements need to reflect this.

To ensure any conditions about data storage and access imposed via the research contract (in particular confidentiality) are respected please contact the Research Contract Support Team for guidance.

For any queries about data licensing or agreements please contact the Intellectual Property Manager:
Tel: 0161 295 2905

Data access statements are used in research publications to explain where supporting data can be found and under what conditions they can be accessed.

Data access statements are required for all publications which are publicly-funded. The EPSRC audit papers which acknowledge their funding to check that they include a data access statement.

Data access statements need to include a persistent URL directly linking to the dataset or to supporting documentation that describes the data in detail, how it may be accessed and any constraints that may apply. If compelling legal or ethical reasons exist to protect access to the data these should be noted in the statement. A simple ‘contact the author’ instruction is not sufficient.

The format and placement of the data access statement will be influenced by the publisher's house-style. Some journals provide a separate section in articles for the data access statement. If this is not available include the data access statement with the acknowledgement of funder support. Alternatively, a formal data citation could be included either within the main references or in a data citation section.

The data access statement should be included in submitted papers, even if a persistent URL or DOI have not been issued. The statement should be updated to include any persistent identifiers as they become available, usually when the paper is accepted for publication.

Examples of data access statements

  • All data supporting this study are openly available from the University of Salford Data Repository
  • All data supporting this study are provided as supplementary information accompanying this paper.
  • All data are provided in full in the results section of this paper.
  • Due to ethical concerns, supporting data cannot be made openly available. Further information about the data and conditions for access are available at the University of Salford Data Repository
  • Due to the (commercially, politically, ethically) sensitive nature of the research, no participants consented to their data being retained or shared. Additional details relating to other aspects of the data are available from the University of Salford Data Repository at
  • No new data were created during this study.

Published research data may be referred to or re-used as the basis for further research and must be correctly cited in the following circumstances:

  • When publishing a research paper acknowledging RCUK funding - a data access statement with a persistent link to the dataset or an explanation of why the data cannot be made accessible must be included
  • If an academic publisher requires data underpinning research results to be made available – data may be accessed through supplementary published materials or a data access statement
  • If you use a third party/secondary dataset as part of your research, you will be expected to cite this dataset in the text or references section.

Benefits of data citation:

  • allowing other researchers to find the data
  • enabling easy reuse, repurposing and verification of data
  • allowing the impact of data to be tracked through metrics, similar to publications
  • creating a scholarly structure that recognises and rewards data creators.

Data citation should be viewed as part of good research practice and a number of Guiding Principles have been agreed by The Future of Research Communications and e-Scholarship.

How to cite a dataset

A dataset citation includes all of the same components as any other citation:

  • author
  • title
  • year of publication
  • publisher (for data this is often the archive where it is housed)
  • edition or version
  • access information (a URL or other persistent identifier such as a DOI)

Unfortunately, standards for the data citation have not been finalised but many data repositories and publishers provide some guidelines. However, if no format is suggested for datasets, take a standard data citation style, such as DataCite below:

  • Creator (Publication Year): Title. Publisher. Identifier

It may also be desirable to include information about Version and Resource Type. In this case use this format:

  • Creator (Publication Year): Title. Version. Publisher. Resource Type. Identifier

DataCite recommends that DOI names are displayed as linkable, permanent URLs:

Further information is available from:

The scope and format of data may determine the potential for re-use, but sometimes the most exciting discoveries arise from re-examining data, or using data for a different purpose than originally intended.

Potential options for reuse include:

  • Providing description and historical context
  • Comparative research, restudy or follow up
  • Secondary analysis
  • Replication or validation of published work
  • Research design and methodological advancement
  • Teaching and learning

When analysing or re-using data it may be beneficial to utilise data cleaning and visualisation tools.

Data cleaning and transformation tools:

  • OpenRefine - a tool for data cleaning
  • Find and replace, in any text editor – look for patterns and repetition in a file
  • Regular Expressions or regexes – for when patterns are observed in a file but there aren’t exact character matches
  • Data wrangler - interactive transformation of messy, real-world data into the data tables analysis tools expect
  • Tabula - a tool for liberating data tables locked inside PDF files
  • Mr. Data Converter - converts Excel data into one of several web-friendly formats, including HTML, JSON and XML
  • GNU PSPP - a free replacement for the proprietary program SPSS
  • The R Project – a free software environment for statistical computing and graphics

There are a variety of data visualisation tools available, but here are some highlights:

  • Data Driven Documents D3.js – a JavaScript library for manipulating documents based on data, using HTML, SVG, and CSS
  • FusionCharts - JavaScript charting library, with over 90 charts and 900 maps
  • Dygraphs - JavaScript charting library allowing users to explore and interpret large/dense data sets
  • Tableau Public – free service that lets anyone publish interactive data to the web
  • Datawrapper – open source software for creating charts and maps
  • Timeline JS – open-source tool that enables you to build visually-rich interactive timelines
  • Google Fusion Tables - experimental data visualization web application