Step-by-Step Guide on Data Publication for ETH Zurich Researchers © 2021 by ETH Library is licensed under CC BY 4.0
The step-by-step guide for data preparation and data publication supports you with documenting your metadata and datasets and with preparing them for publication in a FAIR data repository. Following this guide will help you take into account good practise principles as well as funder requirements and guidelines that apply at ETH Zurich. It also provides you with information about where to find relevant information and support at ETH Zurich. The guide is structured in three parts: “Part I – Metadata Preparation”, “Part II – Data Preparation”, and “Part III – Data Publication”. Each of these parts is presented below (transcriptions included) and a collection of all three illustrations is available in the following PDF file for download (without transcriptions). It is strongly recommended to engage with Part I and Part II before proceeding with Part III of the step-by-step guide.
Parts Overview
Part I – Metadata Preparation
Part I of the step-by-step guide supports you in collecting important metadata (i.e., data about your data), which is a crucial step before arranging your actual research data for reuse, publication, or archiving.
The hyperlinks with supporting information are available in the PDF above or in the description under each picture.
Description
I am planning to publish my research data 1
1. Metadata Documentation
I have to document my data with metadata 2 (entered via a repository’s interface and provided e.g. in a README text file).
Condition A
I developed code, scripts, or other forms of software.
If condition A applies, I proceed with the following steps. Otherwise, I proceed with “Part II: Data Preparation” to prepare my data for publication.
2. Code Description
I add to the README text file:
- the programming language used
- the version, libraries, compiler, packages etc.
- the specifications of the machine I used
3. Registration
I have to register 3 my code, scripts or software (As required by ETH Zurich, I have to register code, scripts or software at the ETH transfer office)
4. License Information
I add to the README text file:
- the license used for the code, scripts or software
5. Completion
I include the code, scripts and software after registration when proceeding to “Part II: Data Preparation”
I proceed with “Part II: Data Preparation” to prepare my data for publication
Footnotes
1 Reasons for publishing one’s research data, for example:
- According to ETH Zurich’s RDM Guidelines, “Research Data and Programming Code that are considered as directly relevant for a result publication based on Community Standards must be published and deposited in a FAIR repository […]” (RDM Guidelines, Article 6(1)); AND
- I am planning to submit/publish a paper AND/OR
- My funder requires me to publish my research data underlying a paper AND/OR
- I am planning to publish my research data to make them reusable
- etc.
2 Required metadata:
- a title (a name given to the dataset or the research project that produced it)
- an abstract of the project or dataset (description)
- creator (the names of the persons who collected the data or contributed to data collection), identified by ORCID identifier
- the date or period of collection
- a short description of each file
- the persistent identifier of the publication (e.g., DOI, ISBN)
- the selected license
- the collection methods, tools and software used
- discipline-specific metadata relevant in the respective research field (if applicable)
- A guide for writing README files is available here.
3 For registration at ETH transfer office, see information on software licensing to third parties and open-source licensing here.
Part II – Data Preparation
Part II of the step-by-step guide supports you to decide which part of your data can be published directly and for which part of your data only metadata can be provided. This is a crucial step before the actual publication or archiving of your research data.
The hyperlinks with supporting information are available in the PDF above or in the description under each picture.
Description
I prepare my data for reuse, publication or archiving
Condition A
Some of the data are protected by copyright or contracts with third parties or I am planning to patent them. If condition A applies to me, those data will not be published and I proceed with Condition D.
If condition A does not apply, I proceed to the next step.
Condition B
Some of the data are personal or confidential data 1. (According to ETH guidelines, strictly confidential data (for a definition, see ETH Zurich directive) must not be distributed and access can only be granted to a specified group of persons)
If condition B does not apply to me, I proceed with "Option 1". If condition B applies, I proceed with the next step.
Condition C
Data can be anonymized 2 and consent from study participants is obtained. (As required by Swiss and international law, I handle sensitive data with the appropriate care.)
If condition C applies to me, I proceed with "Option 2". If condition C does not apply, those data will not be publised and I proceed with the next step.
Condition D
The metadata themselves are personal or confidential data1.
If condition D applies to me, I end up in "Option 0". If condition D does not apply, I end up in "Option 3".
Option 0
The metadata will not be published, because they are sensitive. This is a dead end, because nothing will be published.
Option 1
I include the data
- I include the README file from “Part I: Metadata Preparation”
- I include the code, scripts and software after registration
The README file and related metadata and the data will be published. I proceed with “Part III: Data Publication”.
Option 2
- I anonymize the data and include it
- I add to the README text file from Part I:
- license information
- access conditions (define how and by whom data can be accessed and used)
- planned storage period
- categories of personal data
The README file and related metadata and the fully anonymized data will be published. I proceed with “Part III: Data Publication”.
Option 3
- I add to the README text file from Part I:
- license information
- access conditions (define how and by whom data can be accessed and used)
- planned storage period
- categories of personal data
Only the README file and related metadata will be published (these must not include any sensitive data). I proceed with “Part III: Data Publication”.
Footnotes
1 personal or otherwise (strictly) confidential data are data such as
- data related to identifiable persons (e.g. body- or health-related data; household income data)
- contract-related data (e.g. with third-party ownership or rights to the data at a hospital or company)
Additional information at ETH Zurich directive and factsheet “Factsheet ‘Data Protection in Research Projects’”
2 anonymization = “all items which, when combined, would enable the data subject to be identified without disproportionate effort, must be irreversibly masked or deleted” (Human Research Ordinance, Art 25)
Part III – Data Publication
Part III of the step-by-step guide supports you when you finally want to publish your data collection or dataset, including your data, metadata, and potentially also accompanying code scripts. It is strongly recommended to engage with Part I and Part II before proceeding with this third part of the step-by-step guide.
The hyperlinks with supporting information are available in the PDF above or in the description under each picture.
Description
I want to publish my data collection or dataset (data + metadata [+ code])
Condition A
I will use the ETH Research Collection as a FAIR repository.
If condition A does not apply to me, I proceed with the option "External Repository". If condition A applies, I proceed with the option "Research Collection".
External Repository
I choose another FAIR data repository (e.g., Zenodo or a discipline-specific repository listed on www.re3data.org).
I proceed with step "Publication".
Research Collection
I optionally reserve a DOI. To refer to my research data in my manuscript or paper, I need to reserve a DOI for my data, but I do not want to upload my data before acceptance of my manuscript.
- I follow the instructions to reserve a DOI at this site of the ETH library documentation, and I include the DOI in my manuscript.
- I proceed to the next step during or after the paper publication process.
Deposit data in the Research Collection
To deposit my data in the ETH Research Collection, I follow the Video tutorial: How to publish research data. I proceed with the next step.
Publication
I enter the required metadata via the repository’s interface and I choose a suitable license. Guidance is available here.
If the data were collected during a project funded e.g. by the SNSF or EU, one has to comply with funder requirements:
I deposit my data (data + metadata [+ code]) from “Part II: Data Preparation” including the README file in the chosen repository.
Optionally you can also link DOIs. Repository items can be linked, e.g. for connecting the DOI of a dataset with a DOI of a paper.
Important information
For additional support, see the Research Collection Manual, or contact research-collection@library.ethz.ch.
You reached the end of the last step-by-step guide.