Research data management
Research data has increasingly become the focus of research funders in the last few years. They recognised that by making research data accessibleand reusable, research projects can be carried out more efficiently, cost-effectively and with a valid basis. The organisation, reusability and storing of research data is therefore gaining significance for the planning of research projects in all scientific disciplines.
Which services are offered by the University Library in relation to research data?
The University Library offers interdisciplinary guidance and training for the entire data management cycle with its expertise in the area of information and knowledge organisation, specialising in the area of Open Science and digitisation.
A data management plan (DMP) describes how to handle research data during the entire life cycle of the data. Some funding institutions already require a first version of the DMP with applications for research project funding. A DMP created early on can be very helpful as a checklist for your own data management. Preliminary clarification should specifically consider the ethical, legal, technical and financial implications of data gathering and processing, and later the publishing and archiving.
The University Library regularly offers DMP Writing Labs before application deadlines for funding agencies and is available for individual consultation sessions.
The SNSF expects that the so-called FAIR principles are adhered to when publishing data. These guarantee that the data will be Findable, Accessible, Interoperable and Reusable.
For researchers, this means, among other things, the following:
- The SNSF recommends the use of non-commercial repositories that comply to the FAIR principles. For the social sciences and humanities, FORSbase and DaSCH can be recommended, for example.
- In order to guarantee findability and reusability, research data must be enriched with comprehensive, systematic and standardised (as far as possible) metadata.
- Making the data accessible does not mean that the data must be completely open and freely accessible. Rather, it means that information is provided about HOW the data can be accessed. If research data cannot be published due to ethical or legal reasons, then the metadata should still be accessible.
- When storing the data, suitable file formats should be selected that have widespread use in the scientific community, are compatible with different systems and ideally should be free of charge and open source. File formats that require specialised algorithms or uncommon programs should be avoided.
- The research data should be provided with a persistent identifier, so that they can be permanently and uniquely identified, e.g. through a DOI (Digital Object Identifier) or an ARK (Archival Resource Key). Many repositories automatically generate a persistent identifier.
- Data should be provided with a licence such as, for example, a Creative Commons licence which regulates its reuse.
The University Library will support you in finding the most suitable repository for your data and managing your data according to the FAIR principles.
When starting a project, it is strongly recommended that you select a logical and consistent data organisation that will enable you and others to easily find, access and use your data, thereby avoiding duplication of work, and ensuring that your data can be backed up. The following tips can help you to develop your own organisation system:
- Use a hierarchical and clearly arranged file structure.
- Separate active and completed work and delete all unused temporary files.
- Establish a naming convention for file names.
- Use file formats that are widespread and preferably open source.
Describe your files using good documentation and metadata.
Documentation implies creating good information for later use and application. The aim is to make information or documents findable and reproducible. Structured information about an object is called metadata.
If research data are to be published or archived in a repository, metadata that reflect the contents of a document are essential, thereby making it easier to find the document. A description of the contents can be in the form of subject headings and abstracts. For indexing, the use of a standardised vocabulary is recommended. An overview of freely-accessible vocabularies can be found here: http://bartoc.org/.
Ideally, documentation and metadata should already be recorded on an ongoing basis during your research. It is recommended that an internal project standard should define how data will be annotated and stored. Meaningful naming of files and information in the individual documents (e.g. information about time, place and interviewee in an interview transcript) should also be included.
The requirements of research funders differ with regard to the recommended duration of data archiving and the definition of "long term". The University of Basel recommends storing data for at least 5 years after publication of the research results. However, many research sponsors recommend keeping the data for longer – the SNSF recommends 10 years. Fees charged by the archive for the preparation and inclusion of the data can be addressed directly in the funding application. The following aspects should therefore be considered as early as possible in your data management planning.
- selection of your research data
- suitable file format
- detailed documentation and metadata
- the data set has a persistent identifier
- long term financing of the archive is guaranteed (for example, linked to an institution)
The University Library is currently establishing an archiving solution for its own holdings and will be happy to advise you on the preparation of your data for long term storage and on finding a suitably repository. The University Library does not have an archive or repository for research data.
Research data are increasingly openly accessible via data archives, supplementary material in scientific journals and on the websites of research groups. Multidisciplinary and subject-specific archives may be located with the help of Re3data — Registry of Research Data Repositories. Some data records can be found through search engines for data, e.g.:
Research data must be cited in the same way as other publications in the spirit of good academic practice. We recommend the following details in the usual citation style: author, data record name, repository, version, persistent identifier.
Frequently asked questions
Persistent identifiers enable resource providers (e.g. journals, storage, …) to clearly identify individuals or to cite data records in the research community. The most important identifiers are DOI and ORCID.
A Digital Object Identifier (DOI) is a unique alphanumeric character string, with which content can be uniquely identified and a permanent link to its location in the Internet can be provided.
ORCID (Open Researcher and Contributor Identifier) provides an identifier that authors can use with their name when they carry out research, grant and innovation activities. You can find more information about this identifier here: orcid.org/aboutorcid.org/about.
With regard to research data, the research funding institutions demand that ...
- at least the data on which a publication is based should be made openly accessible, provided that there are no legal, ethical, copyright or other clauses to the contrary.
- the data is published in a repository that follows the FAIR principles.
- depending on the funding institution, that a preliminary data management plan be submitted on submission of the application, or after the project has been approved, which will then be adapted and supplemented during and at the end of the project.
- the data on which a publication is based be archived. The recommended retention period is 10 years. The SNSF admits that depending on the subject area, this is not always equally meaningful.
Metadata describes how data was created and the context in which it exists. It should make the data identifiable and explain who, how and in which context the data was generated, how it was processed and where it has already been published and under which conditions it is accessible and reusable for others.
The following types of metadata exist:
- Technical metadata (e.g. the size of an image file)
- Metadata that describes the content (title, authors, keywords)
- Administrative metadata (rights, licences, publication date, usage, access rights)
- Relational metadata (reference to other data sets or the associated publication)
Research data is all data that is the subject of research or that arise during the research process. It serves to document and validate the research results. It is factual material, which can be very different depending on the discipline and research area and arises through different methods such as measurements, experiments, surveys, source research, digitisation, and many more. The term research data is deliberately kept open and there is therefore no general definition.
Data retention means protecting your data in a secure environment in the long term, in such a way that it remains usable, understandable and accessible. This means more than just making backup copies of your data. If you only backup your data, it can ....
- become unreadable with future software because the file format is not compatible.
- be altered when opening the file with new software, so that it is no longer reliable for research.
- be altered by someone, as there is no access control.
- become incomprehensible because no documentation or metadata have survived.
Good data management is part of the research process and contributes to the integrity and transparency of research. If you begin with data management planning at the start of your project, ...
- you can budget for the required infrastructure, legal questions and costs, which are sometimes reimbursed by your sponsor.
- you can save time and resources in the long term and increase the efficiency of your research.
- you will prevent data loss by increasing data security.
- you will avoid unnecessary data duplication and additional workload.
- you will enable the reproducibility and verifiability of your research.
- you will meet the requirements of the funding institutions.
Your contacts at the University Library for research data
Research Data Management Network
The University Library is a member and the coordination centre for the Research Data Management Network at the University of Basel. In cooperation with the Vice President’s Office for Research, the library ensures coordination amongst all participants within the Research Data Management Network and its further development and ensures the monitoring of developments in the area of research data management.