Guidelines for handling research data
Research data are of great importance in the scientific research process. Therefore, there are a number of national and international guidelines and policies. At LUH, the guideline for handling research data provides an overview of the requirements and responsibilities for RDM at LUH via the principles formulated in it. On this page, we have compiled supplementary explanations and information on various aspects of data management for you.
Data description and documentation
-
What is meant by "research data"?
All data generated and processed in scientific activities are considered research data. They constitute the basis of the research results. Data types can be quite diverse and may comprise measurements, secondary analyses, visualisations, models, and the results of polls and surveys. Equally manifold are the file formats used to hold numbers, text, code, or graphics. Even physical samples such as minerals or tissues are sometimes defined as "data".
-
Which file formats meet the requirements of the guideline?
Data management begins with its collection. A variety of methods, e.g. measurements, simulations, surveys or text dissections is used to generate or collect data. Thus it is available in different file formats such as tables, CAD data, image and raster data, transcripts, program codes and much more. The chosen methodology, the type of data and its file formats determine whether and how data can be processed automatically. It also influences how compatible data is with other hardware and software systems and whether it remains readable in the long term.
The data type determines in which form data are presented and stored. For example, survey data will rather be processed in tabular form than as a text. Complex data collection should be stored in a database rather than in an Excel sheet.
Considering the file format of your data is important. Some devices and many application programs store data in a proprietary format. These may not be readable with other software. So, better save or convert the data in an open format. This facilitates data exchange and long-term archiving of data.
Recommendations on data formats can be found on the website of the RADAR repository.
-
What does good file naming and folder structure look like?
Folder and file names should consist of elements that allow quick classification of content. For example, you can provide information about the creation date, the file version and the person editing the research data. These elements are arranged in a uniform format. Make sure that naming conventions are agreed on in advance, set out in writing and adhered to during the research process.
The more information file names contain, the longer they can become. Some programs cannot process very long file names. Information that is the same for all files in a folder is stored in the folder name instead.
Tips for file naming
- Dates in YYMMDD format, for example, 150828 for August 28, 2015.
- Shorten personal data, e.g. to initials.
- Use only the following characters for file names: A-Z a-z 0-9 _ (underscore)
- Do not use umlauts, spaces or special characters, as many programs interpret these characters differently or do not display them correctly.
-
What is meant by documentation of data? What is metadata?
Metadata is used for the structured documentation and description of research data. They consist of descriptive and technical information. Research data should always be stored together with its metadata. In public repositories, this usually happens automatically and is mandatory. Before publication, metadata can be saved in a number of formats, e.g.
- in a database.
- in tables (e.g. Excel).
- in a readme file (text, PDF).
- in a structured XML file.
- in the data file (e.g. in the file header).
Good documentation ensures that data- is findable,
- can be processed automatically, e.g. by search engines,
- can be reused better or at all,
- can be quoted and thus directly associated with the creator of the data,
- is more valuable for science, since content, quality and state of processing can be evaluated.
Basic descriptive metadata
- Unique identifier
- Title
- Creator (primarily responsible researchers)
- Collection date (also versions)
- Format (if necessary: required software)
- Subject area
- Data description / abstract
- Data collection (spatial / temporal)
- Organisation
- Legal Rights / License
- Relations to other objects (data, texts...)
Related Links
Information on forschungsdaten.info (mainly in German)
General metadata schemas: -
How is a data management plan structured?
The handling of research data should be recorded in a data management plan (DMP).
The content of a DMP:- Project overview
- What kind of data is used in my project? (Self-generated data, pre-existent data)
- How is the data managed? (filenames, storage location (internal / external), backups)
- How is the data processed?
- Which legal aspects must be considered? (Data protection, licenses, how do I distribute my data?)
- Data sharing and publication
- Who is involved in what with the data (roles and responsibilities)?
- What resources are available to me? (money, material, human resources)
In general, you should draft the data management plan in as much detail as possible for internal project use. If the research process deviates from the original planning or if certain aspects are to be specified, the data management plan must be adapted.
Online tools for the development of DMP
DMPonline
free, online tool for creating data management plans in English from the Digital Curation Centre (DCC)
RDMO
free DMP tool for institutional use with own entitiesFurther Information
LUH template for creating data management plans (in German)
forschungsdaten.info (mainly in German)
How to Develop a Data Management and Sharing Plan, Sarah Jones (DCC)
Checklist for Data Management Plans from the DCC -
Which legal aspects have to be considered?
Before you collect, process and publish scientific research data, you should check the legal framework and guidelines for handling research data. Personal data, for example, is regulated by data protection laws. You can get detailed council on this topic from the Stabsstelle Datenschutz. If you process data collected from other persons or institutions, check in advance to ensure that their use is permitted. Is the Data released under a specific license? Also consult your employer to find out what exploitation rights you have to the data you have collected. Further information can be found in our FAQs on legal aspects of handling research data and in this expert assessment of legal aspects of research data management by the DataJus project (both only in German).
Regulations
- German Copyright Law
Many raw data are not protected by copyright, but there may be certain exceptions. - Data protection laws:
- General Data Protection Regulation (DSGVO)
- Federal Data Protection Law (in German)
- Data Protection Law of Lower Saxony (in German)
Related Links
- Information on forschungsdaten.info (mainly in German)
- Legal aspects of Open Science (in German)
- German Copyright Law
Saving, archiving and publishing
-
Where can data be securely stored?
The loss of your data, which involves considerable expense of money, time and effort, and the analysis that builds on it, can have a significant negative impact on your research. Anyone who digitally generates and evaluates research data must therefore ensure that nothing is lost and that the results are stored securely for a long time. The following principles should be observed:
- Regularly backup relevant research data on suitably devices, or use professional backup services.
- The backup intervals determine the possible loss rate in the event of a fault - the more frequently you save, the lower the possible loss of data.
- Every backup is only as good as the data recovery: Test recovery on your computer before an emergency occurs.
Backup & Restore at the LUIS
If possible, use the server of your institute for data storage, which is regularly backed up by the LUIS. With the Backup & Restore service, institutes and central institutions create backup copies of server data that change regularly or that belong to current projects. The service automatically stores copies LUIS servers, where they are secured for a limited period of time.
Alternatively, the "Sync & Share" service Seafile is available as part of the central file service, which automatically copies selected data to a LUIS server. In addition, this data can be distributed to other devices.
Backup programs
The data on your PC can be secured on external media (USB storage, DVD, tapes) with the built-in mechanisms of your operating system (Windows Vista or higher: Backup & Restore) or with special software. A list of these programs can be found on Wikipedia.
Related links
- Tips on data organisation from Forschungsdaten.info (in German)
- Tips on data security from forschungsdaten.info (in German)
-
How do I store access-restricted data?
Think carefully about where and how you store and secure your data. If you work with data that is worthy of protection, you should restrict access to it to the immediate collaborators. Typically, these restrictions are governed by the read and write permissions and sharing privileges on institute servers or in file services such as Seafile. If you are working with personal research data, you can get council from the data protection office to check if the technical and organizational safety measures are sufficient.
Free cloud storage services and unencrypted USB media are not an appropriate place for sensitive data. Personal research data must always be stored encrypted. You can encrypt entire file systems from mass storage such as hard disks or portable USB media so that unauthorized persons can not access them. Most operating systems like macOS, Windows and Linux already come with built-in software (FileFault, Bitlocker, dm-crypt). For Windows, the open source VeraCrypt is also recommended. Alternatively, you can also encrypt individual folders and files directly (file encryption). This is possible with the archive manager 7-Zip, some file managers or tools like GPG or OpenSSL.
Further information
-
Where can data be published in accordance with the FAIR principles?
Archiving and publishing data in a special data repository is a way to make data accessible and citable in the long term. Most repositories have special requirements for the data to be hosted, which should at best be considered before the data is created. Usually these are some or all of the following requirements:
- Use open data formats that facilitate long-term archiving and access.
- Mandatory metadata for documentation to increase findability and usability.
- The assurance of the data provider that archiving and access to the data does not violate copyright or data protection laws.
- Use of licenses or agreements that facilitate subsequent use (e.g. Open Access, Open Access after an embargo period).
What to consider when choosing a repository
- Guaranteed data preservation for at least 10 years
- Affordable fees for long-term data preservation
- Metadata acquisition for each record at least complying with DataCite or Dublin Core standards
- Unique, long-term persistent identifiers, e.g. a DOI, are assigned for each data set.
Repositories
re3data.org
Index of interdisciplinary and subject-specific repositories
RIsources
DFG Portal for Research Infrastructures
Leibniz Universität Hannover- LUIS data archive non-public archive
- LUH data repository
RADAR
Generic Data Repository operated by FIZ Karlsruhe and TIB
ZENODO
Generic Repository, funded by the European Union and operated at CERN. -
Which data should be published?
Funders, universities and scientific organisations require or recommend that research data and other results are openly accessible after a project ends. This makes it easier to evaluate research findings and to enable other researchers to reuse data.
However, it is neither possible nor useful to publish all data generated during the research process. Data worthy to be published may be all data that are needed to understand a project’s outcome. The most important criteria are as followed:
- Uniqueness: No duplicates of the data have already been published elsewhere.
- Extremely limited reproducibility: The data cannot be re-generated or only at great expense.
- High professional relevance: The data is of particular interest to your professional community or even across disciplines.
- Basis of text publications: You have published a book or article based on the analysis of this data.
To ensure that your data is reusable, please note the following:
- Adequate documentation: Provide sufficient descriptive metadata so that the data set can be searched in a database (e.g. a repository).
- Readability: If possible, save the data in open, widely used formats that can be opened independently of platforms and does not require special (possibly not permanently available) hardware and software.
- Rights: Check whether the rights of third parties may prevent publication (e.g. copyrights or personal rights). If this is the case, try to have all necessary rights granted in writing by the persons concerned. Provide your data with an open license (e.g. CC0) so that it can be used by anyone without restrictions.
-
Which licences are recommended?
Before sharing data with other parties, the requirements for re-use should be clarified. Researchers at LUH are recommended to use open licenses for data publications. By assigning an open license, the author allows other persons the right to use, modify and redistribute the data without restriction. There are licenses that limit these rights. However, these are no longer regarded as "open". The granting of a standardised license is usually a preliminary requirement for publication in repositories.
Licenses
- Open Data Commons Licenses
Open data and database licenses - Creative Commons Licenses
Licenses suitable for open licensing of research data are: - Data license Germany (in German)
This is a license for data specifically developed for the German legal area, which is intended to standardise and simplify licensing for German users.
- Open Data Commons Licenses
Guidelines and project planning
-
How do I create an internal policy for my project or my institute?
The Team Research Data Management provides a guide to support designing an internal policy for handling research data (German only). It provides an overview of the necessary steps from preliminary considerations to implementation and evaluation of a policy. It also contains an overview of possible principles defined therein.
Technische Universität Berlin provides another and very detailed guide resulting from the sub-project "Research Data Policies for Research Projects" of the DFG joint project "FDNext". You can find it here.
The Collaborative Research Centre 1464 TerraQ has created and published a guideline on research data management. It constitutes a good example.
-
What costs are incurred for research data management?
The extent to which costs for research data management measures arise is different in each project. Three example calculations can be found here. Important influencing factors are the volume of data, the number of files and the degree of homogeneity. The more manual work is required, the higher the labor requirement and thus the costs. The publication of data is free of charge in many repositories. However, larger volumes of data or the use of additional services (such as data preparation and curation) may incur costs.
Integration of RDM topics into teaching
-
Where do I find materials for the integration of RDM topics into my teaching?
There are numerous digitally available teaching materials on research data management for free reuse. When using the material, pay attention to the licence terms of the material, e.g. whether you have to name the authors in case of a CC-BY licence.
The Team Research Data Management provides various materials in the section "Training, instructions and other materials".
The state initiative Hessian Research Data Infrastructures (HeFDI) has developed a learning module on research data management, the "HeFDI Data Learning Materials" (German only), and made it available for reuse. The customisable learning module can be imported into ILIAS.
When (further) developing your own teaching materials, you can apply the learning objectives matrix on research data management (RDM) as a basis for content and objectives.
Es gibt zahlreiche Lehrmaterialien zum Forschungsdatenmanagement, welche digital verfügbar sind und frei nachnutzbar sind. Achten Sie bei der Nutzung auf die Lizenz des Materials, ob Sie bspw. bei einer CC-BY Lizenz die Autor*innen nennen müssen.
Das Service-Team Forschungsdaten stellt unter „Schulungen, Anleitungen und weitere Materialien“ diverse Materialien zur Verfügung.
Die Landesinitiative Hessische Forschungsdateninfrastrukturen (HeFDI) hat die „HeFDI Data Learning Materials“, ein Lernmodul zum Thema Forschungsdatenmanagement, entwickelt und zur Nachnutzung zur Verfügung gestellt. Das Lernmodul kann in ILIAS implementiert und inhaltlich individuell angepasst werden.
Für die (weiter)Entwicklung eigener Lehrmaterialien kann die Lernzielmatrix zum Themenbereich Forschungsdatenmanagement (FDM) als Grundlage für Vermittlungsinhalte und Lernziele genutzt werden.
-
Where do I get advice and support?
If you have any questions regarding research data management feel free to contact the Service Team at Dezernat 4 anytime or write to forschungsdaten(at)uni-hannover.de. Helpful links and further reading are listed here. Please contact the Data Protection Office, if you have any questions regarding the handling of personal data.