Blog der Hauptbibliothek

Data on the long run: Tips for long-term storage

5. June 2019 | Martina Gosteli | Keine Kommentare |

This post is also available in: Deutsch

Open source, well documented data formats are key for the long-term availability of data. General principles are:

Open and non-proprietary file formats are preferable to closed and proprietary formats, text-based formats are better than binary formats; depending on the research area, this is not always possible.

Certain data formats can be converted/exported into formats that are suitable for long-term archiving:

  • Export Microsoft DOCX files to PDF: In the “Save as” or “Export” dialog select additional “Options…”: Activate the checkbox “ISO 19005-1 compatible (PDF/A)” here (c.f. p. 2).

    If you use LATEX, you can, for example, include the “pdfx” package to create a PDF/A compliant document.

    Check PDF/A compatibility in Adobe Acrobat Reader DC: A corresponding info icon appears in the left navigation bar. A good alternative is veraPDF (EU funded).
  • Convert existing PDF files into PDF/A: Either with Adobe Acrobat Professional, which requires a license, or with certain free tools such as the PDF24 Creator.
  • Export Microsoft XLSX files as CSV (Column Separated Values): This is useful if you deal with simple numerical tables, e.g. values of measurements. This can be achieved either by the “Save as” or “Export” dialog.
  • Text encoding: Unicode UTF-8 is a good standard that aims to combine different characters and symbols into a single character set.

The ETH Library provides a general overview of suitable file formats.

Often it is only the interaction of various parameters and data formats that enables the representation of complex facts: BIDS (Brain Imaging Data Structure), for example, enables the visualization of multidimensional images in the field of Magnetic Resonance Imaging and is based on a predefined naming convention/folder structure.

The Data Services team will be happy to answer any questions you may have about research data. You can reach us via mail and our website.

Abgelegt unter: Coffee LecturesResearch DataTips for Physicians & Health ProfessionsTips for Researchers
Tags: