Publishing linguistic research data with LaRS

Publishing linguistic research data with LaRS

Read this post in Deutsch

The “Language Repository of Switzerland” – “LaRS” for short – has been online since September 2022. Linguistics researchers can now archive and publish their research data free of charge with LaRS. How does it work? It’s very simple:

Get started with a SWITCH edu-ID

As a linguistics researcher whose university is represented in the CLARIN CH consortium, you can log in to SWISSUbase with your SWITCH edu-ID. When creating the project, select the subject area “Linguistics”.

“Fast Data Deposit” with FAIR metadata

To meet the requirements of research funding bodies, LaRS supports researchers in publishing data according to the FAIR principles (findable, accessable, interoperable and re-usable). Two different publication workflows are implemented on SWISSUbase:

  • Fast Data Deposit: You can upload datasets to SWISSUbase in 6 simple steps, specifying the minimum required project and dataset metadata. In this way, data can also be added to an already existing project. Before completing the deposit, you can add additional metadata (recommended) to increase visibility and findability in the catalogue, or submit the datasets with the minimum information.
  • Create New Study: Here you work with the standard interface. Fill in project metadata and add records to your project. In the two-step deposit process, you can first publish a project and only then add further datasets – or do both at the same time.

FAIR metadata is discipline-specific metadata. To meet this criterion, add at least one mandatory metadata block (“Language”). The following metadata blocks are provided by LaRS:

  • Language: Metadata about the language and about the language itself. This includes the language name with the ISO code or the linguality type.
  • Annotation: Metadata about the annotation method used, for example the annotation type and controlled vocabularies.
  • You can also enter text, audio, video and image data, each with its own metadata block.
  • With the “Tools” block, you can reveal which software and applications were used to generate, process and annotate the data.

Publish data, but for whom?

“Sharing is caring”, but it is not always possible to share everything with everyone. Therefore, you can publish your data in different ways:

  • CC licences: you can choose from all available Creative Commons licences. Interested parties can thus download your data from SWISSUbase without any restrictions.
  • “Closed contract without prior agreement”: In some cases you may not want to make the data freely available for download. This may be the case if personal or sensitive data are involved. Or if data may only be used explicitly for scientific and/or educational purposes. This option requires interested parties to log in to SWISSUbase first. Only after specifying a reason and accepting the “Closed contract” can they download the data.
     
  • “Closed contract with prior agreement”: In addition, you can specify that you will only release your data upon request. You decide in each case whether you offer the data to a specific user for download – or not.

Journals often require early access to the data. SWISSUbase has developed the “Share by URL” function specifically for this purpose, so that third parties can view the data in read-only mode even before you have published them. You decide how long the link remains valid.

Your data: curated, addressed, published

As soon as you submit your data sets to LaRS, a team of data curators checks the plausibility of the metadata entered. Random samples of the submitted data are checked for readability. If necessary, the data curators will submit a list of suggested changes to you. However, the data curators will also make minor changes without consultation.

In order to meet scientific requirements and the FAIR principles, each dataset is assigned a persistent identifier (DOI) which you can use to cite the dataset.

Once the data curation is complete, your metadata can be found in the SWISSUbase catalogue and is stored exclusively in Switzerland. In the future (from around mid-2023), your datasets will also be made visible with metadata in the European catalogue of the linguistic research network CLARIN.

Further information

You can find more information on our website and on SWISSUbase. Of course, you can write to us at swissubase@ub.uzh.ch if you have questions or suggestions, or if you are interested in using LaRS and SWISSUbase.

Florian Steurer, Team Open Science Services