The LIA Sápmi corpus is a speech corpus with recordings from 1960 – 1990 of Sami dialects from the northern part of Norway, Finland and Sweden, some recordings from NRK sami radio and some from UiT, mostly collected by Niels Jernsletten. The the topics of the interviews and conversations are typically about old trades and traditional life.
The corpus have about 190 000 tokens and 122 speakers from 19 places.
Automatic lemmatization, morphological tagging and translation to Norwegian are done by Giellatekno.
The LIA Sápmi corpus is a speech corpus with recordings from 1960 – 1990 of Sami dialects from the northern part of Norway, Finland and Sweden, some recordings from NRK sami radio and some from UiT, mostly collected by Niels Jernsletten. The the topics of the interviews and conversations are typically about old trades and traditional life.
The corpus have about 190 000 tokens and 122 speakers from 19 places.
Automatic lemmatization, morphological tagging and translation to Norwegian are done by Giellatekno.
Extended metadata
resource Common Info:
resource Type: corpus
identification Info:
resource Name: LIA sápmi – Sámegiela hállangiellakorpus
resource Name: LIA sápmi – LIA-korpuset for samiske dialekter
resource Name: LIA sápmi – the LIA corpus of Sami dialects
description: The LIA Sápmi corpus is a speech corpus with recordings from 1960 – 1990 of Sami dialects from the northern part of Norway, Finland and Sweden, some recordings from NRK sami radio and some from UiT, mostly collected by Niels Jernsletten. The the topics of the interviews and conversations are typically about old trades and traditional life.
The corpus have about 190 000 tokens and 122 speakers from 19 places.
Automatic lemmatization, morphological tagging and translation to Norwegian are done by Giellatekno.
non Standard Conditions Of Use: The corpus has audio and video recordings classified as personal data. In agreement with NSD, the Data Protection Official in Norway, the corpus is accessible only through Glossa, a search and post-processing tool developed by the Text Laboratory.
The audio excerpts given by the search interface can not be shown in public unless you have an agreement with the Text Laboratory.
Please note that every individual researcher is responsible for treating the participants in the corpus with respect and sincerity. Furthermore, the participants must be kept anonymous in every published paper or other output.
licensor:
actor Info:
actor Type: organization
organization Info:
organization Name: University of Oslo
organization Name: Universitetet i Oslo
organization Short Name: UiO
organization Short Name: UoO
department Name: Department of Linguistics and Scandinavian Studies
department Name: Institutt for lingvistiske og nordiske studier (ILN)
unstandardised Genre: conversations and informal interviews
classification Info:
genre Info:
genre Type: speechGenre
genre: semi formal
unstandardised Genre: interviews
time Coverage Info:
time Coverage: 1960 – 1987
geographic Coverage Info:
geographic Coverage: Sami areas in northern Norway, Finland and Sweden
recording Info:
recording Device Type: tapeVHS
recording Environment: other
dc:type
corpus
dc:title
LIA sápmi – the LIA corpus of Sami dialects
dc:identifier
oai:tekstlab.uio.no:lia-sapmi
dc:description
The LIA Sápmi corpus is a speech corpus with recordings from 1960 – 1990 of Sami dialects from the northern part of Norway, Finland and Sweden, some recordings from NRK sami radio and some from UiT, mostly collected by Niels Jernsletten. The the topics of the interviews and conversations are typically about old trades and traditional life.
The corpus have about 190 000 tokens and 122 speakers from 19 places.
Automatic lemmatization, morphological tagging and translation to Norwegian are done by Giellatekno.