Skip to content

LIA sápmi – the LIA corpus of Sami dialects

The LIA Sápmi corpus is a speech corpus with recordings from 1960 – 1990 of Sami dialects from the northern part of Norway, Finland and Sweden, some recordings from NRK sami radio and some from UiT, mostly collected by Niels Jernsletten. The the topics of the interviews and conversations are typically about old trades and traditional life.
The corpus have about 190 000 tokens and 122 speakers from 19 places.
Automatic lemmatization, morphological tagging and translation to Norwegian are done by Giellatekno.

The LIA Sápmi corpus is a speech corpus with recordings from 1960 – 1990 of Sami dialects from the northern part of Norway, Finland and Sweden, some recordings from NRK sami radio and some from UiT, mostly collected by Niels Jernsletten. The the topics of the interviews and conversations are typically about old trades and traditional life.
The corpus have about 190 000 tokens and 122 speakers from 19 places.
Automatic lemmatization, morphological tagging and translation to Norwegian are done by Giellatekno.

Extended metadata

Download resources

Go to resource page