Speech, Text, Video  08.04.2021

Corpus of American Nordic Speech v.3.1

CANS v.3.1 - Corpus of American Nordic Speech - is a speech corpus with speakers from USA and Canada speaking Norwegian and Swedish. Most of the informants learnt to speak their Nordic language as …

  • Language: Norwegian Bokmål, Swedish
  • Origin: CLARINO Text Laboratory Centre
  • Licence: CLARIN_ACA-NC-LOC-PRIV-ND-*
Text  08.04.2021

Corpus of American Nordic Speech – downloadable transcriptions

CANS v.3.1 - Corpus of American Nordic Speech - is a speech corpus with speakers from USA and Canada speaking Norwegian and Swedish. Most of the informants learnt to speak their Nordic language as …

  • Language: Norwegian Bokmål, Swedish
  • Origin: CLARINO Text Laboratory Centre
  • Licence: Creative_Commons-BY-NC-SA (CC-BY-NC-SA)
Speech, Text  06.04.2021

LIA Norwegian – Corpus of historical dialect recordings

LIA Norwegian is a speech corpus with old recordings (1939 - 1996) from four Norwegian universities: NTNU, UoB, UoO and UoT. The recordings are mainly made for dialect and onomastic research and the …

  • Language: Norwegian, Norwegian Nynorsk
  • Origin: CLARINO Text Laboratory Centre
  • Licence: CLARIN_ACA-NC-LOC-PRIV-ND-*
Text  06.04.2021

The BigBrother Corpus – downloadable transcriptions

The BigBrother Corpus is a speech corpus with recordings from the first season of the BigBrother show, sent on Norwegian television by TVNorge in the first half of 2001. The participants in BigBrother …

  • Language: Norwegian, Norwegian Bokmål
  • Origin: CLARINO Text Laboratory Centre
  • Licence: Creative_Commons-BY-NC-SA (CC-BY-NC-SA)
Text  06.04.2021

Transcriptions and selected audio files from LIA Norwegian for download

All transcriptions from LIA Norwegian are downloadable in plain text format. A folder containing 553 transcriptions from LIA Norwegian, in ELAN format, along with their corresponding audio, can …

  • Language: Norwegian, Norwegian Nynorsk
  • Origin: CLARINO Text Laboratory Centre
  • Licence: Creative_Commons-BY-NC-SA (CC-BY-NC-SA)
Lexicon  15.03.2021

Lingit Pronunciation Lexicon for Norwegian Nynorsk

This pronunciation lexicon for Nynorsk was originally developed by Lingit AS to be used in their text-to-Speech voices which were first released in 2008. The resource consists of a set of lexical …

  • Language: Norwegian Nynorsk
  • Origin: Language Bank
  • Licence: Creative_Commons-ZERO (CC-ZERO)
Speech, Text, Video  12.03.2021

The BigBrother Corpus

The BigBrother Corpus is a speech corpus with recordings from the first season of the BigBrother show, sent on Norwegian television by TVNorge in the first half of 2001. The participants in BigBrother …

  • Language: Norwegian, Norwegian Bokmål
  • Origin: CLARINO Text Laboratory Centre
  • Licence: CLARIN_ACA-NC-LOC-PRIV-ND-*
Text  09.03.2021

The Abkhaz National Corpus

The Abkhaz National Corpus is a comprehensive and open, grammatically annotated text corpus. It makes the Abkhaz language accessible to scientific investigations from various perspectives …

  • Language: Abkhaz
  • Origin: CLARINO Bergen Centre
  • Licence: CLARIN_PUB-BY-NC-ND
Speech, Text, Video  08.03.2021

Norsk talespråkskorpus – Oslodelen

NoTa-Oslo is a speech corpus with interviews and conversations from 166 informants born and raised in Oslo and the Oslo area. The informants are carefully selected w.r.t. sociolinguistic variables and …

  • Language: Norwegian, Norwegian Bokmål
  • Origin: CLARINO Text Laboratory Centre
  • Licence: CLARIN_ACA-NC-LOC-PRIV-ND-*
Text  03.03.2021

Corona texts from NRK

The «Corona texts from NRK» treebank is a syntactically annotated corpus. It is based on data transcribed from the two newscasts Dagsrevyen and Supernytt produced by the Norwegian Broadcasting …

  • Language: Norwegian, Norwegian Bokmål
  • Origin: CLARINO Bergen Centre
  • Licence: Creative_Commons-BY (CC-BY)