Corpus of American Nordic Speech – downloadable transcriptions

CANS v.3.1 – Corpus of American Nordic Speech – is a speech corpus with speakers from USA and Canada speaking Norwegian and Swedish. Most of the informants learnt to speak their Nordic language as children at home. There are 268 speakers from 63 places in the corpus, all in all more than 774 000 tokens. The corpus contains both conversations and interviews.

The downloadable version of the corpus contains all transcriptions in the corpus, some in txt format and some in html format. The transcriptions are available in to versions: one phonetic and one orthographic.

CANS v.3.1. includes Norwegian recordings from Janne Bondi Johannessen et al. (2010 – 2016) together with older recordings and transcriptions from Didrik Arup Seip and Ernst W. Selmer (1931), Einar Haugen (1942) and Arnstein Hjelde (1987, 1990, 1992). The Swedish recordings are collected by Ida Larsson et al. (2011 – 2014).

Download resources

Extended metadata

Go to resource page

Go to resource page http://tekstlab.uio.no/norskiamerika/english/corpus.html

dc:type	corpus
dc:title	Corpus of American Nordic Speech – downloadable transcriptions
dc:identifier	oai:tekstlab.uio.no:cans-transcriptions
dc:description	CANS v.3.1 – Corpus of American Nordic Speech – is a speech corpus with speakers from USA and Canada speaking Norwegian and Swedish. Most of the informants learnt to speak their Nordic language as children at home. There are 268 speakers from 63 places in the corpus, all in all more than 774 000 tokens. The corpus contains both conversations and interviews. The downloadable version of the corpus contains all transcriptions in the corpus, some in txt format and some in html format. The transcriptions are available in to versions: one phonetic and one orthographic. CANS v.3.1. includes Norwegian recordings from Janne Bondi Johannessen et al. (2010 – 2016) together with older recordings and transcriptions from Didrik Arup Seip and Ernst W. Selmer (1931), Einar Haugen (1942) and Arnstein Hjelde (1987, 1990, 1992). The Swedish recordings are collected by Ida Larsson et al. (2011 – 2014).
dc:publisher
dc:format	downloadable
dc:date	2010-01-01
dc:date	2019-11-01
dc:rights	Public
dc:rights	Creative Commons (CC)
dc:rights	Creative_Commons-BY-NC-SA (CC-BY-NC-SA)
dc:rights	http://creativecommons.org/licenses/by-nc-sa/4.0/
dc:creator	The Text Laboratory
dc:creator	Ida Larsson
dc:lang	Norwegian Bokmål
dc:lang	Swedish

Corpus of American Nordic Speech – downloadable transcriptions

Download resources

Extended metadata

Dublin Core (DC)

Go to resource page