Skip to content

Corpus of American Nordic Speech – downloadable transcriptions

CANS v.3.1 – Corpus of American Nordic Speech – is a speech corpus with speakers from USA and Canada speaking Norwegian and Swedish. Most of the informants learnt to speak their Nordic language as children at home. There are 268 speakers from 63 places in the corpus, all in all more than 774 000 tokens. The corpus contains both conversations and interviews.

The downloadable version of the corpus contains all transcriptions in the corpus, some in txt format and some in html format. The transcriptions are available in to versions: one phonetic and one orthographic.

CANS v.3.1. includes Norwegian recordings from Janne Bondi Johannessen et al. (2010 – 2016) together with older recordings and transcriptions from Didrik Arup Seip and Ernst W. Selmer (1931), Einar Haugen (1942) and Arnstein Hjelde (1987, 1990, 1992). The Swedish recordings are collected by Ida Larsson et al. (2011 – 2014).

CANS v.3.1 – Corpus of American Nordic Speech – is a speech corpus with speakers from USA and Canada speaking Norwegian and Swedish. Most of the informants learnt to speak their Nordic language as children at home. There are 268 speakers from 63 places in the corpus, all in all more than 774 000 tokens. The corpus contains both conversations and interviews.

The downloadable version of the corpus contains all transcriptions in the corpus, some in txt format and some in html format. The transcriptions are available in to versions: one phonetic and one orthographic.

CANS v.3.1. includes Norwegian recordings from Janne Bondi Johannessen et al. (2010 – 2016) together with older recordings and transcriptions from Didrik Arup Seip and Ernst W. Selmer (1931), Einar Haugen (1942) and Arnstein Hjelde (1987, 1990, 1992). The Swedish recordings are collected by Ida Larsson et al. (2011 – 2014).

Extended metadata

Download resources

Go to resource page