Skip to content

The NDC Treebank

The NDC Treebank includes 4637 speech segments and 66 042 tokens from the Norwegian part of Nordic Dialect Corpus. The segments are taken from 30 transcribed interviews from 17 places in Norway. The treebank is annotated with morphological and dependency-style syntactic analysis and manually corrected. The treebank is available in two versions: A downloadable version in conllx format and a searchable version in the search interface Glossa.
Nordic Dialect Corpus is a corpus of Norwegian, Swedish, Danish, Faroese, Icelandic and Övdalian spontaneously spoken dialects.

The NDC Treebank includes 4637 speech segments and 66 042 tokens from the Norwegian part of Nordic Dialect Corpus. The segments are taken from 30 transcribed interviews from 17 places in Norway. The treebank is annotated with morphological and dependency-style syntactic analysis and manually corrected. The treebank is available in two versions: A downloadable version in conllx format and a searchable version in the search interface Glossa.
Nordic Dialect Corpus is a corpus of Norwegian, Swedish, Danish, Faroese, Icelandic and Övdalian spontaneously spoken dialects.

Extended metadata

Download resources

Go to resource page