Randomized extraction of the New Norwegian Corpus (Nynorskkorpuset).
Contains sentences in New Norwegian (Nynorsk) from the year 2000 and after. Tab-separated, one word pr. line, lemmatized and morphologically tagged, year and domain information is given. Annotation is done with the Oslo-Bergen tagger. Sentences in the Bokmål standard have been removed.
This corpus is intended for use in the development of language technology.
Size: 3,3 million sentences, 57,5 million words.
Randomized extraction of the New Norwegian Corpus (Nynorskkorpuset).
Contains sentences in New Norwegian (Nynorsk) from the year 2000 and after. Tab-separated, one word pr. line, lemmatized and morphologically tagged, year and domain information is given. Annotation is done with the Oslo-Bergen tagger. Sentences in the Bokmål standard have been removed.
This corpus is intended for use in the development of language technology.
Size: 3,3 million sentences, 57,5 million words.
Extended metadata
dc:type
dc:title
Randomized extraction of the New Norwegian corpus
dc:identifier
oai:repo.clarino.uib.no:11509/140
dc:description
Randomized extraction of the New Norwegian Corpus (Nynorskkorpuset).
Contains sentences in New Norwegian (Nynorsk) from the year 2000 and after. Tab-separated, one word pr. line, lemmatized and morphologically tagged, year and domain information is given. Annotation is done with the Oslo-Bergen tagger. Sentences in the Bokmål standard have been removed.
This corpus is intended for use in the development of language technology.
Size: 3,3 million sentences, 57,5 million words.