Norwegian Newspaper Corpus annotated (2001-2009)

This is a subpart of the Norwegian Newspaper Corpus for bokmål, grammatically annotated with information about each word’s lemma, part of speech (word class) and morphological analysis based on an automatic analysis. Like the full Norwegian Newspaper Corpus, this annotated subpart is freely accessible text and represents modern Norwegian in the written variety Norwegian Bokmål, comprising 35 692 210 tokens and covering the time span 2001-2009.
Through the search interface Corpuscle, you may search for all running words in the text (tokens) and also search with the following attributes: lemma, pos (part of speech), morphology, source (newspaper name), year, date, gender of the author, author and language (Norwegian Bokmål and Nynorsk).

Extended metadata

Last ned metadata (CMDI XML)

Last ned metadata (CMDI XML)

dc:type	corpus
dc:title	Norwegian Newspaper Corpus annotated (2001-2009)
dc:identifier	oai:clarino.uib.no:avis
dc:description	This is a subpart of the Norwegian Newspaper Corpus for bokmål, grammatically annotated with information about each word’s lemma, part of speech (word class) and morphological analysis based on an automatic analysis. Like the full Norwegian Newspaper Corpus, this annotated subpart is freely accessible text and represents modern Norwegian in the written variety Norwegian Bokmål, comprising 35 692 210 tokens and covering the time span 2001-2009. Through the search interface Corpuscle, you may search for all running words in the text (tokens) and also search with the following attributes: lemma, pos (part of speech), morphology, source (newspaper name), year, date, gender of the author, author and language (Norwegian Bokmål and Nynorsk).
dc:publisher
dc:format	accessibleThroughInterface
dc:date	1998
dc:date
dc:rights	Public
dc:rights	Creative Commons (CC)
dc:rights	Creative_Commons-BY-NC (CC-BY-NC)
dc:rights	http://creativecommons.org/licenses/by-nc/4.0/
dc:lang	Norwegian
dc:lang	Norwegian Bokmål

Norwegian Newspaper Corpus annotated (2001-2009)

Extended metadata

Dublin Core (DC)

Last ned metadata (CMDI XML)