This is a subpart of the Norwegian Newspaper Corpus for bokmål, grammatically annotated with information about each word’s lemma, part of speech (word class) and morphological analysis based on an automatic analysis. Like the full Norwegian Newspaper Corpus, this annotated subpart is freely accessible text and represents modern Norwegian in the written variety Norwegian Bokmål, comprising 35 692 210 tokens and covering the time span 2001-2009.
Through the search interface Corpuscle, you may search for all running words in the text (tokens) and also search with the following attributes: lemma, pos (part of speech), morphology, source (newspaper name), year, date, gender of the author, author and language (Norwegian Bokmål and Nynorsk).
This is a subpart of the Norwegian Newspaper Corpus for bokmål, grammatically annotated with information about each word’s lemma, part of speech (word class) and morphological analysis based on an automatic analysis. Like the full Norwegian Newspaper Corpus, this annotated subpart is freely accessible text and represents modern Norwegian in the written variety Norwegian Bokmål, comprising 35 692 210 tokens and covering the time span 2001-2009.
Through the search interface Corpuscle, you may search for all running words in the text (tokens) and also search with the following attributes: lemma, pos (part of speech), morphology, source (newspaper name), year, date, gender of the author, author and language (Norwegian Bokmål and Nynorsk).
Extended metadata
resource Common Info
resource Type: corpus
identification Info
resource Name: Norwegian Newspaper Corpus annotated (2001-2009)
resource Name: Norsk aviskorpus annotert (2001-2009)
description: This is a subpart of the Norwegian Newspaper Corpus for bokmål, grammatically annotated with information about each word’s lemma, part of speech (word class) and morphological analysis based on an automatic analysis. Like the full Norwegian Newspaper Corpus, this annotated subpart is freely accessible text and represents modern Norwegian in the written variety Norwegian Bokmål, comprising 35 692 210 tokens and covering the time span 2001-2009.
Through the search interface Corpuscle, you may search for all running words in the text (tokens) and also search with the following attributes: lemma, pos (part of speech), morphology, source (newspaper name), year, date, gender of the author, author and language (Norwegian Bokmål and Nynorsk).
title: The annotated part of the Norwegian Newspaper corpus contains a subset of the full newspaper text in the NNC, and has text from the following newspapers:
AP – Aftenposten
DB – Dagbladet
DT – Dag og tid
NA – Nationen
SO – Sogn avis
linguality Info
linguality Type: monolingual
language Info
language Id: no
language Name: Norwegian
language Info
language Id: nb
language Name: Norwegian Bokmål
modality Info
modality Type: writtenLanguage
size Info
size: 35 692 210
size Unit: tokens
annotation Info
annotation Type: other
annotation Format: Annotated with newspaper title and date.
classification Info
genre Info
genre Type: textGenre
genre: newspaper and magazines
time Coverage Info
time Coverage: 2001 – 2009
dc:type
corpus
dc:title
Norwegian Newspaper Corpus annotated (2001-2009)
dc:identifier
oai:clarino.uib.no:avis
dc:description
This is a subpart of the Norwegian Newspaper Corpus for bokmål, grammatically annotated with information about each word’s lemma, part of speech (word class) and morphological analysis based on an automatic analysis. Like the full Norwegian Newspaper Corpus, this annotated subpart is freely accessible text and represents modern Norwegian in the written variety Norwegian Bokmål, comprising 35 692 210 tokens and covering the time span 2001-2009.
Through the search interface Corpuscle, you may search for all running words in the text (tokens) and also search with the following attributes: lemma, pos (part of speech), morphology, source (newspaper name), year, date, gender of the author, author and language (Norwegian Bokmål and Nynorsk).