Norwegian Newspaper Corpus annotated (2001-2009)

CLARINO UiB - Corpuscle

Lisens: Creative_Commons-BY (CC-BY)

Oppdatert: 2016-02-12

This is a subpart of Norsk aviskorpus, grammatically annotated and classified. It comprises 35 692 210 tokens and covers Norwegian bokmål in the time span 2001-2009.

The full Norwegian Newspaper Corpus (NNC) Bokmål version is a large monitor corpus representing contemporary Norwegian language in the written variety Norwegian Bokmål and Norwegian nynorsk.

The corpus is compiled through daily harvesting and processing of published texts from the web edition of Norwegian newspapers.

The annotated version available in Corpuscle is annotated with newspaper title and date, and spans from 2001 to 2009.

To search in the full corpus until today's date, go to: That search interface has less functions than in Corpuscle, but allows you to search in the newest material. The Corpuscle material will be updated regularly.

For questions about the contents of the corpus, contact Knut Hofland (Uni Research Computing). For questions related to the Corpuscle version, contact Paul Meurer (Uni Research Computing). See Contact information in metadata.

To read more about the corpus, cf.

Vis utvidede metadata

The link will take you to an external site: We take no responsibility whatsoever for the content of external links.