Skip to content

Norwegian Newspaper Corpus Bokmål

The Norwegian Newspaper Corpus (NNC) Bokmål version is a large monitor corpus representing contemporary Norwegian language in the written variety Norwegian Bokmål.
A corresponding corpus is available for Norwegian nynorsk, see URL in metadata.
The corpus is compiled through daily harvesting and processing of published texts from the web edition of Norwegian newspapers.

The version available in Corpuscle is annotated with newspaper title and date, and spans from October 1998 to May 2020.

To search in the full corpus until today’s date, go to: http://avis.uib.no/avis/sok/copy_of_sok-i-hele-korpuset. That search interface has less functions than in Corpuscle, but allows you to search in the newest material. The Corpuscle material will be updated regularly.

For questions about the contents of the corpus, contact Knut Hofland (Uni Research Computing). For questions related to the Corpuscle version, contact Paul Meurer (Uni Research Computing). See Contact information in metadata.

To refer to the Norwegian newspaper Corpus, we suggest the following references:
Norwegian Newspaper Corpus Bokmål. 2020. Created by the project Norsk aviskorpus. Distributed by the CLARINO Bergen Centre: hdl:11495/D9B5-0349-4330-0

Andersen, Gisle, and Knut Hofland. 2012. “Building a Large Corpus Based on Newspapers from the Web.” In Exploring Newspaper Language: Using the Web to Create and Investigate a Large Corpus of Modern Norwegian, edited by Gisle Andersen, 1–28. Studies in Corpus Linguistics 49. Amsterdam/Philadelphia: John Benjamins Publishing Company

The Norwegian Newspaper Corpus (NNC) Bokmål version is a large monitor corpus representing contemporary Norwegian language in the written variety Norwegian Bokmål.
A corresponding corpus is available for Norwegian nynorsk, see URL in metadata.
The corpus is compiled through daily harvesting and processing of published texts from the web edition of Norwegian newspapers.

The version available in Corpuscle is annotated with newspaper title and date, and spans from October 1998 to May 2020.

To search in the full corpus until today’s date, go to: http://avis.uib.no/avis/sok/copy_of_sok-i-hele-korpuset. That search interface has less functions than in Corpuscle, but allows you to search in the newest material. The Corpuscle material will be updated regularly.

For questions about the contents of the corpus, contact Knut Hofland (Uni Research Computing). For questions related to the Corpuscle version, contact Paul Meurer (Uni Research Computing). See Contact information in metadata.

To refer to the Norwegian newspaper Corpus, we suggest the following references:
Norwegian Newspaper Corpus Bokmål. 2020. Created by the project Norsk aviskorpus. Distributed by the CLARINO Bergen Centre: hdl:11495/D9B5-0349-4330-0

Andersen, Gisle, and Knut Hofland. 2012. “Building a Large Corpus Based on Newspapers from the Web.” In Exploring Newspaper Language: Using the Web to Create and Investigate a Large Corpus of Modern Norwegian, edited by Gisle Andersen, 1–28. Studies in Corpus Linguistics 49. Amsterdam/Philadelphia: John Benjamins Publishing Company

Extended metadata

Download metadata