Web-based corpus of Bokmål Norwegian containing about 700 million tokens. The corpus has been built by crawling, downloading and processing web documents in the .no top-level internet domain between November 2009 and January 2010. NoWaC has been built with permission from the Norwegian Ministry of Culture (Kulturdepartementet).
There are no information about author, publisher, genre etc in the corpus.
NoWaC can be downloaded (scrambled version) or accessed through a search interface (Glossa).
The link will take you to an external site: We take no responsibility whatsoever for the content of external links.