N-grams for Norwegian Bokmål (based on NNC and NST news text)

CLARINO NB – Språkbanken

Lisens: Creative_Commons-ZERO (CC-ZERO)

Oppdatert: 2016-01-29

These n-grams are derived from the Norwegian Newspaper Corpus and part of the Text Corpus from Nordisk språkteknologi (NST). In total, the source material consists of 1175 million words of running text. In this version, the n-grams are sorted alphabetically and by frequency, respectively. Frequency lists (unigrams) are published in a separate distribution. There is also a "light" version available, listing the 1000 most frequent n-grams.

