Resources from the resource bank Archive - Språkbanken

National Library of Norway Språkbanken

I samarbeid med

Innbyggjarnamn (Demonyms)

This dataset contains demonyms from different places in Norway. It is based on the Innbyggjarnamn table from the Språkrådet website: …

Language:
Norwegian
Distributed by:
Language Bank
Licence:
Norwegian Licence for Open Government Data (NLOD)
Type:
Lexicon
Updated:
03.11.2025
SNOMED CT – English Terms Translated to Norwegian Bokmål and Norwegian Nynorsk

SNOMED CT is a systematic collection of healthcare-related concepts used to document and share information in patient care within the health and care services. The terminology covers healthcare …

Language:
English, Norwegian Bokmål, Norwegian Nynorsk
Distributed by:
Language Bank
Licence:
Creative_Commons-ZERO (CC-ZERO)
Type:
Lexicon
Updated:
03.07.2025
Administrative law concepts in Norwegian sign language

This dataset consists of 32 films with explanations of key administrative law concepts in Norwegian sign language. The films are produced at the Department of Public and International Law, The Faculty …

Language:
Norwegian Sign Language
Distributed by:
Language Bank
Licence:
Creative_Commons-BY-NC (CC-BY-NC)
Type:
Text, Video
Updated:
06.05.2025
Norwegian Newspaper Corpus annotated (2001-2009)

This is a subpart of the Norwegian Newspaper Corpus for bokmål, grammatically annotated with information about each word’s lemma, part of speech (word class) and morphological analysis based on an …

Language:
Norwegian, Norwegian Bokmål
Distributed by:
CLARINO Bergen Centre
Licence:
Creative_Commons-BY-NC (CC-BY-NC)
Type:
Text
Updated:
16.04.2025
Norwegian Newspaper Corpus Nynorsk

The Norwegian Newspaper Corpus (Nynorsk) is a freely accessible text corpus representing modern Norwegian in the written variety Norwegian Nynorsk. As of today, the material contains texts from 1998 …

Language:
Norwegian, Norwegian Nynorsk
Distributed by:
CLARINO Bergen Centre
Licence:
Creative_Commons-BY (CC-BY)
Type:
Text
Updated:
14.04.2025
Målfrid 2025 – Freely Available Documents from Norwegian State Institutions

This corpus consists of documents from 493 domains of Norwegian state institutions and comprises approximately 2.4 billion tokens in total. In addition to Norwegian Bokmål and Nynorsk texts, the …

Language:
Norwegian Bokmål, Norwegian Nynorsk, English, Northern Sami, Southern Sami, Lule Sami
Distributed by:
Language Bank
Licence:
Norwegian Licence for Open Government Data (NLOD)
Type:
Text
Updated:
31.01.2025
Synthetic text images for North, South, Lule and Inare Sámi

This dataset contains synthetic line images meant for fitting OCR models for North, South, Lule and Inari Sámi. Clean line images are created using Pillow and they are subsequently distorted using …

Language:
Distributed by:
Language Bank
Licence:
Creative_Commons-BY (CC-BY)
Type:
Tool
Updated:
28.01.2025
OCR Models for Sámi Languages

This is a collection of models for OCR (optical character recognition) of Sámi languages. These can be used to recognize text in images of printed text (scanned books, magazines, etc.) in North …

Language:
Distributed by:
Language Bank
Licence:
Creative_Commons-BY (CC-BY)
Type:
Tool
Updated:
22.01.2025
Norwegian idioms

This dataset consists of 3537 Norwegian idioms and phrases that appear more than 100 times in the online library of the National Library of Norway. There are 3455 idioms in Norwegian Bokmål and 88 in …

Language:
Norwegian Bokmål, Norwegian Nynorsk
Distributed by:
Language Bank
Licence:
Creative_Commons-ZERO (CC-ZERO)
Type:
Text
Updated:
10.10.2024
Norwegian Government Press Conference Speech Corpus

The Norwegian Government Press Conference Speech Corpus (NorGovPCC) consists of approximately 138 hours of speech generated from audio with aligned subtitles from press conferences published by the …

Distributed by:
Language Bank
Licence:
Norwegian Licence for Open Government Data (NLOD)
Type:
Speech
Updated:
10.07.2024