Resources from the resource bank Archive - Page 2 of 126 - Språkbanken

National Library of Norway Språkbanken

I samarbeid med

TeflonNorL2 NOCASA Challenge Dataset

This is a specialized version of the data set that has been used for the Non-native Children’s Automatic Speech Assessment Challenge (NOCASA), https://teflon.aalto.fi/nocasa-2025/, hosted by the …

Distributed by:
Language Bank
Licence:
all rights reserved
Type:
Speech, Text
Updated:
23.03.2024
Grapheme-to-Phoneme Models for Norwegian Bokmål

This resource contains Grapheme-to-Phoneme (G2P) models for Norwegian Bokmål, which have been adapted to the G2P system Phonetisaurus (https://github.com/AdolfVonKleist/Phonetisaurus). The G2P models …

Language:
Distributed by:
Language Bank
Licence:
Creative_Commons-ZERO (CC-ZERO)
Type:
Tool
Updated:
09.02.2024
Målfrid 2024 – Freely Available Documents from Norwegian State Institutions

This corpus consists of documents from 497 domains of Norwegian state institutions and comprises approximately 2.6 billion tokens in total. In addition to Norwegian Bokmål and Nynorsk texts, the …

Language:
Norwegian Bokmål, Norwegian Nynorsk, English, Northern Sami, Southern Sami, Lule Sami
Distributed by:
Language Bank
Licence:
Norwegian Licence for Open Government Data (NLOD)
Type:
Text
Updated:
31.01.2024
Glossa

Glossa is a tool for researchers who want to search linguistically annotated corpora. Glossa is designed to make it easy for researchers to: - create complex searches - explore the result via e.g. …

Language:
Distributed by:
CLARINO Text Laboratory Centre
Licence:
MIT license
Type:
Tool
Updated:
11.01.2024
NST Norwegian ASR Database (16 kHz) – Reorganized

This database was created by Nordic Language Technology for the development of automatic speech recognition and dictation in Norwegian. In this version (from 2022), the organization of the data has …

Language:
Norwegian
Distributed by:
Language Bank
Licence:
Creative_Commons-ZERO (CC-ZERO)
Type:
Speech, Text
Updated:
19.12.2023
Mapping between Norwegian municipalities and dialect regions

This resource provides a mapping between Norwegian municipalities and dialect regions, and can be used, e.g., to infer the dialect region of a speaker in a speech dataset based on their place of …

Distributed by:
Language Bank
Licence:
Creative_Commons-BY (CC-BY)
Type:
Tool
Updated:
20.11.2023
Stortinget Speech Corpus version 1.0

The Stortinget Speech Corpus (SSC) is a 5000+ hours speech dataset for weak supervision ASR created from audio and aligned proceedings text from Stortinget, the Norwegian Parliament. It contains …

Language:
Norwegian
Distributed by:
Language Bank
Licence:
Creative_Commons-ZERO (CC-ZERO)
Type:
Speech, Text
Updated:
15.11.2023
NDT 2.0 with Constituent Structure

In this version of the Norwegian Dependency Treebank 2.0 constituent structure (c-structure) similar to the one found in NorGramBank has been added. This can be used to train one syntactic parser for …

Language:
Norwegian Bokmål, Norwegian Nynorsk
Distributed by:
Language Bank
Licence:
Creative_Commons-ZERO (CC-ZERO)
Type:
Text
Updated:
27.10.2023
spaCy for Norwegian Nynorsk

These spaCy models are trained on the NorNE dataset in a version compatible with Universal Dependencies. spaCy is a widely used library in python for language technology applications. spaCy does not …

Distributed by:
Language Bank
Licence:
MIT license
Type:
Tool
Updated:
20.10.2023
Norwegian Dependency Treebank 2.0

This is version 2.0 of the Norwegian Dependency Treebank (NDT), developed by the National Library of Norway in 2011-2014. In version 2.0 of NDT, the grammatical annotations remain the same as in the …

Language:
Norwegian Bokmål, Norwegian Nynorsk
Distributed by:
Language Bank
Licence:
Creative_Commons-ZERO (CC-ZERO)
Type:
Text
Updated:
24.08.2023