Resources from the resource bank Archive - Page 41 of 126 - Språkbanken

Nasjonalbiblioteket Språkbanken

I samarbeid med

NoReC: The Norwegian Review Corpus

While the NoReC dataset was primarily created for training and evaluating models for document-level sentiment analysis, many other use cases are of course possible. The corpus comprises more than …

Distributed by:
CLARINO Bergen Centre
The Level Stress Recordings: NGbr_89 and NGbr_90

Recording equipment The recordings were done by means of a cassette recorder (Sony TC-D5M) and Sony lavaliere microphones. The recordings took place in the speakers’ homes or in a hotel room. Sigurd …

Distributed by:
CLARINO Bergen Centre
The Level Stress recordings: Vinäs_08

Recording equipment The recordings were done by means of a digital recorder (Fostex FR-2LE) and two AKG C451 B microphones placed on the table in front of the speakers. The recording took place in one …

Distributed by:
CLARINO Bergen Centre
The Level Stress Recordings

This collection consists of scripted recordings from different rural dialects spoken in Norway and Sweden, in total 33 recordings of 46 different speakers. The speakers’ year of birth ranges from …

Distributed by:
CLARINO Bergen Centre
Norwegian Sign Language Corpus – Halvorsen (2012)

The Norwegian Sign Language Corpus is a collection of four datasets, collected at different times and for different projects: – The first dataset was collected as part of a doctoral research …

Distributed by:
CLARINO Bergen Centre
Randomized extraction of the New Norwegian corpus

Randomized extraction of the New Norwegian Corpus (Nynorskkorpuset). Contains sentences in New Norwegian (Nynorsk) from the year 2000 and after. Tab-separated, one word pr. line, lemmatized and …

Distributed by:
CLARINO Bergen Centre
The Kola Peninsula Spoken Corpus (KoPeSC) 1: Spoken Corpus to “Речь поморов Терского берега Белого моря: Звучащая хрестоматия” [“Pomor Speech on the Ter Coast of the White Sea: A spoken anthology”] (Slavica Bergensia 15)

The Kola Peninsula Spoken Corpus (KoPeSC) is a dataset of sound recordings and their transcriptions in ELAN of Pomor Russian dialect speech and of Sámi and Russian speech as spoken by the indigenous …

Distributed by:
CLARINO Bergen Centre
[MCSQ]: The Multilingual Corpus of Survey Questionnaires

The Multilingual Corpus of Survey Questionnaires (MCSQ) is the very first publicly available multilingual database comprised of international survey texts. Its latest version (Rosalind Franklin), is …

Distributed by:
CLARINO Bergen Centre
Norwegian Sign Language corpus – Depicting Perspective

The Norwegian Sign Language Corpus is a collection of four datasets, collected at different times and for different projects: – The first dataset was collected as part of a doctoral research …

Distributed by:
CLARINO Bergen Centre
WAB XML transcriptions of Wittgenstein’s Nachlass > 1st subset of 5000 pages with license CC BY-NC 3.0

During his lifetime, the Austrian-British philosopher Ludwig Wittgenstein (1889–1951) published only one philosophical book, the Logisch-philosophische Abhandlung / Tractatus logico-philosophicus …

Distributed by:
CLARINO Bergen Centre