Språkbanken - a language technology resource collection for Norwegian
Språkbanken offers digital language resources for use in research and in the development of language technology. The resources can be downloaded from Språkbanken’s website free of charge. The collection is expanding continuously.
Språkbanken is a service to that part of the ICT-industry which works with the development of language-based ICT-solutions, to researchers within language technology and linguistics, and to public enterprises which develop electronic solutions for public services. Among other things, Språkbanken contains corpora of written and spoken language, i.e. large collections of text and speech in machine-readable format.
Språkbanken is a national infra-structure initiative designed to ensure that language technology solutions based on the Norwegian language will be developed, and thereby prevent domain loss of Norwegian in technology-dependent areas (see the white paper Mål og meining – “Language and Meaning”).
The goal of CLARINO is to realise the Norwegian part of the European CLARIN project. The ultimate goal is to make existing and future language resources readily accessible to researchers and to bring e-science to the humanities.
Examples of such resources are text corpora (large collections of digital text), collections of collected speech, digital dictionaries and databases of various types.
CLARIN is all about conservation, reuse, disclosure and sharing of research data within the humanities. The idea is that researchers at European research institutions simple possible will gain access to research resources in their own and other European countries via a joint search system; the institutions in the CLARIN network will set up services for searching their own resources, and based on the restrictions imposed on the individual resource (copyright, privacy etc.), the individual researcher will be given access to the resource by authenticating themselves via their home institution.
The primary task of The National Library in CLARINO is to act as the national coordinator for the harvesting and exchange of metadata between the Norwegian CLARINO institutions and to make these available to the total of institutions in the CLARIN network. The National Library also functions as a content provider of language resources, to the CLARIN network and others.
In Mål og meining – ein heilskapleg norsk språkpolitikk* (White paper No. 35, 2007-2008), the Government signalled that Språkbanken be established
On assignment from the Ministry of Culture, a working group appointed by the Language Council of Norway prepared a plan for the establishment of Språkbanken in 2008. In 2009, questions of organisation and affiliation were more closely examined, and the National Library was commissioned to realise Språkbanken.
In the abovementioned white paper, the overarching goals of Språkbanken are expressed. Språkbanken shall build up and secure the quality of digital language resources for Norwegian and ensure that these resources are easily accessible for the language technology industry, linguistic research and education, and for public administration. This way, the establishment of Språkbanken is a major language policy measure to ensure the development of language technology solutions based on the Norwegian language. Språkbanken is a high priority action to strengthen the use of Norwegian in an area that is typically dominated by the major world languages.
The need for establishing a publically available language resource collection was pointed out already in the 1990s. Preparatory reports have been presented earlier. One document available in English is the following report from 2002:
Consolidating and increasing the availability of Norwegian human language technology resources (pdf) – The Language Council of Norway
The white paper is available (in Norwegian only) at the Government’s website:
To ensure that Språkbanken maintains a continuous contact and dialogue with stakeholders in the market and the research communities, the National Library has appointed an advisory council of professionals.
The Council is an areana which attends to the exchange of information between relevant institutions and stakeholders in the language technology market. The council will thus contribute to identifying user needs, help Språkbanken to prioritise between these needs, and it has a strategic and professional guiding function in the question of further development and expansion of Språkbanken. The Council shall also ensure that Språkbanken is developed in line with prevailing language policy guidelines.
The council consists of representatives from the Universities of Bergen, Oslo and Tromsø, the Norwegian University of Science and Technology (Trondheim), the Research Council of Norway, The Norwegian School of Economics (NHH), Telenor, IBM Norway, Companybook, the Language Council of Norway and a representative from the National Library. As of August 2012, its members are:
- Companybook: Fredrik Jørgensen
- The Language Council of Norway: Nina Teigland
- The National Library: Jon Arild Olsen
- IBM Norway: Roar Fundingsrud
- The Norwegian School of Economics (NHH): Marita Kristiansen
- Norwegian University of Science and Technology: Torbjørn Svendsen
- The Research Council of Norway: Siri Lader Bruhn
- Telenor: Knut Kvale
- The University of Bergen: Victoria Rosén
- The University of Oslo: Janne Bondi Johannessen
- The University of Tromsø: Trond Trosterud
The Council meets at least once a year.