Digitized books in xml-format

CLARINO NB – Språkbanken

Lisens: Creative_Commons-ZERO (CC-ZERO)

Oppdatert: 2016-02-03

The National Library is in the process of digitizing its entire collection. This digitization generates a large amount of xml-files that contain all information about the object that is digitized (bibliographic metadata, structural analysis (chapterisation, pagination, section), OCR analysis etc.)

From the links below you can download xml versions of the digitized written material (mostly books) that can be distributed freely. This encompasses older material as well as white papers etc. of recent date. The material contains approximately 9,000 titles.

The index provides an overview of the contents of this material. The index is in plain text format (tab seperated) with the following columns:

1.Digibok_ID: This identification can be used to retrieve each single title in the data files (eg: digibok_2009073101106).

2.Year of publication: The first four numbers indicate which year the book / publication was published (eg: 20011231). The last four numbers are always 1231. If the first four numbers are 9999, this means the year of publication is unknown.

3.Title: Title of publication (eg: Ny livsforsikringslovgivning: utredning nr 7 fra Banklovkommisjonen: utredning fra Banklovkommisjonen oppnevnt ved kongelig resolusjon 6. april 1990: avgitt til Finansdepartementet 29. juni 2001).

4.Author: Name of author or institution (eg: Finansdepartementet).

5.Publisher: Publishing house or institution (eg: Statens forvaltningstjeneste, Informasjonsforvaltning).

