<OAI-PMH xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns="http://www.openarchives.org/OAI/2.0/" xsi:schemaLocation="http://www.openarchives.org/OAI/2.0/          http://www.openarchives.org/OAI/2.0/OAI-PMH.xsd">
  <responseDate>2026-07-03T04:18:25.424Z</responseDate>
  <request verb="GetRecord">https://www.nb.no/sprakbanken/oai</request>
  <GetRecord>
    <record>
      <header>
        <identifier>oai:nb.no:sbr-63</identifier>
        <datestamp/>
      </header>
      <metadata>
        <cmd:CMD xmlns:cmd="http://www.clarin.eu/cmd/1" xmlns="http://www.clarin.eu/cmd/" xmlns:cmdp="http://www.clarin.eu/cmd/1/profiles/clarin.eu:cr1:p_1407745711925" CMDVersion="1.2" xsi:schemaLocation="http://www.clarin.eu/cmd/1 https://infra.clarin.eu/CMDI/1.x/xsd/cmd-envelop.xsd http://www.clarin.eu/cmd/1/profiles/clarin.eu:cr1:p_1407745711925 https://catalog.clarin.eu/ds/ComponentRegistry/rest/registry/1.1/profiles/clarin.eu:cr1:p_1407745711925/1.2/xsd">
          <cmd:Header>
            <cmd:MdCreator>Arne Martinus Lindstad</cmd:MdCreator>
            <cmd:MdCreationDate>2021-06-22</cmd:MdCreationDate>
            <cmd:MdSelfLink>https://www.nb.no/sprakbanken/oai?verb=GetRecord&amp;identifier=oai:nb.no:sbr-63&amp;metadataPrefix=cmdi</cmd:MdSelfLink>
            <cmd:MdProfile>clarin.eu:cr1:p_1407745711925</cmd:MdProfile>
            <cmd:MdCollectionDisplayName>Språkbanken NB</cmd:MdCollectionDisplayName>
          </cmd:Header>
          <cmd:Resources>
            <cmd:ResourceProxyList>
              <cmd:ResourceProxy id="doffin_tm">
                <cmd:ResourceType mimetype="application/x-gtar">Resource</cmd:ResourceType>
                <cmd:ResourceRef>https://www.nb.no/sbfil/tekst/2020_doffin_tm.tar.gz</cmd:ResourceRef>
              </cmd:ResourceProxy>
              <cmd:ResourceProxy id="doffin_readme">
                <cmd:ResourceType mimetype="application/pdf">Resource</cmd:ResourceType>
                <cmd:ResourceRef>https://www.nb.no/sbfil/dok/2020_doffin_tm.pdf</cmd:ResourceRef>
              </cmd:ResourceProxy>
            </cmd:ResourceProxyList>
            <cmd:JournalFileProxyList/>
            <cmd:ResourceRelationList>
              <cmd:ResourceRelation>
                <cmd:RelationType>describes</cmd:RelationType>
                <cmd:Resource>
                  <cmd:Role>
                    <cmd:Resource>
                      <cmd:Role/>
                    </cmd:Resource>
                  </cmd:Role>
                </cmd:Resource>
              </cmd:ResourceRelation>
            </cmd:ResourceRelationList>
          </cmd:Resources>
          <cmd:IsPartOfList/>
          <cmd:Components>
            <cmdp:corpusProfile>
              <cmdp:resourceCommonInfo>
                <cmdp:resourceType>corpus</cmdp:resourceType>
                <cmdp:identificationInfo>
                  <cmdp:resourceName xml:lang="en">Translation Memory from Doffin</cmdp:resourceName>
                  <cmdp:resourceName xml:lang="nn">Omsetjingsminne frå Doffin</cmdp:resourceName>
                  <cmdp:description xml:lang="en">This corpus contains data from Doffin, the Norwegian web-based database for notices of public procurement and procurement in the utility sector, managed by The Norwegian Agency for Public and Financial Management.

The Language Bank received the data in the form of an XML database dump. The dump consisted of 41,143 document pairs (original and translation). 40,631 of these were translations from Norwegian to English. Only the latter are included in the corpus. Of the originally Norwegian documents, 39,893 were in Norwegian Bokmål and 736 in Norwegian Nynorsk.

Original and translation were first aligned on document level using an internal document identifier, then the sentences were extracted using the NLTK Punkt Sentence Tokenizer and aligned using Hunalign. Duplicate translations (exact duplicates) were discarded.

We recorded a total of 293,649 translation units (TUs) for Norwegian Bokmål to English, and 6,342 TUs for Norwegian Nynorsk to English. A TU is a translation pair with an original text and a parallelized translation, and usually corresponds to a more or less meaningful linguistic unit, typically a sentence, a heading etc. A TU may also consist of a single word or several clauses. The translation units for the two languages are distributed as two separate files, both in TMX 1.4 format (a variant of XML).</cmdp:description>
                  <cmdp:description xml:lang="nn">Dette korpuset inneheld data frå Doffin, den nasjonale kunngjeringsbasen for offentlege anskaffingar, forvalta av Direktoratet for Forvaltning og Økonomistyring (DFØ). Språkbanken fekk dataa i from av ein dump av ein XML-database.

Dumpen bestod av 41.143 dokumentpar (originalar og omsetjingar). 40.631 av desse var omsetjingar frå norsk til engelsk. Berre desse er inkluderte i korpuset. Av dei opphavleg norske dokumenta er 39.893 på bokmål og 736 på nynorsk.

Original og omsetjing vart først parallelliserte på dokumentnivå ved hjelp av ein intern dokumentidentifikator, deretter vart setningane identifiserte med NLTK Punkt Sentence Tokenizer og parallelliserte ved å nytte Hunalign. Dupliserte omsetjingar (eksakte duplikat) vart kasserte.

Totalt fann me 293.649 omsetjingseiningar (Translation Units – TU) for bokmål til engelsk, og 6.342 TUar for nynorsk til engelsk. Ein TU er eit omsetjingspar med ei originaltekst og ei parallellstilt omsetjing, og svarar vanlegvis til ei meir eller mindre meiningsberande språkleg eining, typisk ei setning, overskrift eller liknande. Ein TU kan òg bestå ev eit enkeltord eller fleire setningar. Omsetjingseiningane for bokmål og nynorsk vert distribuerte som to separate filer, begge i TMX 1.4-format (ein variant av XML).</cmdp:description>
                  <cmdp:url cmd:description="resource homepage">https://www.nb.no/sprakbanken/ressurskatalog/oai-nb-no-sbr-63/</cmdp:url>
                  <cmdp:PID cmd:description="hdl">hdl:21.11146/63</cmdp:PID>
                  <cmdp:identifier>sbr-63</cmdp:identifier>
                </cmdp:identificationInfo>
                <cmdp:distributionInfo>
                  <cmdp:licenceInfo>
                    <cmdp:userCategory>Public</cmdp:userCategory>
                    <cmdp:distributionAccessMedium>downloadable</cmdp:distributionAccessMedium>
                    <cmdp:downloadLocation cmd:description="resource homepage">https://www.nb.no/sprakbanken/ressurskatalog/oai-nb-no-sbr-63/</cmdp:downloadLocation>
                    <cmdp:licence>
                      <cmdp:licenceFamily>Creative Commons (CC)</cmdp:licenceFamily>
                      <cmdp:licenceName>Creative_Commons-ZERO (CC-ZERO)</cmdp:licenceName>
                      <cmdp:licenceURL>https://creativecommons.org/publicdomain/zero/1.0/</cmdp:licenceURL>
                    </cmdp:licence>
                    <cmdp:licensor>
                      <cmdp:actorInfo>
                        <cmdp:actorType>organization</cmdp:actorType>
                        <cmdp:role xml:lang="en">Licensor</cmdp:role>
                        <cmdp:organizationInfo>
                          <cmdp:organizationName xml:lang="en">Norwegian Agency for Public and Financial Management</cmdp:organizationName>
                          <cmdp:organizationName xml:lang="nn">Direktoratet for Forvaltning og Økonomistyring</cmdp:organizationName>
                          <cmdp:organizationShortName xml:lang="en">DFØ</cmdp:organizationShortName>
                          <cmdp:organizationShortName xml:lang="nn">DFØ</cmdp:organizationShortName>
                          <cmdp:departmentName xml:lang="en">Doffin</cmdp:departmentName>
                          <cmdp:departmentName xml:lang="nn">Doffin</cmdp:departmentName>
                        </cmdp:organizationInfo>
                      </cmdp:actorInfo>
                    </cmdp:licensor>
                    <cmdp:distributionRightsHolder>
                      <cmdp:actorInfo>
                        <cmdp:actorType>organization</cmdp:actorType>
                        <cmdp:role xml:lang="en">Distribution Rights Holder</cmdp:role>
                        <cmdp:organizationInfo>
                          <cmdp:organizationName xml:lang="en">National Library of Norway</cmdp:organizationName>
                          <cmdp:organizationName xml:lang="nn">Nasjonalbiblioteket</cmdp:organizationName>
                          <cmdp:organizationShortName xml:lang="en">NLN</cmdp:organizationShortName>
                          <cmdp:organizationShortName xml:lang="nn">NB</cmdp:organizationShortName>
                          <cmdp:departmentName xml:lang="en">The Language Bank</cmdp:departmentName>
                          <cmdp:departmentName xml:lang="nn">Språkbanken</cmdp:departmentName>
                        </cmdp:organizationInfo>
                        <cmdp:communicationInfo>
                          <cmdp:email>sprakbanken@nb.no</cmdp:email>
                          <cmdp:url>https://www.nb.no/sprakbanken/</cmdp:url>
                          <cmdp:address>P.O. Box 2674 Solli</cmdp:address>
                          <cmdp:zipCode>0203</cmdp:zipCode>
                          <cmdp:city>Oslo</cmdp:city>
                          <cmdp:region>Oslo</cmdp:region>
                          <cmdp:country>Norway</cmdp:country>
                        </cmdp:communicationInfo>
                      </cmdp:actorInfo>
                    </cmdp:distributionRightsHolder>
                  </cmdp:licenceInfo>
                </cmdp:distributionInfo>
                <cmdp:contact>
                  <cmdp:actorInfo>
                    <cmdp:actorType>organization</cmdp:actorType>
                    <cmdp:role xml:lang="en">Contact</cmdp:role>
                    <cmdp:organizationInfo>
                      <cmdp:organizationName xml:lang="en">National Library of Norway</cmdp:organizationName>
                      <cmdp:organizationName xml:lang="nn">Nasjonalbiblioteket</cmdp:organizationName>
                      <cmdp:organizationShortName xml:lang="en">NLN</cmdp:organizationShortName>
                      <cmdp:organizationShortName xml:lang="nn">NB</cmdp:organizationShortName>
                      <cmdp:departmentName xml:lang="en">The Language Bank</cmdp:departmentName>
                      <cmdp:departmentName xml:lang="nn">Språkbanken</cmdp:departmentName>
                    </cmdp:organizationInfo>
                    <cmdp:communicationInfo>
                      <cmdp:email>sprakbanken@nb.no</cmdp:email>
                      <cmdp:url>https://www.nb.no/sprakbanken/</cmdp:url>
                      <cmdp:address>P.O. Box 2674 Solli</cmdp:address>
                      <cmdp:zipCode>0203</cmdp:zipCode>
                      <cmdp:city>Oslo</cmdp:city>
                      <cmdp:region>Oslo</cmdp:region>
                      <cmdp:country>Norway</cmdp:country>
                    </cmdp:communicationInfo>
                  </cmdp:actorInfo>
                </cmdp:contact>
                <cmdp:metadataInfo>
                  <cmdp:metadataCreationDate>2020-12-18</cmdp:metadataCreationDate>
                  <cmdp:metadataLanguageName>English</cmdp:metadataLanguageName>
                  <cmdp:metadataLanguageId>en</cmdp:metadataLanguageId>
                  <cmdp:metadataLastDateUpdated>2023-08-07</cmdp:metadataLastDateUpdated>
                  <cmdp:metadataCreator>
                    <cmdp:actorInfo>
                      <cmdp:actorType>person</cmdp:actorType>
                      <cmdp:role xml:lang="en">Metadata Creator</cmdp:role>
                      <cmdp:personInfo>
                        <cmdp:surname xml:lang="nn">Lindstad</cmdp:surname>
                        <cmdp:givenName xml:lang="nn">Arne Martinus</cmdp:givenName>
                        <cmdp:affiliation>
                          <cmdp:organizationInfo>
                            <cmdp:organizationName xml:lang="en">National Library of Norway</cmdp:organizationName>
                            <cmdp:organizationName xml:lang="nn">Nasjonalbiblioteket</cmdp:organizationName>
                            <cmdp:organizationShortName xml:lang="en">NLN</cmdp:organizationShortName>
                            <cmdp:organizationShortName xml:lang="nn">NB</cmdp:organizationShortName>
                            <cmdp:departmentName xml:lang="en">The Language Bank</cmdp:departmentName>
                            <cmdp:departmentName xml:lang="nn">Språkbanken</cmdp:departmentName>
                          </cmdp:organizationInfo>
                        </cmdp:affiliation>
                      </cmdp:personInfo>
                      <cmdp:communicationInfo>
                        <cmdp:email>sprakbanken@nb.no</cmdp:email>
                        <cmdp:url>https://www.nb.no/sprakbanken/</cmdp:url>
                        <cmdp:address>P.O. Box 2674 Solli</cmdp:address>
                        <cmdp:zipCode>0203</cmdp:zipCode>
                        <cmdp:city>Oslo</cmdp:city>
                        <cmdp:region>Oslo</cmdp:region>
                        <cmdp:country>Norway</cmdp:country>
                      </cmdp:communicationInfo>
                    </cmdp:actorInfo>
                  </cmdp:metadataCreator>
                </cmdp:metadataInfo>
                <cmdp:versionInfo>
                  <cmdp:version>2020</cmdp:version>
                  <cmdp:lastDateUpdated>2020-11-04</cmdp:lastDateUpdated>
                </cmdp:versionInfo>
                <cmdp:validationInfo>
                  <cmdp:validated>true</cmdp:validated>
                  <cmdp:validationType>formal</cmdp:validationType>
                  <cmdp:validationMode>automatic</cmdp:validationMode>
                  <cmdp:validationModeDetails>Sentences extracted using the NLTK Punkt Sentence Tokenizer, aligned using Hunalign.</cmdp:validationModeDetails>
                  <cmdp:validationExtent>full</cmdp:validationExtent>
                  <cmdp:validator>
                    <cmdp:actorInfo>
                      <cmdp:actorType>person</cmdp:actorType>
                      <cmdp:role xml:lang="en">Resource Validator</cmdp:role>
                      <cmdp:personInfo>
                        <cmdp:surname xml:lang="nn">Birkenes</cmdp:surname>
                        <cmdp:givenName xml:lang="nn">Mgnus Breder</cmdp:givenName>
                        <cmdp:affiliation>
                          <cmdp:organizationInfo>
                            <cmdp:organizationName xml:lang="en">National Library of Norway</cmdp:organizationName>
                            <cmdp:organizationName xml:lang="nn">Nasjonalbiblioteket</cmdp:organizationName>
                            <cmdp:organizationShortName xml:lang="en">NLN</cmdp:organizationShortName>
                            <cmdp:organizationShortName xml:lang="nn">NB</cmdp:organizationShortName>
                            <cmdp:departmentName xml:lang="en">The Language Bank</cmdp:departmentName>
                            <cmdp:departmentName xml:lang="nn">Språkbanken</cmdp:departmentName>
                          </cmdp:organizationInfo>
                        </cmdp:affiliation>
                      </cmdp:personInfo>
                      <cmdp:communicationInfo>
                        <cmdp:email>sprakbanken@nb.no</cmdp:email>
                        <cmdp:url>https://www.nb.no/sprakbanken/</cmdp:url>
                        <cmdp:address>P.O. Box 2674 Solli</cmdp:address>
                        <cmdp:zipCode>0203</cmdp:zipCode>
                        <cmdp:city>Oslo</cmdp:city>
                        <cmdp:region>Oslo</cmdp:region>
                        <cmdp:country>Norway</cmdp:country>
                      </cmdp:communicationInfo>
                    </cmdp:actorInfo>
                  </cmdp:validator>
                </cmdp:validationInfo>
                <cmdp:resourceDocumentationInfo>
                  <cmdp:documentationUnstructured>
                    <cmdp:role>documentation</cmdp:role>
                    <cmdp:documentUnstructured>https://www.nb.no/sbfil/dok/2020_doffin_tm.pdf</cmdp:documentUnstructured>
                  </cmdp:documentationUnstructured>
                </cmdp:resourceDocumentationInfo>
                <cmdp:resourceCreationInfo>
                  <cmdp:creationEndDate>2020-11-04</cmdp:creationEndDate>
                  <cmdp:resourceCreator>
                    <cmdp:actorInfo>
                      <cmdp:actorType>organization</cmdp:actorType>
                      <cmdp:role xml:lang="en">Resource Creator</cmdp:role>
                      <cmdp:organizationInfo>
                        <cmdp:organizationName xml:lang="en">Norwegian Agency for Public and Financial Management</cmdp:organizationName>
                        <cmdp:organizationName xml:lang="nn">Direktoratet for Forvaltning og Økonomistyring</cmdp:organizationName>
                        <cmdp:organizationShortName xml:lang="en">DFØ</cmdp:organizationShortName>
                        <cmdp:organizationShortName xml:lang="nn">DFØ</cmdp:organizationShortName>
                        <cmdp:departmentName xml:lang="en">Doffin</cmdp:departmentName>
                        <cmdp:departmentName xml:lang="nn">Doffin</cmdp:departmentName>
                      </cmdp:organizationInfo>
                    </cmdp:actorInfo>
                    <cmdp:actorInfo>
                      <cmdp:actorType>person</cmdp:actorType>
                      <cmdp:role xml:lang="en">Resource Creator</cmdp:role>
                      <cmdp:personInfo>
                        <cmdp:surname xml:lang="nn">Birkenes</cmdp:surname>
                        <cmdp:givenName xml:lang="nn">Magnus Breder</cmdp:givenName>
                        <cmdp:affiliation>
                          <cmdp:organizationInfo>
                            <cmdp:organizationName xml:lang="en">National Library of Norway</cmdp:organizationName>
                            <cmdp:organizationName xml:lang="nn">Nasjonalbiblioteket</cmdp:organizationName>
                            <cmdp:organizationShortName xml:lang="en">NLN</cmdp:organizationShortName>
                            <cmdp:organizationShortName xml:lang="nn">NB</cmdp:organizationShortName>
                            <cmdp:departmentName xml:lang="en">The Language Bank</cmdp:departmentName>
                            <cmdp:departmentName xml:lang="nn">Språkbanken</cmdp:departmentName>
                          </cmdp:organizationInfo>
                        </cmdp:affiliation>
                      </cmdp:personInfo>
                      <cmdp:communicationInfo>
                        <cmdp:email>sprakbanken@nb.no</cmdp:email>
                        <cmdp:url>https://www.nb.no/sprakbanken/</cmdp:url>
                        <cmdp:address>P.O. Box 2674 Solli</cmdp:address>
                        <cmdp:zipCode>0203</cmdp:zipCode>
                        <cmdp:city>Oslo</cmdp:city>
                        <cmdp:region>Oslo</cmdp:region>
                        <cmdp:country>Norway</cmdp:country>
                      </cmdp:communicationInfo>
                    </cmdp:actorInfo>
                  </cmdp:resourceCreator>
                </cmdp:resourceCreationInfo>
              </cmdp:resourceCommonInfo>
              <cmdp:corpusInfo>
                <cmdp:corpusType>Written Corpus</cmdp:corpusType>
                <cmdp:corpusPartInfo>
                  <cmdp:mediaType>text</cmdp:mediaType>
                  <cmdp:corpusTextInfo>
                    <cmdp:textFormatInfo>
                      <cmdp:mimeType>application/x-tmx+xml</cmdp:mimeType>
                      <cmdp:sizePerTextFormat>
                        <cmdp:sizeInfo>
                          <cmdp:size>299991</cmdp:size>
                          <cmdp:sizeUnit>units</cmdp:sizeUnit>
                        </cmdp:sizeInfo>
                        <cmdp:sizeInfo>
                          <cmdp:size>2</cmdp:size>
                          <cmdp:sizeUnit>files</cmdp:sizeUnit>
                        </cmdp:sizeInfo>
                      </cmdp:sizePerTextFormat>
                    </cmdp:textFormatInfo>
                    <cmdp:characterEncodingInfo>
                      <cmdp:characterEncoding>UTF-8</cmdp:characterEncoding>
                    </cmdp:characterEncodingInfo>
                  </cmdp:corpusTextInfo>
                </cmdp:corpusPartInfo>
                <cmdp:corpusPartGeneralInfo>
                  <cmdp:lingualityInfo>
                    <cmdp:lingualityType>multilingual</cmdp:lingualityType>
                    <cmdp:multilingualityType>parallel</cmdp:multilingualityType>
                    <cmdp:multilingualityTypeDetails>translation memory</cmdp:multilingualityTypeDetails>
                  </cmdp:lingualityInfo>
                  <cmdp:languageInfo>
                    <cmdp:languageId>nb</cmdp:languageId>
                    <cmdp:languageName>Norwegian Bokmål</cmdp:languageName>
                  </cmdp:languageInfo>
                  <cmdp:languageInfo>
                    <cmdp:languageId>nn</cmdp:languageId>
                    <cmdp:languageName>Norwegian Nynorsk</cmdp:languageName>
                  </cmdp:languageInfo>
                  <cmdp:languageInfo>
                    <cmdp:languageId>en</cmdp:languageId>
                    <cmdp:languageName>English</cmdp:languageName>
                  </cmdp:languageInfo>
                  <cmdp:modalityInfo>
                    <cmdp:modalityType>writtenLanguage</cmdp:modalityType>
                  </cmdp:modalityInfo>
                  <cmdp:sizeInfo>
                    <cmdp:size>299991</cmdp:size>
                    <cmdp:sizeUnit>units</cmdp:sizeUnit>
                  </cmdp:sizeInfo>
                  <cmdp:sizeInfo>
                    <cmdp:size>2</cmdp:size>
                    <cmdp:sizeUnit>files</cmdp:sizeUnit>
                  </cmdp:sizeInfo>
                  <cmdp:annotationInfo>
                    <cmdp:annotationType>alignment</cmdp:annotationType>
                    <cmdp:segmentationLevel>sentence</cmdp:segmentationLevel>
                    <cmdp:annotationMode>automatic</cmdp:annotationMode>
                    <cmdp:annotator>
                      <cmdp:actorInfo>
                        <cmdp:actorType>person</cmdp:actorType>
                        <cmdp:role xml:lang="en">Resource Annotator</cmdp:role>
                        <cmdp:personInfo>
                          <cmdp:surname xml:lang="nn">Birkenes</cmdp:surname>
                          <cmdp:givenName xml:lang="nn">Magnus Breder</cmdp:givenName>
                          <cmdp:affiliation>
                            <cmdp:organizationInfo>
                              <cmdp:organizationName xml:lang="en">National Library of Norway</cmdp:organizationName>
                              <cmdp:organizationName xml:lang="nn">Nasjonalbiblioteket</cmdp:organizationName>
                              <cmdp:organizationShortName xml:lang="en">NLN</cmdp:organizationShortName>
                              <cmdp:organizationShortName xml:lang="nn">NB</cmdp:organizationShortName>
                              <cmdp:departmentName xml:lang="en">The Language Bank</cmdp:departmentName>
                              <cmdp:departmentName xml:lang="nn">Språkbanken</cmdp:departmentName>
                            </cmdp:organizationInfo>
                          </cmdp:affiliation>
                        </cmdp:personInfo>
                        <cmdp:communicationInfo>
                          <cmdp:email>sprakbanken@nb.no</cmdp:email>
                          <cmdp:url>https://www.nb.no/sprakbanken/</cmdp:url>
                          <cmdp:address>P.O. Box 2674</cmdp:address>
                          <cmdp:zipCode>0203</cmdp:zipCode>
                          <cmdp:city>Oslo</cmdp:city>
                          <cmdp:region>Oslo</cmdp:region>
                          <cmdp:country>Norway</cmdp:country>
                        </cmdp:communicationInfo>
                      </cmdp:actorInfo>
                    </cmdp:annotator>
                  </cmdp:annotationInfo>
                </cmdp:corpusPartGeneralInfo>
              </cmdp:corpusInfo>
            </cmdp:corpusProfile>
          </cmd:Components>
        </cmd:CMD>
      </metadata>
    </record>
  </GetRecord>
</OAI-PMH>