<OAI-PMH xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns="http://www.openarchives.org/OAI/2.0/" xsi:schemaLocation="http://www.openarchives.org/OAI/2.0/          http://www.openarchives.org/OAI/2.0/OAI-PMH.xsd">
  <responseDate>2026-05-29T23:05:08.942Z</responseDate>
  <request verb="GetRecord">https://www.nb.no/sprakbanken/oai</request>
  <GetRecord>
    <record>
      <header>
        <identifier>oai:nb.no:sbr-91</identifier>
        <datestamp/>
      </header>
      <metadata>
        <cmd:CMD xmlns:cmd="http://www.clarin.eu/cmd/1" xmlns="http://www.clarin.eu/cmd/" xmlns:cmdp="http://www.clarin.eu/cmd/1/profiles/clarin.eu:cr1:p_1407745711925" CMDVersion="1.2" xsi:schemaLocation="http://www.clarin.eu/cmd/1 https://infra.clarin.eu/CMDI/1.x/xsd/cmd-envelop.xsd http://www.clarin.eu/cmd/1/profiles/clarin.eu:cr1:p_1407745711925 https://catalog.clarin.eu/ds/ComponentRegistry/rest/registry/1.1/profiles/clarin.eu:cr1:p_1407745711925/1.2/xsd">
          <cmd:Header>
            <cmd:MdCreator>Arne Martinus Lindstad</cmd:MdCreator>
            <cmd:MdCreationDate>2023-11-16</cmd:MdCreationDate>
            <cmd:MdSelfLink>https://www.nb.no/sprakbanken/oai?verb=GetRecord&amp;identifier=oai:nb.no:sbr-91&amp;metadataPrefix=cmdi</cmd:MdSelfLink>
            <cmd:MdProfile>clarin.eu:cr1:p_1407745711925</cmd:MdProfile>
            <cmd:MdCollectionDisplayName>Språkbanken NB</cmd:MdCollectionDisplayName>
          </cmd:Header>
          <cmd:Resources>
            <cmd:ResourceProxyList>
              <cmd:ResourceProxy id="sbr-91_data">
                <cmd:ResourceType mimetype="application/x-gtar">Resource</cmd:ResourceType>
                <cmd:ResourceRef>https://www.nb.no/sbfil/talegjenkjenning/ssc/ssc_v1_0.tar.gz</cmd:ResourceRef>
              </cmd:ResourceProxy>
              <cmd:ResourceProxy id="sbr-91_md">
                <cmd:ResourceType mimetype="text/markdown">Resource</cmd:ResourceType>
                <cmd:ResourceRef>https://www.nb.no/sbfil/talegjenkjenning/ssc/ssc_v1_0_dataset_card.md</cmd:ResourceRef>
              </cmd:ResourceProxy>
              <cmd:ResourceProxy id="sbr-91_pdf">
                <cmd:ResourceType mimetype="application/pdf">Resource</cmd:ResourceType>
                <cmd:ResourceRef>https://www.nb.no/sbfil/talegjenkjenning/ssc/SSC_1.pdf</cmd:ResourceRef>
              </cmd:ResourceProxy>
            </cmd:ResourceProxyList>
            <cmd:JournalFileProxyList/>
            <cmd:ResourceRelationList>
              <cmd:ResourceRelation>
                <cmd:RelationType>describes</cmd:RelationType>
                <cmd:Resource>
                  <cmd:Role>
                    <cmd:Resource>
                      <cmd:Role/>
                    </cmd:Resource>
                  </cmd:Role>
                </cmd:Resource>
              </cmd:ResourceRelation>
              <cmd:ResourceRelation>
                <cmd:RelationType>describes</cmd:RelationType>
                <cmd:Resource>
                  <cmd:Role>
                    <cmd:Resource>
                      <cmd:Role/>
                    </cmd:Resource>
                  </cmd:Role>
                </cmd:Resource>
              </cmd:ResourceRelation>
            </cmd:ResourceRelationList>
          </cmd:Resources>
          <cmd:IsPartOfList/>
          <cmd:Components>
            <cmdp:corpusProfile>
              <cmdp:resourceCommonInfo>
                <cmdp:resourceType>corpus</cmdp:resourceType>
                <cmdp:identificationInfo>
                  <cmdp:resourceName xml:lang="en">Stortinget Speech Corpus version 1.0</cmdp:resourceName>
                  <cmdp:resourceName xml:lang="nn">Stortinget Speech Corpus versjon 1.0</cmdp:resourceName>
                  <cmdp:description xml:lang="en">The Stortinget Speech Corpus (SSC) is a 5000+ hours speech dataset for weak supervision ASR created from audio and aligned proceedings text from Stortinget, the Norwegian Parliament. It contains speech segments of up to 30 seconds with transcriptions in Norwegian Bokmål (nob) and Norwegian Nynorsk (nno) from the official proceedings.

The dataset is distributed as a JSONL file. Audio files, proceedings files and transcription files (with ASR output) are included in this repository, and there are relative file paths in the JSONL file. Note that only segmented audio files are part of the release.

Dataset statistics
- Number of segments: 724 783
- Total duration in hours: 5 190
- Number of unique speakers: 729

For more detailed information, see the documentation files.</cmdp:description>
                  <cmdp:description xml:lang="nn">Stortinget Speech Corpus (SSC) er eit taledatasett på meir enn 5000 timar for svakt overvaka taleattkjenning laga av lydopptak og tekst frå Stortingsforhandlingane. Det inneheld taleeiningar på inntil 30 sekund med transkripsjonar på bokmål og nynorsk frå dei offisielle Stortingsforhandlingane.

Datasettet vert distribuert som ei JSONL-fil. Lydfiler, tekstfiler og transkripsjonsfiler (med output frå taleattkjenninga) er inkluderte i datasettet, linka med relative filstiar i JSONL-fila. Merk at berre segmenterte lydfiler er del av korpuset.

Statistikk
- Antall segment: 724 783
- Total varigheit i timar: 5 190
- Antal unike talarar: 729

For meir detaljert informasjon, sjå dokumentasjonsfilene.</cmdp:description>
                  <cmdp:resourceShortName xml:lang="en">SSC</cmdp:resourceShortName>
                  <cmdp:resourceShortName xml:lang="nn">SSC</cmdp:resourceShortName>
                  <cmdp:url cmd:description="resource homepage">https://www.nb.no/sprakbanken/ressurskatalog/oai-nb-no-sbr-91/</cmdp:url>
                  <cmdp:PID cmd:description="handle">hdl:21.11146/91</cmdp:PID>
                  <cmdp:identifier>sbr-91</cmdp:identifier>
                </cmdp:identificationInfo>
                <cmdp:distributionInfo>
                  <cmdp:licenceInfo>
                    <cmdp:userCategory>Public</cmdp:userCategory>
                    <cmdp:distributionAccessMedium>downloadable</cmdp:distributionAccessMedium>
                    <cmdp:downloadLocation cmd:description="resource homepage">https://www.nb.no/sprakbanken/ressurskatalog/oai-nb-no-sbr-91/</cmdp:downloadLocation>
                    <cmdp:licence>
                      <cmdp:licenceFamily>Creative Commons (CC)</cmdp:licenceFamily>
                      <cmdp:licenceName>Creative_Commons-ZERO (CC-ZERO)</cmdp:licenceName>
                      <cmdp:licenceURL>https://creativecommons.org/publicdomain/zero/1.0/</cmdp:licenceURL>
                    </cmdp:licence>
                    <cmdp:licensor>
                      <cmdp:actorInfo>
                        <cmdp:actorType>organization</cmdp:actorType>
                        <cmdp:role xml:lang="en">Licensor</cmdp:role>
                        <cmdp:organizationInfo>
                          <cmdp:organizationName xml:lang="en">National Library of Norway</cmdp:organizationName>
                          <cmdp:organizationName xml:lang="nn">Nasjonalbiblioteket</cmdp:organizationName>
                          <cmdp:organizationShortName xml:lang="en">NLN</cmdp:organizationShortName>
                          <cmdp:organizationShortName xml:lang="nn">NB</cmdp:organizationShortName>
                        </cmdp:organizationInfo>
                        <cmdp:communicationInfo>
                          <cmdp:email>sprakbanken@nb.no</cmdp:email>
                          <cmdp:email>ai-lab@nb.no</cmdp:email>
                          <cmdp:url>https://www.nb.no/sprakbanken/</cmdp:url>
                          <cmdp:url>https://ai.nb.no</cmdp:url>
                          <cmdp:address>P.O. Box 2674 Solli</cmdp:address>
                          <cmdp:zipCode>0203</cmdp:zipCode>
                          <cmdp:city>Oslo</cmdp:city>
                          <cmdp:region>Oslo</cmdp:region>
                          <cmdp:country>Norway</cmdp:country>
                        </cmdp:communicationInfo>
                      </cmdp:actorInfo>
                    </cmdp:licensor>
                  </cmdp:licenceInfo>
                </cmdp:distributionInfo>
                <cmdp:contact>
                  <cmdp:actorInfo>
                    <cmdp:actorType>organization</cmdp:actorType>
                    <cmdp:role xml:lang="en">Contact</cmdp:role>
                    <cmdp:organizationInfo>
                      <cmdp:organizationName xml:lang="en">National Library of Norway</cmdp:organizationName>
                      <cmdp:organizationName xml:lang="nn">Nasjonalbiblioteket</cmdp:organizationName>
                      <cmdp:organizationShortName xml:lang="en">NLN</cmdp:organizationShortName>
                      <cmdp:organizationShortName xml:lang="nn">NB</cmdp:organizationShortName>
                      <cmdp:departmentName xml:lang="en">The Language Bank</cmdp:departmentName>
                      <cmdp:departmentName xml:lang="nn">Språkbanken</cmdp:departmentName>
                    </cmdp:organizationInfo>
                    <cmdp:communicationInfo>
                      <cmdp:email>sprakbanken@nb.no</cmdp:email>
                      <cmdp:url>https://www.nb.no/sprakbanken/</cmdp:url>
                      <cmdp:address>P.O. Box 2674 Solli</cmdp:address>
                      <cmdp:zipCode>0203</cmdp:zipCode>
                      <cmdp:city>Oslo</cmdp:city>
                      <cmdp:region>Oslo</cmdp:region>
                      <cmdp:country>Norway</cmdp:country>
                    </cmdp:communicationInfo>
                  </cmdp:actorInfo>
                </cmdp:contact>
                <cmdp:metadataInfo>
                  <cmdp:metadataCreationDate>2023-11-16</cmdp:metadataCreationDate>
                  <cmdp:metadataLanguageName>English</cmdp:metadataLanguageName>
                  <cmdp:metadataLanguageName>Norwegian Nynorsk</cmdp:metadataLanguageName>
                  <cmdp:metadataLanguageId>en</cmdp:metadataLanguageId>
                  <cmdp:metadataLanguageId>nn</cmdp:metadataLanguageId>
                  <cmdp:metadataLastDateUpdated>2024-01-12</cmdp:metadataLastDateUpdated>
                  <cmdp:metadataCreator>
                    <cmdp:actorInfo>
                      <cmdp:actorType>organization</cmdp:actorType>
                      <cmdp:role xml:lang="en">Metadata Creator</cmdp:role>
                      <cmdp:organizationInfo>
                        <cmdp:organizationName xml:lang="en">National Library of Norway</cmdp:organizationName>
                        <cmdp:organizationName xml:lang="nn">Nasjonalbiblioteket</cmdp:organizationName>
                        <cmdp:organizationShortName xml:lang="en">NLN</cmdp:organizationShortName>
                        <cmdp:organizationShortName xml:lang="nn">NB</cmdp:organizationShortName>
                        <cmdp:departmentName xml:lang="en">The Language Bank</cmdp:departmentName>
                        <cmdp:departmentName xml:lang="nn">Språkbanken</cmdp:departmentName>
                      </cmdp:organizationInfo>
                      <cmdp:communicationInfo>
                        <cmdp:email>sprakbanken@nb.no</cmdp:email>
                        <cmdp:url>https://www.nb.no/sprakbanken/</cmdp:url>
                        <cmdp:address>P.O. Box 2674 Solli</cmdp:address>
                        <cmdp:zipCode>0203</cmdp:zipCode>
                        <cmdp:city>Oslo</cmdp:city>
                        <cmdp:region>Oslo</cmdp:region>
                        <cmdp:country>Norway</cmdp:country>
                      </cmdp:communicationInfo>
                    </cmdp:actorInfo>
                  </cmdp:metadataCreator>
                </cmdp:metadataInfo>
                <cmdp:versionInfo>
                  <cmdp:version>1.0</cmdp:version>
                  <cmdp:lastDateUpdated>2023-11-15</cmdp:lastDateUpdated>
                </cmdp:versionInfo>
                <cmdp:validationInfo>
                  <cmdp:validated>false</cmdp:validated>
                </cmdp:validationInfo>
                <cmdp:resourceDocumentationInfo>
                  <cmdp:documentationUnstructured>
                    <cmdp:role>documentation</cmdp:role>
                    <cmdp:documentUnstructured>https://www.nb.no/sbfil/talegjenkjenning/ssc/SSC_1.pdf</cmdp:documentUnstructured>
                  </cmdp:documentationUnstructured>
                </cmdp:resourceDocumentationInfo>
                <cmdp:resourceCreationInfo>
                  <cmdp:creationStartDate>2019-08-01</cmdp:creationStartDate>
                  <cmdp:creationEndDate>2023-11-15</cmdp:creationEndDate>
                  <cmdp:resourceCreator>
                    <cmdp:actorInfo>
                      <cmdp:actorType>organization</cmdp:actorType>
                      <cmdp:role xml:lang="en">Resource Creator</cmdp:role>
                      <cmdp:organizationInfo>
                        <cmdp:organizationName xml:lang="en">National Library of Norway</cmdp:organizationName>
                        <cmdp:organizationName xml:lang="nn">Nasjonalbiblioteket</cmdp:organizationName>
                        <cmdp:organizationShortName xml:lang="en">NLN</cmdp:organizationShortName>
                        <cmdp:organizationShortName xml:lang="nn">NB</cmdp:organizationShortName>
                        <cmdp:departmentName xml:lang="en">The Language Bank / The AI-lab</cmdp:departmentName>
                        <cmdp:departmentName xml:lang="nn">Språkbanken / AI-laben</cmdp:departmentName>
                      </cmdp:organizationInfo>
                      <cmdp:communicationInfo>
                        <cmdp:email>sprakbanken@nb.no</cmdp:email>
                        <cmdp:email>ai-lab@nb.no</cmdp:email>
                        <cmdp:url>https://www.nb.no/sprakbanken/</cmdp:url>
                        <cmdp:url>https://ai-lab.nb.no/</cmdp:url>
                        <cmdp:address>P.O. Box 2674 Solli</cmdp:address>
                        <cmdp:zipCode>0203</cmdp:zipCode>
                        <cmdp:city>Oslo</cmdp:city>
                        <cmdp:region>Oslo</cmdp:region>
                        <cmdp:country>Norway</cmdp:country>
                      </cmdp:communicationInfo>
                    </cmdp:actorInfo>
                    <cmdp:actorInfo>
                      <cmdp:actorType>organization</cmdp:actorType>
                      <cmdp:role xml:lang="en">Resource Creator</cmdp:role>
                      <cmdp:organizationInfo>
                        <cmdp:organizationName xml:lang="en">Norwegian University of Science and Technology</cmdp:organizationName>
                        <cmdp:organizationName xml:lang="nn">Noregs teknisk-naturvitskaplege universitet</cmdp:organizationName>
                        <cmdp:organizationShortName xml:lang="en">NTNU</cmdp:organizationShortName>
                        <cmdp:organizationShortName xml:lang="nn">NTNU</cmdp:organizationShortName>
                        <cmdp:departmentName xml:lang="en">Department of Electronic Systems</cmdp:departmentName>
                        <cmdp:departmentName xml:lang="nn">Institutt for elektroniske system</cmdp:departmentName>
                      </cmdp:organizationInfo>
                    </cmdp:actorInfo>
                  </cmdp:resourceCreator>
                </cmdp:resourceCreationInfo>
              </cmdp:resourceCommonInfo>
              <cmdp:corpusInfo>
                <cmdp:corpusType>Multimodal Corpus</cmdp:corpusType>
                <cmdp:corpusPartInfo>
                  <cmdp:mediaType>audio</cmdp:mediaType>
                  <cmdp:corpusAudioInfo>
                    <cmdp:audioSizeInfo>
                      <cmdp:sizeInfo>
                        <cmdp:size>5190</cmdp:size>
                        <cmdp:sizeUnit>hours</cmdp:sizeUnit>
                      </cmdp:sizeInfo>
                      <cmdp:sizeInfo>
                        <cmdp:size>724783</cmdp:size>
                        <cmdp:sizeUnit>units</cmdp:sizeUnit>
                      </cmdp:sizeInfo>
                      <cmdp:durationOfAudioInfo>
                        <cmdp:size>5190</cmdp:size>
                        <cmdp:durationUnit>hours</cmdp:durationUnit>
                      </cmdp:durationOfAudioInfo>
                    </cmdp:audioSizeInfo>
                    <cmdp:audioFormatInfo>
                      <cmdp:mimeType>audio/mpeg</cmdp:mimeType>
                      <cmdp:samplingRate>16000</cmdp:samplingRate>
                    </cmdp:audioFormatInfo>
                  </cmdp:corpusAudioInfo>
                </cmdp:corpusPartInfo>
                <cmdp:corpusPartInfo>
                  <cmdp:mediaType>text</cmdp:mediaType>
                  <cmdp:corpusTextInfo>
                    <cmdp:textFormatInfo>
                      <cmdp:mimeType>text/jsonl</cmdp:mimeType>
                    </cmdp:textFormatInfo>
                    <cmdp:characterEncodingInfo>
                      <cmdp:characterEncoding>UTF-8</cmdp:characterEncoding>
                    </cmdp:characterEncodingInfo>
                  </cmdp:corpusTextInfo>
                </cmdp:corpusPartInfo>
                <cmdp:corpusPartGeneralInfo>
                  <cmdp:lingualityInfo>
                    <cmdp:lingualityType>monolingual</cmdp:lingualityType>
                  </cmdp:lingualityInfo>
                  <cmdp:languageInfo>
                    <cmdp:languageId>no</cmdp:languageId>
                    <cmdp:languageName>Norwegian</cmdp:languageName>
                    <cmdp:sizePerLanguage>
                      <cmdp:sizeInfo>
                        <cmdp:size>724783</cmdp:size>
                        <cmdp:sizeUnit>units</cmdp:sizeUnit>
                      </cmdp:sizeInfo>
                      <cmdp:sizeInfo>
                        <cmdp:size>5190</cmdp:size>
                        <cmdp:sizeUnit>hours</cmdp:sizeUnit>
                      </cmdp:sizeInfo>
                      <cmdp:sizeInfo>
                        <cmdp:size>62</cmdp:size>
                        <cmdp:sizeUnit>gb</cmdp:sizeUnit>
                      </cmdp:sizeInfo>
                    </cmdp:sizePerLanguage>
                    <cmdp:languageVarietyInfo>
                      <cmdp:languageVarietyType>other</cmdp:languageVarietyType>
                      <cmdp:languageVarietyName>formal</cmdp:languageVarietyName>
                    </cmdp:languageVarietyInfo>
                  </cmdp:languageInfo>
                  <cmdp:modalityInfo>
                    <cmdp:modalityType>spokenLanguage</cmdp:modalityType>
                    <cmdp:modalityTypeDetails>formal speech, parliamentary speech</cmdp:modalityTypeDetails>
                  </cmdp:modalityInfo>
                  <cmdp:annotationInfo>
                    <cmdp:annotationType>alignment</cmdp:annotationType>
                  </cmdp:annotationInfo>
                </cmdp:corpusPartGeneralInfo>
              </cmdp:corpusInfo>
            </cmdp:corpusProfile>
          </cmd:Components>
        </cmd:CMD>
      </metadata>
    </record>
  </GetRecord>
</OAI-PMH>