{"id":31736,"date":"2025-02-10T10:39:23","date_gmt":"2025-02-10T09:39:23","guid":{"rendered":"https:\/\/www.nb.no\/sprakbanken\/ressurskatalog\/oai-nb-no-sbr-94\/"},"modified":"2025-08-27T09:51:20","modified_gmt":"2025-08-27T07:51:20","slug":"oai-nb-no-sbr-94","status":"publish","type":"language-resource","link":"https:\/\/www.nb.no\/sprakbanken\/ressurskatalog\/oai-nb-no-sbr-94\/","title":{"rendered":"TeflonNorL2 NOCASA Challenge Dataset"},"content":{"rendered":"<p><?xml version='1.0' encoding='utf-8'?><br \/>\n<record><\/p>\n<header>\n        <identifier>oai:nb.no:sbr-94<\/identifier><br \/>\n        <datestamp>2025-08-25<\/datestamp><br \/>\n      <\/header>\n<p>      <metadata><br \/>\n        <CMD xmlns=\"http:\/\/www.clarin.eu\/cmd\/\"><br \/>\n          <Header><br \/>\n            <MdCreator>Arne Martinus Lindstad<\/MdCreator><br \/>\n            <MdCreationDate>2024-03-23<\/MdCreationDate><br \/>\n            <MdSelfLink>https:\/\/www.nb.no\/sprakbanken\/oai?verb=GetRecord&amp;identifier=oai:nb.no:sbr-94&amp;metadataPrefix=cmdi<\/MdSelfLink><br \/>\n            <MdProfile>clarin.eu:cr1:p_1407745711925<\/MdProfile><br \/>\n            <MdCollectionDisplayName>Spr\u00e5kbanken NB<\/MdCollectionDisplayName><br \/>\n          <\/Header><br \/>\n          <Resources><br \/>\n            <ResourceProxyList><br \/>\n              <ResourceProxy id=\"teflon1\"><br \/>\n                <ResourceType mimetype=\"application\/x-tgz\">Resource<\/ResourceType><br \/>\n                <ResourceRef>https:\/\/www.nb.no\/sbfil\/teflon\/train_audio.tgz<\/ResourceRef><br \/>\n              <\/ResourceProxy><br \/>\n              <ResourceProxy id=\"teflon2\"><br \/>\n                <ResourceType mimetype=\"application\/x-tgz\">Resource<\/ResourceType><br \/>\n                <ResourceRef>https:\/\/www.nb.no\/sbfil\/teflon\/test_audio.tgz<\/ResourceRef><br \/>\n              <\/ResourceProxy><br \/>\n              <ResourceProxy id=\"teflon3\"><br \/>\n                <ResourceType mimetype=\"application\/x-gzip\">Resource<\/ResourceType><br \/>\n                <ResourceRef>https:\/\/www.nb.no\/sbfil\/teflon\/train.csv.gz<\/ResourceRef><br \/>\n              <\/ResourceProxy><br \/>\n              <ResourceProxy id=\"teflon4\"><br \/>\n                <ResourceType mimetype=\"application\/x-gzip\">Resource<\/ResourceType><br \/>\n                <ResourceRef>https:\/\/www.nb.no\/sbfil\/teflon\/test.csv.gz<\/ResourceRef><br \/>\n              <\/ResourceProxy><br \/>\n              <ResourceProxy id=\"teflon5\"><br \/>\n                <ResourceType mimetype=\"application\/x-gzip\">Resource<\/ResourceType><br \/>\n                <ResourceRef>https:\/\/www.nb.no\/sbfil\/teflon\/test_full.csv.gz<\/ResourceRef><br \/>\n              <\/ResourceProxy><br \/>\n            <\/ResourceProxyList><br \/>\n            <JournalFileProxyList \/><br \/>\n            <ResourceRelationList \/><br \/>\n          <\/Resources><br \/>\n          <IsPartOfList \/><br \/>\n          <Components><br \/>\n            <corpusProfile><br \/>\n              <resourceCommonInfo><br \/>\n                <resourceType>corpus<\/resourceType><br \/>\n                <identificationInfo><br \/>\n                  <resourceName xml:lang=\"nb\">TeflonNorL2 NOCASA Challenge Dataset<\/resourceName><br \/>\n                  <resourceName xml:lang=\"en\">TeflonNorL2 NOCASA Challenge Dataset<\/resourceName><br \/>\n                  <description xml:lang=\"nb\">This is a specialized version of the data set that has been used for the Non-native Children\u2019s Automatic Speech Assessment Challenge (NOCASA), https:\/\/teflon.aalto.fi\/nocasa-2025\/, hosted by the IEEE International Workshop on Machine Learning for Signal Processing (MLSP) 2025, https:\/\/2025.ieeemlsp.org\/en\/<\/p>\n<p>The full dataset is described here:<\/p>\n<p>Anne Marte Haug Olstad, Anna Smolander, Sofia Str\u00f6mbergsson, Sari Ylinen, Minna Lehtonen, Mikko Kurimo, Yaroslav Getman, Tam\u00e1s Gr\u00f3sz, Xinwei Cao, Torbj\u00f8rn Svendsen, and Giampiero Salvi. 2024. Collecting Linguistic Resources for Assessing Children\u2019s Pronunciation of Nordic Languages. In Proceedings of the 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING 2024), pages 3529\u20133537, Torino, Italia. ELRA and ICCL.<\/p>\n<p>The specialized version of the data and the challenge are described here:<\/p>\n<p>Getman, Y., Gr\u00f3sz, T., Kurimo, M., &amp; Salvi, G. (2025). &laquo;Non-native Children&#8217;s Automatic Speech Assessment Challenge (NOCASA)&raquo;. IEEE International Workshop on Machine Learning for Signal Processing (MLSP), Istanbul, Turkey<\/p>\n<p>Compared to the full dataset a number of modifications have been made to the challenge data:<\/p>\n<p>&#8211; some recordings were excluded<br \/>\n&#8211; the data was split into training and test set following a procedure that should keep a similar distribution of speaker characteristics<br \/>\n&#8211; the file names were anonymized to hide the speaker identities (it should not be possible to infer which recordings correspond to the same speaker)<br \/>\n&#8211; metadata was limited to orthographic transcription and assessment score for the training data and only orthographic transcription for the test data<\/p>\n<p>Here, we also release assessment scores for the test data separately.<\/p>\n<p>Files:<br \/>\n&#8211; train_audio.tgz: audio files for the training set<br \/>\n&#8211; test_audio.tgz: audio files for the test set<br \/>\n&#8211; train.csv.gz: metadata for the training data (orthographic transcriptions and assessment scores)<br \/>\n&#8211; test.csv.gz: metadata for the test data (orthographic transcriptions)<br \/>\n&#8211; test_full.csv.gz: metadata for the test data (orthographic transcriptions and assessment scores)<\/p>\n<p>Scroll down to download the files.<\/p>\n<p>Contact professor Giampiero Salvi (giampiero.salvi@ntnu.no) at NTNU if you have any questions about the dataset.<\/description><br \/>\n                  <description xml:lang=\"en\">This is a specialized version of the data set that has been used for the Non-native Children\u2019s Automatic Speech Assessment Challenge (NOCASA), https:\/\/teflon.aalto.fi\/nocasa-2025\/, hosted by the IEEE International Workshop on Machine Learning for Signal Processing (MLSP) 2025, https:\/\/2025.ieeemlsp.org\/en\/<\/p>\n<p>The full dataset is described here:<\/p>\n<p>Anne Marte Haug Olstad, Anna Smolander, Sofia Str\u00f6mbergsson, Sari Ylinen, Minna Lehtonen, Mikko Kurimo, Yaroslav Getman, Tam\u00e1s Gr\u00f3sz, Xinwei Cao, Torbj\u00f8rn Svendsen, and Giampiero Salvi. 2024. Collecting Linguistic Resources for Assessing Children\u2019s Pronunciation of Nordic Languages. In Proceedings of the 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING 2024), pages 3529\u20133537, Torino, Italia. ELRA and ICCL.<\/p>\n<p>The specialized version of the data and the challenge are described here:<\/p>\n<p>Getman, Y., Gr\u00f3sz, T., Kurimo, M., &amp; Salvi, G. (2025). &laquo;Non-native Children&#8217;s Automatic Speech Assessment Challenge (NOCASA)&raquo;. IEEE International Workshop on Machine Learning for Signal Processing (MLSP), Istanbul, Turkey<\/p>\n<p>Compared to the full dataset a number of modifications have been made to the challenge data:<\/p>\n<p>&#8211; some recordings were excluded<br \/>\n&#8211; the data was split into training and test set following a procedure that should keep a similar distribution of speaker characteristics<br \/>\n&#8211; the file names were anonymized to hide the speaker identities (it should not be possible to infer which recordings correspond to the same speaker)<br \/>\n&#8211; metadata was limited to orthographic transcription and assessment score for the training data and only orthographic transcription for the test data<\/p>\n<p>Here, we also release assessment scores for the test data separately.<\/p>\n<p>Files:<br \/>\n&#8211; train_audio.tgz: audio files for the training set<br \/>\n&#8211; test_audio.tgz: audio files for the test set<br \/>\n&#8211; train.csv.gz: metadata for the training data (orthographic transcriptions and assessment scores)<br \/>\n&#8211; test.csv.gz: metadata for the test data (orthographic transcriptions)<br \/>\n&#8211; test_full.csv.gz: metadata for the test data (orthographic transcriptions and assessment scores)<\/p>\n<p>Scroll down to download the files.<\/p>\n<p>Contact professor Giampiero Salvi (giampiero.salvi@ntnu.no) at NTNU if you have any questions about the dataset.<\/description><br \/>\n                  <url description=\"Resource landing page\">https:\/\/www.nb.no\/sprakbanken\/ressurskatalog\/oai-nb-no-sbr-94\/<\/url><br \/>\n                  <PID description=\"handle\">hdl:21.11146\/94<\/PID><br \/>\n                  <identifier>sbr-94<\/identifier><br \/>\n                <\/identificationInfo><br \/>\n                <distributionInfo>\n                  <licenceInfo>\n                    <userCategory>Restricted<\/userCategory><br \/>\n                    <distributionAccessMedium>downloadable<\/distributionAccessMedium><br \/>\n                    <downloadLocation description=\"resource homepage\">https:\/\/www.nb.no\/sprakbanken\/en\/resource-catalogue\/oai-nb-no-sbr-94\/<\/downloadLocation>\n                    <licence>\n                      <licenceFamily>none<\/licenceFamily>\n                      <licenceName>all rights reserved<\/licenceName>\n                      <conditionsOfUse>ID<\/conditionsOfUse><br \/>\n                      <conditionsOfUse>PERM<\/conditionsOfUse>\n                    <\/licence>\n                  <\/licenceInfo>\n                <\/distributionInfo><br \/>\n                <contact><br \/>\n                  <actorInfo><br \/>\n                    <actorType>person<\/actorType><br \/>\n                    <role xml:lang=\"en\">Contact person<\/role>\n                    <personInfo>\n                      <surname xml:lang=\"en\">Salvi<\/surname><br \/>\n                      <givenName xml:lang=\"en\">Giampiero<\/givenName>\n                      <position>Professor<\/position>\n                      <affiliation><br \/>\n                        <organizationInfo><br \/>\n                          <organizationName xml:lang=\"nb\">Norges teknisk-naturvitenskapelige universitet<\/organizationName><br \/>\n                          <organizationName xml:lang=\"en\">Norwegian University of Science and Technology<\/organizationName><br \/>\n                          <organizationShortName xml:lang=\"nb\">NTNU<\/organizationShortName><br \/>\n                          <organizationShortName xml:lang=\"en\">NTNU<\/organizationShortName><br \/>\n                        <\/organizationInfo><br \/>\n                      <\/affiliation>\n                    <\/personInfo>\n                    <communicationInfo><br \/>\n                      <email>giampiero.salvi@ntnu.no<\/email><br \/>\n                      <url>https:\/\/www.ntnu.edu\/employees\/giampiero.salvi<\/url><br \/>\n                    <\/communicationInfo><br \/>\n                  <\/actorInfo><br \/>\n                <\/contact><br \/>\n                <metadataInfo><br \/>\n                  <metadataCreationDate>2024-03-23<\/metadataCreationDate><br \/>\n                  <metadataLastDateUpdated>2025-08-25<\/metadataLastDateUpdated><br \/>\n                  <metadataCreator><br \/>\n                    <actorInfo><br \/>\n                      <actorType>organization<\/actorType><br \/>\n                      <role xml:lang=\"en\">Metadata Creator<\/role><br \/>\n                      <organizationInfo><br \/>\n                        <organizationName xml:lang=\"nb\">Nasjonalbiblioteket<\/organizationName><br \/>\n                        <organizationName xml:lang=\"en\">National Library of Norway<\/organizationName><br \/>\n                        <organizationShortName xml:lang=\"nb\">NB<\/organizationShortName><br \/>\n                        <organizationShortName xml:lang=\"en\">NLN<\/organizationShortName><br \/>\n                        <departmentName xml:lang=\"nb\">Spr\u00e5kbanken<\/departmentName><br \/>\n                        <departmentName xml:lang=\"en\">The Language Bank<\/departmentName><br \/>\n                      <\/organizationInfo><br \/>\n                    <\/actorInfo><br \/>\n                  <\/metadataCreator><br \/>\n                <\/metadataInfo><br \/>\n                <resourceCreationInfo><br \/>\n                  <creationEndDate>2024-03-23<\/creationEndDate><br \/>\n                <\/resourceCreationInfo><br \/>\n              <\/resourceCommonInfo><br \/>\n              <corpusInfo><br \/>\n                <corpusType>Multimodal Corpus<\/corpusType><br \/>\n                <corpusPartInfo><br \/>\n                  <mediaType>audio<\/mediaType><br \/>\n                <\/corpusPartInfo><br \/>\n                <corpusPartInfo><br \/>\n                  <mediaType>text<\/mediaType><br \/>\n                <\/corpusPartInfo><br \/>\n                <corpusPartGeneralInfo \/><br \/>\n              <\/corpusInfo><br \/>\n            <\/corpusProfile><br \/>\n          <\/Components><br \/>\n        <\/CMD><br \/>\n      <\/metadata><br \/>\n    <\/record><\/p>\n","protected":false},"template":"","categories":[],"tags":[],"language-resource-type":[7641,7569],"language-resource-origin":[7562],"class_list":["post-31736","language-resource","type-language-resource","status-publish","hentry"],"yoast_head":"<!-- This site is optimized with the Yoast SEO Premium plugin v27.1 (Yoast SEO v27.1.1) - https:\/\/yoast.com\/product\/yoast-seo-premium-wordpress\/ -->\n<title>TeflonNorL2 NOCASA Challenge Dataset - Spr\u00e5kbanken<\/title>\n<meta name=\"robots\" content=\"index, follow, max-snippet:-1, max-image-preview:large, max-video-preview:-1\" \/>\n<link rel=\"canonical\" href=\"https:\/\/www.nb.no\/sprakbanken\/ressurskatalog\/oai-nb-no-sbr-94\/\" \/>\n<meta property=\"og:locale\" content=\"nb_NO\" \/>\n<meta property=\"og:type\" content=\"article\" \/>\n<meta property=\"og:title\" content=\"TeflonNorL2 NOCASA Challenge Dataset\" \/>\n<meta property=\"og:url\" content=\"https:\/\/www.nb.no\/sprakbanken\/ressurskatalog\/oai-nb-no-sbr-94\/\" \/>\n<meta property=\"og:site_name\" content=\"Spr\u00e5kbanken\" \/>\n<meta property=\"article:modified_time\" content=\"2025-08-27T07:51:20+00:00\" \/>\n<meta name=\"twitter:card\" content=\"summary_large_image\" \/>\n<meta name=\"twitter:label1\" content=\"Ansl. lesetid\" \/>\n\t<meta name=\"twitter:data1\" content=\"4 minutter\" \/>\n<script type=\"application\/ld+json\" class=\"yoast-schema-graph\">{\"@context\":\"https:\/\/schema.org\",\"@graph\":[{\"@type\":\"WebPage\",\"@id\":\"https:\/\/www.nb.no\/sprakbanken\/ressurskatalog\/oai-nb-no-sbr-94\/\",\"url\":\"https:\/\/www.nb.no\/sprakbanken\/ressurskatalog\/oai-nb-no-sbr-94\/\",\"name\":\"TeflonNorL2 NOCASA Challenge Dataset - Spr\u00e5kbanken\",\"isPartOf\":{\"@id\":\"https:\/\/www.nb.no\/sprakbanken\/#website\"},\"datePublished\":\"2025-02-10T09:39:23+00:00\",\"dateModified\":\"2025-08-27T07:51:20+00:00\",\"breadcrumb\":{\"@id\":\"https:\/\/www.nb.no\/sprakbanken\/ressurskatalog\/oai-nb-no-sbr-94\/#breadcrumb\"},\"inLanguage\":\"nb-NO\",\"potentialAction\":[{\"@type\":\"ReadAction\",\"target\":[\"https:\/\/www.nb.no\/sprakbanken\/ressurskatalog\/oai-nb-no-sbr-94\/\"]}]},{\"@type\":\"BreadcrumbList\",\"@id\":\"https:\/\/www.nb.no\/sprakbanken\/ressurskatalog\/oai-nb-no-sbr-94\/#breadcrumb\",\"itemListElement\":[{\"@type\":\"ListItem\",\"position\":1,\"name\":\"Home\",\"item\":\"https:\/\/www.nb.no\/sprakbanken\/\"},{\"@type\":\"ListItem\",\"position\":2,\"name\":\"Resources from the resource bank\",\"item\":\"https:\/\/www.nb.no\/sprakbanken\/en\/resource-catalogue\/\"},{\"@type\":\"ListItem\",\"position\":3,\"name\":\"TeflonNorL2 NOCASA Challenge Dataset\"}]},{\"@type\":\"WebSite\",\"@id\":\"https:\/\/www.nb.no\/sprakbanken\/#website\",\"url\":\"https:\/\/www.nb.no\/sprakbanken\/\",\"name\":\"Spr\u00e5kbanken\",\"description\":\"\",\"potentialAction\":[{\"@type\":\"SearchAction\",\"target\":{\"@type\":\"EntryPoint\",\"urlTemplate\":\"https:\/\/www.nb.no\/sprakbanken\/?s={search_term_string}\"},\"query-input\":{\"@type\":\"PropertyValueSpecification\",\"valueRequired\":true,\"valueName\":\"search_term_string\"}}],\"inLanguage\":\"nb-NO\"}]}<\/script>\n<!-- \/ Yoast SEO Premium plugin. -->","yoast_head_json":{"title":"TeflonNorL2 NOCASA Challenge Dataset - Spr\u00e5kbanken","robots":{"index":"index","follow":"follow","max-snippet":"max-snippet:-1","max-image-preview":"max-image-preview:large","max-video-preview":"max-video-preview:-1"},"canonical":"https:\/\/www.nb.no\/sprakbanken\/ressurskatalog\/oai-nb-no-sbr-94\/","og_locale":"nb_NO","og_type":"article","og_title":"TeflonNorL2 NOCASA Challenge Dataset","og_url":"https:\/\/www.nb.no\/sprakbanken\/ressurskatalog\/oai-nb-no-sbr-94\/","og_site_name":"Spr\u00e5kbanken","article_modified_time":"2025-08-27T07:51:20+00:00","twitter_card":"summary_large_image","twitter_misc":{"Ansl. lesetid":"4 minutter"},"schema":{"@context":"https:\/\/schema.org","@graph":[{"@type":"WebPage","@id":"https:\/\/www.nb.no\/sprakbanken\/ressurskatalog\/oai-nb-no-sbr-94\/","url":"https:\/\/www.nb.no\/sprakbanken\/ressurskatalog\/oai-nb-no-sbr-94\/","name":"TeflonNorL2 NOCASA Challenge Dataset - Spr\u00e5kbanken","isPartOf":{"@id":"https:\/\/www.nb.no\/sprakbanken\/#website"},"datePublished":"2025-02-10T09:39:23+00:00","dateModified":"2025-08-27T07:51:20+00:00","breadcrumb":{"@id":"https:\/\/www.nb.no\/sprakbanken\/ressurskatalog\/oai-nb-no-sbr-94\/#breadcrumb"},"inLanguage":"nb-NO","potentialAction":[{"@type":"ReadAction","target":["https:\/\/www.nb.no\/sprakbanken\/ressurskatalog\/oai-nb-no-sbr-94\/"]}]},{"@type":"BreadcrumbList","@id":"https:\/\/www.nb.no\/sprakbanken\/ressurskatalog\/oai-nb-no-sbr-94\/#breadcrumb","itemListElement":[{"@type":"ListItem","position":1,"name":"Home","item":"https:\/\/www.nb.no\/sprakbanken\/"},{"@type":"ListItem","position":2,"name":"Resources from the resource bank","item":"https:\/\/www.nb.no\/sprakbanken\/en\/resource-catalogue\/"},{"@type":"ListItem","position":3,"name":"TeflonNorL2 NOCASA Challenge Dataset"}]},{"@type":"WebSite","@id":"https:\/\/www.nb.no\/sprakbanken\/#website","url":"https:\/\/www.nb.no\/sprakbanken\/","name":"Spr\u00e5kbanken","description":"","potentialAction":[{"@type":"SearchAction","target":{"@type":"EntryPoint","urlTemplate":"https:\/\/www.nb.no\/sprakbanken\/?s={search_term_string}"},"query-input":{"@type":"PropertyValueSpecification","valueRequired":true,"valueName":"search_term_string"}}],"inLanguage":"nb-NO"}]}},"lang":"nb","translations":{"nb":31736,"en":31742},"pll_sync_post":[],"_links":{"self":[{"href":"https:\/\/www.nb.no\/sprakbanken\/wp-json\/wp\/v2\/language-resource\/31736","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.nb.no\/sprakbanken\/wp-json\/wp\/v2\/language-resource"}],"about":[{"href":"https:\/\/www.nb.no\/sprakbanken\/wp-json\/wp\/v2\/types\/language-resource"}],"wp:attachment":[{"href":"https:\/\/www.nb.no\/sprakbanken\/wp-json\/wp\/v2\/media?parent=31736"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.nb.no\/sprakbanken\/wp-json\/wp\/v2\/categories?post=31736"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.nb.no\/sprakbanken\/wp-json\/wp\/v2\/tags?post=31736"},{"taxonomy":"language-resource-type","embeddable":true,"href":"https:\/\/www.nb.no\/sprakbanken\/wp-json\/wp\/v2\/language-resource-type?post=31736"},{"taxonomy":"language-resource-origin","embeddable":true,"href":"https:\/\/www.nb.no\/sprakbanken\/wp-json\/wp\/v2\/language-resource-origin?post=31736"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}