Norsk koreferansekorpus
Utvidet metadata
- resource Common Info
- resource Type: corpus
- identification Info
- resource Name: Norwegian Anaphora Resolution Corpus
- resource Name: Norsk koreferansekorpus
- description: Norwegian-BokmaalNARC and Norwegian-NynorskNARC are conversions of the Bokmål and Nynorsk parts of the Norwegian Anaphora Resolution Corpus (NARC), respectively. This is the first publicly available corpus annotated with anaphoric relations between noun phrases for Norwegian. The annotation effort enriches the existing annotation of the Norwegian Dependency Treebank (NDT) and contains a total of 15.742 sentences and 245.515 tokens for Bokmål, and 12.481 sentences and 206.660 tokens for Nynorsk. The accompanying paper by Mæhlum et al. at CRAC 2022 describes the annotation effort in more detail.
- description: Norwegian-BokmaalNARC og Norwegian-NynorskNARC er konverteringar av høvesvis bokmåls- og nynorskdelen av Norwegian Anaphora Resolution Corpus (NARC), det første offentleg tilgjengelege korpuset annotert med anaforiske relasjonar mellom substantivfrasar for norsk. Annoteringa berikar den eksisterande annoteringa til Norsk dependenstrebank (NDT), og inneheld totalt 15 742 setningar og 245 515 "tokens" for bokmål, og 12 481 setningar og 206 660 "tokens" for nynorsk. Den vedlagde artikkelen av Mæhlum et al. frå CRAC 2022 skildrar annoteringsarbeidet i detalj.
- resource Short Name: NARC
- resource Short Name: NARC
- url: https://www.nb.no/sprakbanken/ressurskatalog/oai-nb-no-sbr-82/
- P I D: hdl:21.11146/82
- identifier: sbr-82
- distribution Info
- licence Info
- user Category: Public
- distribution Access Medium: downloadable
- download Location: https://www.nb.no/sprakbanken/ressurskatalog/oai-nb-no-sbr-82/
- licence
- licence Family: Creative Commons (CC)
- licence Name: Creative_Commons-BY-SA (CC-BY-SA)
- licence Url: https://creativecommons.org/licenses/by-sa/4.0/
- conditions Of Use: BY
- conditions Of Use: SA
- licence Info
- contact
- actor Info
- actor Type: organization
- organization Info
- organization Name: National Library of Norway
- organization Name: Nasjonalbiblioteket
- organization Short Name: NLN
- organization Short Name: NB
- department Name: The Language Bank
- department Name: Språkbanken
- communication Info
- email: sprakbanken@nb.no
- url: https://www.nb.no/sprakbanken/
- address: P.O. Box 2674 Solli
- zip Code: 0203
- city: Oslo
- region: Oslo
- country: Norway
- actor Info
- metadata Info
- metadata Creation Date: 02.05.2023
- metadata Language Name: English
- metadata Language Id: en
- metadata Last Date Updated: 09.05.2023
- metadata Creator
- actor Info
- actor Type: organization
- organization Info
- organization Name: National Library of Norway
- organization Name: Nasjonalbiblioteket
- organization Short Name: NLN
- organization Short Name: NB
- department Name: The Language Bank
- department Name: Språkbanken
- communication Info
- email: sprakbanken@nb.no
- url: https://www.nb.no/sprakbanken/
- address: P.O. Box 2674 Solli
- zip Code: 0203
- city: Oslo
- region: Oslo
- country: Norway
- actor Info
- version Info
- version: 1.1
- last Date Updated: 24.02.2023
- validation Info
- validated: false
- resource Creation Info
- creation End Date: 24.02.2023
- corpus Info
- corpus Type: Treebank
- corpus Part Info
- media Type: text
- corpus Text Info
- text Format Info
- mime Type: text/plain
- size Per Text Format
- size Info
- size: 28223
- size Unit: sentences
- size Info
- size: 452175
- size Unit: tokens
- size Info
- character Encoding Info
- character Encoding: UTF-8
- text Format Info
- corpus Part General Info
- linguality Info
- linguality Type: monolingual
- multilinguality Type: other
- language Info
- language Id: nb
- language Name: Norwegian Bokmål
- size Per Language
- size Info
- size: 2
- size Unit: files
- size Info
- size: 15742
- size Unit: sentences
- size Info
- size: 245515
- size Unit: tokens
- size Info
- language Info
- language Id: nn
- language Name: Norwegian Nynorsk
- size Per Language
- size Info
- size: 2
- size Unit: files
- size Info
- size: 12481
- size Unit: sentences
- size Info
- size: 206660
- size Unit: tokens
- size Info
- modality Info
- modality Type: writtenLanguage
- modality Type Details: Blog text, news text, parliament proceedings, government white papers
- annotation Info
- annotation Type: discourseAnnotation-coreference
- linguality Info
dc:type | corpus |
dc:title | Norsk koreferansekorpus |
dc:identifier | oai:nb.no:sbr-82 |
dc:description | Norwegian-BokmaalNARC og Norwegian-NynorskNARC er konverteringar av høvesvis bokmåls- og nynorskdelen av Norwegian Anaphora Resolution Corpus (NARC), det første offentleg tilgjengelege korpuset annotert med anaforiske relasjonar mellom substantivfrasar for norsk. Annoteringa berikar den eksisterande annoteringa til Norsk dependenstrebank (NDT), og inneheld totalt 15 742 setningar og 245 515 "tokens" for bokmål, og 12 481 setningar og 206 660 "tokens" for nynorsk. Den vedlagde artikkelen av Mæhlum et al. frå CRAC 2022 skildrar annoteringsarbeidet i detalj. |
dc:publisher | |
dc:format | downloadable |
dc:date | |
dc:date | 2023-02-24 |
dc:rights | Public |
dc:rights | Creative Commons (CC) |
dc:rights | Creative_Commons-BY-SA (CC-BY-SA) |
dc:rights | https://creativecommons.org/licenses/by-sa/4.0/ |
dc:lang | bokmål |
dc:lang | nynorsk |