The Sofie Parallel Treebank is a syntactically annotated parallel corpus based on the first chapters of the novel “Sofies verden” by Jostein Gaarder, published by Aschehoug forlag. The treebank is a product of the META-NORD project and its goal to promote the accessability of existing treebanks for the languages in the project.
SOURCE TEXT
The Norwegian novel Sofies verden (Gaarder 1991) was chosen as a suitable basis for treebanking because it is linguistically rich and professionally translated in many languages, and because some treebanks already existed for text selections from this material in some languages in the META-NORD area.
Previous work was done by the Nordic Treebank Network, funded by the Nordic Language Technology Program (2001-2005) but had not been maintained and was no longer accessible. It was decided to gather those treebanks, document them, supplement them with additional treebanks for some languages where this effort was feasible, and make the resulting resources accessible. The resulting work has been a joint effort between META-NORD and the INESS project, which hosts the treebank.
The rights for the Finnish treebank have not been cleared, and this treebank is currently unavailable.
More information about the treebank development in META-NORD is available in the META-NORD Deliverable 3.4 on Parallel Treebanks (http://www.meta-nord.eu).
The Sofie Parallel Treebank is a syntactically annotated parallel corpus based on the first chapters of the novel “Sofies verden” by Jostein Gaarder, published by Aschehoug forlag. The treebank is a product of the META-NORD project and its goal to promote the accessability of existing treebanks for the languages in the project.
SOURCE TEXT
The Norwegian novel Sofies verden (Gaarder 1991) was chosen as a suitable basis for treebanking because it is linguistically rich and professionally translated in many languages, and because some treebanks already existed for text selections from this material in some languages in the META-NORD area.
Previous work was done by the Nordic Treebank Network, funded by the Nordic Language Technology Program (2001-2005) but had not been maintained and was no longer accessible. It was decided to gather those treebanks, document them, supplement them with additional treebanks for some languages where this effort was feasible, and make the resulting resources accessible. The resulting work has been a joint effort between META-NORD and the INESS project, which hosts the treebank.
The rights for the Finnish treebank have not been cleared, and this treebank is currently unavailable.
More information about the treebank development in META-NORD is available in the META-NORD Deliverable 3.4 on Parallel Treebanks (http://www.meta-nord.eu).
Utvidet metadata
resource Common Info
resource Type: corpus
identification Info
resource Name: META-NORD Sofie Parallel Treebank
description: The Sofie Parallel Treebank is a syntactically annotated parallel corpus based on the first chapters of the novel “Sofies verden” by Jostein Gaarder, published by Aschehoug forlag. The treebank is a product of the META-NORD project and its goal to promote the accessability of existing treebanks for the languages in the project.
SOURCE TEXT
The Norwegian novel Sofies verden (Gaarder 1991) was chosen as a suitable basis for treebanking because it is linguistically rich and professionally translated in many languages, and because some treebanks already existed for text selections from this material in some languages in the META-NORD area.
Previous work was done by the Nordic Treebank Network, funded by the Nordic Language Technology Program (2001-2005) but had not been maintained and was no longer accessible. It was decided to gather those treebanks, document them, supplement them with additional treebanks for some languages where this effort was feasible, and make the resulting resources accessible. The resulting work has been a joint effort between META-NORD and the INESS project, which hosts the treebank.
The rights for the Finnish treebank have not been cleared, and this treebank is currently unavailable.
More information about the treebank development in META-NORD is available in the META-NORD Deliverable 3.4 on Parallel Treebanks (http://www.meta-nord.eu).
attribution Text: The "Sofie analyses" is research material based on the novel "Sofies verden" [Sophie's world] by Jostein Gaarder, published by Aschehoug Forlag. If you use INESS in your research, please link to the INESS webpage (http://clarino.uib.no/iness) in materials included with your data. We suggest the following reference in your scientific publications: Victoria Rosén, Koenraad De Smedt, Paul Meurer, and Helge Dyvik. An open infrastructure for advanced treebanking. In Jan Hajič, Koenraad De Smedt, Marko Tadić, and António Branco (eds.) META-RESEARCH Workshop on Advanced Treebanking at LREC2012, pages 22–29, Istanbul, Turkey, May 2012.
source: The present metadata are authoritative metadata. They are based on metadata from the project META-NORD (project end date 31.01.2013), published in the META-SHARE catalogue.
validation Mode Details: All alignments have been manually validated. See the descriptions of the monolingual treebanks for validation/evaluation of the individual annotations.
validation Extent: partial
corpus Info
corpus Type: Treebank
corpus Part Info
media Type: text
corpus Part General Info
source Work Info
title: Sofies verden
work Description: The novel Sofies verden (Sophie's world), ISBN: 9788203254147.
multilinguality Type Details: All language pairs have been aligned at sentence (parse unit) level.
language Info
language Id: no
language Name: Norwegian
size Per Language
size Info
size: 255
size Unit: sentences
language Info
language Id: sv
language Name: Swedish
size Per Language
size Info
size: 215
size Unit: sentences
language Info
language Id: da
language Name: Danish
size Per Language
size Info
size: 103
size Unit: sentences
language Info
language Id: et
language Name: Estonian
size Per Language
size Info
size: 52
size Unit: sentences
language Info
language Id: ka
language Name: Georgian
size Per Language
size Info
size: 1025
size Unit: sentences
language Info
language Id: de
language Name: German
size Per Language
size Info
size: 528
size Unit: sentences
language Info
language Id: is
language Name: Icelandic
size Per Language
size Info
size: 194
size Unit: sentences
language Info
language Id: en
language Name: English
size Per Language
size Info
size: 225
size Unit: sentences
modality Info
modality Type: writtenLanguage
classification Info
genre Info
genre Type: textGenre
genre: fiction and drama
creation Info
dc:type
corpus
dc:title
META-NORD Sofie Parallel Treebank
dc:identifier
oai:clarino.uib.no:sofie-par
dc:description
The Sofie Parallel Treebank is a syntactically annotated parallel corpus based on the first chapters of the novel “Sofies verden” by Jostein Gaarder, published by Aschehoug forlag. The treebank is a product of the META-NORD project and its goal to promote the accessability of existing treebanks for the languages in the project.
SOURCE TEXT
The Norwegian novel Sofies verden (Gaarder 1991) was chosen as a suitable basis for treebanking because it is linguistically rich and professionally translated in many languages, and because some treebanks already existed for text selections from this material in some languages in the META-NORD area.
Previous work was done by the Nordic Treebank Network, funded by the Nordic Language Technology Program (2001-2005) but had not been maintained and was no longer accessible. It was decided to gather those treebanks, document them, supplement them with additional treebanks for some languages where this effort was feasible, and make the resulting resources accessible. The resulting work has been a joint effort between META-NORD and the INESS project, which hosts the treebank.
The rights for the Finnish treebank have not been cleared, and this treebank is currently unavailable.
More information about the treebank development in META-NORD is available in the META-NORD Deliverable 3.4 on Parallel Treebanks (http://www.meta-nord.eu).