Extended metadata
- resource Common Info
- resource Type: corpus
- identification Info
- resource Name: Translation Memories from Semantix AS
- resource Name: Omsetjingsminne frå Semantix AS
- description: This corpus contains translation memories provided to the National Library of Norway by Semantix AS. The translations have been carried out on behalf of various public agencies and institutions. The corpus is composed of texts of English or Norwegian Bokmål origin, with parallelized translations into the other language. There are some very few examples og translation into Norwegian Nynorsk in the material, but for simplicity, these have been classified as Norwegian Bokmål. All translations from English to Norwegian Bokmål are collected in a separate file, and vice versa with translations from Norwegian Bokmål to English. The files are in TMX 1.4 format (a variant of XML). In the files, every single translation unit (TU) is marked with the institution for which the translation has been carried out. A TU corresponds (more or less) to a meaningful linguistic unit, typically a sentence, a heading etc. A TU may also consist of a single word or several clauses. The corpus contains a total of 1.325.013 TUs, distributed as follows: - English > Norwegian Bokmål: 250.053 TUs - Norwegian Bokmål > English: 1.074.960 TUs The documentation file contains an overview of the agencies and institutions, and the number of TUs belonging to each institution.
- description: Dette korpuset inneheld ei rekkje omsetjingsminne Nasjonalbiblioteket har overteke frå Semantix AS. Omsetjingane er utførte på vegner av offentlege kontor og institusjonar. Korpuset er samansett av tekster med engelsk eller norsk (bokmål) originaltekst, og omsetjingar til høvesvis norsk (bokmål) eller engelsk. Det finst nokre veldig få omsetjingar til nynorsk i materialet, men desse har vorte klassifiserte som bokmål. Det finst ei fil med omsetjingar frå engelsk til bokmål, og ei fil med omsetjingar frå bokmål til engelsk. Filene er i TMX 1.4-format (ein variant av XML). I filene er kvar enkelt omsetjingseining (TU - Translation Unit) merkt med institusjonen omsetjinga er gjort for. Ei omsetjingseining svarar meir eller mindre til ei meiningsberande eining, typisk ei setning, ei overskrift eller liknande. Det kan òg dreie seg om enkeltord eller lengre sekvensar. Totalt inneheld korpuset 1.325.013 omsetjingseiningar, fordelte slik: - engelsk > bokmål: 250.053 TU - bokmål > engelsk: 1.074.960 TU Dokumentasjonsfila inneheld ei oversikt over kva institusjoner som er omfatta og talet på TUer for den enkelte institusjonen.
- url: https://www.nb.no/sprakbanken/resource/translation-memories-from-semantix-as/
- identifier: sbr-62
- distribution Info
- licence Info
- user Category: Public
- distribution Access Medium: downloadable
- download Location: https://www.nb.no/sprakbanken/wp-json/resource/v1/sbr-62
- execution Location:
- attribution Text:
- licence
- licence Family: Creative Commons (CC)
- licence Name: Creative_Commons-ZERO (CC-ZERO)
- licence Url: https://creativecommons.org/publicdomain/zero/1.0/
- conditions Of Use:
- non Standard Conditions Of Use:
- distribution Rights Holder
- actor Info
- actor Type: organization
- role: Distribution Rights Holder
- organization Info
- organization Name: National Library of Norway
- organization Name: Nasjonalbiblioteket
- organization Short Name: NLN
- organization Short Name: NB
- communication Info
- email: sprakbanken@nb.no
- url: https://www.nb.no/sprakbanken/
- address: P.O. Box 2674 Solli
- zip Code: 0203
- city: Oslo
- region: Oslo
- country: Norway
- actor Info
- licensor:
- actor Info
- actor Type: organization
- role: Licensor
- organization Info
- organization Name: Semantix AS
- organization Name: Semantix AS
- licence Info
- ipr Holder
- contact
- actor Info
- actor Type: organization
- role: Contact
- organization Info
- organization Name: National Library of Norway
- organization Name: Nasjonalbiblioteket
- organization Short Name: NLN
- organization Short Name: NB
- department Name: The Language Bank
- department Name: Språkbanken
- communication Info
- email: sprakbanken@nb.no
- url: https://www.nb.no/sprakbanken/
- address: P.O. Box 2674 Solli
- zip Code: 0203
- city: Oslo
- region: Oslo
- country: Norway
- actor Info
- metadata Info
- metadata Creation Date: 17.12.2020
- metadata Language Name: English
- metadata Language Id: eng
- metadata Last Date Updated: 18.12.2020
- metadata Creator
- actor Info
- actor Type: person
- role: Metadata Creator
- person Info
- surname: Lindstad
- given Name: Arne Martinus
- affiliation:
- organization Info
- organization Name: National Library of Norway
- organization Name: Nasjonalbiblioteket
- organization Short Name: NLN
- organization Short Name: NB
- communication Info
- email: sprakbanken@nb.no
- url: https://www.nb.no/sprakbanken/
- address: P.O. Box 2674 Solli
- zip Code: 0203
- city: Oslo
- region: Oslo
- country: Norway
- actor Info
- version: 1.0
- revision:
- last Date Updated: 11.11.2020
- validated: yes
- validation Type: content
- validation Mode: manual
- validation Mode Details: manual translation
- validation Extent: full
- validator:
- actor Info
- actor Type: organization
- role: Resource Validator
- organization Info
- organization Name: Semantix AS
- organization Name: Semantix AS
- documentation Unstructured
- role: documentation
- document Unstructured: content overview
- creation Start Date:
- creation End Date: 11.11.2020
- resource Creator
- actor Info
- actor Type: organization
- role: Resource Creator
- organization Info
- organization Name: Semantix AS
- organization Name: Semantix AS
- actor Info
- actor Type: person
- role: Resource Creator
- person Info
- surname: Birkenes
- given Name: Magnus Breder
- affiliation:
- organization Info
- organization Name: National Library of Norway
- organization Name: Nasjonalbiblioteket
- organization Short Name: NLN
- organization Short Name: NB
- communication Info
- email: sprakbanken@nb.no
- url: https://www.nb.no/sprakbanken/
- address: P.O. Box 2674 Solli
- zip Code: 0203
- city: Oslo
- region: Oslo
- country: Norway
- actor Info
- corpus Info
- corpus Type: Written Corpus
- corpus Part Info
- media Type: text
- corpus Audio Info
- corpus Text Info
- text Format Info
- mime Type: application/x-tmx+xml
- size Per Text Format
- size Info
- size: 2
- size Unit: files
- size Info
- size: 1325013
- size Unit: units
- size Info
- character Encoding Info
- character Encoding: UTF-8
- text Format Info
- corpus Text Ngram Info
- ngram Info
- base Item:
- order:
- ngram Info
- corpus Part General Info
- linguality Info
- linguality Type: multilingual
- multilinguality Type: parallel
- multilinguality Type Details: translation memory
- language Info
- language Id: nob
- language Name: Norwegian Bokmål
- language Variety Info
- language Variety Type: other
- language Variety Name: formal
- language Info
- language Id: eng
- language Name: English
- language Variety Info
- language Variety Type: other
- language Variety Name: formal
- modality Info
- modality Type: writtenLanguage
- modality Type Details:
- size Info
- size: 1325013
- size Unit: units
- annotation Info
- annotation Type: alignment
- segmentation Level: clause
- annotation Mode: mixed
- linguality Info
dc:type | corpus |
dc:title | Translation Memories from Semantix AS |
dc:identifier | oai:nb.no:sbr-62 |
dc:description | This corpus contains translation memories provided to the National Library of Norway by Semantix AS. The translations have been carried out on behalf of various public agencies and institutions. The corpus is composed of texts of English or Norwegian Bokmål origin, with parallelized translations into the other language. There are some very few examples og translation into Norwegian Nynorsk in the material, but for simplicity, these have been classified as Norwegian Bokmål. All translations from English to Norwegian Bokmål are collected in a separate file, and vice versa with translations from Norwegian Bokmål to English. The files are in TMX 1.4 format (a variant of XML). In the files, every single translation unit (TU) is marked with the institution for which the translation has been carried out. A TU corresponds (more or less) to a meaningful linguistic unit, typically a sentence, a heading etc. A TU may also consist of a single word or several clauses. The corpus contains a total of 1.325.013 TUs, distributed as follows: - English > Norwegian Bokmål: 250.053 TUs - Norwegian Bokmål > English: 1.074.960 TUs The documentation file contains an overview of the agencies and institutions, and the number of TUs belonging to each institution. |
dc:publisher | |
dc:format | downloadable |
dc:date | |
dc:date | 2020-11-11 |
dc:rights | Public |
dc:rights | Creative Commons (CC) |
dc:rights | Creative_Commons-ZERO (CC-ZERO) |
dc:rights | https://creativecommons.org/publicdomain/zero/1.0/ |
dc:creator | Semantix AS |
dc:creator | Magnus Breder Birkenes |
dc:lang | Norwegian Bokmål |
dc:lang | English |