The BigBrother Corpus

Clarino - Textlab


Oppdatert: 2017-06-08

The BigBrother Corpus is a speech corpus with recordings from the first season of the BigBrother show, sent on Norwegian television by TVNorge in the first half of 2001. The participants in BigBrother speak different dialects, but primarily they come from the east of Norway. They are aged 23-36 years.

The BigBrother Corpus contains audio and video recordings of almost all the 100 broadcasts that was shown on television, approx. 550 000 words. The recordings are linked to the orthographic transcriptions. The transcriptions are also tagged morphologically.

The BigBrother Corpus is a unique speech corpus where the participants work together, discuss, argue, quarrel, cries, laugh, shout, make love etc. In contrast to controlled recordings that are limited to interviews and dialogue, the BigBrother-material has conversations about all possible topics and within different genre. Sometimes strong feelings are in turn, which also can conceivably have an impact on the language.

Vis utvidede metadata

The link will take you to an external site: We take no responsibility whatsoever for the content of external links.