This distribution represents only the morphological information encoded in BulTreeBank - HPSG-based Treebank of Bulgarian. It contains about 214000 tokens. It was used for the training of the TreeTagger for Bulgarian.
It contains sentences from Bulgarian Grammar Textbooks, Newspapers, Literature and other sources of texts.
Full documentation (Style Book, Tagset description) of the Treebank can be found on: http://www.bultreebank.org/TechRep.html
The link will take you to an external site: We take no responsibility whatsoever for the content of external links.