Basic Research on Corpus Annotation

Project leader: MAEKAWA Kikuo
Professor, Department of Corpus Studies, NINJAL
Project Period: October 2009 - March 2016
Keywords: corpus, annotation


The value of a corpus is a function of its size and the quality of its annotation. Placing a special focus on the latter, this project pursues various possibilities of corpus annotation to further increase the effectiveness of existing corpora, thereby contributing to the standardization of various annotation schemes for the Japanese language.

The annotations under investigation include the argument structure of predicates, clause boundary classification, word-sense tagging, modality tagging, and multi-word structure tagging, among others. Although most of these annotations are for written or transcribed texts, the project is not limited to written language. Annotation of spoken language will also be pursued.

