Study on Documents and Meta-languages for Designing a Corpus of Modern Japanese

Abbreviation: Corpus of Modern Japanese
Project Leader: TANAKA Makiro
Associate Professor, Department of Corpus Studies, NINJAL
Research field: Japanese linguistics
Keywords: Corpus, History of modern Japanese, Documents of Modern Japanese


This project investigates the texts of modern Japanese as a preliminary to a future project of designing a corpus of Modern Japanese. Ultimately, this Modern Japanese corpus is expected to complement both the Diachronic Corpus covering the period between ancient times and early modern times and the Balanced Corpus of Contemporary Written Japanese, which will be completed in 2011. Based on the Taiyo Corpus created by the old NINJAL as well as on digitized texts of Modern Japanese, this project will create a prototype of the Corpus of Modern Japanese and use it to develop methods for corpus studies of Modern Japanese. The project will also make a list of important documents, examine methods of selecting documents for the corpus, and develop the methodology for describing the structure of texts and the morphology of the language comprising them. This project will advance the research to the stage where subsequent projects can start the actual construction of the Corpus of Modern Japanese.