Center for Language Resource Development

The Center for Language Resource Development (言語資源開発センター : Japanese webpage) contributes to the creation of language resource science through the maintenance and development of corpora and other language resources and their use in research and education in cooperation with research departments and the research community. Specifically, the center will organize and standardize the information necessary for corpus creation, develop and provide related technologies, and establish an environment for building data together with the research community.
The corpora that we have developed and released in collaboration with the Research Departments include the Corpus of Spontaneous Japanese (CSJ), the Balanced Corpus of Contemporary Written Japanese (BCCWJ), the Corpus of Historical Japanese (CHJ), the International Corpus of Japanese as a Second Language (I-JAS), the Corpus of Everyday Japanese Conversation (CEJC), and the Corpus of Japanese Dialects (COJADS).
As language resources other than corpora, we have developed UniDic, a short unit dictionary for Japanese morphological analysis, the Word List by Semantic Principles, a Japanese thesaurus, and online corpus search tools such as Shonagon and Chunagon.

Ninjal Corpora Portal Site:

We also hold an annual "Language Resource Workshop" as a forum for exchanging information on language resources, and hold tutorials on corpora and tools as needed.

Profiles of researchers at the Center for Language Resource Development
Share This Page