Kobun Resources: Tools for Reading Classical Japanese Texts
In Japan, an enormous number of historical documents and texts have been preserved over the centuries. However, many are written using archaic styles (often referred to as ‘kobun‘) which are unreadable to all but a small handful of specialists. This page introduces open-access digital projects that aim to address this issue by making historical texts more accessible to non-specialists.
Chusei Monjo Mediaeval Manuscript Database
The National Museum of Japanese History has released Chusei Monjo, a new database of Japanese mediaeval manuscripts from the late Heian period to the Azuchi-Momoyama period (around the 11th to 16th centuries). Along with images of the original manuscripts, the database also includes the transliterated text, an explanation in modern Japanese, and audio for each text. The site also incorporates an online forum allowing scholars and students to ask questions, comment on the texts and discuss mediaeval Japanese history.
Chusei Monjo website (Japanese only)
KuroNet Character Recognition Platform
At Japan’s ROIS-DS Centre for Open Data in the Humanities (CODH), researchers Alex Lamb, Tarin Clanuwat and Asanobu Kitamoto have built an Optical Character Recognition platform called KuroNet, which converts images of documents written in kuzushiji (classical cursive script) into contemporary Japanese script using artificial intelligence. The model was trained on the CODH’s extensive Kuzushiji Dataset, which contains over 4000 character classes and a million character images from 44 books printed in the Edo period (around the 17th to 19th centuries).
See the KuroNet page on the CODH website (Japanese only)
See instructions for using the KuroNet Kuzushiji Recognition Viewer (Japanese only)
Read an article in The Gradient magazine on CODH’s work in this area by project researcher Alex Lamb
Featured: Japanese Studies
Peer-reviewed Journal: New Voices in Japanese Studies
Japanese Studies Grants from the Japan Foundation