Machine Translation between Language Stages
ExtractingHistorical Grammar from a Parallel Diachronic Corpus of Polish
Philosophische Fakultät II
This paper explores methods for the extrapolation of correspondences in a small parallel diachronic corpus taken from the Modern and Middle Polish Bible, in an attempt to answer the question “can historical grammar and lexica be derived directly from a corpus?” The problem of extracting this data is approached from a machine translation point of view: by envisioning texts from different periods as language models for their respective language stages, and historical grammar as a translation model mapping one language stage onto another. This notion is explored using automatic extraction of morphological, lexical and syntactic correspondences.