The Beginnings of German Modern Poetry Corpus
The corpus was compiled as part of the project "The Beginnings of Modern Poetry," which uses digital methods to study German-language literature from about 1850 to 1920. It consists of texts in German-language poetry anthologies published in the second half of the 19th century and the early 20th century. The selected anthologies focus on poetry that was contemporary at the time, and, in the case of the anthologies published around 1900, on poems that the anthologists considered "modern". In total, the corpus consists of more than 20 anthologies containing more than 6000 poems. [more...]
-
text/tg.collection+tg.aggregation+xml
-
text/tg.collection+tg.aggregation+xml
The Beginnings of German Modern Poetry Corpus
The corpus has been compiled and is used in a project called "The Beginnings of Modern Poetry" (PIs: Fotis Jannidis, Simone Winko). The project investigates the history of German-language literature from about 1850 to 1920. More information on the project can be found here.
The corpus consists of poems in German-language poetry anthologies from about 1850 to 1920. The anthologies bear names such as German Poets of the Present Day. A Lyrical Album (edited by Robert Prutz, 1859) or German Poetry since Liliencron (edited by Hans Bethge, 1905). They contain texts that were contemporary at the time of publication and, in the case of the anthologies from around 1900, were considered "modern" by the anthologists. The anthologies include texts by authors such as Theodor Fontane and Rainer Maria Rilke that are well-known today, but also poems by now lesser-known writers such as Karl Gerok and Martin Greif. In total, the corpus consists of more than 20 anthologies containing more than 6000 poems. A more detailed description of the corpus and the anthologies can be found here.
All anthologies are XML-TEI encoded. The OCR as well as aspects such as verse boundaries were checked manually by student assistants, whom we would like to thank here: Julia Bartels, Juljana Battenberg, Aylin Bozyel, Jana Eckardt, Isabel Schlie and Lena Walter. The files also include GND identifiers for most authors. Prefaces were sometimes excluded from digitization for pragmatic reasons. Indexes, tables of contents, etc. are always excluded. Structural information about rhyme, meter, etc. is also not included.
All corpus texts are in the public domain and can be reused without restrictions. Please consider referencing this collection using the citation suggestion below.
As part of the project "The Beginnings of Modern Poetry", human annotators have provided extensive annotations for many of the corpus texts. Among other things, there are annotations on emotion representation for more than 1000 poems and annotations on text similarity for more than 800 poems. The annotation data can be found on Zenodo.
Citation Suggestion
The Beginnings of German Modern Poetry Corpus, edited by Simone Winko, Leonard Konle, Fotis Jannidis and Merten Kröncke. Göttingen/Würzburg 2022. DOI: https://doi.org/10.5281/zenodo.6053952.