VEP Early Modern 1080 Collection

Before downloading the Early Modern 1080 corpus, read about the format of the text files here, and about our text processing workflow. Corpora are generated from Text Creation Partnership (TCP) XML files. Our downloads do not contain texts from EEBO-TCP Phase II, which will not be in the public domain until five years after the completion of the TCP project for Phase II.

The corpus is released under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License.

Early Modern 1080

A corpus of 1080 digitized texts built from the EEBO-TCP Phase I and the ECCO-TCP used to generate a topic model for Serendip. Texts selected were originally published between 1530 and 1799. The corpus was built by randomly sampling 40 texts per decade in an attempt to provide a less biased cross-section than just using well-known texts.

Credits: Metadata prepared by Mattie Burkert and Katie Lanning, under the supervision of Michael Witmore and Robin Valenza. XML files processed and curated by Deidre Stuffer for release as plain text files.