VEP Early Modern Science Collection
VEP proudly presents two corpora of early modern scientific writing, curated by Alan Hogarth. They are released under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License.
Before downloading the corpora, read about the format of the text files here, and about our text processing workflow. Corpora are generated from Text Creation Partnership (TCP) XML files.
Please note that our download corpora do not contain texts from EEBO-TCP Phase II, which will not be in the public domain until five years after the completion of the TCP project for Phase II. However, metadata for EEBO-TCP Phase II texts is available for download.
Early Modern Science Corpora
The Big Names of Science
This corpus contains 329 early modern scientific texts by 100 ‘Big Name’ authors, published between 1530 and 1724. The authors were selected on the basis of their reputation and influence as early modern writers who address scientific subjects.
- Download the Big Names of Science SimpleText plain text files
- zip contents: 272 unrestricted SimpleText plain text files; 1 metadata csv; README for Big Names of Science Metadata.pdf; README_SimpleText_files.txt; tcp_restricted.txt
- size: 35 MB zipped; 99.3 MB unzipped
- Download Big Names of Science Metadata (via the Metadata Builder)
- Download Big Names of Science Metadata README (PDF)
- Download Big Names of Science Ubiqu+Ity Tokens Files
- zip contents: 272 Ubiqu+Ity Tokens csvs, TextViewer.html, README_Ubiquity_tokens_files.txt
- size: 80.7MB zipped; 480 MB unzipped
- Download the Big Names of Science 1-Grams (csv, right-click save as)
The Super Science
The Super Science corpus lists information about every ‘scientific’ text from EEBO-TCP Phases I and II, ECCO-TCP, and Evans-TCP. The corpus comprises 1,979 texts and covers the period 1482-1710.
- Download the Super Science SimpleText plain text files
- zip contents: 1,130 unrestricted SimpleText plain text files; 1 metadata csv; README for Super Science Metadata.pdf; README_SimpleText_files.txt; tcp_restricted.txt
- size: 143 MB zipped; 397 MB unzipped
- Download Super Science Metadata (via the Metadata Builder)
- Download Super Science Metadata README (PDF)
- Download Super Science Ubiqu+Ity Tokens Files
- zip contents: 1,130 Ubiqu+Ity Tokens csvs, TextViewer.html, vep_super_science_v2_ubiq321_ds.csv, README_Ubiquity_tokens_files.txt
- size: 213 MB zipped; 1.88 GB unzipped
- Download the Super Science 1-Grams (csv, right-click save as)
Credits: Metadata prepared by Alan Hogarth, with supervision by Jonathan Hope. XML files processed and curated by Deidre Stuffer for release as plain text files.