The AI for Humanists project is developing resources to enable DH scholars to explore how large language models and AI technologies can be used in their research and teaching. Find an annotated bibliography of research papers and tools, a glossary of relevant terms, code tutorials, and information about our workshops.
Developed through collaboration among various institutions and projects, CATMuS provides an inter-compatible dataset spanning more than 200 manuscripts and incunabula in 10 different languages, comprising over 160,000 lines of text and 5 million characters spanning from the 8th century to the 16th.
The Corpus Glossariorum Latinorum Online (CGLO) provides digital access to the Corpus Glossariorum Latinorum (CGL, 1888–1923) and relevant archival material at the Thesaurus linguae Latinae (TLL).
AGAPE is an open-access database which aims to map the reception of the Greek Church Fathers in print throughout early modern Europe. It represents the main outcome of the four-year FNS Ambizione project The Greek Imprint on Europe: Patristics and Publishing in the Early Swiss Reformation, led by Paolo Sachet and based at the Institut d’histoire de la Réformation, University of Geneva.
St. Catherine's Monastery of the Sinai, in partnership with the Early Manuscripts Electronic Library (EMEL) and the UCLA Library, welcomes you to the Sinai Manuscripts Digital Library. Widely recognized as the world’s oldest continually operating library, St. Catherine’s holdings represent an unparalleled resource to study the history and literature of the Eastern Mediterranean from late antiquity until early modernity.
odyCy is a state of the art NLP library for Ancient Greek, capable of part-of-speech tagging, morphological analysis, dependency parsing, lemmatization and more.
Ob Altenglisch, Frühneuhochdeutsch oder Altnordisch – Peter Baker vereint in seinem Font-Projekt »Elstob« fast schon vergessene Glyphen mittelalterlicher Sprachen. Entstanden ist eine umfangreich ausgebaute Schriftfamilie, die es kostenlos zum Herunterladen gibt.
The string2string library is an open-source tool that offers a comprehensive suite of efficient algorithms for a broad range of string-to-string problems. It includes both traditional algorithmic solutions and recent advanced neural approaches to address various problems in pairwise string alignment, distance measurement, lexical and semantic search, and similarity analysis. Additionally, the library provides several helpful visualization tools and metrics to facilitate the interpretation and analysis of these methods.
Quo vadam? Start here! / How to Use Scrapbox? / Social Networks (Current) / Social Networks (Research) / Initials / Arrows in Manuscripts / Cats and Manuscripts / Standoff Annotation (TEI) / Marginal
The Lat-Epig interface allows you to query the EDCS and save the search result in a TSV file and plot the results on a map of the Roman Empire without any prior knowledge of programming. - GitHub - mqAncientHistory/Lat-Epig: The Lat-Epig interface allows you to query the EDCS and save the search result in a TSV file and plot the results on a map of the Roman Empire without any prior knowledge of programming.
PyGWalker: Turn your pandas dataframe into a Tableau-style User Interface for visual analysis - GitHub - Kanaries/pygwalker: PyGWalker: Turn your pandas dataframe into a Tableau-style User Interface for visual analysis
In this tutorial, you'll explore the different ways of creating and modifying PDF files in Python. You'll learn how to read and extract text, merge and concatenate files, crop and rotate pages, encrypt and decrypt files, and even create PDFs from scratch.
Les Archives de littérature du Moyen Âge (ARLIMA) ont été fondées à l'intention des étudiants et chercheurs spécialistes du Moyen Âge, pour qui la constitution d'une bibliographie sur un auteur ou un texte est devenue une tâche de plus en plus ardue, en raison de la multiplication non seulement des publications mais également des outils bibliographiques imprimés et électroniques à leur disposition.
Die DFG-Praxisregeln „Digitalisierung“ stellen eine zentrale Grundlage für DFG-geförderte Projekte im Programm „Digitalisierung und Erschließung“ dar: Sie formulieren Standards und enthalten Informationen zu organisatorischen, methodischen und technischen Fragen im Kontext der Digitalisierung und Erschließung forschungsrelevanter Objekte. Sie leisten damit einen wichtigen Beitrag zur Nachhaltigkeit, Zugänglichkeit und Anschlussfähigkeit geförderter Projekte und der in diesem Zusammenhang entstehenden Infrastruktur. Das vorliegende Dokument stellt eine aktualisierte Fassung der zuletzt 2016 durch die DFG publizierten Praxisregeln dar. Es wurde in Absprache mit der DFG-Geschäftsstelle durch eine vom NFDI-Konsortium NFDI4Culture initiierte Autor*innengruppe erarbeitet, deren Mitglieder mehrheitlich seit langem an der Ausgestaltung der Praxisregeln beteiligt waren sowie aktiv in die NFDI-Konsortien NFDI4Culture, NFDI4Memory, NFDI4Objects und Text+ eingebunden sind. Die jetzt überarbeitet vorliegenden Praxisregeln „Digitalisierung“ dienen als Ausgangspunkt für eine material- und communitybezogene Ausdifferenzierung der Praxisregeln durch die Communitys. Alle mit der Digitalisierung forschungsrelevanter Objekte befassten Communitys und Einrichtungen sind dazu aufgerufen, mit ihrer Expertise am weiteren Prozess mitzuwirken.
Beautiful visualizations of how language differs among document types. - GitHub - JasonKessler/scattertext: Beautiful visualizations of how language differs among document types.
Die Virtuelle Schatzkammer der Stadtbibliothek im Bildungscampus Nürnberg ist eine Online-Plattform mit Digitalisaten von historisch wertvollem Kulturgut sowie von lokalhistorisch relevantem Schrifttum (Norica). Dazu zählen urheberrechtsfreie Handschriften sowie Drucke vom 15. bis in das 20. Jahrhundert hinein. Im Zuge des Digitalisierungsprozesses wird der Bestand an digital abrufbaren Ressourcen fortlaufend erweitert.
The research of ancient written artefacts results in an ever-increasing amount of digital data in different forms, ranging from raw images of artefacts to automatically generated data from advanced acquisition techniques. The manual analysis of this data is typically time consuming and can be subject to human error and bias. Therefore, a set of Pattern Analysis Software Tools (PAST) has been developed for the automatic analysis of visual and tabular patterns in the research data from the study of ancient written artefacts. These software tools have been developed by Hussein Mohammed to facilitate a more efficient study of written artefacts and to help scholars benefit from the rapid advancements in the fields of pattern analysis and artificial intelligence. Furthermore, these tools can provide new insights which can only be derived from the statistical analysis of research data. Each tool in PAST is developed and tested in close collaboration with experts from relevant fields of research in order to ensure its usability and applicability to actual research questions.
The BERT for Humanists project is developing resources to enable DH scholars to explore how BERT-like models can be used in their research and teaching. Find an annotated bibliography of research papers and tools, a glossary of relevant terms, code tutorials, and information about our virtual workshop in June 2021.
Craft batch publishing with serverless IIIF for digital research and scholarship.
Aperitiiif is a workflow and set of components for batch publishing IIIF-compliant image collections. It addresses the needs of research and scholarly collections—needs often distinct from collections formally acquired and stewarded by research institutions.
CRMtex is an extension of CIDOC CRM created to support the study of ancient documents by identifying relevant textual entities and by modelling the scientific process related with the investigation of ancient texts and their features in order to foster integration with other cultural heritage research fields, such as archaeology and history.
In this step-by-step tutorial, you'll learn how to use spaCy. This free and open-source library for natural language processing (NLP) in Python has a lot of built-in capabilities and is becoming increasingly popular for processing and analyzing data in NLP.
Repo for the Harvard / LaTeX Ninja "Beyond TEI" workshop, May 2022 - GitHub - sarahalang/Harvard_BeyondTEI_Workshop_SLang2022: Repo for the Harvard / LaTeX Ninja "Beyond TEI" workshop, May 2022
C. Schroeder, and A. Zeldes. (2019)cite arxiv:1912.05082Comment: 9 pages; paper presented at the Stanford University CESTA Workshop "Collecting, Preserving and Disseminating Endangered Cultural Heritage for New Understandings Through Multilingual Approaches".
A. von Stockhausen. Kirche und Kaiser in Antike und Spätantike, volume 136 of Arbeiten zur Kirchengeschichte, Walter de Gruyter, Berlin; Boston, (2017)