Investigating the Origins of Islamicate Manuscripts Using Computational Methods

Project title: Investigating the Origins of Islamicate Manuscripts Using Computational Methods
Funder: Cambridge Humanities Research Grant
PI: Yasmin Faghihi, Cambridge Digital Humanities
Dates: September 2021–August 2022

Provenance is amongst the most ambiguous and in some cases controversial aspects of manuscript studies. Current methods for dating and placing manuscripts from the Islamicate world employ a variety of criteria, often in combination. For a minority of manuscripts dates and places are derived from notes by former readers, owners, librarians, or other custodians. Other methods rely on investigation of physical features such as watermark imaging and analysis, XRTF, multi-spectroscopy, or radiocarbon dating of parchment. In the absence of secure dates or scientific evidence, researchers rely on codicological observations. These include paleographical analysis, layout, ruling size and analysis of ornamentation and illustration. We therefore rely heavily on the codicological data and the extrapolation of information taken from a range of witnesses to ascertain the place and date of production and the wider history of the object.

For the past ten years we have been working collaboratively in collecting manuscript descriptions sourced from legacy data and new research (with the manuscript in hand) and recording these findings in FIHRIST, a union catalogue for Islamicate manuscripts in UK repositories. We have been using TEI/XML to produce detailed descriptions of manuscripts including physical descriptions, bibliographical information relating to the text and contextual information. As capacity for this work varied in each contributing institution the data varied in complexity and suffered from internal inconsistencies. In 2019 we received seed funding from CDH to embark on a massive data clean-up, deduplicating about 25% of the authority entries in preparation for analytical approaches to the dataset.

Following this successful operation, we have now secured a £20,000 grant from CHRG to use computational methods on the dataset to investigate questions around the provenance of items in FIHRIST. We will be using Linked Open Data to cluster descriptions of manuscripts sharing codicological features, attempting to generate hypotheses about provenance by cross referencing data within these clusters. The aim is to trace movements of manuscripts before the point of institutional acquisition (e.g. manuscripts produced in one area and collected in another) in cases where this history has not been recorded, in order to generate a better understanding of provenance in relation to multiple acts of ownership, appropriation and acquisition over time.