The Cambridge University Library has acquired digital archives from Gale Cengage, a publisher of large primary source materials, including historical documents and newspapers. These digital archives are now available within a new resource called the “Gale Digital Scholar Lab” which has been specifically designed for the purpose of enabling text-mining and analysis.   

Using the Lab you can search the archives as you would on their native platforms and build content sets from these search results. You can make multiple content sets and analyse the corpus that you amass using the tools provided in the Lab. The tools available in the Lab now are all Open Source (and it is the ambition of the publisher that these will be expanded on over time): Topic Modelling (Mallet); Frequencies (Lucene); Clustering (SciKit Learn); Parts-of-Speech Tagger (spaCy); Sentiment Analysis (OpenNLP); Named Entity Recognition (spaCy); Ngrams (Lucene).

The Lab promises to open up new possibilities for the relative newcomer to digital scholarship in this area, allowing natural language processing tools to be applied to raw text data (OCR), facilitating new discoveries and insights. The Lab makes much of visualization of results and data and thus lends itself to scholarly sharing and “bridging the gap between scholarly resources and faculty researchers/students”. The Lab facilitates organisation of content sets, including renaming, duplicating and versioning as well as identifying the searches used to create the content set, which makes sharing and reproducing research projects easier than is usually the case.

Archives included in the Lab to which Cambridge has access for analysis are:

17th and 18th century Burney collection

19th century UK periodicals

British Library newspapers

Economist historical archive, 1843–2014

Eighteenth century collections online

Illustrated London News historical archive, 1842–2003

Making of modern law: legal treatises, 1800–1926

Nineteenth century U.S. newspapers

Times digital archive

Times literary supplement historical archive

U.S. declassified documents online

The access to the Lab is on a trial basis to help Cambridge assess its usefulness to the practitioner and to encourage and promote the resource to digital humanities scholarship in Cambridge generally. Access is available now from the details below, up to 31 December 2018. We have requested further “guide”-type materials to help the complete novice get started on the Lab and hope to be able to forward these on soon.  

The Gale Digital Scholar Lab can be accessed via the University of Cambridge. To obtain Lab access, please contact ejournals@lib.cam.ac.uk.

 

 

  • Posted 17 Oct 2018

Cambridge Digital Humanities

Tel: +44 1223 766886
Email enquiries@crassh.cam.ac.uk