The Vialab team (Christopher Collins and Adam James Bradley, University of Ontario Institute of Technology) is developing a software suite comprised of three research and browsing tools allowing to explore vast corpora of textual documents. This project aims to respond to current and emerging HSS researchers’ needs in an innovative way. Vialab’s approach is based on a paradigm shift: they wish to develop tools prompting new questions rather than to provide answers to pre-existing ones.
The first tool, called Textension, is a simple solution allowing to interact with textual documents in image format, which are very often found in digital libraries thanks to digitization projects of archives and historical documents. It will allow, among other things, to expand blank spaces in order to take notes, to insert data visualizations, to automatically translate texts and to manually clean the OCR of these documents.
The second tool is a document analyser using machine learning to help researchers find relevant documents in the Érudit corpus, in French or in English. It allows to browse based on the concepts and ideas found in these documents and provides various visualizations to explore the results.
The third project is a cultural map based on named entities extracted from the plain text of the Érudit corpus, allowing to explore and analyze networks, whose nodes and dimensions can be configured by users.