Tools: Weave & Scribe

Matthew Milner: March 3, 2016 at 13:20

I've been busy migrating two new tools for NanoHistory. Both are geared towards making life easier for users, as the main problem for NanoHistory is the density of data entry and connections which need to be made quickly, easily, and accurately. This work focuses on rapid creation of new events or connections between existing nodes or entities, and the transcription or documentation of entities and events from online source materials that aren't well suited to automated processing - mainly manuscript sources. Like existing NanoHistory tools, I've given these them one word names - Weave and Scribe.

Weave

In Making Publics, in many ways NanoHistory's progenitor, we created a workbench tool consisting of overlaid canvas elements that allowed users to drag and drop entities from a defined workbrench container. As Making Publics grew, it became clear that populating this 'container' of easily draggable entities was a hindrance to rapid creation of new events. In NanoHistory I've opted to search for the required entities and drag between nodes directly within the workspace itself. Clicking and keyup actions call modal boxes to search NanoHistory for appropriate entities, and to define verbs for events. An event is added directly to NanoHistory upon completion. This allows users to create events rapidly by clicking and dragging connections between existing nodes. At the same time, addition of a new entity to the workspace displays existing events with entities already populating the workspace, showing existing networks prior to the creation of new events.

Weave has several limitations. First, at the moment it does not filter existing events by date, but shows all events between entities within the workspace. Secondly, and perhaps more importantly, it doesn't handle nested events. This means that users can't add events as objects to the workspace, which is a major inconvenience for complex networks. This is a limitation to the D3 force directed graph visualization which I need to overcome elsewhere as well. Once a clear solution emerges for visualizing both a nested event and nodes contained within it, and both their relationships to other nodes within a network visualization I'll implement it in Weave.

Scribe

NanoHistory doesn't have any means for storing user's documents, or handling or processing them in any meaningful way. Although one of the longer term plans is to present its data in various ways to act as a gazetteer for Named Entity Recognition tools, I don't have any plans to turn NanoHistory into an NER platform. It seems easier to pre-process texts elsewhere, or create pipelines into NanoHistory from existing platforms for such activities. That said, there are online archives which are a) not suitable to OCR or NER processes simply because they are two onerous or complex at the moment, and b) researchers who have little time or energy to learn the necessary skills to conduct OCR or NER on them, yet are versed and eager to use them. I'm looking at you, medievalists! And so I've ported over the old transcription tool from Making Publics to allow users to view documents, pages, or images housed elsewhere directly within NanoHistory. In order to use Scribe, users must first create a thing, and then provide the necessary link to it within NanoHistory. Scribe then uses the link in its viewer.

The tool is fairly simple - on the left is a viewer which determines what kind of link is being used: a pdf file, html, or an International Image Interoperability Framework manifest (IIIF). For IIIF, we're using DivaJS, an image tiling and viewing library created by Andrew Hankinson in Ichiro Fujinaga's DDMAL Lab in the School of Music at McGill University. It's free, and open source. On the right there are a number of forms and lists for adding new entities to NanoHistory, and collating those which are already noted as 'mentioned' or 'cited' in the document being viewed.