Text Interoperability Tool

The Center for Digital Research in the Humanities at the University of Nebraska and Northwestern University's Academic and Academic Research Technologies are pleased to announce the first fruits of a collaboration between the Abbot and EEBO-MorphAdorner projects: the release of some 2,000 18th century texts from the TCP-ECCO collections in a TEI-P5 format and with linguistic annotation. More texts will follow shortly, subject to the access restrictions that will govern the use of TCP texts for the remainder of this decade. Read the full announcement or download the 2,000 ECCO texts here.

Abbot is a tool designed to convert dissimilar collections of XML texts into a common interoperable form. Abbot's key feature is the ability to read an XML schema file and output procedures to convert source files into a valid form that is consistent with the target schema. Abbot's schema-harvesting procedures focus on TEI, but are extremely flexible and format agnostic. Abbot makes no particular judgment or demand concerning the type of interoperability sought. It can transform texts into a variety of TEI schema, and accommodates user customization.

Abbot is more likely than conventional file-conversion methods to spot and deal with problems because it operates consistently and algorithmically across large numbers of texts. Abbot sets its course on an ambitious but sensible path—moving toward total interoperability, while at the same time accepting the uniqueness of individual text collections. Abbot's method allows for different forms of interoperability, from small, one-off instances to the creation of large, permanent digital libraries.

Abbot was developed at the Center for Digital Research in the Humanities by Brian L. Pytlik Zillig, Stephen Ramsay, Martin Mueller, and Frank Smutniak. Support for Abbot was provided by the Andrew W. Mellon Foundation.

Power users may want to try the command-line version of Abbot, available here.

For further information please contact us at

Please see the Abbot software license.

more info