CollateX

CollateX is a software to (a.) read multiple versions of a text, (b.) identify differences by aligning tokens, and (c.) output the alignment results for further processing, for instance (d.) to support the production of a critical apparatus or the stemmatical analysis of a text's genesis.

1
mention
4
contributors
Get started
2279 commits | Last commit 10 months ago

What CollateX can do for you

CollateX is a software to

  1. read multiple (≥ 2) versions of a text, splitting each version into parts (tokens) to be compared,
  2. identify similarities of and differences between the versions (including moved/transposed segments) by aligning tokens, and
  3. output the alignment results in a variety of formats for further processing, for instance
  4. to support the production of a critical apparatus or the stemmatical analysis of a text's genesis.

It resembles software used to compute differences between files (e.g. diff) or tools for sequence alignment which are commonly used in Bioinformatics. While CollateX shares some of the techniques and algorithms with those tools, it mainly aims for a flexible and configurable approach to the problem of finding similarities and differences in texts, sometimes trading computational soundness or complexity for the user's ability to influence results.

As such it is primarily designed for use cases in disciplines like Philology or – more specifically – the field of Textual Criticism where the assessment of findings is based on interpretation and therefore can be supported by computational means but is not necessarily computable.

Please go to http://collatex.net/ for further information.

Keywords
Programming languages
  • HTML 44%
  • Java 35%
  • Python 12%
  • Pug 3%
  • JavaScript 2%
  • Kotlin 2%
  • Jupyter Notebook 1%
  • Other 1%
License
</>Source code

Participating organisations

Int

Mentions

Contributors

GM
Gregor Middell
TLA
Tara L. Andrews