Leave a Message

TEI XML Conversion Services

Produce Educational, Scholarly and Research Material in TEI XML

TEI XML Conversion Services

TEI (Text Encoding Initiative) is a set of guidelines written in a source format called 'ODD' ('One Document Does it All') which includes the TEI source code, the TEI schemas, and the prose documentation for encoding machine-readable texts, mostly in the linguistics, humanities and social sciences.

SunTec Digital can help you convert large volumes of textual and research material in TEI XML format with complete precision. We have been extensively supporting academic institutions, libraries, museums, publishers and individual scholars to digitally present literary and linguistic texts, journals, manuscripts, historical archives, dictionaries, research papers, books, transcribed text, spoken corpora, and facsimiles, etc., for the purpose of teaching, web research and information security.

Producing Educational, Scholarly and Research Material in TEI XML

With our rich experience and technical competency, we accurately convert files from HTML, PDF and other formats into TEI XML documents, and vice-versa. We can also encode the major structural and presentational features of letters, prose and poetry using TEI XML format.

The TEI XML conversion specialist team at SunTec Digital strictly adheres to the TEI guidelines for text encoding:

  • Using specialist vocabularies like SVG, MathML and XInclude, whenever required
  • Collection of TEI elements, defining: descriptions, examples, elements and attributes, content models and datatypes, constraints, information on how it can be used, equivalences, etc.
  • Managing infrastructure of model and attribute classes

Our teams are adept at working with the following tools and technologies to transform the content in TEI XML:

  • XSLT for XML transformation
  • XSL FO for XML document formatting
  • XHTML, CSS, JavaScript, JSON for web display
  • XQuery, Xpath, XForms, RDF and SPARQL for searching and indexing
  • Tools like oXygen, TEI Boilerplate, Omeka, OxGarage, existDB, etc., to simplify the TEI XML conversion process

TEI XML Conversion: Our Workflow

To facilitate interchange of scholarly information, our teams can create documents in TEI XML format, conforming to P5, the latest version of the TEI Guidelines, or as requested by the client.

Step One: Document Analysis

  • Deciding the number of items that are to be reviewed
  • Looking through the materials and preparing a list of all the 'features' present
  • Studying the content & structure of the documents

Step Two: Requirement Assessment

  • Understanding the type of documents used by the client
  • Analyzing content of reference queries

Step Three: TEI Tag Set Compilation

  • Comparing text features with the project requirements to find out referencing, structural and semantic needs
  • Sample encoding

Step Four: Encoding Guidelines to Follow

  • Devising a solution corresponding to the source files, TEI version standard, document structure, etc.
  • File naming scheme
  • DTD or Schema
  • Tag list
  • Character encoding
  • Attachments
  • XML/TEI Sample
  • Normalization, authority control & cross-referencing
  • Structural & semantic features that require additional explanation

Step Five: Text Encoding

  • Digitizing the original files and documents
  • Performing both manual & automated Quality Control on image files, and document the results
  • Running auto QC on XML files
  • Performing automated DTD and Schematron validation on XML files, and making the necessary changes
  • Verifying the TEI file names
  • Uploading the encoded file to the client server

Final Step: Delivering TEI-conformant documents, validated against the TEI Schema.

Getting Started!

To find out how SunTec Digital can help you create and convert files into TEI XML-conformant documents, kindly email us at info@suntecdigital.com.