Transcription Standards

Old Uyghur manuscripts are transcribed differently across scholars and traditions — Clauson, Gabain, Erdal, and Chinese sources each have their own conventions. BITIG Data adopts a unified Latin transcription based on international Turcological practice.

Loading transcription rules…


Digitization Workflow

This is a prototype workflow illustrating how raw manuscript materials can be transformed into structured, searchable digital data. We do not claim to have a fully automated pipeline — this is a demonstration of the process.

1
Source Acquisition
PDF scans, manuscript photos, published editions
2
Text Recognition
Manual transcription or assisted OCR of Old Uyghur script
3
Transcription Normalization
Convert source-specific conventions to unified BITIG standard
4
Lexical Annotation
Segment, gloss, tag POS, link to dictionary entries
5
Web Presentation
Static site: searchable, browseable, citable, extensible

Current scope: curated samples (5 texts, ~20 dictionary entries). The pipeline is designed to be extensible — new texts and lexical entries can be added by appending to the JSON data files. No database or server-side processing required.