Dataset Viewer
Auto-converted to Parquet Duplicate
id
string
image_bytes
image
script
class label
script_type
string
W00KG03840-I00KG038640356
0Danyig
DraDring
W1CZ1211-I1CZ20770009
0Danyig
DraDring
W1CZ1237-I1CZ17150008
0Danyig
DraDring
W1CZ1250-I1CZ20970010
0Danyig
DraDring
W1CZ2330-I1CZ23430126
0Danyig
DraDring
W21872-62000208
0Danyig
DraDring
W23752-26070245
0Danyig
DraDring
W26448-I1CZ21930229
0Danyig
DraDring
W3CN644-I3CN6460648
0Danyig
DraDring
W4CZ307408-I4CZ3083600720
0Danyig
DraDring
W4CZ57660-I4CZ3240010630
0Danyig
DraDring
W5CN96-I5CN61510335
0Danyig
DraDring
W00KG01660-I00KG073840009
0Danyig
DraDring
W1CZ1152-I1CZ15680008
0Danyig
DraRing
W1CZ1172-I1CZ16060009
0Danyig
DraDring
W1CZ1270-I1CZ17610220
0Danyig
DraRing
W1KG3709-I1KG37500012
0Danyig
DraRing
W24700-I00KG035520709
0Danyig
DraDring
W26440-I1CZ21970377
0Danyig
DraDring
W3CN21332-I3CN213350566
0Danyig
DraRing
W3CN2937-I3CN29390007
0Danyig
DraDring
W4CZ332278-I4CZ3328700106
0Danyig
DraDring
W8LS19266-I8LS192680005
0Danyig
DraDring
W1CZ1172-I1CZ16060020
0Danyig
DraDring
W1CZ1263-I1CZ17450007
0Danyig
Drathung
W1CZ1286-I1CZ17930006
0Danyig
Drathung
W1GS66243-I1GS662450031
0Danyig
Drathung
W20532-55840371
0Danyig
Drathung
W23373-15550492
0Danyig
Drathung
W23551-21140329
0Danyig
Drathung
W27941-45380446
0Danyig
Drathung
W3CN21482-I4CN121420005
0Danyig
Drathung
W8CZ213-I8CZ7630256
0Danyig
Drathung
W8LS32728-I8LS334220588
0Danyig
Drathung
W12170-20760005
0Danyig
DraRing
W1KG12097-I1KG122700132
0Danyig
DraRing
W1LT0433-I1LT04330395
0Danyig
Drathung
W23505-20780059
0Danyig
DraRing
W2CZ8072-I1PD966720494
0Danyig
Gongshabma
W3CN21332-I3CN213350568
0Danyig
DraRing
W3CN3014-I3CN4390005
0Danyig
DraDring
W3CN415-I3CN4210005
0Danyig
Gongshabma
W3CN644-I3CN6460652
0Danyig
DraDring
W3CN8329-I3CN83480447
0Danyig
DraDring
W8LS19374-I8LS193760005
0Danyig
Gongshabma
W1LT0877-I1LT08770006
0Danyig
Drathung
W1LT0965-I1LT09650005
0Danyig
Tsegdrig
W27563-43860005
0Danyig
Tsegdrig
W2CZ7986-I1KG37890005
0Danyig
Tsegdrig
W2PD17540-I4PD23620005
0Danyig
Tsegdrig
W2PD17548-I4PD23810005
0Danyig
Drathung
W3CN833-I3CN8410450
0Danyig
Drathung
W4CZ307410-I4CZ3083640005
0Danyig
Drathung
W4CZ74189-I4CZ742490006
0Danyig
DraDring
W8LS17668-I8LS176860276
0Danyig
Tsegdrig
W8LS20093-I8LS200950082
0Danyig
Drathung
W3CN21798-I4CN123440005
0Danyig
DraDring
W4CZ74180-I4CZ742310507
0Danyig
DraRing
W1LT0369-I1LT03690588
0Danyig
Drathung
W00KG04018-I00KG040350955
0Danyig
Tsegdrig
W00KG01605-I00KG028460096
1Druma
Dhumri
W00KG03548-I00KG035750042
1Druma
Dhumri
W1KG26085-I1KG261000740
1Druma
Dhumri
W1KG587-I1KG6020382
1Druma
Dhumri
W24878-I00KG035580164
1Druma
Dhumri
W2PD17544-I4PD23700005
1Druma
Dhumri
W2PD19741-I2PD197430005
1Druma
Dhumri
W3CN17930-I3CN179320276
1Druma
Dhumri
W3CN2546-I3CN25600268
1Druma
Dhumri
W4PD2358-I4PD23770005
1Druma
Dhumri
W8835-I1KG209630005
1Druma
Dhumri
W8LS17032-I8LS173960006
1Druma
Dhumri
W00KG03840-I00KG038640352
1Druma
DruDring
W1CZ1211-I1CZ20770006
1Druma
DruDring
W1CZ1228-I1CZ16950007
1Druma
DruDring
W1CZ1253-I1CZ17570029
1Druma
DruDring
W1CZ1270-I1CZ17610226
1Druma
DruDring
W1KG12281-I1KG127520007
1Druma
DruDring
W2KG5025-I2KG2126720006
1Druma
DruDring
W3CN15335-I3CN158950772
1Druma
Dhumri
W4CZ332268-I4CZ3328490494
1Druma
DruDring
W00KG03990-I00KG040250006
1Druma
DruDring
W1CZ1149-I1CZ15600008
1Druma
DruRing
W1CZ1294-I1CZ18090043
1Druma
DruRing
W1LT0433-I1LT04330394
1Druma
DruRing
W1PD153537-I1KG131350614
1Druma
Dhumri
W22178-I4CN121630878
1Druma
Dhumri
W2PD19899-I2PD206771420
1Druma
DruRing
W3CN1469-I3CN14710005
1Druma
DruRing
W8CZ210-I8CZ7600078
1Druma
DruRing
W1CZ1149-I1CZ15600005
1Druma
DruRing
W1CZ1229-I1CZ16970005
1Druma
DruRing
W1CZ1253-I1CZ17570032
1Druma
DruDring
W1CZ1272-I1CZ17650009
1Druma
DruDring
W1CZ2746-I1CZ27690005
1Druma
DruRing
W21872-62770005
1Druma
Dhumri
W23420-15750005
1Druma
DruRing
W3CN3013-I3CN4620561
1Druma
Dhumri
W3PD1004-I3PD10650005
1Druma
DruDring
W4CZ307384-I4CZ3083100009
1Druma
DruDring
End of preview. Expand in Data Studio

Tibetan Script Classification Benchmark

Holdout benchmark for 6-class Tibetan script classification. Test split only — not used during training. All images are BDRC manuscript page scans, balanced by subclass.

Class Images Subclasses
Danyig 60 DraDring: 25, DraRing: 9, Drathung: 17, Gongshabma: 3, Tsegdrig: 6
Druma 60 Dhumri: 22, DruDring: 20, DruRing: 10, Druchen: 2, Druthung: 6
Gyuyig 60 Khyuyig: 31, Tsumachug: 15, Yigchung: 14
Pedri 60 Peri: 44, Petsuk: 16
Tsugdri 60 Trinyig: 36, Tsugchung: 14, Tsugthung: 10
Uchen 60 Uchen SugDring: 53, Uchen SugRing: 3, Uchen Sugthung: 4
multiscript 60 Multi-Scripts: 60
non_tibetan 60 Other: 60

Total: 480 images across 8 classes.

Parquet schema

Column Type Description
id string BDRC page id (e.g. W00KG09391-I00KG093950005)
image_bytes binary JPEG/PNG page image
script string One of: Danyig, Druma, Gyuyig, Pedri, Tsugdri, Uchen, multiscript, non_tibetan
script_type string Sub-script / subclass name (e.g. DraDring, Multi-Scripts, Other)

Load in Python

from datasets import load_dataset

ds = load_dataset("BDRC/tibetan-script-classification-benchmark", split="test")
print(len(ds))  # 480

row = ds[0]
# row["id"], row["image_bytes"], row["script"]

Evaluate a model

from experiments.benchmark_eval.eval import run_benchmark
run_benchmark(model, repo_id="BDRC/tibetan-script-classification-benchmark")

Citation

@misc{bdrcscriptbenchmark,
  title  = {Tibetan Script Classification Benchmark},
  author = {Buddhist Digital Resource Center and OpenPecha},
  year   = {2026},
  url    = {https://huggingface.co/datasets/BDRC/tibetan-script-classification-benchmark},
  note   = {Images from BDRC. MIT.}
}

Acknowledgements

Images from the Buddhist Digital Resource Center (BDRC). Developed by Dharmaduta / OpenPecha.

Downloads last month
73