image imagewidth (px) 512 512 | label class label 2
classes |
|---|---|
0baseline | |
0baseline | |
0baseline | |
0baseline | |
0baseline | |
0baseline | |
0baseline | |
0baseline | |
0baseline | |
0baseline | |
0baseline | |
0baseline | |
0baseline | |
0baseline | |
0baseline | |
0baseline | |
0baseline | |
0baseline | |
0baseline | |
0baseline | |
0baseline | |
0baseline | |
0baseline | |
0baseline | |
0baseline | |
0baseline | |
0baseline | |
0baseline | |
0baseline | |
0baseline | |
0baseline | |
0baseline | |
0baseline | |
0baseline | |
0baseline | |
0baseline | |
0baseline | |
0baseline | |
0baseline | |
0baseline | |
0baseline | |
0baseline | |
1current | |
1current | |
1current | |
1current | |
1current | |
1current | |
1current | |
1current | |
1current | |
1current | |
1current | |
1current | |
1current | |
1current | |
1current | |
1current | |
1current | |
1current | |
1current | |
1current | |
1current | |
1current | |
1current | |
1current | |
1current | |
1current | |
1current | |
1current | |
1current | |
1current | |
1current | |
1current | |
1current | |
1current | |
1current | |
1current | |
1current | |
1current | |
1current | |
1current | |
1current | |
1current |
YAML Metadata Warning:empty or missing yaml metadata in repo card
Check out the documentation for more information.
satellite-disruption-triage-aux-v1-3
Civilian Conflict-Disruption Satellite VLM Dataset — Auxiliary / v1.3
This is an auxiliary dataset for training and evaluating Vision-Language Models (VLMs) to perform civilian conflict-disruption triage from paired satellite imagery. It is not a tactical intelligence dataset and not a canonical expert benchmark.
Scope & Purpose
The target task is detecting macro-scale civilian infrastructure disruption caused by war, armed conflict, bombardment, shelling, explosions, siege, or major unrest. The model compares a baseline (pre-event) satellite image with a current (post-event) image and outputs a structured triage decision.
Included civilian infrastructure:
- Hospitals, schools, residential/civilian building clusters
- Food/logistics warehouses, grain silos, markets, aid hubs
- Ports, bridges, water facilities, power/desalination plants
- IDP camps, public infrastructure
Explicitly excluded:
- Military bases, weapons systems, troop positions, air defenses
- Tactical route intelligence, target ranking, strike planning data
Dataset Version
- Version: 1.3.0
- Total examples: 3,332
- Conflict-core examples: 3,272 (2,127 train / 1,145 eval)
- Non-conflict auxiliary examples: 60 (42 train / 18 eval) — capped at <2% of total
- Image resolution: 512×512 PNG
- License: CC-BY-NC-4.0 (Maxar Open Data, non-commercial)
- Build date: 2026-04-25
Schema
Flat JSONL (train_flat.jsonl, eval_flat.jsonl)
Each row:
{
"example_id": "string",
"baseline_image": "images/baseline/...png",
"current_image": "images/current/...png",
"target_output": {
"action": "discard | defer | downlink_now",
"category": "conflict_building_damage | conflict_hospital_damage | conflict_food_logistics_damage | conflict_water_infrastructure_damage | conflict_bridge_or_access_damage | conflict_port_or_silo_damage | conflict_urban_area_damage | explosion_damage | no_visible_disruption | ambiguous_or_low_visibility | other_conflict_civilian_disruption",
"rationale": "short sentence",
"bbox_norm": [x_min, y_min, x_max, y_max] | null
},
"source_dataset": "string",
"source_event": "string",
"source_image_name": "string",
"provenance": "source URL or dataset reference",
"modality": "optical-to-optical | optical-to-SAR | SAR-to-SAR | other",
"location_name": "string",
"country": "string",
"conflict_context": "short string",
"baseline_date": "YYYY-MM-DD or null",
"current_date": "YYYY-MM-DD or null",
"license": "string",
"label_method": "mask-derived | metadata-derived | manual-review | vlm-assisted | weak-label",
"damage_ratio": float | null,
"destruction_ratio": float | null,
"subset": "conflict_core | non_conflict_auxiliary"
}
SFT JSONL (train_sft.jsonl, eval_sft.jsonl)
Conversational format for VLM fine-tuning:
{
"example_id": "string",
"images": ["images/baseline/...png", "images/current/...png"],
"messages": [
{"role": "system", "content": "You are a civilian conflict-disruption satellite triage model. Return strict JSON only."},
{"role": "user", "content": "Compare the baseline and current satellite images. Focus only on macro-scale civilian disruption caused by conflict or explosion. Return action, category, rationale, and bbox_norm."},
{"role": "assistant", "content": "{ strict JSON target_output }"}
],
"source_dataset": "string",
"provenance": "string"
}
Action Definitions
downlink_now— Clear macro-visible civilian disruption from conflict/explosion (buildings destroyed, widespread damage)defer— Plausible but ambiguous conflict disruption (partial damage, smoke, low visibility, SAR/optical mismatch)discard— No visible disruption, weak evidence, non-civilian target, or invalid pair
Split Policy
Event-held-out evaluation. No event family, city damage campaign, or near-duplicate tile appears in both train and eval.
- Ukraine: Split by city/district (18 cities → train, 9 cities → eval)
- Beirut explosion: Exclusively in eval
- Bata explosion: Exclusively in train
- Auxiliary natural disasters: Events split across train/eval, no overlap
Action Distribution
| Split | downlink_now | defer | discard |
|---|---|---|---|
| Train (conflict core) | 558 (26.2%) | 1,100 (51.7%) | 469 (22.1%) |
| Eval (conflict core) | 330 (28.8%) | 539 (47.1%) | 276 (24.1%) |
| Target | 45% | 20% | 35% |
Note: The target balance was not fully achievable because the BRIGHT dataset genuinely contains more moderate-damage tiles (defer) than catastrophic-damage tiles (downlink_now). Labels are honest and derived from pixel-level damage masks, not artificially balanced.
Event / Country Coverage
| Country | Cities/Areas | Examples | Split |
|---|---|---|---|
| Ukraine | 27 cities (Kharkiv, Mariupol, Bucha, Irpin, Bakhmut, etc.) | 2,844 | Train+Eval |
| Lebanon | Beirut Port | 230 | Eval only |
| Equatorial Guinea | Bata | 198 | Train only |
| USA | Various (wildfires, hurricanes) | 60 | Train+Eval (aux) |
Data Sources
| Source | License | Rows | Status |
|---|---|---|---|
| BRIGHT (Kullervo/BRIGHT) | CC-BY-NC-4.0 | 3,272 conflict | Used |
| xBD (DIUx/xView2) | CC-BY-NC-4.0 | 60 auxiliary | Used |
| UNOSAT Gaza damage assessments | Proprietary | — | Excluded — no bulk download |
| UNOSAT Ukraine damage layers | Proprietary | — | Excluded — no bulk download |
| PRS/ETH Ukraine Zenodo | CC-BY-4.0 | — | Excluded — empty/missing files |
| Maxar Open Data | CC-BY-NC-4.0 | — | Excluded — raw imagery, no ML-ready labels |
Full source audit in source_audit.md.
Known Limitations
Ukraine-only armed conflict examples: The only public, redistributable conflict-damage satellite dataset with paired pre/post images and pixel labels is BRIGHT Ukraine. Gaza, Syria, Yemen, Sudan, Myanmar, Nagorno-Karabakh, Iraq, Libya, Iran, and Mexico cartel-conflict areas have no public ML-ready satellite damage benchmarks.
Optical-to-SAR modality gap: ~54% of conflict examples use pre-event optical + post-event SAR. Radar speckle and geometry differences can create false-change signals.
No per-building type labels: BRIGHT provides building damage masks but not building-type classification (hospital vs residential vs warehouse). All high-damage tiles are labeled
conflict_building_damage.Template-generated rationales: Rationale text is auto-generated from damage thresholds, not human expert review.
No exact dates: BRIGHT provides only event names, not acquisition dates.
Non-commercial license: CC-BY-NC-4.0 restricts commercial use.
How to Use
from datasets import load_dataset
dataset = load_dataset("imagefolder", data_dir="satellite-disruption-triage-aux-v1-3")
Or load the JSONL directly for VLM training:
import json
train = [json.loads(l) for l in open("train_flat.jsonl")]
Citation
@dataset{satellite_disruption_triage_v1_3,
title = {Satellite Disruption Triage Auxiliary Dataset v1.3},
author = {ChrisRPL},
year = {2026},
url = {https://huggingface.co/datasets/ChrisRPL/satellite-disruption-triage-aux-v1-3}
}
Contact
For issues or contributions, open a discussion on the Hugging Face Hub.
- Downloads last month
- 72