arxiv:2605.07630
Zhengyang Tang
tangzhy
AI & ML interests
None yet
Recent Activity
authored a paper about 14 hours ago
Safe, or Simply Incapable? Rethinking Safety Evaluation for Phone-Use Agents submitted a paper 1 day ago
Safe, or Simply Incapable? Rethinking Safety Evaluation for Phone-Use Agents authored a paper 9 days ago
Claw-Eval-Live: A Live Agent Benchmark for Evolving Real-World Workflows