YAML Metadata Warning:empty or missing yaml metadata in repo card
Check out the documentation for more information.
ultrafeedback_grpo/llama32_3b_armorm_e1_bs128_g4-20260420.234605
- Policy model: meta-llama/Llama-3.2-3B-Instruct
- Reward model: RLHFlow/ArmoRM-Llama3-8B-v0.1
- Dataset: openbmb/UltraFeedback
- Algorithm: GRPO
- Epochs: 1
- Learning rate: 3e-6
- Warmup ratio: 0.1
- Scheduler: cosine
- Global train batch size: 128
- Group size: 4
- Max response length: 1024
- Local checkpoint root: /home/jovyan/kraft/outputs/ultrafeedback_grpo_llama32_3b/runs/ultrafeedback_grpo/llama32_3b_armorm_e1_bs128_g4-20260420.234605/checkpoints
- Downloads last month
- 12
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support