Exp-1
Small language model (9.9M parameters) trained from scratch.
Architecture
| Property | Value |
|---|---|
| Layers | 11 |
| Hidden size | 256 |
| Intermediate size | 704 |
| Attention heads | 8 (GQA kv=2) |
| Max sequence length | 1024 |
| Vocab size | 8192 |
| Tied embeddings | True |
| Total parameters | 9.853M |
Training
- Tokens seen: 2,514,124,800
- Val loss: 2.5423
- Val PPL: 12.71
Usage
from transformers import AutoTokenizer, AutoModelForCausalLM
tokenizer = AutoTokenizer.from_pretrained("GODELEV/Exp-1")
model = AutoModelForCausalLM.from_pretrained("GODELEV/Exp-1")
inputs = tokenizer("Hello", return_tensors="pt")
output = model.generate(**inputs, max_new_tokens=50)
print(tokenizer.decode(output[0], skip_special_tokens=True))
- Downloads last month
- 9
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support