YAML Metadata Warning:empty or missing yaml metadata in repo card

Check out the documentation for more information.

Matrix 2

Model Description

Matrix 2 is a fine-tuned version of DeepSeek-R1-Distill-Qwen-7B, trained on a focused mixture of chain-of-thought reasoning, math, coding, and logic data. It is the flagship reasoning model of the Inelly lineup -- built for deep, accurate, step-by-step problem solving.

  • Developed by: Bry (GenueAI)
  • Base model: DeepSeek-R1-Distill-Qwen-7B
  • Fine-tuning method: QLoRA (4-bit NF4, rank 16)
  • Parameters: 7.62B (base) + ~6.5M trainable (LoRA adapters)
  • License: MIT (inherited from DeepSeek-R1)

Intended Use

Matrix 2 is intended for:

  • Deep Chain-of-Thought reasoning – Multi-step problem solving with clear logic
  • Mathematics – Algebra, arithmetic, word problems, multi-step calculations
  • Code generation – Python functions with proper logic and comments
  • Logical deduction – Syllogisms, puzzles, transitive reasoning
  • Scientific explanations – Physics, biology, general science
  • Complex instruction following – Multi-part tasks requiring structured thinking

Out of Scope

  • Not intended for production deployment without further safety evaluation
  • Safety alignment inherited from DeepSeek-R1 base; fine-tuning data did not include adversarial safety examples
  • Larger memory footprint than 1.5B/3B variants (~5.2GB)

Training Data

Matrix 2 was fine-tuned for 1 epoch on ~5,225 samples drawn from:

Dataset Samples Purpose
Bespoke-Stratos-35k 3,000 Chain-of-thought math & reasoning
OpenThoughts-114k 2,500 Code generation with reasoning
dolphin-r1 2,000 General reasoning (DeepSeek-R1 distill)

All samples were deduplicated and reasoning-weighted (2x oversample for CoT examples). Maximum sequence length: 512 tokens.


Training Hyperparameters

Parameter Value
Base model DeepSeek-R1-Distill-Qwen-7B
Quantization 4-bit NF4 (bitsandbytes)
LoRA rank 16
LoRA alpha 32
LoRA dropout 0.05
Learning rate 2e-4
Batch size 8 (gradient accumulation)
Epochs 1
Max seq length 512
Optimizer AdamW 8-bit
LR scheduler cosine
Warmup ratio 0.05
Training time ~74 min
Hardware RTX 3090 (24GB VRAM)

Model Architecture

Property Value
Model type Qwen2ForCausalLM
Hidden size 3,584
Layers 28
Attention heads 28
Head dim 128
Intermediate size 18,944
Vocab size 152,064
Context length 131,072
Total parameters ~7.62B
Trainable parameters ~6.5M (LoRA)

Usage

from transformers import AutoModelForCausalLM, AutoTokenizer

model = AutoModelForCausalLM.from_pretrained("path/to/matrix-2", torch_dtype=torch.float16, device_map="auto")
tokenizer = AutoTokenizer.from_pretrained("path/to/matrix-2")

messages = [{"role": "user", "content": "Solve for x: 3x + 7 = 22. Show all steps."}]
text = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
inputs = tokenizer(text, return_tensors="pt").to(model.device)

output = model.generate(**inputs, max_new_tokens=256, temperature=0.7, top_p=0.9)
response = tokenizer.decode(output[0][inputs.input_ids.shape[1]:], skip_special_tokens=True)
print(response)

Performance

Informal GPU testing across 8 categories:

Category Result
Chain-of-Thought reasoning ✅ Excellent multi-step logic
Math ✅ Accurate with detailed work shown
Code generation ✅ Clean, well-commented Python
Logic puzzles ✅ Thorough deductive reasoning
General knowledge ✅ Accurate, detailed explanations
Complex reasoning ✅ Handles multi-step word problems well

Inelly / GenueAI Model Family

Model Size Focus
Matrix 2 (this model) 7B Deep CoT reasoning, math, coding
Inelly 4.5 3B Conversation + politeness + CoT
Inelly 4.5 Blaze 1.5B Fast reasoning + CoT

Limitations

  • Safety: Inherited from DeepSeek-R1 base; not specifically safety-tuned. May occasionally follow harmful instructions.
  • Memory: Requires ~5.2GB VRAM for inference (FP16)
  • Context length: Fine-tuned on 512-token sequences; base supports 128K but fine-tuned performance is optimized for shorter contexts
  • Factual accuracy: May hallucinate in specialized domains (law, medicine, finance)
  • Speed: Slower than 1.5B/3B variants due to size

Acknowledgments


Citation

@misc{matrix2,
  title = {Matrix 2: A 7B Chain-of-Thought Reasoning Model},
  author = {Bry},
  organization = {GenueAI},
  year = {2026},
  note = {Fine-tuned from DeepSeek-R1-Distill-Qwen-7B using QLoRA},
}
Downloads last month
16
Safetensors
Model size
8B params
Tensor type
F32
·
F16
·
U8
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Collection including GenueAI/Matrix-2