Turkish-Gemma-9b-v0.1
This is the Turkish-Gemma-9b-v0.1. This model is based on Gemma-2-9b, and was developed through a combination of continual pre-training, supervised fine-tuning (SFT), direct preference optimization (DPO), and model merging.
The Turkish-Gemma-9b-v0.1 is designed for Turkish text generation tasks, providing coherent, contextually relevant continuations and answers. Due to the diverse nature of the training dataโwhich includes large-scale pre-training corpora, instruction-tuning data, and human preference dataโthe model may exhibit biases. Users should be aware of these and deploy the model responsibly.
You can easily demo the model here (Coming soon!): https://cosmos.yildiz.edu.tr/cosmosllm
To evaluate model performance, we compiled a dataset of 1,450 carefully designed questions across diverse categories. Each question was reviewed and rated by 18 human annotators, allowing for a reliable comparison across multiple models.
The table below summarizes the evaluation results:
๐ Model Comparison: Win Rates
| Model Name | Win Rate |
|---|---|
| Qwen/Qwen3-30B-A3B | 62.39% |
| gpt-4o-mini | 62.12% |
| google/gemma-3-12b-it | 61.61% |
| google/gemma-2-27b-it | 57.91% |
| ytu-ce-cosmos/Turkish-Gemma-9b-v0.1 | 57.30% |
| google/gemma-2-9b-it | 54.13% |
| ytu-ce-cosmos/Turkish-Llama-8b-DPO-v0.1 | 36.89% |
Voting Metodology
A question and two answers from different models were presented to human judges. The judges selected the better answer based on their preferences. For example, in the question below, the judge selected the answer on the right:

๐ Turkish Evaluation Benchmark Results (via malhajar17/lm-evaluation-harness_turkish)
| Model Name | Average | MMLU | Truthful_QA | ARC | Hellaswag | Gsm8K | Winogrande |
|---|---|---|---|---|---|---|---|
| Qwen/Qwen2.5-72B-Instruct | 67.69 | 77.28 | 59.86 | 61.52 | 61.98 | 83.6 | 61.92 |
| google/gemma-3-27b-it | 67.36 | 70.2 | 57.06 | 66.98 | 66.58 | 77.52 | 65.8 |
| google/gemma-2-27b-it | 65.57 | 66.49 | 57.45 | 63.65 | 63.86 | 76.54 | 65.4 |
| meta-llama/Llama-3-1-70B-Instruct | 63.92 | 74.00 | 51.41 | 59.64 | 64.31 | 66.13 | 66.90 |
| Qwen/Qwen2.5-32B-Instruct | 63.74 | 70.93 | 57.87 | 57.00 | 57.04 | 77.83 | 61.77 |
| ytu-ce-cosmos/Turkish-Gemma-9b-v0.1 | 63.31 | 63.85 | 54.21 | 59.64 | 64.19 | 73.42 | 64.53 |
| google/gemma-3-12b-it | 62.94 | 63.92 | 57.16 | 60.67 | 62.00 | 72.06 | 61.77 |
| Qwen/Qwen2.5-14B-it | 60.34 | 65.28 | 59.00 | 50.00 | 52.22 | 76.77 | 58.77 |
| google/gemma-2-9b-it | 59.14 | 61.07 | 55.77 | 56.31 | 56.48 | 63.10 | 62.09 |
| ytu-ce-cosmos/Turkish-Llama-8b-DPO-v0.1 | 55.03 | 51.97 | 57.56 | 51.02 | 52.96 | 59.87 | 57.77 |
| Qwen/Qwen2.5-7B-Instruct | 53.42 | 56.31 | 55.99 | 42.06 | 44.71 | 64.16 | 59.66 |
Transformers pipeline
import transformers
import torch
model_id = "ytu-ce-cosmos/Turkish-Gemma-9b-v0.1"
pipeline = transformers.pipeline(
"text-generation",
model=model_id,
model_kwargs={"torch_dtype": torch.bfloat16},
device_map="auto",
)
messages = [
{"role": "user", "content": "ฤฐsmi RD olan bir fonksiyon ona verilen sayฤฑnฤฑn รงarpmaya gรถre tersini dรถndรผrmektedir. รrneฤin RD(3)=1/3. Buna gรถre RD(X)=X ifadesini doฤru yapan kaรง X deฤeri vardฤฑr?"}
]
terminators = [
pipeline.tokenizer.eos_token_id,
pipeline.tokenizer.convert_tokens_to_ids("<end_of_turn>")
]
outputs = pipeline(
messages,
max_new_tokens=512,
eos_token_id=terminators,
do_sample=True,
temperature=0.6,
top_p=0.9,
)
print(outputs[0]["generated_text"][-1])
# RD(X) = X ifadesi, bir sayฤฑnฤฑn รงarpmaya gรถre tersinin kendisiyle eลit olmasฤฑ anlamฤฑna gelir. Yani, X ile 1/X aynฤฑ olmalฤฑdฤฑr. Bu durum yalnฤฑzca X'in karesi 1 olduฤunda gerรงekleลir:
# Xยฒ = 1
# Bu denklemin รงรถzรผmleri:
# X = 1 ve X = -1
# Dolayฤฑsฤฑyla, RD(X) = X eลitliฤini saฤlayan *iki* X deฤeri vardฤฑr: *1* ve *-1*.
Transformers AutoModelForCausalLM
from transformers import AutoTokenizer, AutoModelForCausalLM
import torch
model_id = "ytu-ce-cosmos/Turkish-Gemma-9b-v0.1"
tokenizer = AutoTokenizer.from_pretrained(model_id)
model = AutoModelForCausalLM.from_pretrained(
model_id,
torch_dtype=torch.bfloat16,
device_map="auto",
)
messages = [
{"role": "user", "content": "ฤฐsmi RD olan bir fonksiyon ona verilen sayฤฑnฤฑn รงarpmaya gรถre tersini dรถndรผrmektedir. รrneฤin RD(3)=1/3. Buna gรถre RD(X)=X ifadesini doฤru yapan kaรง X deฤeri vardฤฑr?"}
]
input_ids = tokenizer.apply_chat_template(
messages,
add_generation_prompt=True,
return_tensors="pt"
).to(model.device)
terminators = [
pipeline.tokenizer.eos_token_id,
pipeline.tokenizer.convert_tokens_to_ids("<end_of_turn>")
]
outputs = model.generate(
input_ids,
max_new_tokens=512,
eos_token_id=terminators,
do_sample=False,
)
response = outputs[0][input_ids.shape[-1]:]
print(tokenizer.decode(response, skip_special_tokens=True))
# RD(X) = X ifadesi, bir sayฤฑnฤฑn รงarpmaya gรถre tersinin kendisiyle eลit olmasฤฑ anlamฤฑna gelir. Yani, X ile 1/X aynฤฑ olmalฤฑdฤฑr. Bu durum yalnฤฑzca X'in karesi 1 olduฤunda gerรงekleลir:
# Xยฒ = 1
# Bu denklemin รงรถzรผmleri:
# X = 1 ve X = -1
# Dolayฤฑsฤฑyla, RD(X) = X eลitliฤini saฤlayan *iki* X deฤeri vardฤฑr: *1* ve *-1*.
Acknowledgments
- Thanks to the generous support from the Hugging Face team, it is possible to download models from their S3 storage ๐ค
- Computing resources used in this work were provided by the National Center for High Performance Computing of Turkey (UHeM) under grant numbers 1016912023 and 1018512024
Contact
COSMOS AI Research Group, Yildiz Technical University Computer Engineering Department
https://cosmos.yildiz.edu.tr/
cosmos@yildiz.edu.tr
license: gemma2
- Downloads last month
- 464