You need to agree to share your contact information to access this model

This repository is publicly accessible, but you have to accept the conditions to access its files and content.

AuriStream - Speech Language Model

AuriStream is a speech language model by Greta Tuckute and Klemen Kotar.

This repository contains the shared model code for AuriStream models.

Overview

AuriStream is a GPT-like transformer model for cochlear token prediction with optional multi-token prediction (MTP) heads.

This model predicts cochlear tokens from a tokenizer such as WavCochCausalV8192.

Usage

This repository is not meant to be used directly. Instead, use one of the checkpoint repositories that reference this base code:

AuriStream7B_40Pred_BigAudioDataset_500k

To load a checkpoint:

from transformers import AutoModel, AutoConfig

model = AutoModel.from_pretrained(
    "TuKoResearch/AuriStream7B_40Pred_BigAudioDataset_500k",
    trust_remote_code=True,
)

Model Architecture

The AuriStream model includes:

RMSNorm for layer normalization
Rotary Position Embeddings (RoPE)
SiLU activation in MLP layers
Multi-token prediction heads

Configuration Options

Parameter	Description	Default
`vocab_size`	Number of cochlear tokens	8192
`n_embd`	Hidden dimension	768
`n_layer`	Number of transformer layers	12
`n_head`	Number of attention heads	12
`n_pred_steps`	Number of prediction steps (MTP)	1

Files

configuration_auristream.py - Configuration class
modeling_auristream.py - Model implementation

Tokenizer

This model uses cochlear tokens from WavCochCausalV8192.

Downloads last month: -; Downloads are not tracked for this model. How to track

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support