You need to agree to share your contact information to access this model

This repository is publicly accessible, but you have to accept the conditions to access its files and content.

Log in or Sign Up to review the conditions and access this model content.

AuriStream - Speech Language Model

AuriStream is a speech language model by Greta Tuckute and Klemen Kotar.

This repository contains the shared model code for AuriStream models.

Overview

AuriStream is a GPT-like transformer model for cochlear token prediction with optional multi-token prediction (MTP) heads.

This model predicts cochlear tokens from a tokenizer such as WavCochCausalV8192.

Usage

This repository is not meant to be used directly. Instead, use one of the checkpoint repositories that reference this base code:

To load a checkpoint:

from transformers import AutoModel, AutoConfig

model = AutoModel.from_pretrained(
    "TuKoResearch/AuriStream7B_40Pred_BigAudioDataset_500k",
    trust_remote_code=True,
)

Model Architecture

The AuriStream model includes:

  • RMSNorm for layer normalization
  • Rotary Position Embeddings (RoPE)
  • SiLU activation in MLP layers
  • Multi-token prediction heads

Configuration Options

Parameter Description Default
vocab_size Number of cochlear tokens 8192
n_embd Hidden dimension 768
n_layer Number of transformer layers 12
n_head Number of attention heads 12
n_pred_steps Number of prediction steps (MTP) 1

Files

  • configuration_auristream.py - Configuration class
  • modeling_auristream.py - Model implementation

Tokenizer

This model uses cochlear tokens from WavCochCausalV8192.

Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support