29 4 150

appvoid

https://ko-fi.com/appvoid

AI & ML interests

Working on symbolic models

Recent Activity

liked a model 4 days ago

owensong/Inflect-Nano-v1

repliedto their post 7 days ago

Everyone is talking about Anthropic’s Fable 5. Bigger models. Longer tasks. More power. That’s exciting, but if the future of AI only belongs to people with massive compute, then it is not really for everyone. That’s why the Hugging Face smol models event matters. Small models are not just tiny versions of big models. They are the models that can run cheaper, faster, closer to the user, and eventually on normal machines. That is the lane I’m building in. I’m working on a rewrite model series for messy real-world input: voice transcripts, rambling thoughts, half-correct instructions, and all the “no wait, actually…” moments where the intent is there, but buried. A giant model can clean that up. Cool. I want small open models to get better at it too. Current goal: $54.43/month Colab Pro+ + fees 11% funded That lets me iterate faster with better datasets, techniques, and compute. Yep. If you care about useful small models and open experimentation, even $1 helps: https://ko-fi.com/appvoid If you are curious about my work, you can check my latest palmer version. That should give you an idea of my recent work. 🤏 smol for the win! 💪 https://huggingface.co/appvoid/palmer-005-nano

posted an update 7 days ago

View all activity

Organizations

replied to their post 7 days ago

Also, you can help by sharing ideas directly to me. Bonus points if byte-level related architectures.

posted an update 7 days ago

Post

Everyone is talking about Anthropic’s Fable 5.

Bigger models. Longer tasks. More power.

That’s exciting, but if the future of AI only belongs to people with massive compute, then it is not really for everyone.

That’s why the Hugging Face smol models event matters.

Small models are not just tiny versions of big models.
They are the models that can run cheaper, faster, closer to the user, and eventually on normal machines.

That is the lane I’m building in.

I’m working on a rewrite model series for messy real-world input:

voice transcripts, rambling thoughts, half-correct instructions, and all the “no wait, actually…” moments where the intent is there, but buried.

A giant model can clean that up.

Cool.

I want small open models to get better at it too.

Current goal:

$54.43/month
Colab Pro+ + fees
11% funded

That lets me iterate faster with better datasets, techniques, and compute. Yep.

If you care about useful small models and open experimentation, even $1 helps:

https://ko-fi.com/appvoid

If you are curious about my work, you can check my latest palmer version. That should give you an idea of my recent work.

🤏 smol for the win! 💪

appvoid/palmer-005-nano

1 reply

posted an update 17 days ago

Post

151

yikes! i missed the small model hackathon i guess i'll have to make sota for people to notice

posted an update 18 days ago

Post

posted an update 25 days ago

Post

177

As an advocate for small language models I just want to say. It might not actually be the end for small models. We are just getting started! Now that we have super good models we can find creative ways to replicate the behavior at small scale!

I'll show you in a few weeks what a small model is capable of, you will surprised.

1 reply

replied to codelion's post 25 days ago

Great!

posted an update about 1 month ago

Post

199

Cheers for a year of sota AI on cpus 🥂 people actually liked my last model, here's another sota for you. This one should feel way different in terms of quality.

appvoid/palmer-005-core

replied to their post about 1 month ago

Next release will be above most 1b models while being 1/3 the size

posted an update about 1 month ago

Post

4530

As promised

appvoid/palmer-005-nano

1 reply

replied to their post about 1 month ago

as promised

posted an update about 1 month ago

Post

140

new model tomorrow, less than 100m params, 1b level on text editing tasks

🤖 palmer is coming back...

1 reply

posted an update 2 months ago

Post

162

Yesterday someone faked an anthropic account: https://huggingface.co/Anthropic-ai/claude
Be careful... all I'm saying.

1 reply

replied to their post 3 months ago

I guess the reason is slow is because llama.cpp is not optimized...

replied to their post 4 months ago

Correct! It's causal modeling (for now) with a char level tokenizer with only 8 tokens.

The model learns by looking for relationships of sequences for a single token, so the only way it learns is literally nudging weights towards a generalized solution using pure sequences.

In short, it learns to learn.

Will the be any app to.. convert the dots to something meaningful?

Not yet, I'm focusing on getting the core right first. But once the model is general enough, I don't see why not. Though you might need to finetune it for your use case.

replied to their post 4 months ago

Correct! It's causal modeling (for now) with a char level tokenizer with only 8 tokens.

The model learns by looking for relationships of sequences for a single token, so the only way it learns is literally nudging weights towards a generalized solution using pure sequences.

In short, it learns to learn.

posted an update 4 months ago

Post

2521

Let's keep the momentum for small models. I just published dot. It's the first pretrained causal model that is trained on math/symbols rather than english. The goal is to get an agnostic fewshot meta learner that learns from reality itself instead of language.

It's already decent at some tasks, with next version coming in a few weeks.

appvoid/dot

5 replies

replied to their post 4 months ago

The first model proudly trained from scratch on "physical" reasoning instead of chunky language tokens was published.

posted an update 4 months ago

Post

246

Are you ready for some ●s? Tomorrow will be a good day.

4 replies

replied to their post 4 months ago

if you need raw power though slow, rwkv 0.4b has you covered, if you need something in between choose lfm2 350m

posted an update 4 months ago

Post

957

granite-4.0-350m, rwkv7-g1d-0.4b and LFM2-350M are currently the best sub 0.5b models currently for fewshot, simple language tasks

no one is saying this:

if you need the absolute speed + small size + quality, granite 350m is the current king

3 replies

appvoid

AI & ML interests

Recent Activity

Organizations

appvoid's activity