Jerry Towler

Self-driven.

Home

About

Now

Photos

Reading

Archive

Stats

Feb 21, 2026

1 minute to read

Bluesky

The path to ubiquitous AI | Taalas:

Taalas’ silicon Llama achieves 17K tokens/sec per user, nearly 10X faster than the current state of the art, while costing 20X less to build, and consuming 10X less power.

Hot dang, that’s fast. Their live demo is so fast it seems fake.
It’s still a 2.5kW server, but that’s not very far from residential! No word on cost…
I wonder when models they can bake into hardware get “good enough” to stop churning. They’re crazy fast, but spinning new silicon for every new model sounds insane at the current pace of progress.

Jerry Towler

Self-driven.

Jerry Towler

Self-driven.

Reading

How to Take Smart Notes

Categories