Note · June 21, 2026

Usefulness Over Hype

GLM 5.2 is benchmaxxing right now, and on X people keep saying the same thing, that it is genuinely good for agentic coding. I think they are right, there is something real here. The one place it still falls short is multimodal understanding, and that gap is not closed yet.

And here is the part everyone skips. It is now beating frontier models on the benchmarks, but beating a benchmark is not the same as being useful. When it comes to real usefulness, Claude Opus is still the head of the game. There is no question about it for me.

Fable was the mythos, the guardrailed model, and from what I hear it got taken down by the US government. Meanwhile the open-weight models are getting really good, and you can see more benchmaxxing in just the last couple of days. The pace is wild.

From my own experience, the Minimax model is really fast and good for some coding work. I reach for it and it delivers. But when I put it next to Opus, Opus is simply better. There is no question about that one either.

Sonnet is good too, but drifting is the case for most frontends. Give it a strong foundation, the tokens and the branding and the structure already in place, and it holds up just fine, no issue at all. And for the premium creative design side of frontend, Sonnet 4.6 is better than any model I have used.

Opus is the one I trust for functional user experience design, the parts that actually have to work. So that is the thing I have realized. Each model has its own lane, and the real skill now is knowing which one to put where. Stop judging models by the leaderboard, and start judging them by the job.

Usefulness Over Hype

More notes

Down to Binary

Design the Business

Zero Dependency

Built From Scratch