More

fnordpiglet · 2026-04-29T00:00:52 1777420852

Microsoft specializes in taking successful products and pumping them full of malware, spyware, bloatware, and adware once they have a critical mass of users. It is often preceded by quality dropping significantly due to under investment and McKinsey being brought in to find a way to prop up declining revenues - of course the answer is never to invest in making it a superior product again, but monetization strategies.

fnordpiglet · 2026-04-28T23:50:23 1777420223

Two 9’s? You have to work pretty hard to do that badly. That’s like bragging you graduated with a C average from Harvard after your father endowed a chair to get you in.

Given GitHub has become a utility service globally this should be frankly worrisome to everyone let alone the developer community actively using it. It’s intertwined into many things now beyond simply source code hosting and PRs. And I am surprised GitHub leadership is ok with the state of things. Having worked at a lot of 5-6 9’s shops, this would have been all hands on deck, all roadmaps paused, figure it out or perish sorts of stuff.

gen220 · 2026-04-29T00:08:29 1777421309

Oh yea I'm not trying to defend it as amazing or tolerable, just clarifying the actual benchmark/reality of their performance!

I think at least three 9's would be the baseline. But I'm also sympathetic that they had a *post-exit* scaling of write volume of like, idk, 1000x???

Very few orgs would survive that kind of scale and retain three 9's.

fnordpiglet · 2026-04-28T05:32:19 1777354339

Gemini helped them build it but didn’t / couldn’t attribute it from its corpus. I think we will see a surge of “rediscovery” that’s unattributed training surfacing of prior work that wasn’t widely recognized at the time.

oofbey · 2026-04-28T13:40:37 1777383637

Gemini is perfectly capable of searching the web. Pretty good at it really. As are most agents. If such a surge happens, it’s purely because of laziness.

fnordpiglet · 2026-04-28T21:38:46 1777412326

Laziness, aka, the human condition?

fnordpiglet · 2026-04-27T03:34:54 1777260894

When their customers start using those buildings with computers in them to autonomously determine who to kidnap and execute, I suspect you might understand their point. I’d also note we are one refusal away from the US president declaring DPA control over frontier model providers and their infrastructure a national defense necessity and under his personal control.

tt24 · 2026-04-27T04:47:17 1777265237

Then complain about the US president forcing Microsoft to do X rather than just preemptively criticizing Microsoft for doing nothing

fnordpiglet · 2026-04-26T20:26:32 1777235192

There were always engineers who didn’t think and depended on crutches around them like senior engineers and politicizing the perf cycle. Most people got into this because their parents told them it makes a lot of money, and they never had the drive and curiosity to develop the passion required to truly think through the problems in computing and computer science. They will continue to use crutches to survive. Those that are driven by the problems for the problems will continue to think and use AI as a tool for leverage. This is no different than any other assistive technology.

fnordpiglet · 2026-04-26T18:08:25 1777226905

Database benchmarks are another.

I have empirical experience though building classifiers that can have no precision measurement because the classifier performs invariably better than humans. They become the state of the art benchmark themselves and can’t be benchmarked except against themselves. These are for tasks that are non trivial and complex, but less logical than coding and less sustained reasoning. There may come a day though, when there is no calibrated benchmark that is independent of the models it’s measuring.

fnordpiglet · 2026-04-26T16:47:03 1777222023

I think sometimes though there harness LLMs providing guidance. For instance I’ve seen recently coding agents doing an analysis then mid response saying “no wait, that’s not right” and course correcting. This feels implausible as an auto regressive rhetorical tick. LLM harnesses are widely used in advanced agentic systems and I’m sure the Pro level reasoning models exploit them extensively. I’m not saying this is what happened here, but there is a chance it was something injected by the hardness into its thinking.

fnordpiglet · 2026-04-23T05:39:57 1776922797

This is obtuse. The assertion was a deviation in the areas of Florida experiencing hurricane penetration. This is a localized effect. You’re discussing the gross effects of an entire nation, in this comment, of an entire state in the prior. However no one is discussing Florida or the US. They’re discussing the orange growing regions of Florida, which is a region that has not historically had hurricanes, but has had them recently.

It’s like saying the UV radiation hitting the earth is the same as it was historically so therefore an ozone hole in Australia didn’t exist and cataracts can’t be higher there.

tgarrett · 2026-04-23T06:08:31 1776924511

So what you are saying is that, yes there has not been an overall increase in hurricanes hitting the US over the last 175 years, but climate change has been specifically and precisely steering the hurricanes towards the orange growing regions of Florida in recent years, and is therefore to blame for the crop failures.

You have to diagnose a problem correctly in order to have a chance at solving it.

fnordpiglet · 2026-04-23T15:49:05 1776959345

I’m asserting nothing other than the article asserted the pattern of hurricanes changed to target the orange growing region more often and that you’re using gross geographic data to discuss an orthogonal point. However you make it seem like the assertion is nature intentionally targeting orange groves rather than shifts in patterns implies patterns shifted from where they were to where they were not hitting - this is definitional in the concept of a pattern shift. Your evidence for your assertions are unrelated to that topic of pattern shift, indicating you’ve misunderstood the problem to diagnose.

It’s great you’re bringing data to the table but you’re overstating its validity to the assertion dramatically.

Finally I’d note you’re asserting an analysis you’ve done without providing the data, method, or any reproducibility. So while you might personally feel you’ve done an accurate job, your assertions are citing exclusively yourself, against hidden methods, making it of no more quality than a puff piece article citing research without citation that you’re arguing against.

tgarrett · 2026-04-23T16:56:01 1776963361

I did provide the data in my first comment, here it is again:

https://www.aoml.noaa.gov/hrd/hurdat/All_U.S._Hurricanes.htm...

The analysis is easy: copy and paste the data from that link into a new text file, then write a python script that goes through it and counts the number of Cat 1, 2, 3, 4 & 5 hurricanes that make landfall per year (the "Highest Saffir-Simpson U.S. Category" column), and then make the plots: I used gnuplot. You can then do fits to the data if you'd like, but the flat trend lines over the last 175 years are obvious.

I encourage you to not trust me and to do it yourself, but I'm also happy to share my script, let me know.

As far as the hurricane trajectory trend lines go, they are clearly highly stochastic: check out e.g. both the spaghetti plot predictions for various storms from previous years, and ask google for a map of where they grow (grew...) oranges in Florida.

JumpCrisscross · 2026-04-23T09:02:41 1776934961

Well now I'm thoroughly confused. Because your data does seem to overturn the conventional wisdom.

Do actual climate scientists claim we're getting more, and stronger, hurricanes now than we did before?

Auracle · 2026-04-23T15:51:40 1776959500

By the way, I know I saw someone point out the same data at least 5 years ago - probably more like 10.

At some point the discourse changed from “just because it’s a cold winter doesn’t mean that global warming isn’t happening” to “every hurricane/wildfire is due to climate change” and it’s ridiculous.

I honestly think a lot of young people don’t realize that while climate change is probably real our weather and variability hasn’t changed that much - yet, at least.

ben_w · 2026-04-23T17:59:56 1776967196

> I honestly think a lot of young people don’t realize that while climate change is probably real our weather and variability hasn’t changed that much - yet, at least.

"Much" is one of those vague words, where it's true and false depending on your meaning.

If you live on any of the transition zones between climates, as I did growing up, it is directly visible: My experience of snow in the south coast of the UK was almost entirely in the early years of my childhood, and family photos of my older siblings show that they had even more than me. My parents had experiences of even deeper and longer cold, with ponds freezing completely solid, not just a layer of ice on the top.

I can easily imagine someone who lives in the parts of the US where all the winter urban snow photos come from, may not notice the loss of a 1-2 centimetres out of 100cm of snowfall, but when it's your last centimetre, it's much easier to spot.

slibhb · 2026-04-23T12:26:01 1776947161

> Do actual climate scientists claim we're getting more, and stronger, hurricanes now than we did before?

The general line is that climate change has probably increased the amount of rainfall associated with hurricanes, possibly the severity of hurricanes (due to sea level rise and warmer water) but there isn't good evidence that it has increased the frequency of hurricanes.

WorldMaker · 2026-04-23T18:58:41 1776970721

I've heard climate scientists that describe climate change as a "more energy in the system" phenomenon. The overall system for now is mostly the same, but every event inside of it has "more energy" than it had before.

For hurricanes this seems especially problematic because the historical categorization system is based on radar-observed width of the storm. "More energy" means that the categories stay the same over time, but every category is getting worse (more rainfall, heavier/faster winds, further travel, higher damage).

As with so many statistical phenomenon, it's also a reminder to be careful what metrics you are trying to compare. Comparing just the hurricane categories to historic values may just be the exact sort of wrong metric, for these "more energy" concerns.

JumpCrisscross · 2026-04-23T20:12:37 1776975157

> the historical categorization system is based on radar-observed width of the storm

Now I'm confused again, because OP used data going back to 1851. We didn't have radar in the 19th century.

WorldMaker · 2026-04-24T05:11:24 1777007484

Ah, sorry. I suppose it is only fair to mention using the wrong metrics and getting the exact metric wrong myself. Today it is radar-observed wind speed and historically there were other less efficient means to test or at least estimate wind speed.

The original point still stands that Hurricanes are defined by only the one metric and other metrics have room to grow bigger as the category stays the same:

> The Saffir-Simpson Hurricane Wind Scale is a 1 to 5 rating based only on a hurricane's maximum sustained wind speed. This scale does not take into account other potentially deadly hazards such as storm surge, rainfall flooding, and tornadoes.

From: https://www.nhc.noaa.gov/aboutsshws.php

newsclues · 2026-04-23T09:40:29 1776937229

Climate science is largely based on models, not just raw data.

Climate science is also highly political, and seems to have a big economic impact…

fnordpiglet · 2026-04-23T01:59:52 1776909592

I think this is a narrow view. Aws and azure build their own data centers and partner closely with Nvidia and build their own silicon too. TPUS are non standard, no one else can run them - Nvidia build on fabrics and technologies well under and well integrated for a long time (mellanox etc) and clearly work very closely with the aws and azure hardware and data center build teams. I’d not bet that Google can do things better than everyone else - that’s certainly something Googlers always believe about themselves but it’s not the case that you can’t build a best of breed that meets or exceeds total in house builds.

fnordpiglet · 2026-04-22T16:25:21 1776875121

For coding often quality at the margin is crucial even at a premium. It’s not the same as cranking out spam emails or HN posts at scale. This is why the marginal difference between your median engineer and your P99 engineer is comp is substantial, while the marginal comp difference between your median pick and packer vs your P99 pick and packer isn’t.

I’d also say it keeps the frontier shops competitive while costing R&D in the present is beneficial to them in forcing them to make a better and better product especially in value add space.

Finally, particularly for Anthropic, they are going for the more trustworthy shop. Even ali is hosting pay frontier models for service revenue, but if you’re not a Chinese shop, would you really host your production code development workload on a Chinese hosted provider? OpenAI is sketchy enough but even there I have a marginal confidence they aren’t just wholesale mining data for trade secrets - even if they are using it for model training. Anthropic I slightly trust more. Hence the premium. No one really believes at face value a Chinese hosted firm isn’t mass trolling every competitive advantage possible and handing back to the government and other cross competitive firms - even if they aren’t the historical precedent is so well established and known that everyone prices it in.

ozgrakkurt · 2026-04-22T19:33:00 1776886380

I just assume any of those companies would steal my work and wouldn't care about it.

Everything they have done so far indicates this.

Running your own is the only option unless you really trust them or unless you have the option to sue them like some big companies can.

Or if you don't really care then you can use the chineese one since it is cheaper.

What makes you trust Anthropic more than Alibaba?

fnordpiglet · 2026-04-22T21:08:44 1776892124

There’s a difference between stealing for model training and direct monitoring of actionable trade secrets and corporate espionage. Anthropic and OpenAI wouldn’t do this simply because they would be litigated out of existence and criminally investigated if they did. In China it’s an expected part of the corporate and legal structure with virtually no recourse for a foreign firm and when it’s in states interest domestic either. I’m surprised you don’t realize the US has fairly strong civil, criminal, and regulatory protections in place for theft of actionable material and reuse of corporate and trade secrets, let alone copyright materials. I assure you their ToS also do not allow them to do this and that in itself is a contractual obligation you can enforce and win in court.

trvz · 2026-04-22T21:40:29 1776894029

Anthropic already admitted to heavily monitoring user requests to protect against distillation. They have everything in place, turning on learning from user data would literally be just a couple lines of code at this point. Anyone trusting them not to do it is a fool.

anon373839 · 2026-04-22T22:51:48 1776898308

Absolutely. Plus as these companies become hungrier for revenue and to get out of the commodity market they are in, they are only going to get more aggressive in their (ab)use of customer data.

Zetaphor · 2026-04-23T02:18:33 1776910713

How exactly do you propose that a local weights model that I can run without an internet connection is going to exfiltrate my trade secrets to the Chinese government?

fnordpiglet · 2026-04-23T03:48:19 1776916099

If you read I’m talking about their service only models.

dTal · 2026-04-23T09:16:47 1776935807

Why? No one else was. The discussion was about OpenAI / Anthropic's lack of moat when there are open weights models that are almost as good. You can host them anywhere you like. Pay a US company to do so if you want.

bigbadfeline · 2026-04-22T17:11:17 1776877877

> For coding often quality at the margin is crucial even at a premium

That's a cryptic way to say "Only for vibe-coding quality at the margin matters". Obviously, quality is determined first and foremost by the skills of the human operating the LLM.

> No one really believes at face value a Chinese hosted firm isn’t mass trolling every competitive advantage possible

That's much easier to believe than the same but applied to a huge global corp that operates in your own market and has both the power and the desire to eat your market share for breakfast, before the markets open, so "growth" can be reported the same day.

Besides, open models are hosted by many small providers in the US too, you don't have to use foreign providers per se.

fnordpiglet · 2026-04-22T17:27:44 1776878864

1) model provider choices don’t obviate the need to make other good choices

2) I think there is a special case for Chinese providers due to the philosophical differences in what constitutes fair markets and the regulatory and civil legal structure outside China generally makes such things existentially dangerous to do; hence while it might happen it is extraordinarily ill advised, while in China is implicitly the way things work. However my point is Ali has their own hosted version of Qwen models operating on the frontier that are at minimum hosted exclusively before released. Theres no reason to believe they won’t at some point exclusively host some frontier or fine tuned variants for purposes for commercial reasons. This is part of why they had recent turnover.

rohansood15 · 2026-04-22T16:35:48 1776875748

Most code is not P99 though.

Also, have you considered that your trust in Anthropic and distrust in China may not be shared by many outside the US? There's a reason why Huawei is the largest supplier of 5G hardware globally.

fnordpiglet · 2026-04-22T17:30:17 1776879017

I find it hard to believe anyone who has ever done business inside China doesn’t know that the structure of Chinese business is built around massive IP theft and repurposing on a state wide systematic level. It’s not a nationalism point, it’s an objective and easily verified truth.

Most code is not P99, but companies pay a premium to produce code that is. That’s my point.

Zetaphor · 2026-04-23T02:22:50 1776910970

I'll ask you the same thing I asked the other guy. How is a an open weights model that I can run on my own hardware without an internet connection going to exfiltrate my trade secrets to the Chinese government?

dTal · 2026-04-23T09:13:53 1776935633

It's the same user and they already answered you: "If you read I’m talking about their service only models."

But yes this is a non-sequitor. The original question was "What competitive advantage does OpenAI/Anthropic has when companies like Qwen/Minimax/etc are open sourcing models that shows similar (yet below than OpenAI/Anthropic) benchmark results?"

Even if you don't trust Chinese companies, and you want a hosted model, you can always pay a third party to host a Chinese open weight model. And it'll be a lot cheaper than OpenAI.

rohansood15 · 2026-04-23T05:32:03 1776922323

Chinese companies are built on IP theft, and Anthropic/Open AI are not?

And in world where code generation costs are trending to zero, goodluck commanding a premium to produce any kind of code.

There is a whole bunch of P99 code that is open-source. What makes code P99 is not the model that produces it, but the people who verify/validate/direct it.

piperswe · 2026-04-24T14:40:39 1777041639

Didn’t the major American labs pirate a whole bunch of their training data?

runjake · 2026-04-22T16:39:30 1776875970

You're right, but perspective is important, and that's because China and the US are engaged in economic warfare (even before the current US regime), vying for the dubious title of "superpower".

DiogenesKynikos · 2026-04-23T01:05:45 1776906345

Are you claiming that major Chinese cloud providers like Tencent and Alibaba are pilfering trade secrets from their customers' data? To my knowledge, there's no evidence for that whatsoever. If it were true and came out, it would instantly tank their cloud businesses (which is why they don't do it, and why AWS, Azure, etc. also don't do it).

If it were to happen, Chinese law does offer recourse, including to foreign firms. It's not as if China doesn't have IP law. It has actually made a major effort over the last 10+ years to set up specialized courts just to deal with IP disputes, and I think foreign firms have a fairly good track record of winning cases.

> No one really believes at face value

This says a lot more about the prejudices and stereotypes in the West about China than it does about China itself.

Zetaphor · 2026-04-23T02:29:01 1776911341

In every one of these threads for a new Chinese open weights model, it's always the same tired discussion of how this is all actually a psyop by the Chinese government to undermine US interests and it can't answer questions about Tienanmen Square.

Meanwhile I'm over here solving real world business problems with a model that I can securely run on-prem and not pay out the nose for cloud GPU inference. And then after work I use that same model to power my personal experiments and hobby projects.

There are no Chinese labs with different financial and political motivations, there's only "China" the monolith. The last thread for Qwen's new hosted model was full of folks talking about how "China" is no longer releasing open weights models, when the next day Moonshot AI releases Kimi 2.6. A few days later and here's Qwen again with another open release.

For some reason this country gets what I assume are otherwise smart Americans to just completely shut off their brains and start repeating rhetoric.

andriy_koval · 2026-04-23T17:02:31 1776963751

> The last thread for Qwen's new hosted model was full of folks talking about how "China" is no longer releasing open weights models, when the next day Moonshot AI releases Kimi 2.6. A few days later and here's Qwen again with another open release.

looks like you declared win argument, because you now see that 2.6 was released, but at that time your opponents argument stand.

Also, you can't predict if Chinese labs will continue releasing open frontier models. Looks like Kimi is the only one left, Qwen is much smaller model.

Zetaphor · 2026-04-24T16:40:20 1777048820

> looks like you declared win argument, because you now see that 2.6 was released, but at that time your opponents argument stand.

Their argument was based entirely on speculation, but stated as a matter of fact, despite Alibaba making very clear statements that they were going to continue releasing open models.

And the core of my argument is that they were conflating a single company with the motivations of multiple companies in a country. Nobody talks about US companies by saying "The Americans are going to do X", they say "OpenAI/Anthropic/Google is going to do X".

ginko · 2026-04-23T09:24:38 1776936278

>but if you’re not a Chinese shop, would you really host your production code development workload on a Chinese hosted provider?

The point of open source models is that you host them locally. I trust neither Chinese nor American providers with this.

andriy_koval · 2026-04-23T16:57:40 1776963460

another point is that there could be multiple inference providers, so market will be healthier, and not dominated by one player who charges NN% margin.

otabdeveloper4 · 2026-04-22T17:03:29 1776877409

> For coding often quality at the margin is crucial even at a premium.

For coding, quality is not measurable and is based entirely on feels (er, sorry, "vibes").

Employers paying for SOTA models is nothing but a lifestyle status perk for employees, like ping-pong tables or fancy lunch snacks.

fnordpiglet · 2026-04-22T18:13:21 1776881601

I’m building my own company and I consider model choice crucial to my marginal ability to produce a higher quality product I don’t regret having built. Every higher end dev shop I’ve worked at over the last few years perceives things the same. There are measurable outcomes from software built well and software not, even if the code itself isn’t easily measurable. I would rather pay a few thousand more per year for a better overall outcome with less developer struggle against bad model decisions than end up with an inferior end product and have expensive developer spin wheels containing a dumb as a brick model. But everyone’s career experiences are different and I’d feel sad to work at a place where SOTA is a lifestyle choice rather than a rational engineering and business choice.

otabdeveloper4 · 2026-04-23T05:22:09 1776921729

"Rational engineering and business choice" and "AI" are two words that do not go together.

Wait five years and come back. Right now AI is 100% FOMO and lifestyle signaling and nothing more.

j-bos · 2026-04-22T19:42:54 1776886974

"based entirely on feels"

Now there's a word I haven't heard in a long, long time.

OtomotO · 2026-04-22T20:14:49 1776888889

> but if you’re not a Chinese shop, would you really host your production code development workload on a Chinese hosted provider?

As opposed to an US-american shop? Yup, sure, why not? It's the same ballpark.

donmcronald · 2026-04-22T16:52:22 1776876742

Given the very limited experience I have where I've been trying out a few different models, the quality of the context I can build seems to be much more of an issue than the model itself.

If I build a super high quality context for something I'm really good at, I can get great results. If I'm trying to learn something new and have it help me, it's very hit and miss. I can see where the frontier models would be useful for the latter, but they don't seem to make as much difference for the former, at least in my experience.

The biggest issue I have is that if I don't know a topic, my inquiries seem to poison the context. For some reason, my questions are treated like fact. I've also seen the same behavior with Claude getting information from the web. Specifically, I had it take a question about a possible workaround from a bug report and present it as a de-facto solution to my problem. I'm talking disconnect a remote site from the internet levels of wrong.

From what I've seen, I think the future value is in context engineering. I think the value is going to come from systems and tools that let experts "train" a context, which is really just a search problem IMO, and a marketplace or standard for sharing that context building knowledge.

The cynic in me thinks that things like cornering the RAM market are more about depriving everyone else than needing the resources. Whoever usurps the most high quality context from those P99 engineers is going to have a better product because they have better inputs. They don't want to let anyone catch up because the whole thing has properties similar to network effects. The "best" model, even if it's really just the best tooling and context engineering, is going to attract the best users which will improve the model.

It makes me wonder of the self reinforced learning is really just context theft.

jameson · 2026-04-23T19:23:42 1776972222

Apologies for my ignorance but how can you know the quality of the context?

swiftcoder · 2026-04-22T20:18:20 1776889100

> For coding often quality at the margin is crucial even at a premium

For some problems, sure, and when you are stuck, throwing tokens at Opus is worthwhile.

On the other hand, a $10/month minimax 2.7 coding subscription that literally never runs out of tokens will happily perform most day-to-day coding tasks

solenoid0937 · 2026-04-23T05:16:30 1776921390

"Literally never runs out of tokens?" lol, no. Tokens are just energy. There is always a way to run out of tokens, and no one will subsidize free tokens forever.

swiftcoder · 2026-04-23T07:48:56 1776930536

"Never runs out of tokens" in the sense that running 8 hours a day 7 days a week is still under the subscription limit

solenoid0937 · 2026-04-23T16:39:59 1776962399

You can also do that on an API without hitting a limit!

swiftcoder · 2026-04-23T17:38:39 1776965919

Not typically at predictable monthly spend, which turns out to be important to some folks

pistoriusp · 2026-04-23T07:21:56 1776928916

if you run it at home then the sun is a pretty good way to get "free energy."

sumedh · 2026-04-22T21:57:51 1776895071

Why pay for two subscriptions though?

Claude also has other models which use less tokens.

swiftcoder · 2026-04-23T17:39:39 1776965979

Redundancy, mostly. And having left over tokens when Opus eats all of those tokens

AJ007 · 2026-04-22T16:33:27 1776875607

Not sure how your last point matters if 27b can run on consumer hardware, besides being hosted by any company which the user could certainly trust more than anthropic.

OpenAI & Anthropic are just lying to everyone right now because if they can't raise enough money they are dead. Intelligence is a commodity, the semiconductor supply chain is not.

datadrivenangel · 2026-04-22T17:51:47 1776880307

The challenge is token speed. I did some local coding yesterday with qwen3.6 35b and getting 10-40 tokens per second means that the wall time is much longer. 20 tokens per second is a bit over a thousand tokens per minute, which is slower than the the experience you get with Claude Code or the opus models.

Slower and worse is still useful, but not as good in two important dimensions.

fnordpiglet · 2026-04-22T18:16:35 1776881795

Also benchmark measures are not empirical experience measures and are well gamed. As other commenters have said the actual observed behavior is inferior, so it’s not just speed.

It’s ludicrous to believe a small parameter count model will out perform a well made high parameter count model. That’s just magical thinking. We’ve not empirically observed any flattening of the scaling laws, and there’s no reason to believe the scrappy and smart qwen team has discovered P=NP, FTL, or the magical non linear parameter count scaling model.

dTal · 2026-04-23T09:35:19 1776936919

Ooh, car analogy time!

It's kinda like saying a car with a 6L engine will always outperform a car with a 2L engine. There are so many different engineering tradeoffs, so many different things to optimize for, so many different metrics for "performance", that while it's broadly true, it doesn't mean you'll always prefer the 6L car. Maybe you care about running costs! Maybe you'd rather own a smaller car than rent a bigger one. Maybe the 2L car is just better engineered. Maybe you work in food delivery in a dense city and what you actually need is a 50cc moped, because agility and latency are more important than performance at the margins.

And if you're the only game in town, and you only sell 6L behemoths, and some upstart comes along and starts selling nippy little 2L utility vehicles (or worse - giving them away!) you should absolutely be worried about your lunch. Note that this literally happened to the US car industry when Japanese imports started becoming popular in the 80s...

anon373839 · 2026-04-22T22:56:53 1776898613

This is just blind belief. The model discussed in this topic already outperforms “well made” frontier LLMs of 12-18 months ago. If what you wrote is true, that wouldn’t have been possible.

datadrivenangel · 2026-04-22T23:26:23 1776900383

It's amazing that we can run models better than state of the art ~36 months ago on local consumer devices!

rmacqueen · 2026-04-22T18:39:16 1776883156

> This is why the marginal difference between your median engineer and your P99 engineer is comp is substantial, while the marginal comp difference between your median pick and packer vs your P99 pick and packer isn’t.

That's an interesting analogy.