Anthropic says Alibaba illicitly extracted Claude AI model capabilities (reuters.com)

808 points|by htrp|4d ago|1,304 comments|Read full story on reuters.com

Comments (1304)

1. zakkl|4d ago|context

It sounds like Anthropic is eagerly trying to show to USG that they are willing to heavily monitor ‘foreign adversaries’ on their platforms.
This combined with no implementation of KYC makes it seem like they want to find a middle ground with Fable where its off of export controls but they promise to prevent China and specific others from using.
2. ninefathom|3d ago|context

This seems to me like a stab in the right direction.
Obviously their actions are going to be fiscally motivated at the root, but sussing out how they intend the precise dynamics to play out is more nuanced.
Thinking of this as an effort to woo the defense hawks cuts a very clear path.
3. verdverm|3d ago|context

This is not the first time it happened. What have they done to improve the situation? I suspect it more a cat & mouse game, with a lot more cats playing.
4. drillsteps5|4d ago|context

I'm looking forward to the trial where Anthropic will have to disclose sources of their training data, and then explain why they are entitled to charging customers for using regurgitated training data but Alibaba which trains their models on Anthropic's models are not.
Should be fun.
Edit: clarification
5. ninefathom|3d ago|context

While I love the sentiment, I feel like the odds of this actually ever reaching a trial are low, given the international positioning of the parties, and the... um... complex relationships involved.
Anthropic's actions seem performative. Others have already speculated on the likely audience(s).

6. AdieuToLogic|3d ago|context

> While I love the sentiment, I feel like the odds of this actually ever reaching a trial are low ...

As cited in a peer comment here[0]:

  In June 2025, Judge William Alsup of the U.S. District 
  Court for the Northern District of California ruled on 
  summary judgment that using books without permission to 
  train AI was fair use if they were acquired legally, but he 
  denied Anthropic’s request for summary judgment related to 
  piracy—finding that the piracy was not fair use.[1]

Of note in the judge's finding; "the piracy was not fair use".

0 - https://news.ycombinator.com/item?id=48667411

1 - https://authorsguild.org/advocacy/artificial-intelligence/wh...

7. appplication|3d ago|context

Being logically consistent isn’t as profitable as being aggressive and loud.
8. conception|3d ago|context

They already did and paid 1.5B https://authorsguild.org/advocacy/artificial-intelligence/wh...
9. cr125rider|3d ago|context

Meta/Facebook got away with it though right?
10. gaiagraphia|3d ago|context

Quite amusing that the library of libgen is worth 1.5bil for unlimited access.
It's about the same valuation as bun, lol.
11. mastermedo|3d ago|context

$3,000 per title.
12. scotty79|3d ago|context

Do you think many authors would give you rights to create derivative works en masse for that money?
13. gf000|3d ago|context

For endlessly reselling the whole work verbatim? Well, where can I buy such a license in the real world, because then I would like to buy a couple of those!
14. mannanj|3d ago|context

That's a great cost-benefit ratio. Can you and I steal and do illegal things and pay the same cost?
15. eviks|3d ago|context

Sure, but only if you get the same benefits
16. mannanj|3d ago|context

looks like we can't today. Man it would be great to figure out how to be above the law just like how these other rich people in different social classes are.
17. fg137|3d ago|context

That's only a fraction of the training data.
18. drillsteps5|2d ago|context

Interesting. Looks like the judge ruled using legally obtained knowledge (books, articles, etc) to train AI constitutes "fair use".
Given that US legal system is precedent-base that... changes things.
19. HarHarVeryFunny|2d ago|context

That's a tiny drop in the bucket of the value these AI companies have appropriated from society.
Just to give an idea of the scale of it:
Let's say a modern SOTA LLM has 1T params and is therefore trained on 100T tokens
1000 tokens of text = 750 words of prose, which may take 15 min to 3hr to write (Gemini's estimate)
1000 tokens of code = 50-70 lines of code, which may take 15min to 5hr to write
We just want a rough estimate of the value of this, so let's say that 1000 tokens took 1hr of human labor to generate at an average wage of $50/hr
So, if 1000 tokens cost $50 of human labor, then that 100T of training data cost $5T.
So, the value of what the AI companies took from society might better be estimated in the trillions of dollars, not billions.
And of course what they are doing with all this data is building generative AI, so it's not just the value of what they took, but more importantly the future opportunity they are stealing from everyone by replacing human labor with their automaton who's profits they intend to keep for themselves.
20. Artoooooor|3d ago|context

And if it includes at least one GPL source, they should release the weights on GPL license.
21. rvz|4d ago|context

Notice how Anthropic is now scapegoating Chinese models providers like Alibaba and outright accusing them of distilling their models.
Whether if it is true or not, this is part of their effort into using them as an example to scare everyone into getting congress to ban powerful models from being accessed outside of the US and also banning powerful local models from being released.
Anthropic does not care about you, and they are not your friends.
22. re-thc|3d ago|context

> Whether if it is true or not
If it was just "that easy" then I doubt only "Chinese models" would be doing it and we'd already be packed with competition.
Distilling might be a thing but it isn't a free win.
23. skeledrew|3d ago|context

Only China really has the resources (multiple labs invested in the space), culture (Asians are generally collectively-inclined, so sharing is in their core) and political bent (there will be no diplomatic repercussions) to put up a fight.
24. re-thc|3d ago|context

> Only China really has the resources (multiple labs invested in the space)
That's not the point. Why is it a country thing? There are plenty of non-China startups in this space having resources at that scale. The "China" has resources is some "Western media narrative" speak. So Meta should have won a long time ago? Or xAI?
> culture (Asians are generally collectively-inclined, so sharing is in their core)
Just stereotype it? So we've gone from China -> "Asian"? Then where is your Korean or Japanese model etc? And somehow you know they're sharing.
> political bent (there will be no diplomatic repercussions) to put up a fight
More inferring from "Western media news"?
Where's the reality?
The media hyped up Gemini / Google TPU free-win last year. How did that go?
25. skeledrew|3d ago|context

> Why is it a country thing?
Because the China vs US geopolitical situation is a thing. Meta is a social media company, not an AI company, and they direct their focus as such. xAI just never got serious traction so now they're selling their compute. Also if a US company were caught distilling, I think Anthropic could actually take them to court, and I'd guess they don't want that kind of PR.
> Just stereotype it?
Is China not Asian? Are Asians not generally collective/cooperative, as opposed to individualistic/competitive?
The "and" that joined those 3 items is very important: it means you can't pull them apart and address them independently as they each contribute to the context. I'm not too sure about Korea, but in a way Japan is a US colony in all but name. Both are very much politically intertwined with the West (along with RoC/Taiwan), which means nothing major that may be against US interest happens.
The reality is that China and the US are essentially in a trade war, where the latter is trying its best to keep the former in the Dark Ages, because "national security", but the former is refusing to take it lying down and continues to make progress regardless[0], because they have the resources and will.
[0] https://thenextweb.com/news/china-lineshine-supercomputer-to...
26. re-thc|3d ago|context

> Because the China vs US geopolitical situation is a thing.
By the media? It's easy to point fingers at a blackhole.
> Meta is a social media company, not an AI company,
Alibaba (the discussion here) is not an AI company too (by your definition).
> Also if a US company were caught distilling, I think Anthropic could actually take them to court, and I'd guess they don't want that kind of PR.
Meta has been to congress. Microsoft, Google etc have been in lots of court cases and continue to do so. Do you really think that is what stops them?
> Is China not Asian? Are Asians not generally collective/cooperative, as opposed to individualistic/competitive?
This is exactly the "media" view you get. It's just stereotypes and generalization.
And yes, that is wrong by the way. Evident in real data. "China" as a whole wins market share in many areas but no 1 company has as much of a monopoly as US companies do. Why? There's so much competition that it is scary. So are you sure they don't compete?
> but in a way Japan is a US colony in all but name
Again, I almost give up seeing this. Clearly, not. If a whole country, the world's top 5 in GDP is only that to you something is wrong with what you're seeing - not with the country.
> Both are very much politically intertwined with the West (along with RoC/Taiwan), which means nothing major that may be against US interest happens
On the table? You do know that China is a top trading partner with all of these on your list. Despite whatever spat you might see in the media.
> The reality is that China and the US are essentially in a trade war
No. That's what the US government wants you to believe. It was even documented that in his 1st term, Trump, wanting a grand policy asked Krushner, whom then suggested China (pretty randomly) and so they went with it. Trump has now done less "China" related things lately due to all the backlash that you'd think he has moved on and found new toys.
Until very recently, the export ban GPUs had such a loophole that Chinese companies were able to use subsidiaries outside of China to buy and train that the whole thing was meaningless.
i.e. conclusion: stop getting brainwashed by media articles. It's all a show to get someone like you riled up.
27. skeledrew|2d ago|context

> By the media?
What is it that you have against the media? If it's bias that you're trying to avoid then it's futile as there's bias everywhere; it's inherently human to be biased. What you do is go for multiple varied sources, with that awareness, find the intersections, and apply your own judgment to your findings.
> Alibaba (the discussion here) is not an AI company too
No, but that doesn't stop them from having an internal AI lab. Meta may not be an AI company but it definitely has a lab. Same for Google (a primarily information retrieval company), and they have a very good AI lab (though part of that is because their primary business benefits greatly from AI, so they must invest heavily in it).
> Do you really think that is what stops them?
I really don't know as I haven't looked into it much; that's mostly speculation on my part.
> It's just stereotypes and generalization
It's a generalization that holds well. I lived among and went to college with Asians (fresh out of Asia) for several years, and during my studies (I have a degree in Cross-Cultural Studies) my focus was Asia, primarily Japan. As such it's something that I've come to learn "from the horses' mouths".
Sure there can be a lot of competition in business and academia (especially in the younger generation), but that happens within a core framework of cooperation (group obligation and social harmony). It's kind of like a race to get the furthest fastest, to then share the secret to that endurance and speed with those who "lost". You only have to take a look at a random sampling of research released publicly, particularly in tech: you will note a predominantly Chinese authorship in general across the board. And it comes back to my original 3 points:
- China has the resources to pour into R&D in any area of interest,
- there may be a lot of competition, but a primarily collectivist/cooperative core means results are shared broadly so everyone can advance, and there are few qualms about taking from others (the West calls this "IP infringement" or something) because ultimately it's an ingrained cultural expectation that such things are available,
- China isn't in bed with (ie controlled by) the US politically (unlike so many other countries in Asia), so they have no problem going against their wishes/culture.
All points together make for a country that can most efficiently - ironically - compete with the US and not be subject to its foreign policies, which scares them to their core and that fear is expressed in the dearth of export controls and other restrictions the US has in place and forces other Western countries to adopt.
> I almost give up seeing this
Are you aware that:
- Japan was invaded and occupied by the US for several years?
- the Constitution that governs Japan today was drawn up by the occupying force during said occupation?
- Japan is forbidden by the US from building a military (with the token offering that they can maintain a "Self Defense Force"; US has promised to "protect" them otherwise)?
- there are over 20 US military bases in Japan?
Do you think there's anything that Japan can do without the allowance of the US? Just imagine if they were to somehow get a public consensus to update that Constitution that's not in the US's interest (will never happen since they're infiltrated at all levels, but still); what do you think would happen if they then started to enact those changes? Think the military at those bases will just sit and idly watch?
> You do know that China is a top trading partner with all of these on your list.
That's irrelevant. The way the world is structured today is that with many countries, trade happens regardless of political rifts and sanctions (which are almost never 100%). For example many EU countries still buy fossil fuels from Russia, which is pretty heavily sanctioned. And regardless of the increasing export controls and sanctions, the China and US economies are pretty interdependent.
>Until very recently[...]
Yes there are and will always be some kind of loophole, because of economic interdependence. One can't fully shut out the other, or do anything that leads to a direct China-US hot war. Just look at the politics at play over RoC/Taiwan. Neither will back down, and neither will do anything first that leads to escalation.
Conclusion: don't dismiss any source outright, but evaluate them carefully. There are always multiple sides to a story, and it's on you to put the pieces together.
28. re-thc|2d ago|context

> What is it that you have against the media?
Certain outlets aren't clearly reporting but manipulating by paid actors. It's nowhere near the truth. Not even 50% of it.
> I really don't know as I haven't looked into it much; that's mostly speculation on my part.
And I'll stop there. That's 99% of your post. Ungrounded facts. Misguided claims.
You claim you have studied history and maybe that's your bias. History was a long time ago. Back then United Kingdom, i.e. The Great Britain ruled the world too. Are we going to go on about that? The world changed. History doesn't apply. Stop talking about it. So your view of Asians might be old.
It's also just sampling bias. You go to 1 place. See a few "Asians" and it represents a whole continent?
Following your logic I go to 1 restaurant and will claim all American food is terrible based on it.
> That's irrelevant.
You're just blinded and misguided. The end.
> Do you think there's anything that Japan can do without the allowance of the US?
Yes. Iran can do what they want without the allowance of the US. The consequence is military action broke out. But they can. So why can't Japan? All the same. You're mixing logic up.
> Conclusion: don't dismiss any source outright, but evaluate them carefully.
Which you failed to do.
29. skeledrew|2d ago|context

> It's nowhere near the truth
So... where do you get your "truth" from?
> Ungrounded facts. Misguided claims
Point them out with your - apparently non-media - sources which prove otherwise.
> You claim you have studied history
You'll have to also point out where I made such a claim. And I'm pretty sure my view of Asians isn't old, as I still communicate with some regularly.
This isn't about going "to 1 place". This is about doing deep related research (secondary sources) over several years and the results of said research for the most part agreeing with experience with and words of the primary sources (those I interacted with over several years). Your restaurant comparison doesn't fly at all.
> So why can't Japan? All the same.
You have everything there. Read the points again. Try to follow the flow; it isn't that hard. I don't see how you can say a country where the US has had continuous military presence for decades and a country where they were afraid/would be ill-advised to send in ground troops are "same" in this context.
30. re-thc|2d ago|context

> I don't see how you can say a country where the US has had continuous military presence for decades and a country where they were afraid/would be ill-advised to send in ground troops are "same" in this context.
It is the "same". Every country is in striking distance or you can move troops over.
You're speaking like Napoleon or someone? More history lessons? What troops?
These days you can nuke or bomb most places easily. Aircraft carriers, drones, etc. The US has strategic bases everywhere. Inside or outside. No difference. You're 90% arguing over semantics that sound nice on paper. Not so in reality.
Get on with the times.
> And I'm pretty sure my view of Asians isn't old, as I still communicate with some regularly.
That doesn't make it not old. If I speak to my grandfather that never changed for 20 years that's still regularly AND old.
31. skeledrew|2d ago|context

> It is the "same".
It's not. There are different costs - political, logistical, etc. There would be a very high cost if US troops went into Iran, or just if the war were to continue. Note there are still issues and Iran can/will still close the strait whenever they feel slighted. There's almost no cost if the US wanted to influence Japan as they're already inside.
And it seems you're also against history. So no media for current affairs, and no history for contextualizing current state of affairs. Again, what are your sources for your arguments? So far it's like you're pulling everything out of thin air, with zero support or even basic logic.
Also it wouldn't behoove you to look into Asian attitudes to social interaction, although it'll be difficult if you don't spend some time communicating with some of them, and nigh on impossible if you completely dismiss the history that shaped what is today.
32. re-thc|1d ago|context

> There's almost no cost if the US wanted to influence Japan as they're already inside.
Then why did Japan need a tariff? That's a cost. If it was as you say you won't need it.
> although it'll be difficult if you don't spend some time communicating with some of them
How do you know I don't? Your problem is your conclusions are all superficial. This includes your continued generalization.
> And it seems you're also against history.
I'm not against history. You've just applied it wrong. History says the Asian you speak so much about has had just as much wars and internal conflicts as you can imagine. So how can you conclude they work together as a social contract. Maybe on the surface.
33. skeledrew|13h ago|context

> Then why did Japan need a tariff?
What is this about and how does it relate to the context here?
> How do you know I don't?
I never made a conclusion whether you do or don't. I only said it'd be difficult to draw an accurate picture if you don't.
> So how can you conclude they work together as a social contract.
It's something that has been studied for decades, and found to continually hold generally (despite increasing Western influence). Here are a few links to help you along in your discovery of Asian collectivist attitudes:
- https://www.simplypsychology.org/what-are-collectivistic-cul...
- https://www.sohoinchina.com/culture-collectivism/
- https://www.numberanalytics.com/blog/collectivism-in-traditi...
- https://worldpopulationreview.com/country-rankings/collectiv...
Please, if you're going to continue this, provide decent sources and/or logical reasoning to advance your arguments. Going around in circles is tedious.
34. Larrikin|2d ago|context

In what way is Japan a "colony" but Germany or any of the other European nations with US bases not?
The SDF is very capable and being "defensive" is a thing the people could vote away whenever they want. It's simply convenient for not being dragged into pointless wars
35. skeledrew|2d ago|context

> Germany or any of the other European nations with US bases not
Never said anything about them as not relevant in the context of this convo. But Germany's Basic Law was also shaped while they were under occupation. Main difference I'd say is the US mostly played a supervisory role there (since it was a collective Allied occupation) rather than the one doing the writing.
> The SDF is very capable[...]
Doesn't really matter; point is the status was forced on them by another, and they can't change that status on their own whether they want to or not. Theoretically, the people could vote it away. Practically, if the US isn't fully on board, there won't even be a vote.
36. re-thc|2d ago|context

> Main difference I'd say is the US mostly played a supervisory role there
Does it matter? So this is about history. Not today.
Well history says the US came from... so are we going to add that argument? Then none of the places you mentioned are superior to 1 another and the whole argument is moot.
37. Larrikin|2d ago|context

>Theoretically, the people could vote it away. Practically, if the US isn't fully on board, there won't even be a vote.
Provide any evidence. This idea seems to come from your misinformed opinion and trying to provide some superficial difference between Europe and Japan
38. skeledrew|2d ago|context

It's more a matter of logic and human nature. The Japanese have been essentially traumatized from being nuked and then being officially occupied by the US for years (unofficially still occupied, given all those military bases). A good amount are against doing anything that'll "rock the boat". General MacAuthor (the guy who led the invasion and occupation of Japan as well as made their Constitution) designed the Constitution so it's extremely difficult to update (need a super majority of the Diet, before a national referendum). A continuous, overt US presence means they can easily apply a little pressure to prevent any Constitutional changes, while any actions without such change would be illegal (and the Japanese are generally sticklers for the law). If there was anything that the US really wanted to change (can't think of anything, since it serves them well as is) they can always start a campaign, perhaps similar to La Tilde[0].
This[1] gives an idea of just how difficult it is to change the Constitution (there have been 0 amendments to date).
I really can't say much about Europe as I haven't done any research about any countries there.
[0] https://theintercept.com/2026/06/02/la-tilde-propaganda-lati...
[1] https://www.nippon.com/en/in-depth/d00847/
39. re-thc|1d ago|context

> It's more a matter of logic and human nature.
It's a matter of you arguing over semantics. Useless ones too.
The US and in particular, the current president Trump has shown it is all useless. Every government has enough levers and workarounds to make it happen.
Trump has classified the attacks on Iran as "self-defense" so self-defense force, whatever... does it matter?
> and the Japanese are generally sticklers for the law
No, they're not. As usual you just take it at face value and read fake news. Japan's actual prosecution rate is ~30-40%. Post-indictment conviction rate is 99%. Guess why? They fudge the numbers, cheat the system and make it up.
40. skeledrew|13h ago|context

Cite your sources and show your reasoning.
41. sheepscreek|3d ago|context

I think it’s more than that. Piecing together the perspective of a few commentators in this post - it’s plausible Anthropic is trying to shift the narrative from US vs. Rest of the world to US vs. China.
In other words, they want to sell Fable or future more powerful models to rest of the world (presumably all future models are going to be more powerful than current gen). One way they can sell this is to the government is by scapegoating China (which is their primary concern anyway).
This is working on the presumption that non-US companies form a material portion of their current revenue.
42. zb3|3d ago|context

If true then Alibaba is doing us a public service, good job, I hope this extraction was successful.
43. 0xbadcafebee|3d ago|context

There's two basic kinds of distillation: 1) the massive [and dumb] method where you ask a question and use the answer as reinforcement (Black Box), and 2) more targeted distillation where you use one model to directly inform/train/guide another model (RLAIF).
The latter is basically fine-tuning the model with direction from another model. Thousands of businesses do this every day to fine-tune. This is almost certainly what the Chinese labs are doing, since it has a much better effect on the end result than just getting simple answers to simple questions.
These complaints of distillation are inflating the problem to make it sound worse than it is, because they want the USG to block/ban Chinese model providers as protectionism. They have already called for more export controls on chips (which is funny because DeepSeek v4 was designed to run on Huawei chips and now the other Chinese providers are following suit). But they can't come right out and say that, so their claim is that they're asking for more export controls because distilled models might not be as safe as their own. But if you show them a jailbreak of their model that bypasses their safety, they'll tell you that any model can eventually be jailbroken so don't worry about safety.
44. dannyw|3d ago|context

If you’re doing evals, you’re basically doing RLAIF without training a model; just looking at the results.
Fundamentally it is very difficult to stop this while still making your AI models useful.
45. zmgsabst|3d ago|context

Similarly, if you did a corpus study on bioRvix to summarize recent science findings — you could use the same questions and answers to fine tune a model.
There is no way to communicate information at scale to companies through the API, for anything approaching a real application, without that information forming a corpus another model can be trained on.
But it wouldn’t be the first time they broke a model:
Their “guardrails” that cause it to reject user prompts also means it relies on its pop science summary of medicine to tell you why bioRxiv is wrong rather than accurately summarize the papers.
They’ve successfully created a smug, argumentative average of the internet which refuses to even consider it might be wrong or that it’s reading a science paper which is based on measurements and not vibes — but why would I pay for that?
I get it for free online.
46. janalsncm|3d ago|context

Yeah I think the technical term is something more like “pseudo-labeling”. The OG distillation requires logits which Anthropic doesn’t provide.
47. mannanj|3d ago|context

>But if you show them a jailbreak of their model that bypasses their safety, they'll tell you that any model can eventually be jailbroken so don't worry about safety.
Yes this is in line with what Anthropic said in their public statements about their Fable access restriction by the government directive. The hypocrisy and inconsistency in their statements and behavior feels quite childish and controlling. I believe our companies and their leaders, friends among our other influential leaders and leaders from rich social classes, want to actively hurt most people as this behavior looks to be quite self-interested.
48. topato|3d ago|context

Not to mention, the person who brought this quote unquote jailbreak to the Trump Administration was Amazon’s new CEO. They know their IPOs are coming up, so locking their competitors out of the U.S. (even if just for the weeks surrounding the IPO date) would be a major boon. The White House seems to love making announcements just for the sake of making the market move…. Coincidentally, right after POTUS buys a massive amount of the benefactory company’s stock (Buy Dell Computers, lol)
49. gmerc|3d ago|context

https://research.nvidia.com/labs/lpr/slm-agents/ - Distillation data is a natural byproduct of using these models. There's no effective defence against it. Anthropic is degrading thinking blocks to summaries to slow it down and hide model internals, but in the end, the math says you're SOL and it works on MNC/Large Corporate scale well enough that the moment cost becomes a priority, you're left without the lock in you need to keep customers paying.
50. alfiedotwtf|3d ago|context

Byproduct? It’s essentially the only part of an LLM that is useful, because it’s the WHOLE product!
It’s the same reason why DRM for audio and video is a non sequitur - if you want a person to see or hear audio or video, eventually at the end of the chain, it’s going to be converted to audio for the ear and light for the eyes - that’s why you attach your tap.
Without a model generating tokens, what’s the point. So if Anthropic somehow disable quality token generation, what’s the point!
51. TeMPOraL|3d ago|context

That's why the harness is moving server-side: because generating tokens is not the actual point of the model, not for the users. Especially with tool calling giving us agents that can act, most of the tokens generated are not, themselves, critical to the end users. Specifically, a lot of tokens goes into orchestrating actual tool calls, and then most "thinking tokens" are only relevant to users only in so far as they help users keep track of and verify what the LLM is doing. So all those tokens can be hidden or replaced by partial summaries, and all of that can happen server-side, and then there's very little to distill from.
52. TSiege|3d ago|context

I haven't heard of this happening, do you have links any explainers on this?
53. TeMPOraL|3d ago|context

Claude on the Web (which includes also at least the Android and Desktop apps) and ChatGPT web app are two examples - they keep gaining agentic capabilities.
Perhaps most striking example for me - I've been using a lot of Claude Code in the past month, most of it was through the web, Desktop (app) or phone interface, running actual harness "remotely" (somewhere on Anthropic-controlled infra).
One way of looking at it: web surfaces are slowly catching up with (fraction of the power of) agentic coding tools. But another way is, the major players are building up SaaS harnesses that start to compete with (their own) local ones. The reason may be ease of use, but the practical side effect is making it much harder to use their models to train competition, as these SaaS harnesses create an abstraction layer on top of LLMs that resides entirely in the vendor's cloud and therefore cannot be worked around.
54. fnord77|3d ago|context

Can you reach into the model and "transplant" weights directly?
55. antonvs|3d ago|context

You can do things like that - one example is averaging weights between related models - but not with Anthropic's models, because outsiders don't have access to the weights.
56. fulafel|3d ago|context

Weights are just data a server, so we don't know outsiders have access (either via breakin or arrangement).
57. antonvs|3d ago|context

Yes, obviously. That's not the point.
58. parineum|3d ago|context

If you have access to the weights, you can just use them as is...
59. HarHarVeryFunny|3d ago|context

Anthropic are not saying they have been hacked - they are saying that Alibaba have been sending lot of requests to their servers.
60. jorisw|3d ago|context

No, you'd need to have the model on your filesystem for direct access, and then the architecture would need to be the same.
61. X-Ryl669|3d ago|context

I'm not 100% sure it's not possible. If (I don't know) it's possible to freeze the temperature of the model so it's deterministic, and if you could make a map of produced words back to tokens (via HMM probably), then you can probably alter a minimal input and observe the output to model it. If you perform waves of such minimal alterations, you can expect to be able to locate the distance where each alteration impact the model (the idea being that a small alteration on output is likely due to the last layers of the models, and a small alteration is likely due to the deeper layer). Once you've located most of the last layer(s?) weights, you can try to solve for them. With a hundreds of billions weights model, the last layers will likely be so huge that it's probably unfeasible technically, but it's theoretically possible.
62. cheesecakegood|2d ago|context

Yes you can! Well, mostly, depends on how pedantic you are with definitions: you can transplant layers but not weights, which in common parlance are conceptually similar. But usually it isn’t a good idea for a few reasons.
There’s a really fascinating example[1] where a guy identifies a particular set of layers and transplants them. Overgeneralizing, early layers are encoders and the later layers are decoders and in the middle some blocks seem to do specific things or tasks related “reasoning”. So you can actually create a FrankenLLM and it sometimes works.
This needs architectures to be roughly similar however and internal representations to be consistent-ish so for “stealing” it’s not really a thing (other practical concerns aside)
[1] https://dnhkng.github.io/posts/rys/
63. anon373839|3d ago|context

> These complaints of distillation are inflating the problem to make it sound worse than it is
Unfortunately, the Reuters piece itself is complicit in this dramatization. The lede paragraph parrots Anthropic's talking point that distillation is an "attack", without using quotes that would alert the reader that this framing is a corporate talking point. Distillation is NOT an attack.
64. p4coder|3d ago|context

Agreed! I had to do a double take and check the URL. I thought I am reading a press release rather than actual reporting.
65. da_grift_shift|3d ago|context

Same thing nowadays :^)
66. verisimi|3d ago|context

It always was.
67. soperj|3d ago|context

That's exactly what they pay the publicist for.
68. friendzis|3d ago|context

https://news.ycombinator.com/item?id=13155538
69. verdverm|3d ago|context

ironically, I think this is why the jobs apocolypse is overblown, Ai is only good at a thing if the people using it are also good at that thing, and people are attributing Ai as superhuman at things they do not know themselves
70. inigyou|3d ago|context

AI doesn't have to be able to do your job to convince your boss that it does
71. verdverm|3d ago|context

Ford just rehired 350, new signals are coming through for the industry as reality sets in. Seeming more and more like we are post peak hype
72. inigyou|3d ago|context

Bet they work harder and get paid less than the 350 laid off
73. verdverm|3d ago|context

they were "rehired", so presumably the same people
74. blazespin|3d ago|context

The hilarious thing here is anthropic is basically admitting that most of their Capabilities can be easily copied.
75. verdverm|2d ago|context

I listened to a podcast on Ai and security today where they said they got access to a hackers workdir (after they were caught) and they had hacked 14 companies using GPT-5.2 and Claude <4.5 (forget minor). GLM-5.2 came up because, while not as good as Mythos, it's almost as good, i.e. you have to prompt more / cannot just give a fuzzy request and sit back. The harness likely matters more than the model
https://www.youtube.com/watch?v=ldEJq_JxYuM
76. geye1234|2d ago|context

Once you see it, you can't unsee it:
https://www.paulgraham.com/submarine.html
And if you have more time:
https://dn790002.ca.archive.org/0/items/kupdf.net_bernays-et...
77. dist-epoch|3d ago|context

Reuters is probably the most rigorous news agency in the world.
> it said was the largest known attack
> Anthropic said in the letter it was supportive of the U.S. government's efforts to combat the attacks
both times the word "attack" appears it's clearly stated that the word was used by the company, it's a direct company quote.
actually putting it into quotes would be editorializing
> Unfortunately, the Reuters piece itself is complicit in this dramatization
how would you feel if somebody quoting you would turn your word dramatization into "dramatization" because they don't agree with your assesment
78. psychoslave|3d ago|context

Well, let’s say you put the picture of some political figure, and put in highly contrasted red, bold large catchy font, "TERRORIST THAT KILLED MILLION PEOPLE", then below that in barely visible contrast, in tiny discrete letters, "is what this person probably will claim to be against".
This whole sentence technically will be correct, 100% guarantee, whatever this person actually even said or think.
From a propaganda point of view, framing the elements of language is even more important than what the statements actually states to be true or possibly true.
79. dist-epoch|3d ago|context

nice slippery slope you manufactured there - what if Reuters becomes Daily Mail
what framing are you talking about? they are literally quoting a company.
please explain what Reuters should have done here. Should they have added in parentheses: (editor note: we don't agree with Anthropic calling this an "attack")
Is that what you want? News outlets giving their opinion and moral judgement on company quotes? I mean, Fox News/CNN do have a large following, so there is clearly a market for that.
80. dghlsakjg|3d ago|context

If you’re going to call out their use of slippery slope as a fallacy then it should be pointed out that your original argument was framed on an appeal to authority of Reuters as a leading news agency.
Both are logically unsound.
81. anon373839|3d ago|context

> please explain what Reuters should have done here
This is very straightforward: use direct quotes or use neutral language. The article describes the alleged incident as both an “attack” and a “strike” in the first two paragraphs. And neither is within verbatim quoted text.
Reuters, however highly you may regard them, simply adopted Anthropic’s framing uncritically in this instance.
82. dist-epoch|3d ago|context

You are confusing stylistic choice with framing.
A lot of times Reuters paraphrases instead of "quoting quotes".
> "uncritically"
You are mistaking Reuters with CNN or FoxNews. If you want "critical" reporting you should read some bloggers instead of news agencies.
83. anon373839|2d ago|context

It’s not the paraphrase that’s the problem. It is the adoption of the viewpoint by the writer. Reuters has parroted Anthropic’s judgment of what distillation is in Reuters’s own voice, which imparts legitimacy to the PR spin Anthropic is trying to broadcast. A journalist’s job is not to be the mouthpiece for PR flacks.
If you don’t think a reader who is unfamiliar with the concepts took away a sense that “distillation” is inherently a Bad Thing, I think you’re reading this obtusely.
84. Laurel1234|3d ago|context

> how would you feel if somebody quoting you would turn your word dramatization into "dramatization" because they don't agree with your assesment
This is exactly what news agency should be doing though. When the dude showed up to Comet Pizza to look for Hillary Clinton or whatever, do you figure they should've printed "Local hero saves children from predatory cabal"?
85. qup|3d ago|context

I want them to report the facts, not their opinions.
Reporting that corporate called it attacks is good. I do prefer direct quotes.
However, when they quote one word, the journalists are inserting their own opinion about it. I want to make my own opinions based on the facts. I don't need the reporter to draw the conclusions for me.
86. HarHarVeryFunny|2d ago|context

The problem here is that reuters are in effect just acting as a propaganda arm of Anthropic.
Anthropic craft a piece to manipulate the US government to shut down Chinese AI competition, and CC reuters. Reuters then just publish chunks of this verbatim ("Anthropic said X & Y") using Anthropic's emotionally wrought and manipulative language.
What's missing here is even the most basic analysis of what Anthropic are saying - what are they referring to as a "fraudulent" account, what do they mean by "distillation" (probably not what you imagine), can you actually create a similar capability model by distillation (or is Mythos-cloning just BS), etc, etc.
87. dist-epoch|1d ago|context

> Reuters then just publish chunks of this verbatim
You should read what a news agency is, and what it does.
https://en.wikipedia.org/wiki/News_agency
88. echelon|3d ago|context

Anthropic raped everyone without asking and stole their labor to build their career-commoditizing tech.
Distillation is Robin Hooding it back so that one trillion dollar company doesn't reap all the benefits of their automation of the workforce.
Distillation is Prometheus bringing fire from the gods to give to ordinary humans. Something we all own anyway, but that was kept from us.
Distillation is freedom.
Everyone should be pro-distillation. We should all work together to distill every proprietary model.
Anthropic stole. OpenAI stole. Google stole. ElevenLabs stole. Suno stole.
We should be able to get it all back.
89. SillyUsername|3d ago|context

And a number of Qwen variants are available to self host. Do Anthropic have any like that?
90. echelon|3d ago|context

I'm more excited by open weights models you can't self host and need to spin up on H200s (RunPod or bare metal). This is where the real power lies and is where the open source world will trend.
It's far cheaper to spin up an H200 hourly or to simply consume a managed version of an open weights model than it is to use a proprietary hyperscaler API. And you own the model itself and can fine tune, tweak, lobotomize, etc.
The stuff you can run on your own RTX cards is neat, but it's rather hobbyist. The real power is in the cloud. Renting cloud hardware is fine, because the core problem is ownership of the weights, not the server rack or ISP fiber lines. Those are already commodity.
Big businesses will eventually run open weights models in the cloud, and it'll be a rather large part of the future AI economy.
91. mrngld|3d ago|context

Eaaaaasy now, the Chinese labs aren't freedom fighters on behalf the common man. They're not non-profits, they're not neutral transnational organizations only dedicated to open source efforts.
They're Chinese companies offering open source models now as loss leaders to keep themselves in the game because they know virtually nobody, especially in the corporate world, would contract with them and give them access to their data. They might as well just send a Dropbox link of all their sensitive data directly to their Chinese competitors, same end effect.
They're also doing it as the digital equivalent of what they've done in other industrial sectors for decades. Undercut and flood the market and once you've killed or severely hindered your competition, then you have the market cornered. The moment they can afford to these open source releases will stop.
Then the world will be stuck, just the way the world is largely stuck on rare earths. Instead of being able to regulate the leading companies from DC and Brussels, they'll be stuck watching Beijing call the shots.
That world would likely always have guys like Mistral and Trinity, but it's an open question if they'll ever catch up to the frontier.
And then Beijing will enjoy access to the data (ask any multinational operating in China for more than 2 seconds how useful contracts and Chinas legal system is for protecting IP), and these companies will roll in the money, and the Chinese supply chain will grow up behind the labs.
So, let's not pretend they've got the moral high ground. No. That boot just isn't on your neck yet. They're playing the long game -- and they're good at it.
92. idiotsecant|3d ago|context

It doesn't matter why Chinese firms are stealing models and open sourcing them. The fact that they are doing it is a very, very good thing for basically everyone other than the people who paid to build the original models, but I've got no sympathy for them considering they stole all the content to train them in the first place. This is some kind of beautiful irony.
93. reaperducer|3d ago|context

The "why" always matters in everything in life.
94. rescbr|3d ago|context

The whole AI industry was built upon stealing IP.
The extreme of this is to make IP laws irrelevant and that everything should be in the public domain.
Which maybe is not a bad outcome for humanity as a collective after all.
95. Saline9515|3d ago|context

The main problem is how they accessed the IP, but then using it to train a model is fair use. But yeah, IP theft doesn't exist because nothing is stolen really: Hollywood studios still have their movies.
96. sdellis|3d ago|context

Um, yeah. They stole the IP and then they stored the pirated IP. It was literally stolen and stored on their servers. That proves that IP theft exists. It's not complicated.
97. Saline9515|2d ago|context

IP theft doesn't exist, if I copy a Disney movie, Disney still has their movies. If I steal someone's bike, they can't use their bike. You deprive Disney of nothing by "stealing" their movies since it is just a copy.
98. civet_java|3d ago|context

Can you please tell my, as someone who is neither Chinese nor American, "why" I should care if a Chinese company stole from another American company (that in turn stole from everyone) to give me a cheaper service that fits my use case?
99. w0m|3d ago|context

> to give me a cheaper service that fits my use case?
Because they aren't giving you a cheaper service that fits your use case.
Best Case scenario, it's a trillion-dollar behemoth stealing from a billion-dollar behemoth so they can add their own explicit restrictions/weights on top to influence the masses.
There is no 'robin hood' here, any perceived value you get is clearly and explicitly tainted. "I don't care if it doesn't show me non-party-line results - It makes me a cheap UI !". Ethics/morals be damned.
100. amanaplanacanal|3d ago|context

> There is no 'robin hood' here, any perceived value you get is clearly and explicitly tainted. "I don't care if it doesn't show me non-party-line results - It makes me a cheap UI !". Ethics/morals be damned.
I can't tell if you are talking about Anthropic or Alibaba here.
101. w0m|3d ago|context

and honestly that's my entire point. There is no Good Guy here.
102. civet_java|3d ago|context

In a world which already has the likes of Anthropic and OpenAI, having Chinese labs be a counter balance is decidedly better than the hypothetical where American companies had a global monopoly on LLMs.
If your argument is that all present LLM offerings are unethical then that is something I am sypmathetic to. That said, I am also unable to offer a conceivable roadmap to undoing the opening of the LLM Pandora's box so I tend not ground my arguments in anti-LLM advocacy; that would be very 2023 of me.
103. butlike|3d ago|context

I don't think that's true. Sometimes the 'why' is lost in time as no one's around to tell it, so we end up with a "if a tree falls in the woods and no one's around to hear it, does it make a sound?" scenario. It doesn't really matter. The thing now exists without a 'why.'
104. philipallstar|3d ago|context

> it is a very, very good thing for basically everyone other than the people who paid to build the original models
It's not a good thing if you think there's more discovery and progress to be made, rather than cannibalising a fully mature field with cheaper alternatives. Drowning R&D early is not good for everyone.
105. how_gauche|3d ago|context

Is leveraging an enormous capital advantage to strip-mine the Internet and sell it back to us cannibalism or not? Confused on this point. I think they are exploiting a loophole in copyright law (and kind of redefining the meaning of "derivative work" in my opinion, but hey I'm not a lawyer) that collectively we tolerate because the end result is so useful
106. philipallstar|3d ago|context

I think that's a slightly different topic, but: a) strip-mining the internet is definitely the most misleading way to think about it. Strip mining means aggressively removing something to the area's detriment, and nothing has been removed. If all AI is turned off today the internet has not lost all of its natural resources, and silly phrases like that fuel inappropriate emotions and consequent conclusions and b) the internet is not being sold back to us - that is also a highly misleading phrase, if not an outright lie. The internet is still there and we can use it. No one is selling back to us what we already had. AI is not the internet cordoned off and resold.
107. svachalek|3d ago|context

What does further progress get us? Mass unemployment? Extinction? Pick your dark future science fiction?
The happy ending where we're all living in a garden of eden cared for by benevolent AI is hardly worth considering when you look at the cast of characters who are in charge of the world right now.
108. philipallstar|2d ago|context

I don't think those are the only two options. The most likely option is: AI does what self-driving cars have done, which is to say, get so far and no further.
109. pseudony|3d ago|context

I don’t think many outside the US are actively hoping to be governed by Sam, Dario and Elon.
110. philipallstar|2d ago|context

I don't see how this is relevant to AI. You don't need to be governed by them.
111. HarHarVeryFunny|3d ago|context

The Chinese companies don't have to be open weights, and it's not all about competing with the west. For example, most of Ziphu's (GLM) business in China is supporting private on-prem instances rather than selling API access. They make money by selling support services - much like RedHat's busines model.
112. binary0010|3d ago|context

I think most of us know why they're doing it. We are just very pleased with it regardless.
1. I get great products for nearly free 2. Anthropic/openai/etc will hopefully be destroyed since they stole everyone's work and are trying to capitalize on pure theft.
Win-win. The why of it is not really that relevant.
113. w0m|3d ago|context

>We are just very pleased with it regardless
You don't trust the multi-billion dollar behemoth, but you trust the militarized multi-trillion dollar behemoth to play 'robin hood'?
i can't get my brain around the mental loops here.
114. binary0010|3d ago|context

I don't get it? I use the open weights deepseek on opencode Go hosted in the us/etc.
What are the mental loops here?
I would genuinely like to know if I'm missing something.
115. svachalek|3d ago|context

If you don't think Anthropic and OpenAI are multi-trillion dollar militarized behemoths you need to catch up on some news.
Both are planning $trillion+ IPOs this year. OpenAI is collaborating with the Department of War, and Anthropic is under intense pressure to do the same and their top model is being held hostage right now. This week, the Department of War wrote a statement that xAI should not be held accountable for environmental laws because Grok is a vital weapon system of the US and was used to fire over 2000 missiles at Iran. The pentagon's statement mentions there are 3-4 such models so you may be able to guess which they are.
116. Daishiman|3d ago|context

> You don't trust the multi-billion dollar behemoth, but you trust the militarized multi-trillion dollar behemoth to play 'robin hood'?
Nobody's trusting anyone, we're just enjoying the benefits of true competition much like the working middle class gained benefits between the ideological competition of the Cold War.
117. zobzu|3d ago|context

you dont get it - usa is the goliath in all scenarios online. these are us based companies. most of the world would like to see them and the us fail.
118. crispyambulance|3d ago|context

The standard of neutrality that people here pretend to require from news organizations is not even remotely realistic.
It was a timely story from Reuters. They do fast news feeds, like APnews. Could it have been better or more accurate? Sure, they could have gone into why distillation may or may not be seen as "an attack". But then it would have been a more involved story, defeating the purpose of a news feed.
The Reuters piece was "good enough". Some other place like the NYTimes or WSJ can follow up with more detailed investigative coverage if it's a worthwhile story.
119. crmd|3d ago|context

I don’t want or need fast and “good enough” news and i’m gonna try and make a case that you don’t either.
Until very recently, all of modern civilization was built by people who got their news at most once a day. Reputable bureaus like Reuters took that day to get it right.
I’m not the national security advisor, so I don’t need a push notification that there was an earthquake in Nepal, or a bullshit rush-job briefing on Chinese AI distillation tactics.
120. dahart|3d ago|context

The fast part isn’t for your benefit, primarily, and news media would love to go slower and have more time if they could, and still survive. The race to break news first - in order to be the one to tell their audience something “new”, something they hadn’t heard elsewhere - is real and it has been around for all of modern civilization, for hundreds if not thousands of years. A one day turnaround was a thing purely due to daily newspaper print runs being the fastest distribution, it wasn’t because it was long enough to get it right. The reason they had a day is because the competition couldn’t get something out faster than that. Then for a while there were twice daily print runs to be more competitive. Then the internet came along, and now the only way for a site to get attention and be talked about on Hacker News is to report it before any other sites do.
There are some news media that do go slower and take their time, but I think they’re struggling to stay alive. Reuters is still reputable, but they no longer necessarily take a day. The big question is how do we get humanity to prefer slow & correct over fast, and it is even possible? When you hear about an earthquake in Venezuela, how do we stop people from Googling it immediately, and get them to wait for the best most correct story rather than reading whatever’s available now? In the case of natural disasters, I don’t think it’s possible anymore, no matter what case you make. I’m not sure it’s possible with stories like AI distillation either, even if you can absolutely cement the case for slow news. The fact that it’s async/internet now and that first still counts means we (you and I) are still going to give traffic and attention to sites that have the first information on a breaking topic, statistically, despite having a preference for correctness over speed. The one thing we can do is vote with our dollars by subscribing to whatever news media that does a better job than others.