Meta buys into Scale AI in the quest for fresh data

Hot off blowing over $100 billion on the Metaverse, Facebook/Meta has started doing the same with “AI.”

Facebook just bought 49% of Scale AI for $14.3 billio, acqui-hiring founder-CEO Alexandr Wang, in a deal designed to keep antitrust regulators at bay and Scale’s original investors sweet. [Reuters; The Information, paywalled]

Scale will continue as a separate company — but Wang’s day job will be at Facebook.

Zuckerberg’s been poaching hot AI researchers for his new “superintelligence” unit, offering them tens of millions of dollars over several years. Zuckerberg has rearranged desks at Facebook’s office in Menlo Park so the new staff will sit near him. [Bloomberg, archive; Bloomberg, archive]

Wang will join this new unit, where they’re super-sure they will super-build an Artificial General Intelligence!

Wang isn’t even an AI researcher. He’s a business guy with some tech knowledge. The Financial Times says “His talents lie in promoting the company rather than managing its staff or furthering AI research.”  Zuckerberg seems to think this sales guy is the new Facebook executive he wants. [FT, archive]

Scale creates datasets. It outsources this data creation to underpaid gig workers — who it then rips off.

The big problem for all the LLMs is they’ve been fed the whole internet and they’ve run out of data. But Scale’s army of underpaid workers generate new data. That’s the other thing Facebook really wants out of this deal. [Semafor]

We mentioned “AI” to “cocaine” the other day. AI is God’s way of saying Facebook is making too much money.

Disney sues AI image generator Midjourney

Midjourney runs a diffusion model that you can ask to generate pictures. Disney and Universal and several other movie studios have sued because Midjourney keeps spitting out their copyrighted characters. [Complaint, PDF; case docket]

This is the first suit by big Hollywood movie studios against a generative AI company. Disney first went for one of the mid-range gen-AI companies, not a giant one. I expect they think they can get a precedent here.

The suit is not about particular copyrighted works, but copyright in the *characters*. The legal precedent on character copyright is solid.

The filing contains a whole pile of images giving an official character image next to what Midjourney gave them when they prompted it for that character. Or not even naming a character – the prompt “Superhero fight scene” gave a picture of Spider-Man fighting another Spider-Man.

Midjourney also spat out images that were really obviously taken straight from its training data, like stills from Avengers Infinity War.

Disney and Universal first contacted Midjourney about this a year ago.

Disney know how generative AI works. The Disney content can’t be removed from Midjourney without retraining it from scratch.

Midjourney is also odd in that it didn’t take money from outside investors and it’s actually profitable selling monthly subscriptions. This is an AI company that is not a venture capital money bonfire, it’s an actual business.

I suspect Disney isn’t out to just shut Midjourney down. Disney’s goal is to gouge Midjourney for a settlement and a license.

ChatGPT goes down — and fake jobs grind to a halt worldwide

ChatGPT suffered a worldwide outage from 06:36 UTC Tuesday morning. The servers weren’t totally down, but queries kept returning errors. OpenAI finally got it mostly fixed later in the day. [OpenAI, archive]

But you could hear the screams of the vibe coders, the marketers, and the LinkedIn posters around the world. The Drum even ran a piece about marketing teams grinding to a halt because their lying chatbot called in sick. [Drum]

There’s some market in the enterprise for AI slop generators. Wrong text summaries, LLM search engines with data leaks and privacy violations, and rambling slop emails.

But the enterprise chatbot market is jobs and tasks that are substantially … fake. Where it doesn’t matter if it’s wrong. The market for business chatbots is “we pretend to work and they pretend to pay us.”

So what do you do in your chatbot-dependent job next time OpenAI goes down?

The AI bubble is not sustainable. It’s going to pop. And it’s not clear what it would cost to serve GPT without the billions in venture capital dumb money — if it had to pay its way as a for-profit service.

Read more →

Doctors against data abuse — BMA and RCGP protest NHS AI data handover

A pile of patient data that was sent to NHS England from doctors to research Covid-19 just happened to get poured into the NHS’s new LLM, Foresight. The British Medical Association and the Royal College of General Practitioners have referred NHS England to the Information Commissioner’s Office: [RCGP; Politico]

The methodology appears to be new, contentious, and potentially with wide repercussions. It appears unlikely that a proposal would have been supported without additional, extraordinary agreements to permit it. The self-declared scope of this project appears inconsistent with the legal basis under which these data were to be used.

Foresight wants to use anonymised data from everyone in England to produce exciting new insights into health. It runs in-house at University College London on a copy of Facebook’s Llama 2. [UCL]

The British public is overwhelmingly supportive of good use of health data. But they also worry about their data being sent off to private companies like Palantir. [Understanding Patient Data; Pharmacy Business; Guardian, 2023]

And, of course, there’s not really such a thing as anonymised data. It’s notoriously easy to de-anonymise a data set. NHS England is mostly using “anonymised” as an excuse to get around data protection issues. [New Scientist, archive]

NHS England says it’s “paused” data collection and launched an internal audit of the project. They insist that taking care of it themselves is fine and they don’t need some outsiders — like, say, the data protection authority for the UK — sticking their noses in. The investigation proceeds.

 

Apple Intelligence at WWDC 2025! Don’t expect much — it still doesn’t work

Apple Intelligence is most famous for mangling news headlines and text messages and making spam texts look legitimate. Bloomberg wrote in March: [Bloomberg, archive]

People just aren’t embracing Apple Intelligence. Internal company data for the features indicates that real world usage is extremely low.

The 2025 Worldwide Developers’ Conference starts shortly! The keynote will mention the company’s strategy for Apple Intelligence. Third-party developers get to tap into Apple’s on-device AI models! [Bloomberg, archive]

The keynote will not promise exciting new products, because there aren’t any, and last year’s still don’t work.

Apple is encountering “challenges with updating Siri” because they just can’t get generative AI to give reliable responses. This is why Robby Walker, the executive then in charge of Siri, got fired in March. [FT, archive]

Apple also really wants its AI models to run on the device — and a model running on a iPhone is just not going to be as good as its competitors’ much larger, and ridiculously expensive and heavily subsidised, models that run in the cloud.

Google has forced Gemini onto everyone’s Android phone whether they want it or not. It’s reportedly terrible at accents it wasn’t trained on, like the wild variety of accents in the UK. [Reddit]

Amazon’s Alexa+ chatbot edition still hasn’t been seen in the wild, a month after Amazon CEO Andy Jassy told investors it had over 100,000 users, honest! No you can’t talk to one.

Maybe you’ll get your chatbot Siri next year. Or the year after.

Update: Steve Farrugia live-skeets the keynote. Yeah, it’s tweaks and polish on existing features. “Anything AI was more ML.” [Bluesky]

Apple: ‘Reasoning’ AIs fail hard if they actually have to think

Apple Machine Learning Research has a new preprint: “The Illusion of Thinking: Understanding the Strengths and Limitations of Reasoning Models via the Lens of Problem Complexity.” [Apple, PDF]

“Large Reasoning Models” don’t do logic. Worse yet, “frontier LRMs face a complete accuracy collapse beyond certain complexities.” If the problem’s too hard … they just give up!

Instead of the rigged benchmarks that are standard in the AI bubble industry, Apple used a “controllable puzzle environment.” You feed the AI a simple logic puzzle and twiddle the questions.

On simple puzzles, the plain LLM did better than the reasoning models. On middling puzzles, the reasoning model pulls ahead.

But at a certain level of complexity, both simple LLMs and reasoning models fail hard. Their accuracy drops to near zero. They just give up — they actually use less tokens on answering than they used on the medium puzzles.

The researchers even tried telling the models the exact algorithm to solve the puzzle. The models still failed at the complex level.

The same team put out a preprint in October last year which warned that LLMs aren’t doing formal “reasoning” at all. This new paper builds on that one. [arXiv, 2024, PDF]

Subbarao Kambhampati put up a preprint in May that says the same. It’s literally called “Stop Anthropomorphizing Intermediate Tokens As Reasoning/Thinking Traces!” With an exclamation mark. [arXiv, PDF]

Neural networks only know what they were trained on. Fast-talking marketers handwaving about “reasoning” and a few rigged benchmarks don’t change that.

 

UK — High Court to lawyers: cut the ChatGPT or else

US lawyers have spent the past two years filing briefs full of citations to nonexistent cases — because a chatbot made them up. You’d think these intelligent and educated people would know better than to pull this. But it keeps happening.

The UK is no different. In two recent cases, a chatbot made up citations — one a local council failure to provide housing (Ayinde v. Haringey), the other a claim for £89 million in financial damages (Al-Haroun v. Qatar National Bank).

In the housing case, a junior lawyer working for a legal charity submitted a bunch of nonexistent citations. [BAILII, PDF; Prospect]

In the finance case, an expensive and high-powered legal firm accepted a list of made-up cases from the client! Including a fake cite in the name of the judge in the case!

The lower courts referred the made-up citation issues to the England and Wales High Court — which has supervisory power over lower courts and lawyers as officers of the court. The judgement is directed at “every individual currently providing legal services within this jurisdiction.”

The High Court heard the combined case in May — and it considered contempt proceedings against the lawyers over wasting everyone’s time with fake citations. [Law Society Gazette]

The judgement dropped yesterday. It’s just 29 pages, it’s very readable, and it’s a banger. [BAILII; Legal Futures]

Read more →

Builder․ai even more fake: creative accounting, no AI, no money

Builder.ai collapsed a couple of weeks ago.Its much-touted AI website builder seemed to be fully 100% AGI — A Guy Instead.

More details have come out in Builder’s US bankruptcy. Engineer.AI Corporation filed for Chapter 7 liquidation on Monday. The company has about $50–100 million in debts and only $5–$10 million in assets. [Petition, PDF; case docket]

It’s now confirmed that Builder.ai didn’t have any “AI” at all. It was 700 human engineers in India, paid $8 to $15 an hour. [Times of India, archive]

The Natasha chatbot was Builder.ai’s secret sauce. They claimed Natasha would analyse your business requirements for you. Builder was hoping to sell Natasha to Microsoft as a business app development tool. There’s a Builder ad directly promoting Natasha as being “AI.” [YouTube]

Natasha was, of course, the same bunch of Guys Instead. “Natasha” was a running joke in the development office. The engineers called Builder.ai “a call centre with better marketing.” And Builder.ai spun out this fake AI scam for eight years. [TFN]

The investors turned out not to care so much about the tawdry details of the AI being completely fake. But they care a lot about the money — and the extremely creative accounting.

Read more →

Generative AI runs on gambling addiction — just one more prompt, bro!

You’ll have noticed how previously normal people start acting like addicts to their favourite generative AI and shout at you like you’re trying to take their cocaine away.

Matthias Döpmann is a software developer. He’s been trying out AI autocomplete for coding. He was very impressed at how good the tools are for the first 80% of the code. But that last 20% is the hard bit — where you have to stare into space and think for a bit, work out structure, and understand your problem properly: [Revontulet.dev]

For a good 12 hours, over the course of 1 1/2 days, I tried to prompt it such that it yields what we needed. Eventually, I noticed that my prompts converged more and more to be almost the code I wanted. After still not getting a working result, I ended up implementing it myself in less than 30 minutes.

So he used the chatbot as a vastly more wasteful rubber duck. He adds:

… This experience is shared among peers, where AI traps you into the thinking “I am only one prompt away”, whilst clearly it just does not know the answer.

That is: generative AI works the same way as gambling addiction. Spin the gacha! Just one more spin, bro, I can feel it!

Read more →

Washington Post goes AI to clean up amateur right-wing op-eds

The Washington Post put up a new mission statement in January that called AI a “key enabler” of the paper’s success. It aspired to make the Post “an A.I.-fueled platform for news.” [NYT, archive]

The first part of the AI-fueled platform plan is called Ripple. The Post wants to fill out the opinion section with “nonprofessional writers.” How will they quality check these? They’ll run them past an AI! [NYT]

Why does the Post want to do this? Because they can’t find enough good writers to create the owner’s desired libertarian propaganda.

Read more →