Reddit’s cofounder had a gut feeling about OpenAI data

Reddit's cofounder had a gut feeling about OpenAI data - Professional coverage

According to Business Insider, Reddit cofounder Alexis Ohanian revealed that around 2015-2016, he strongly opposed giving the platform’s data to Sam Altman and OpenAI, saying he “felt in my bones” it was the wrong move. This internal debate happened shortly after Altman helped Reddit secure its $50 million Series B funding round in 2014, just as OpenAI was launching as a nonprofit. Ohanian specifically clashed with cofounder Steve Huffman about Altman’s request to “aggressively scrape Reddit” data, with Ohanian ultimately losing the argument. Fast forward to 2024, Reddit and OpenAI announced a formal licensing deal allowing OpenAI to train its AI models on Reddit content. Ohanian acknowledged that Altman recognized the value of Reddit’s human-generated data “before anyone else,” while also noting that Altman “doesn’t seem like the most philanthropically minded guy” despite his nonprofit claims.

Special Offer Banner

The data wars begin

Here’s the thing about this revelation: it shows how early the battle lines were being drawn over training data. We’re talking about 2015-2016, when most people barely understood what large language models were. Sam Altman was already thinking about scraping one of the internet’s largest repositories of human conversation. That’s pretty prescient, honestly. But Ohanian’s gut reaction? Also spot-on. He recognized that Reddit’s treasure wasn’t just the content itself, but that it was “all human” back then. In an age where we’re now drowning in AI-generated sludge, that authentic human voice is becoming the rarest commodity on the internet.

The dead internet theory comes to life

Ohanian’s comments about the “dead internet theory” hit particularly hard. He’s basically saying what many of us are feeling – that “most of the social media we consume now is fake.” It’s either bots, AI-generated, or AI-assisted content. And he’s not wrong. Scroll through any major platform and you can feel the emptiness creeping in. The most interesting part? He sees the solution in smaller, trusted spaces like group chats where people are “verifiably human.” That’s why he’s investing in businesses that bolster human connection, from sports to resurrecting platforms like Digg. In an age of AI everywhere, he believes “the fundamentally human stuff is going to do even better.”

Reddit’s balancing act

Meanwhile, Reddit is trying to have it both ways. They’re taking OpenAI’s money for licensing deals while publicly committing to keeping Reddit “a trusted place for human conversation.” Current CEO Steve Huffman wrote about this “next chapter” earlier this year, emphasizing their focus on human content. The company says they’re exploring ways to confirm accounts are human without compromising privacy. But let’s be real – how do you effectively police this at scale? When AI can generate convincingly human-sounding posts, and when your business model depends on licensing content to AI companies, there’s an inherent conflict. The very data that makes Reddit valuable – authentic human discussion – could be undermined by the partnerships they’re forming.

The value shift in content

What’s fascinating here is how quickly the value proposition has flipped. For years, tech companies treated user-generated content as free raw material to be mined. Now, as Ohanian points out, that authentic human content is becoming the premium product. The irony is thick enough to cut with a knife – the same AI companies that need human data to train their models are creating the synthetic content that makes human data more valuable. So where does this leave us? Probably heading toward a two-tier internet: one layer of AI-generated noise that’s mostly free, and another layer of verified human content that people might actually pay for. Ohanian lost the battle back in 2015, but his vision of prioritizing human connection might just win the war.

Leave a Reply

Your email address will not be published. Required fields are marked *