Mumsnet, the UK-based parenting forum with a colossal archive of over six billion words, is heading to court against OpenAI. The platform, known for its highly engaged community of mostly female users, claims that OpenAI used its data without permission. After failed licensing negotiations, Mumsnet’s CEO, Justine Roberts, says legal action is now unavoidable.
The friction began when Mumsnet discovered AI companies were scraping their vast content library. In an interview with Wired, Roberts explained that Mumsnet attempted to strike licensing deals with major AI players, including OpenAI. Initially, talks with OpenAI seemed promising.
Licensing Talks Break Down
Mumsnet entered discussions with OpenAI in the hope of securing a licensing deal. According to Roberts, OpenAI expressed interest in datasets exceeding 1 billion words. With Mumsnet’s 6 billion words of predominantly female-driven content, Roberts believed the platform could offer a unique dataset. Mumsnet and OpenAI engaged in a series of discussions exchanged NDAs, and shared detailed information.
However, after more than a month of talks, OpenAI reversed its stance. An email exchange reviewed by Wired revealed that OpenAI was no longer interested in the partnership, citing that Mumsnet’s dataset was too small and largely accessible to the public online. OpenAI also emphasized its focus on datasets that capture broader human experience.
In a statement to Wired, OpenAI spokesperson Kayla Wood confirmed that the company seeks partnerships for “large-scale datasets that reflect human society” and does not prioritize publicly available information. The company also allows publishers and content creators to control how their material interacts with AI models.
Disappointment at Mumsnet
Roberts expressed frustration at the outcome. She explained that OpenAI initially appeared enthusiastic about Mumsnet’s content because it consists of high-quality, largely female conversations, a rarity in AI training datasets. “It’s 90 percent female conversation, which is quite unusual,” Roberts said.
OpenAI has secured data-licensing deals with other media outlets, including Vox Media, Reddit, Time, and Condé Nast, among others. However, it remains unclear what dataset sizes OpenAI prioritizes for such agreements, as the specifics of these deals haven’t been publicly disclosed.
Mumsnet Pursues Legal Action
OpenAI’s rejection has now evolved into a legal matter. Mumsnet plans to pursue copyright infringement claims, as well as breach of its terms of use and database right infringement, which occurs when large portions of a database are extracted without consent. Mumsnet sent OpenAI a letter in July announcing its intention to take legal action.
OpenAI responded with a list of questions but did not deny that scraping had occurred. Mumsnet is now considering whether to file its lawsuit in the UK’s High Court or an intellectual property court. Although OpenAI acknowledged the receipt of Mumsnet’s complaint, the AI company declined to comment on the legal claims.
Despite the pending legal battle, Mumsnet continues to explore licensing opportunities with other AI firms. Roberts confirmed discussions with Google and intermediary startups focused on facilitating data licensing deals.
Concerns About AI’s Impact on Publishers
Roberts voiced concerns about how large language models (LLMs) could harm smaller publishers like Mumsnet. “I’m quite worried about the ecosystem, where these big LLMs are allowed to march all over small publishers to build their models, and then people have less reason to visit the websites,” Roberts said. She believes a solution that compensates content creators for their work is essential.
Mumsnet users appear to support the company’s licensing efforts, particularly given the platform’s predominantly female audience. Roberts also highlighted the importance of gender representation in AI training. “There’s something to be said for it being trained on verified female voices,” she noted.
While there are no immediate plans to compensate Mumsnet’s users, Roberts indicated that it’s something the company would consider if data licensing proves lucrative.
Legal Precedent and Future
The legal landscape for AI companies is complex. In the US, AI firms have defended themselves against numerous copyright infringement lawsuits by invoking “fair use,” a legal doctrine that allows limited use of copyrighted material without permission. The UK’s equivalent, “fair dealing,” offers similar protections, though it is more restrictive.
Despite these complexities, Roberts is optimistic about Mumsnet’s chances in court. “We think we have a good chance,” she said. Even if the outcome remains uncertain, Mumsnet is committed to protecting its content and setting a precedent for other platforms facing similar challenges.