Latest Posts

Microsoft Thinks It’s Okay to Steal Content from the Web to Train Its AI

Mustafa Sulayman, Microsoft’s AI director, argues that all content on the web is open source, implying they can utilize it without paying or seeking permission.

According to Sulayman, once you upload content to the internet, it no longer belongs solely to you. It becomes accessible for anyone to use, even for training artificial intelligence. This viewpoint underscores Microsoft’s approach, as articulated by Sulayman, who asserts that web content is free to copy, recreate, or reproduce.

Please follow us on Twitter and Facebook

In an NBC interview, Sulayman addressed concerns about whether AI companies, including Microsoft, infringe on intellectual property rights. As the co-founder of DeepMind and head of Microsoft’s AI division, he maintains a clear stance: any content available on the web since the 1990s is fair game for use.

“With respect to content that’s already on the open web, the social contract for that content since the 1990s has been that it’s fair use,” Sulayman said. “Anybody can copy it, recreate it, reproduce it. It’s been ‘free software,’ that’s been the understanding.”

Mustafa Sulayman also recognizes a “gray area” concerning protected content on websites. He acknowledges situations where websites explicitly prohibit crawling or scraping their content, except for indexing purposes. According to Sulayman, these issues require legal resolution.

During the interview, Sulayman acknowledged that some companies have accessed such protected data.

“So far, some have accessed this information—I don’t know anyone who hasn’t—but legal action will likely follow,”


He commented.

Microsoft (and Other Companies) Would Not Have to Compensate Content Creators

Microsoft Defends Using Web Content for AI Training (1)

Companies like OpenAI, Google, Microsoft, or Midjourney are known to have trained their language models using copyrighted datasets. According to Mustafa Sulayman, these companies believe they are not required to seek permission or compensate content creators due to the “social contract” that allows them to copy, recreate, or reproduce data.

Microsoft’s AI chief further suggests that intellectual property laws must adapt to the impending changes in the information economy.

“We are going to reduce the cost of knowledge production to zero. It is very difficult for people to assimilate, but in 15 or 20 years we will produce new scientific and cultural knowledge at almost zero marginal cost. It will be open source and available to everyone.”

While Sulayman and other executives advocate for the free use of data, content creators and copyright holders hold contrasting views. Recent instances include the music industry’s lawsuit against two companies for using copyrighted songs, Getty’s legal actions against Stable Diffusion’s AI, and George RR Martin’s lawsuit against ChatGPT, alleging systematic and massive-scale intellectual property infringement by OpenAI.

Read Also: Introducing MAI-1: Microsoft’s Latest Artificial Intelligence Challenger to OpenAI

Latest Posts

Don't Miss