AI and media companies negotiate landmark deals over news content
The world’s biggest tech companies are in talks with leading media outlets to strike landmark deals over the use of news content to train artificial intelligence technology.
OpenAI, Google, Microsoft and Adobe have met news executives in recent months to discuss copyright issues around their AI products such as text chatbots and image generators, according to several people familiar with the talks.
These people said that publishers including News Corp, Axel Springer, The New York Times and The Guardian have each been in discussions with at least one of the tech companies.
Those involved in the discussions, which remain in the early stages, added that the deals could involve media organisations being paid a subscription-style fee for their content in order to develop the technology underpinning chatbots such as OpenAI’s ChatGPT and Google’s Bard.
The talks come as media groups express concern over the threat to the industry posed by the rise of AI, as well as fears over the use of their content by OpenAI and Google without deals in place. Some companies such as Stability AI and OpenAI are facing legal action from artists, photo agencies and coders, who allege contractual and copyright infringement.
Speaking in May at INMA, a media conference, News Corp chief executive Robert Thomson summed up the industry’s outrage, saying “[media’s] collective IP is under threat and for which we should argue vociferously for compensation”.
He added that AI was “designed so the reader will never visit a journalism website, thus fatally undermining that journalism”.
A deal would set the blueprint for news organisations in their dealings with generative AI companies worldwide.
“Copyright is a crucial issue for all publishers,” said the Financial Times, which is also in discussions over the matter. “As a subscriptions business, we need to protect the value of our journalism and our business model. Engaging in constructive dialogue with the relevant companies, as we are, is the best way to achieve that.”
Media industry executives want to avoid the mistakes of the early internet era, when many offered articles online for free that ultimately undermined their business models. Big Tech groups such as Google and Facebook then accessed that information to help build multibillion-dollar online advertising businesses.
As the popularity of generative AI has grown, so have the news industry’s concerns, given the technology’s ability to produce convincing swaths of humanlike text.
Google recently announced a generative search function, which returns an AI-written information box above its traditional format of web links. It has launched in the US, and is gearing up for release worldwide.
Some discussions currently involve trying to find a pricing model for news content used as training data for AI models. One number that had been discussed by publishers is $5mn-$20mn a year, according to an industry executive.
Mathias Döpfner, chief executive of Politico-owner Axel Springer that has met leading AI companies Google, Microsoft and OpenAI, said his first choice would be to create a “quantitative” model similar to one developed by the music industry that sees radio stations, nightclubs and streaming services pay record labels each time a track is played. That would first require AI companies to disclose their usage of media content — something they are currently not doing.
Döpfner, whose Berlin-based media company also owns the German tabloid Bild and the broadsheet Die Welt, said an annual agreement for unlimited use of a media company’s content would be a “second best option”, because that model would be harder for small regional or local news outlets to take advantage of.
“We need an industry-wide solution,” said Döpfner. “We have to work together on this.”
Google has been leading the negotiations with UK news outlets, meeting the Guardian and NewsUK. The Alphabet-owned company has long-running partnerships with many media organisations to use data from content such as articles to ensure it is optimised to appear in its search engine. The company has used the data to train its large language models, according to two people familiar with the arrangement.
“Google has put a licensing deal on the table,” said an executive at a newspaper group. “They have accepted the principle that there needs to be payment . . . but we have not got to the point of talking zeros. They have acknowledged that there is a money conversation that we need to have over the next few months, which is the first step.”
After this article was first published, Google said that the newspaper executive’s comment regarding a potential licensing deal is “not accurate. It’s very early days and we’re continuing to work with the ecosystem, including news publishers, to get their input.”
Google would not comment on financial discussions. However, the search company said it was having “ongoing conversations” with news outlets, large and small, in the US, UK and Europe, and already trained its AI on “publicly available information”, which could include paywalled websites.
The Silicon Valley giant added another option it was considering was how to give publishers more “choice and control” over whether their content became part of a training data set for AI, similar to how it allows websites to opt out of their content being used in search.
Since launching ChatGPT in November, OpenAI chief Sam Altman has met News Corp and The New York Times, according to people familiar with the discussions. The company acknowledged it had held talks with publishers and publishing associations around the world on how they could work together.
Developing a financial model for the use of news content to train AI will be extremely difficult, according to publishing leaders. Senior executives at one major US publisher said the news industry was working retroactively because tech companies had launched these products without consulting them.
“There was no discussion, and so now we have to try to get paid after it happened,” the executive said. “The way they launched these products, the total secrecy, the fact that there is zero transparency, no communication before it happened, there’s reasons to be pretty pessimistic.”
Media analyst Claire Enders said talks were “very complicated at present”, adding that, as each organisation takes its own approach, a single commercial arrangement for media groups was unlikely and could be counter productive.
Enders added: “Chatbots won’t be credible tools if they are literally trained primarily on the sewers of misogyny and racism that make up most of open, accessible text.”
The technology companies building AI are keen to focus on its utility in driving efficiencies within newsrooms and enhancing journalism and are happy to pay millions to preserve longstanding relationships with the industry, people involved in the talks said.
Brad Smith, Microsoft’s vice-chair, said it was “in the early days of conversations with media and publishers, and part of that is just helping everybody learn about how models are trained”.
“I think our bigger opportunity is really to work with publishers first to think about how they can use AI to generate more revenue,” he added.
Adobe’s chief executive Shantanu Narayen said he had met Disney, Sky and the UK’s Daily Telegraph in the past few weeks to discuss how it might develop custom models for the companies to use its generative AI for images.
Adobe’s model is trained on pictures in its own library of stock images, as well as openly licensed and public domain content where the copyright has expired. Narayen said bespoke deals and pricing would depend on the company, but clients could add their proprietary content to the tool.
Axel Springer’s Döpfner expressed optimism that deals would be reached because both media organisations and policymakers have grasped the scale of the challenge more quickly than during the last big wave of technological disruption.
AI companies “know that regulation is coming, and they are fearful of it”, he said, adding: “It is in the interest of all parties to come up with a solution for a healthy ecosystem. If there is no incentive to create intellectual property, there is nothing to crawl. And artificial intelligence will become artificial stupidity.”
Read the full article Here