This Startup Has Built an Algorithm to Pay Creators for Their Work Used to Train A.I.
OpenAI, the creator of ChatGPT, has come under fire from publishers and artists who alleged the company scraped their work from the internet to train GPT, its large language model, without their consent. These concerns have sparked lawsuits against the A.I. giant on accusations of copyright infringement, highlighting a major ethical dilemma that comes with pushing A.I.’s capabilities forward. Some startups are exploring a solution that focuses on sharing revenue with content creators. In August, Perplexity AI, an A.I.-powered search engine, introduced a program to pay publishers a portion of ad revenue generated by search queries if their content informs its outputs. ProRata.ai, a startup founded by a pioneer of the early internet monetization model, is developing a similar algorithm to compensate publishers, authors and other creators whose work is used to train generative A.I.
ProRata claims it has created an algorithm that can review an A.I.-generated output, identify the source of information based on novel facts and textual styles, and calculate how much each source contributed to the response. These percentages are then used to cut checks to these creators at the end of every month—a model that, in theory, could help protect the livelihoods of creatives and prevent future lawsuits around intellectual property.
“If you don’t share, then creativity is unsustainable. There’s no way for you to make a living,” ProRata’s co-founder and CEO Bill Gross told Observer regarding the careers of artists. Gross is credited as the inventor of the pay-per-click monetization model for internet search with a company he founded in the late 1990s that was later acquired by Yahoo, according to ProRata’s website.
The startup, which raised $25 million from venture capital firms Mayfield Fund, Prime Movers Lab, Revolution Ventures and IdeaLab Studio in a series A funding round in August, is set to showcase the algorithm through an A.I.-powered search engine expected to release in October. Starting at $19 a month, the engine will monetize queries through advertisements and subscription payments, according to Gross. While 50 percent of the revenue generated will go to ProRata, the other half will be split proportionately across creators.
ProRata’s ultimate goal isn’t to create an alternative to Google Search, but to introduce a new business model that search engines could adopt to ensure creators get paid for their contributions to A.I. “We want to make that the industry standard,” Gross said. While A.I. search features from Google and Microsoft’s Bing don’t directly share ad revenue with publishers, they refer users to links from publishers as a way to drive traffic to their sites.
The answer engine will only be trained on data from creators who partner with ProRata. That means the model will draw from a limited amount of data that could potentially compromise the accuracy of outputs. Still, ProRata isn’t focused on making its A.I. search engine a standalone product but rather on having the pay-per-use model adopted by major search engines.
So far, the company has inked deals with publishers like The Atlantic, Fortune, Financial Times, Time, and Axel Springer, the German company that owns Politico and Business Insider. Authors like Walter Isaacson, Adam Grant, and Ian Bremmer have also agreed, as have music industry veterans like Universal Music Group. ProRata hasn’t encountered any resistance or skepticism from its partners yet, according to Gross. “Most people just want us to be wildly successful so they’ll get a paycheck,” the CEO said. The real challenge, he notes, is convincing Big Tech companies who’ve been crawling web data for free to adopt ProRata’s business model.
“It’s amazing to me that some of the people think that crawling is not stealing,” Gross said. “Basically, Mustafa, the CEO of Microsoft A.I., came out and said, ‘Hey, if it’s available on the web, it’s free for us to use.’ And that’s just bullshit,” Gross added, referring to comments made by Google Deepmind co-founder Mustafa Suleyman during a CNBC interview in July when asked if training A.I. models on web content is akin to intellectual property theft. “Just because something is available and visible doesn’t mean it’s open source,” Gross said.
Paying creators may be a temporary “Band-Aid” solution
Financial compensation may not fully address the ethical concerns of having a creator’s work used for A.I. training without explicit permission, according to Star Kashman, a tech lawyer and partner at Cyber Law Firm with expertise in digital copyright law. She cites actress Scarlett Johansson as an example, who allegedly refused to give OpenAI permission to use her voice for ChatGPT despite financial offers.
“Many authors and creators have personal, moral objections to their work being utilized for A.I. training, regardless of compensation,” Kashman told Observer. “Without explicit permission, paying creators may be a temporary ‘Band-Aid’ solution, but it may not be an all-encompassing resolution to deeper concerns about consent and the impact on creative works.”
The “pay-per-use” model could also potentially lead to a new crop of legal issues. Creators may disagree over whether the payment they receive “accurately reflects” what they contributed to the A.I. systems, especially if they can’t set their own rates, Kashman said. Moreover, A.I. tools may favor the work of bigger, more established creators over smaller ones even if their content is more relevant to a particular query, similar to how search engine optimization (SEO) works. Compensation may also not fully protect A.I. companies from being sued for intellectual property theft, which she said could be easier to prove in court with concrete attribution.
“There will continue to be many IP cases until the Copyright Act is amended to allow scraping on copyrighted content for the purposes of training LLMs,” Gabriel Vincent, another partner at Cyber Law Firm, told Observer, echoing Kashman’s comments.
ProRata has plans to diversify its model to include more than just text. After the October launch, the startup will focus on collaborating with music companies, according to Gross. He also hopes to collaborate with video and movie brands as well as smaller, independent creators and plans to license its attribution technology to A.I. companies that can implement it into their own models.
“A.I. is so amazing, but it needs to be fair to all parties,” Gross said.