No, they won’t get permission first

Cover of "Everywhere Man" by Jim Nelson

In 2011, I wrote a novella about a Silicon Valley startup that trains its virtual reality software from tourist photos it scrapes off the Internet. Millions of these photos are stitched together to create a virtual cable car ride across San Francisco.

This story became Everywhere Man, which was also recorded as an audiobook that you listened to while riding the actual, real-life cable cars. It was one of several literary tours that Oakland-based Invisible City Audio Tours offered. Their idea was to see cities through the lens of literature, and not as a mere collection of landmarks and commercial sights.

During the development of the book, Invisible City’s publisher asked me: “Wouldn’t the startup need to get permission from the people who took the photos?”

Fourteen years later, we’ve received the answer to her question: “Yes, they should get copyright permission. No, they won’t do that, though.”

Of course, today the startups in question are not producing virtual reality tours. They are AI companies feeding massive amounts of copyrighted data into their Language Learning Models (LLMs), which in turn powers their artificial intelligence behemoths—Chat GPT, Claude, Grok, and so forth.

And just like my fictional Silicon Valley startup, these AI companies are being challenged in regard to their use of intellectual property. Shouldn’t these companies have to get copyright permission before using creative works to build their software?

The answer, predictably, is that they don’t believe they need to:

  • Nick Clegg, former executive for Meta (Facebook): Asking artists’ permission before AI companies scrape copyrighted content will “basically kill the AI industry in this country overnight.”
  • In a statement from Open AI, the creators of Chat GPT, they assert “the federal government can both secure Americans’ freedom to learn from AI and avoid forfeiting our AI lead to the [People’s Republic of China] by preserving American AI models’ ability to learn from copyrighted material.”
  • Meanwhile, “OpenAI and Google are pushing the US government to allow their AI models to train on copyrighted material. Both companies outlined their stances in proposals published this week, with OpenAI arguing that applying fair use protections to AI ‘is a matter of national security.'”

It’s not even a matter of asking permission at this point—these companies have already trained their LLMs with copyrighted material and made their AI available to the public. Their earliest defense was that training on copyrighted material was “transformational” and covered as Fair Use. Later they began to frame the argument as a matter of national security. (OpenAI is particularly prone to this claim.) At some point, they’ve all stated publicly, in so many words, that needing to obtain permission from content creators would destroy their business model. (In Nick Clegg’s case, it’s not even so many words. He came right out and stated it.)

Facebook went so far as to use a massive database of flagrantly pirated texts (called LibGen) to train its AI. An internal company document reveals that they kept the source of their texts secret because “if there is media coverage suggesting we have used a dataset we know to be pirated, such as LibGen, this may undermine our negotiating position with regulators on these issues.”

The Atlantic has helpfully produced a searchable database to see if an author’s work was included among LibGen’s trove of pirated texts. Sure enough, three of my books are in the set: Bridge Daughter, Hagar’s Mother, and—you guessed it—Everywhere Man, a book about a Silicon Valley company using stolen intellectual property to train their software.

There is a familiar smell about all of this. In a recent video on typeface piracy (a practice which goes back hundreds of years), designer Linus Boman observes “every time there is a massive technological shift, intellectual property rights suddenly, and very conveniently, become a blind spot. … Is it only considered piracy if the people who do it lack resources and respectability?” Apparently so.


Maybe it’s time to stop telling ourselves that AI will never produce a passable novel, song, or movie—that AI lacks the fiery human spirit to produce creative work of value. Maybe we should concede that AI is more than capable of producing better-than-mediocre works of art.

The open market tells us this is the case. Writers have been caught using AI to produce tens, even hundreds, of novels, all profitable bestsellers with attentive and loyal fan bases. The reason this is true is that AI is a master of imitating others’ work.

Couldn’t an AI be trained only with public domain texts published on or before 1929, the current cut-off point for copyright protection in the United States? Well, it could, but then all those romance novels it produced would read like Jane Eyre and Wuthering Heights. That’s not going to sell many copies.

AI has practical, world-bettering applications in the sciences, healthcare/medicine, mathematics, and beyond. I’m not arguing against AI as a general technology. But it seems all avenues of creating works with AI leads to less-than-optimal market conditions for AI companies and their users if copyright protections are upheld. Why, though, do their short-term profit margins suddenly erase basic copyright law, a legal concept that goes back to the time of Shakespeare?

Ask yourself: Are you better off reading AI-generated novels? Or listening to AI-produced music, or watching AI-generated movies? I see no evidence that AI-produced work is being sold at a lower price than human-generated content, or offering a better experience. What’s in it for me? Lower-quality mass-produced books sold at the same or higher price than before? How is this progress?

Alternate cover of "Everywhere Man" by Jim Nelson

Flight of the Big Blue Bird

Twitter logo

I’ve been bird-watching. I’ve followed the events at Twitter this week with a morbid fascination: Elon Musk’s arrival at Twitter HQ bearing a sink; the outrage at a billionaire buying up a major cultural outlet (which overlooks all the other billionaires making similar purchases, and most of all, that Twitter itself help make founder Jack Dorsey one, but for some reason, this time is different); the questionable sagacity of predicting Twitter is doomed after a mere seven days of changing hands (this from the same media that told us the Twitter sale itself was doomed from the outset); the layoff of half of Twitter’s workforce; and a notable, but not mass, migration to Mastodon, a Twitter lookalike with a more distributed modus operandi (and no billionaire owner).

I’ve been on Mastodon since 2018. I’ve never liked the Pepsi-or-Coke situation with Twitter and Facebook, so I dipped my toe in the Mastodon waters four years ago in the hope of finding a better situation. I didn’t. My Mastodon feed was tumbleweeds, mostly cat photos and random musings on how much better Mastodon is than Twitter. The way to build your Mastodon feed is to follow more people, but I could find no one there I knew or cared to follow—and if I did, they were on Twitter too, so might as well follow them there.

My dusty Mastodon feed greened in the past week. It has more interesting content now, and my own messages (“toots” in Mastodon parlance) are getting some engagement as well.

With the growth comes growing pains. I’m already having a knee-jerk hipster reaction to the increased traffic there, similar to that sinking feeling one gets when your favorite cafe tucked away in a quiet neighborhood gets Yelped.

Mastodon logo
Mastodon

Worse, I’m already starting to see the kind of toots my Twitter feed was flooded with a few years ago: Smug, taunting, highly-politicized messages supposedly proving how people not-like-the-message’s-author are idiots. This was one of the reasons I wanted to find an alternative to Twitter in 2018.

(How did I halt the flow of those messages on Twitter? I stopped following people who retweeted those kinds of messages. Harsh, yes, but if you’re repeatedly propagating material I don’t want to read, I reserve the right to stop following you. The Twitter algorithm picked up on my change of reading habits, and pretty soon that kind of content disappeared from my feed.)

The real question is if this mild shift in traffic snowballs into the Mass Twitter Migration of 2022 that leads to its collapse.

I’m not holding my breath. Twitter has tremendous inertial energy behind it, no matter its ownership. The blue-checked accounts and users with six-digit-plus followers have a ton of investment in the system. Ten percent of Twitter users produce 90% of its content. Power users are a big draw, and I don’t see any of them packing their bags quite yet.

Mostly, I think those capable and willing to leave Twitter won’t. They’ll simply maintain one more social media account. Most people already juggle Twitter, Facebook, and Gmail. Adding one more to the mix might be annoying, but it’s hardly some massive additional investment of time. And if people are active on both Twitter and Mastodon, then—surprise!—Twitter lives.

My Mastodon account is here. More information on joining Mastodon is here.

Author Q&A with Bookies: Monday, October 2nd

I’ll be leading a Q&A on Bookies’ Facebook page this Monday, October 2nd, from 5pm to 7pm (Pacific time). The event is part of Bookies’ #Authorberfest, their yearly October event giving independent authors a chance to meet readers and answer their questions.

This year I will be discussing Bridge Daughter and its upcoming sequel, Hagar’s Mother. I’m also working on a giveaway as part of the Q&A, so come and check it out!

More information about the event is on my Facebook page. Learn more about #Authorberfest at Bookies. If you like and share their announcement, you’ll be entered in a drawing to receive a $10 Amazon gift card.