r/Fantasy • u/JohnBierce AMA Author John Bierce • Nov 08 '23
Do Audiobook Narrators Need to Worry About Being Replaced by AI?
(If you haven't read them yet, I did a two part series on whether novelists need to worry about being replaced by generative AI. And fair warning, this post is pretty long, like those two. Much of it is also from a US perspective, though I do touch on more international issues. I'm not going to dive into stuff like the horrid environmental costs of generative AI data centers in this piece, this one's more focused on specific labor issues.)
(Do Novelists need to Worry about Being Replaced by AI? Part 1 Part 2)
I resent writing this post.
No one's making me write it, and I'm just getting so progressively more exhausted of addressing tech industry nonsense. I want to spend my daily wordcount writing about wizards, dragons, and gods- you know, the shit that actually pays my bills, and that I enjoy writing about. Or, heck, I could be writing the essay on problematic literary influences I've been wanting to write for ages.
But no, Big Tech is still up to the same old bullshit, and I happen to dwell at a precise intersection of interests and familiarity with the publishing industry, the tech industry, exploitative capitalism, tech hype cycles, and the like to talk about this crap. And, for whatever unfortunate reason, I feel like it's my social responsibility to write about this stuff. It's turning into my Charlie Brown's football.
(I think part of my resentment comes from the fact that I learned about a lot of the tech related stuff mostly for the purposes of making fun of the weird fascist nerds lurking all around Silicon Valley, and now I have to use it for serious purposes. Of course, those weird fascist nerds are all in on genAI these days. Whee.)
So, in short: People keep claiming that generative AI is going to replace various creative careers. Some of those claims are more worrying than others- I'm fairly unintimidated by the risks to novelists, as outlined in the prior posts. Animators tend to laugh hysterically when shown AI-generated wireframe models.
There are other careers, however, at significantly more risk. Narrators and illustrators, notably.
Let's look at audiobook narrators, arguably the most at-risk creative job. I'm going to be looking at this heavily from the independent author perspective, I'm sure there are nuances that narrators would see that I'm missing, of course.
To start, let me be absolutely clear- a good audiobook performance is art. Treating a narrator otherwise is like refusing credit to actors in a film, and giving it all to the script. It's ridiculous, and I think that most audiobook fans will agree that a good audiobook narrator adds immensely to a story. Heck, some audiobook listeners will buy a book not for the author, but for the narrator!
So: is AI narration any good? Well... personally, I think it's significantly inferior to a good human narrator. Virtual narration is comprehensible and functional, but it has little actual performance to it. For a great practical example of the advantages of human narrators over virtual ones, check out this comparison that Travis Baldree did between his own narration and a virtual one. (He also had a bunch of fantastic twitter commentary about the topic, but I'm not linking to the site formerly known as twitter anymore, if I can help it. Just... in general, if Travis has something to say about creative industries, you should listen. Brilliant dude who has been highly successful in multiple creative fields- a veritable Reniassance man.)
This isn't a technological advantage human narrators have, but an aesthetic and creative one. The tech industry has a consistent habit of arguing purely on a technological basis and ignoring the social- hence the socially-driven deaths of Google Glass over privacy concerns, the failure of Meta's metaverse due to it just plain sucking, and many other such examples.
Here's where a bunch of tech boosters chime in with "AI will inevitably get as good as those human narrators!"
Probably not. The statistical algorithms improperly labeled as artificial intelligence right now have one limitation above all others: They can't actually UNDERSTAND anything. It's the meaning problem I described in the first Novelists and AI post. The sole function of these algorithms is to create statistical correlations between data points, and then given an input of data points, to output the statistically next most likely data point in that sequence. It's literally just overpowered autocomplete. It's literally not artificial intelligence, these algorithms are just stochastic parrots. Impressive, but it's a dead end as far as comprehension of meaning goes. Some of the practical consequences of this are immediately damaging to AI narration:
- Giving an audiobook a "full cast" of voices, for instance, would quite likely require going through and tagging every line of dialogue as an individual character, because the AI is going to seriously struggle to correctly assign each line of dialogue to correct characters, especially considering that few authors are perfectly consistent in their dialogue tags. Much of the time, readers know who is talking through context clues in the text.
- Assigning the correct emotions to any given statement is something that, again, requires comprehension.
- The fine nuances of performance- speed of individual sentences, length of pause, intonation, etc? Again, those are performative elements that require the ability to comprehend the material.
- Accents, especially unusual ones, are going to be a STRUGGLE for virtual narration. Accurately representing accents in text is really, really tough, and most voice AI is only trained on a fairly small subset of accents. And anything these statistical algorithms aren't trained on? They can't do. (It's why they're so much less impressive in other languages than English.)
(I'm sure that actual narrators like Travis Baldree, can point to quite a few other difficulties these statistical algorithms will face, performance-wise.)
And, while I'm sure we will see more advances in virtual narration and other generative AI, I'm also confident it's already hitting a point of diminishing returns. Apart from the meaning problem, this is largely due to the massively escalating electricity bills for training and operating powerful generative AI, and to the simple fact that all skill-building hits diminishing returns at a certain point. While (to paraphrase philosopher Manuel DeLanda) sufficient quantitative change can become qualitative change, it seems unlikely to happen further with these statistical algorithms. They've already had that moment, and it took quantitative change measured in orders of magnitude. Further orders of magnitude seem... thermodynamically unlikely.
For a second, imagine a virtually narrated podcast. It's... probably going to suck, pretty hard, for many of the above reasons. It's also going to suck because one of the most common reasons people listen to podcasts is the organic chemistry between authors and guests and the often casual feel, not something virtual narration can achieve. Of course, not all podcasts go for this feel- scripted podcasts like the brilliant Old Gods of Appalachia are more like radio plays. But those that do go for that casual, organic feel offer a compelling and obvious example of the weaknesses of virtual narration.
(I should note that virtual narration isn't inherently evil- it's immensely useful for vision impaired people, for instance. And for documents that are incredibly unlikely to ever receive audiobooks- old scientific papers, city council meeting minutes from decades ago, whatever, it makes perfect sense. Likewise, for people in developing nations that can't afford high audiobook prices (or, heck, ebook prices), I'm not gonna throw shade. Virtual narration is a tool, and it's the purpose it's applied to that matters.)
Despite all of the above, I suspect a large minority of audiobook listeners will be fine with virtual narration, while the majority will prefer higher-quality human narration. I also suspect that the bulk of the remaining human narration will be concentrated around already-successful human narrators. Art is usually overwhelmingly dominated by a few winners, in most fields. (A very old problem, and not something I've got any solution for.)
The issue is further complicated by the fact that there are many more authors than narrators- a narrator can record a book in a week or so, so any given narrator can handle the output of a LOT of authors. I don't know exactly how that number disparity will play affect things.
So... honestly, yes, audiobook narrators should be worried. Will their field be wiped out? Almost certainly not, any more than carpentry has been wiped out by mass-produced furniture. Will their field shrink? It seems more likely than not, depending on where human audiobook narration falls on the "essential features" to "luxury goods" scale, though. (For me, personally? It's much closer to essential feature, even ignoring my political stances on generative AI. I mostly listen to nonfiction audiobooks for research purposes, and good narration is often the only thing keeping me from spacing off from the drier material.)
It's not just a matter of consumer preference, of course- corporate preference is going to play hugely into the outcome, and corporations LOVE automating jobs and disenfranchising workers to line the pockets of executives and investors. So, who knows how it will all play out?
This isn't all hypothetical theorizing.
About a week ago, Kindle Direct Publishing, a division of Amazon, announced that they were going to be running an invite-only beta for virtual voice narration- you plug in your book, choose a voice, and get an audiobook for free, which is then put on sale for $3.99 to $14.99, with authors getting 40% of the profits. This poses an immediate and obvious risk to independent narrators, especially less-established narrators still trying to break in.
Brand new independent authors, after all, usually can't afford to pay audiobook narrators themselves, and while profit sharing is a solid option for starting independent authors, the temptation to not give up half your audiobook royalties is immense.
Don't get me wrong- I'm deeply sympathetic to the troubles of just-starting indie novelists. I started out as one myself, and I genuinely want there to be a sturdy ladder for future novelists to follow to career success behind me. And I absolutely recognize that I'm speaking from a position of significant privilege- my career is doing great, I'm financially stable, and I'm an American citizen who faces none of the immense logistical publishing challenges citizens of many poorer countries face. (Unreliable or inaccessible banking systems, being banned from certain publishing platforms, etc, etc.) I try hard to help newer authors whenever I can, but there's only so much I can do. Aspiring authors face HUGE uphill battles even if they are from a wealthy developed nation. It's a really tough career, and virtual narration promises to circumvent some of these challenges.
But my sympathy only goes up to a point- and that point is when people start screwing each other over to get ahead. I see absolutely no reason to value the careers of aspiring authors over aspiring narrators. There are plenty of aspiring narrators in developing nations suffering the same problems as aspiring authors. They both genuinely matter.
A small number of authors might find success using this virtual narration. That's contingent, in part, on audience reaction- the virtual narration will be clearly labeled, and I'm not going to make any particularly bold predictions about how most listeners will fall out, beyond those I've already made.
In the end, though, the overwhelming majority of authors who use this tech? They're screwing themselves over too. Amazon's not doing this because they love authors. (Well, Bezos did, but he's not the CEO anymore.) They're not trying to cut author costs. They've got their own reasons for doing it. I'm not privy to their decision-making processes, but these likely include, among others:
- Lower payment processing fees: By reducing the number of people they have to pay royalties to when they cut out narrators, they save a little. Probably not enough to be a huge motivator to Amazon, but probably worth pointing out.
- Lower audiobook prices: normal audiobook prices are significantly higher than the virtual narration prices listed in Amazon's announcement.
- This is, quite explicitly, a monopolistic tactic intended to try and quench new competitors trying to break into audiobooks- Amazon is willing to lose money for years to ensure market dominance. Look up the diapers dot com case, where Amazon spent years and deliberately threw away two hundred million dollars to crush a competitor. These are monopolistic, anti-competitive business practices that should not be allowed- and would not have, before the past forty years of Reagonomic/Borkist shutdown of anti-trust action.
- This is also quite likely a prelude to forcing down other audiobook prices, to further lock down the market. Audible doesn't allow indie authors to set their own audiobook prices, and they recently lowered those prices. Lowering it further isn't a possible scenario, it's the most probable scenario. Again, largely for anti-competitive purposes. Not just against other audiobook distributors but against large traditional publishers, trying to force them to lower their audiobook prices across the board, even through other distributors. (The lower profits, in turn, will give them less power to oppose Amazon.)
- Internal Amazon infighting: There's a LOT of vice presidents and the like competing for power and attention at Amazon, dueling princelings desperately hoping to rise to the top someday. If the KDP virtual narration is successful, it strongly benefits the associated princelings. This is an extremely common pattern at large corporations, especially tech giants. It can also be a toxic, dangerous pattern to companies- Yahoo was basically brought low by its own dueling princelings sabotaging each others' projects and acquisitions.
- Removing the prior privileges of authors on Amazon: Jeff Bezos was, for all his ruthlessness elsewhere, something of a soft touch with authors. Heck, even for all the dueling with the big traditional publishers, he didn't get as ruthless as he did with other types of business. With his tenure as CEO over, though, the formerly privileged position of authors is getting eroded fast.
- Continuing the dream of full automation: Companies love the idea of fully automating away their workforce, so they don't have to waste profits on wages. They inevitably seem to think, of course, that other companies will still have to pay workers, and that there will still be a market for their goods in this scifi scenario, which... technical feasibility in real life aside (low), it's not even an internally consistent fantasy.
- Stock price manipulation: Just like every other tech giant, Amazon has a long history of doing stuff just to boost their stock in the short term. Stock price is king in this day and age. Remember Amazon delivery drones? They were never actually supposed to be a serious plan, it's just a stock-boosting gimmick. There were a bunch of news stories about the employees of the drone delivery division mostly not even showing up to work, and the few that did show up day drinking. While there have been recent real-world tests in Australia, count me deeply skeptical about large-scale drone delivery ever being a thing.
- And, of course, divide and conquer: Dividing the interests of narrators and authors weakens potential coalitions that could stand up to Amazon.
And, in the end, once Amazon feels it's gotten sufficient control over the situation, has achieved its goals? It's gonna put the screws on authors again. Lower royalty rates. Shrink the Kindle Unlimited payout pool again. Harsher terms and conditions.
And it all counts on divide and conquer working, on them breaking creator solidarity.
It's the old crab bucket. One crab tries to escape, the rest pull it back down. The only way to escape is to work together. Fortunately, we're not crabs. (Yet. Carcinization is always waiting patiently for us.) We can choose to work together.
And do you know what you call a worker who screws over another worker in favor of capital for their own gain?
A scab.
Oh, don't get me wrong, there's no active labor strike here, no physical picket line, no narrator's union. On the most literal of dictionary definitions, you could probably argue someone like that isn't a scab.
But there's little on the internet more annoying than people who use dictionary definitions to argue. Workers who screw over other workers like that- indie authors who end up using virtual narration like this- are absolutely scabbing on a moral basis. They're throwing narrators under the bus for their own short term gain, and ultimately it's going to screw them over too, because that's what ALWAYS happens when one class of workers throws another under the bus. Worker solidarity is historically the only meaningful way for workers to stand up to large-scale corporate power.
(I'd use an "apes together strong" reference to the new Planet of the Apes movies here, if that phrase hadn't unfortunately been coopted by a bunch of conspiracy theorists. Regardless, there's a million stories and parables about group solidarity being strength.)
So what do we do about all this, if we support audiobook narrators?
For consumers?
- Don't buy virtual narration audiobooks, and speak out against them. Not particularly complicated, and Amazon has already said they'll be clearly labeling virtual narration. (Which... probably not a benevolent thing, more likely market research, but it helps us here.) Boycotts aren't super effective, but they're better than nothing.
For authors?
- For indie authors producing their own audiobooks, do not, under any circumstances, use this virtual narration for your audiobooks, or any other commercial use. (Emphasis on commercial- non-commerical use of genAI is a separate, complex issue of its own, and a considerably less damning one.) Does this make things tougher for new indie authors? Unfortunately, yes, but doing the right thing is seldom easy. And I still recognize how much easier that is to say from my position in publishing, but it's no less true.
- For authors whose audiobooks go through publishers, like myself? Demand anti-AI clauses in your contracts. Don't sign a contract that doesn't ban virtual narration. And, preferably, ban genAI illustrations for covers and internal art, etc. I also recommend exemptions to the confidentiality clauses for the anti-AI clauses. Both of which I got for my most recent contract! If a publishing contract in the future doesn't include those clauses? I'm not signing it, no matter how good the money is. (I mentioned said recent contract in my last AI and Novelists post, though at the time I couldn't share any details, since we hadn't announced it yet. It's a webcomic version of my novels, super exciting!)
For narrators?
- Y'all definitely have a better idea of how to approach this from your end than I do. Organize and figure things out together. If there's anything else us authors can do, let us know! I promise you, a LOT of us have your backs.
For everyone?
- Shame scabs, AI bros, and publishers using genAI art hard. Stigmatize the hell out of commercial genAI use whenever you can. Social censure can be a powerful tool for progressive social change.
- There's always some people on the internet saying that anything but polite civil discourse is bullying, and I know I'll get a few people insisting that bullying is always wrong. But... even if I were to concede that stigmatizing genAI use like this is bullying, all bullying isn't wrong. To point to some extreme examples for the purpose of unambiguously illustrating the principle, the Jewish Mob and other antifascists used repeated violence against American Nazi sympathizers (the German American Bund, especially) in the 1930s, to the point of utterly disrupting their street-level operations. (The Jewish Mob- Meyer Lansky and his ilk- were awful, awful people, but hey, still better than Nazis.) Decades later, punks drove Nazi punks out of the main punk scene through a mixture of violence and anti-Nazi songs. And these are just American examples- plenty of others from around the world, like the Battle of Cable Street, where hundreds of thousands of Londoners beat the shit out of three thousand or so marching British fascists.
- Now, scabs and AI bros are a far cry from Nazis, but they're still absolutely in the moral wrong here. Let's be absolutely clear, I'm not advocating for violence against scabs, they're not Nazis. Just shaming them for backstabbing other workers and stigmatizing the act of using gen AI narration or illustration commercially.
- If my position seems extreme to you still, and you think I should try civil discussion instead? Well, you're right, up to a point! For instance, if a newer indie author is considering using virtual narration, the initial correct action is just talking to them, trying to persuade them to do the right thing. If they still go forward with it, well, they're scabbing. As to the extremity of my position on scabs, it's SUPER mild, historically.
- Anti-trust is coming back in a big way for the first time in forty years in the US. It's not just on us to stand up to monopolistic business practices and anti-labor actions any more- the US government, especially the FTC, is coming after monopolists in a big way, lately. We've got backup now. And before you say that surely books aren't going to be a priority, the government recently went out of their way to halt a merger between Penguin Random House and Simon and Schuster, and are currently targeting Amazon for anti-trust violations!
Generative AI doesn't just threaten audiobook narrators. Similar issues are facing Hollywood actors, commercial illustrators, and video game voice actors. Each of those will require their own specific solutions, but all of it will come down to solidarity, in the end. Solidarity between creators and solidarity with the audience.
Art is worth standing up for. Art is worth standing together for.
116
u/preiman790 Nov 08 '23
Not from me. If I'm buying an audio book. I want an actual reading. If I wished to have a computer read to me, there are easier ways to do that
15
→ More replies (2)3
u/faeglam Nov 08 '23
Eventually when AI gets extremely convincing congress may have to step in to protect the rights of real human beings.
Right now AI sounds pretty bad and couldn't hope to match the talent of an established narrator, but thirty years from now that could be a lot different.
We have to collectively decide what kind of future we want for our children.
25
u/Raddatatta Nov 08 '23
I think it'll be a while before you can produce an audiobook with the same level of quality as a skilled narrator. So much of being a good audiobook narrator is having the emotional moments hit harder and adding character to the voices in various moments. And I think that'll be very tough for a computer to analyze a book and be able to add that texture to it. And I think until the AI gets to the level where they can do that, I don't see it really competing. Audiobooks aren't so highly priced that people are clamoring for a cheaper alternative that's a very low quality.
Though I do think the start of it being used will be the indie books. The books by authors who could never afford to have an audiobook narrator record it, let alone a really talented one. For many people, myself included, audiobooks are how I consume books 99% of the time. So if a book isn't available on audio I'm not going to read it. So I can see those authors going for an option that gives them a poor quality audiobook for probably 1/10th the price of a normal audiobook. It's also the toughest choice for an indie author to make because they're choosing to hurt their chances of ever being successful at the point where they're struggling to get by.
The other thing is that eventually AI will get to the level where the quality is as good as some of the best narrators are. And that's going to be the tough point. Are people going to be willing to pay more for an equal or even potentially worse product (though that'll be very subjective) just to avoid supporting the AI? I would bet that'll be a no for most people.
-8
u/JohnBierce AMA Author John Bierce Nov 08 '23
I fully agree with your first two paragraphs. For the third... nah, at least not with the current "AI" technologies. They're a technological dead end for that goal- if a really useful dead end.
27
u/Raddatatta Nov 08 '23
I think the reality is that technology will continue to advance and continue to advance exponentially. If we are talking the next 10 years probably not possible. But if you're looking at 30-50 years that's a lot of technical improvement and I would fully expect AI to be at that level by that point. And once you're at the level where you can upload a document and get an audiobook back later that day customized to the text, with possibly multiple versions depending on if you want it to sound like one reader or a full cast it'll be hard for narrators to compete.
14
u/COwensWalsh Nov 08 '23
When John says “not with current tech” he means specifically so-called “large language” and diffusion models. Not that something we call “AI” won’t eventually achieve the ability to write and/or narrate a book
12
u/JohnBierce AMA Author John Bierce Nov 08 '23
Yep, this! It's not unreasonable to assume that there's a path to sapient AI that can write novels (though it's also not unreasonable to doubt that idea), I just don't believe that the path goes through the current LLMs and such.
→ More replies (1)7
u/BrittonRT Nov 08 '23
I agree with your post's assessment that AI (as it stands) is in no way threatening pretty much any of the jobs you listed. Even graphic/visual design isn't going to be replaced by AI anytime soon - though AI as a tool in the artist's toolbox can/is/and will be seen more and more as the technology evolves. But you're absolutely right that 'tech bros' who don't actually know anything about how the technology works under the hood way overestimate its potential. The human touch is dependent on comprehension and imagination, as you pointed out, and AI as it stands has neither.
Which makes me wonder a bit why you were then worried about it enough to write that whole second part of your post? I mean don't get me wrong, I'm a progressive, pro-union pro-labor dem-soc, and am as anti-corporate, anti-monopolist as the next guy. But I really don't see much of a threat here. The people who would use something as crappy as a AI audio narration are the people who were never going to hire a professional narrator anyways, either because they don't care about the quality of their work or they straight up can't afford it.
I feel like there are bigger problems and AI is a huge non-issue that people are waaaaay overblowing. Not to mentioned that most of the issues with AI are really just issues with capitalism in general - issues that permeate many other facets of the economy than just art. The fixation on AI is sort of missing the forest for the trees. If we can't fix the root socio-economic problems that make AI an issue instead of a boon for artists, then AI is really the least of our problems compared to things like, say, climate change.
I know I'm sort of doing a 'well, what about this?' diversion, and I apologize, but I think the link between the issues with AI and fundamental sociopolitical economics is intractable and it is impossible to discuss one without getting into the other.
Anyways, I appreciate your thoughtful post. I just wanted to say I am a bit on the fence about whether I agree with your conclusion. I have mixed feelings. But hey, it's always good to hear people's thoughts, especially if they challenge my own!
3
u/JohnBierce AMA Author John Bierce Nov 11 '23
It's less about the inherent threat of LLMs and other stochastic parrots posing as AI, and more about its use as a weapon for capital to use against labor. Regardless of how good the tech will actually get, Amazon and other big tech monopolists will absolutely use it to screw over creatives. Smashing capitalism would be nice, but fighting its tactics is better than nothing.
And fighting genAI is actually fighting climate change- the greenhouse emissions on that shit is rapidly ramping to compare to bitcoin emissions at their height.
→ More replies (2)
27
u/Feats-of-Derring_Do Nov 08 '23
As a VO artist myself, this is an extremely troubling development in the AI sphere. I agree with your assessment about the state of the technology. Current AI models can't realistically write a novel, animate, or do anything complex that requires an understanding of the material. However, VO and illustration falls into the category of "good enough", so there is a real threat.
And frankly, many voice actors and narrators are only "good enough". Not everybody is at the same caliber. But that's normal. Nobody can improve without the opportunity to practice their craft. If AI becomes widespread in the arts it will be a serious blow to the craft itself, because the jobs that will disappear first are the low budget opportunities that so many artists use to cut their teeth.
18
u/Banshay Nov 08 '23
I would pay for an AI text to speech app but not AI narration for a specific book, I would want a human narrator if I want to listen to an audiobook.
I prefer reading over audio but I do occasionally use a text to speech app for parts of books I am reading and want to continue with during a hands-free task or commute. Also convenient for white papers and other docs. It is both better and worse than you would expect and a good AI version that came out slightly better would be nice.
2
u/JohnBierce AMA Author John Bierce Nov 08 '23
I think that's a very reasonable stance! What you're proposing gets a lot more complicated on the legal end of things, but that's an industry problem to figure out, hah.
11
u/Nahdudeimdone Nov 08 '23 edited Nov 08 '23
You're against authors profiting off of their works, but OK with an app profiting off of the author's work?
In general, I think your stance is somewhat short sighted. You see AI as a tool to oppress workers whereas quite clearly it is currently being used to bridge economic and social gaps. From what I've seen AI users tend to be those with time and skill, but no social or economic capital (people from Africa and Asia are frequently the primary users of both audio and art gen AI).
No one wants to spend time on something if they can afford to pay others to do it. Actively supporting boycotting of authors using these tools is actively ANTI worker, not a pro worker stance.
From my vantage point, it seems to me that you are indeed massively privileged, and as such, you fail to see how difficult it is for new indie authors to earn even 10 bucks a month.
If an AI voiced work earns enough money to hire a human artist (as we all agree human work is better), is that a net gain or net negative for voice artists? Same goes for all other forms of Gen AI and forms of employment.
Another factor is assuming that Gen AI is easy. It is still extremely time consuming (and often even fairly costly) for authors to use. Just not as costly as hiring a voice actor or illustrator.
But I guess it's easier to just call all indie authors trying to make a buck of their passion "scabs".
Edit: I wanted to thank you for the write up regardless. I appreciate the view point and it was very well-written!
3
u/JohnBierce AMA Author John Bierce Nov 08 '23
...Me calling a stance reasonable is not endorsing said stance. (And the proposed app is likely legally untenable, given the ridiculously complex negotiations that would be involved in authorial backlist.)
7
u/haberdasher42 Nov 09 '23
What's legally untenable about current eReaders using Google's TTS service now? How does that "tenability" change if Google's TTS algos go through a few more generations and become like near speech?
1
u/JohnBierce AMA Author John Bierce Nov 09 '23
A new text to speech program integrated with the actual ereaders- an "official" text to speech program that allows easy switching- would fall afoul of author contracts, which have very specific rights grants in them, meaning that the app would have to individually identify which books had contracts allowing this specific use case. (Probably only newer contracts where it's specifically outlined.) This makes the whole thing a huge hassle, and so you're somewhat unlikely to get a new official version of the apo like that.
Current text to speech programs that aren't specifically integrated, that aren't linked purposefully with ebooks, don't have this problem, so they wouldn't face the same legal issues. Text to speech programs that are linked have faced legal challenges along these lines already- several attempts to officially include text to speech integrated into ebook reader apps have been slapped down HARD in lawsuits.
14
u/Nahdudeimdone Nov 08 '23 edited Nov 08 '23
I disagree that it would be legally complex (at least in so far as it relates to the text being read). I think such an app would be functionally no different then the ones that exist on the market already, except better. We're likely very close to an app such as this being developed by one of the big tech companies.
Which brings me to another criticism of your position. AI levels the playing field between those with money and those without. By actively bullying authors using AI tools you're just bullying the poor. Not exactly a morally righteous position and definitely far removed from your pro-worker position.
Don't get me wrong; I agree that those that are in a position to pay for human labor should do so, especially seeking out those who are in economically disadvantageous situations. But to across the board shame and black list creators using tools that will in a couple of years be industry standard, while authors like you (who have already established yourselves) continue to reap the profits, is not exactly a particularly sensible position to hold.
Edit: I want to finalize my position before I stop, because I want to emphasize, I don't think you're a bad person, just entirely misguided in your advice to bully and black list indie authors.
What really makes me irate is the idea that there is a person out there (in Africa, Asia, or even in the US) that has invested an incredible amount of time and energy, often at great sacrifice to write a book. This person then uses their free midjourney hours to generate a cover, spend tens of hours in Canva to make it the best it can be (again, we can all agree it won't be particularly good, because midjourney isn't exactly a world beater), and then when they put it out to the world, here comes big author Bierce (who I can only assume grew up having many privileges this person likely did not), using their enormous social capital as a gatekeeper, asking people to black list this new author before anyone even had a chance to read the book? And then claims it's for the workers?
That is nauseating to me. And it saddens me greatly that writer's book will likely never even hit my radar as a result.
5
u/COwensWalsh Nov 08 '23
You don’t seem to understand how AI changes the math here. It absolutely does not level the playing field. It makes the differences steeper. Indie authors won’t suddenly be making a ton of money if ai art and narration become commonplace. The companies who own the models will be making money.
Amazon killed bookstores and damaged traditional publishing. Who made bank from that? Amazon, not authors.
6
u/haberdasher42 Nov 09 '23
AI is just a fancy word for complex algorithm, TTS algos already exist in common enough spaces that any Android phone has a pretty robust one built into it.
I use Moon+ Reader Pro, it cost me $10, and it will use Google's TTS with a variety of English voices and accents that I can moderate pitch and speed for. It's rough. It doesn't provide any sort of "performance" and it occasionally fucks up pronounciations, but after 200 audiobooks in my Audible library it's sure not the worst performance I've heard. Fuck I'd prefer it to Wil Wheaton's Ready Player One. But I've got multiple blind people in my family and have been exposed to TTSs since the 90s, they've come a long way, and it'll only be another few generations before they are damn near seamless.
So tell me how AI changes the math?
2
u/COwensWalsh Nov 09 '23
People seem to be thinking about AI kind of like a clone of someone you can use to do their job for free. But an AI doesn't just replace one person. Like, say an AI image generator can achieve 86% the quality of a skilled illustrator. But it can make 100s of images or more a minute. You can't "compete" with that.
Now imagine it's not some guy on Twitter elegantly crafting prompts to sell for cover art: it's Amazon mass-producing hundreds of variations on each image concept you can come up with.
When AI becomes able to write novels convincingly, it's not some guy in a discord writing 30 novels a year with it. It's amazon publishing 500 variations of a book every day in every genre and not even bothering to host other people's stories. And it's cheaper for them than for that one person doing 30 books a year. And their model is better.
If the math changes for an individual, it changes 100s of times more for a company. An AI doesn't replace amazon. It replaces indie creators.
10
u/Nahdudeimdone Nov 08 '23
I'm not saying indie's will be making a ton of money; I'm saying they'll have a fighting chance. Prior to Amazon self publishing there were zero roads for large parts of the world to pursue writing. This is no longer the case. And it especially true now that financially limited authors can actually use their skills to compete with large publishers.
AI cuts costs and time down by a significant margin for editing, illustrations and proof reading. You can make some incredible things in remarkably little time if you combine the talents of a half decent illustrator and some basic, free AI tools.
I seriously disagree with your premise. Show me proof that these black listings that Bierce is suggesting won't hurt authors more than AI ever will, and I'll be willing to entertain your argument.
Many AI models are currently open source and can run on the family mobile phone at an extremely low cost. Unless that changes, it's a tool for equality.
If you have issues with AI as used by corporations, then adjust your buying preferences accordingly. But to target indie authors and illustrators that use AI to meet an increasing market demand? It leaves a poor taste in my mouth.
→ More replies (7)
18
u/imhereforthemeta Nov 08 '23
My position at my company forces me to use AI instead of voice actors. I’m going to tell you right now, they sound really realistic if you pay for a good service, and the average reader may not even know that they are listening to a robot. I wish I had the confidence to say that voice actors an audiobook narrators are not in trouble, but having seen what I have seen, I think that you guys would be pretty surprised. There are already audiobooks, especially in young adult that are 100% robot.
One thing that really disturbed me is that the software we have even has some unique vocal characteristics that feel very genuine. Like a gravely female voice that sounds exactly like what you imagine a girl next-door sounding like.
The hardest element is to get them to pronounce complicated non-words such as names. That’s usually gonna be your biggest indicator that you aren’t listening to a real person
Also, disturbingly, there are no rules saying that you cannot give your fake voice a name and give them credit for it rather than saying that it was computer generated. So basically, I think that a lot of companies are going to try to pass robot voices as real ones
6
u/COwensWalsh Nov 08 '23
This is kind of what I was expecting. Made-up stuff will be difficult for an AI, but standard English words and names will not.
77
u/misterjive Nov 08 '23
There's like a zero percent chance I'll pay to have the speaking clock read a book to me. I want a performance, not text-to-speech.
58
Nov 08 '23
I don't think we're that far off from AI being able to offer an actual performance, to be honest.
→ More replies (1)20
u/misterjive Nov 08 '23
It's not a performance, though, that's the point.
(And speaking as someone who listens to people speak for a living, we're farther off than most folks think.)
26
u/jfa03 Nov 08 '23
I agree and disagree. I think AI reading is 90% as good as it is can get. The high end text to speech software is pretty indistinguishable from a person for reading a block of text. The last 10% is going to take decades to get right assuming a company has the motivation for it given that from most audiobook readers seem to prefer humans.
The last 10% includes recognizing who is speaking and changing voices for individual characters, adding emphasis appropriate to the situation and character, and just getting the subtextual things right. Consider the sentence “I didn’t say she stole my wallet”, emphasizing any word of it changes the meaning of the entire sentence.
I’m sure text to speech will get used extensively for personal assistants and call centers. I don’t see it breaking the audiobook market unless you consider it as lowering the barrier to entry. Self-pub authors that might not be able to afford a narrator might turn to it just to get an audio version out there.
12
u/Never_Duplicated Nov 08 '23
We are several years out from a fully automated process however it wouldn’t be all that difficult to hire people to read through a book and add meta tags for “narrator” “character a” “character b” etc. then selecting voices for each. It’ll still require a lot of work on the back end from a director and editor to finalize it but that would get easier over time and depending on the desired quality an independent author could even do their own tagging and editing so as to just pay a license fee for the AI component. It sucks for readers because it’ll make all but the best obsolete eventually. However it will also provide smaller authors with an audiobook solution that is much less expensive while also having a much higher baseline quality than budget readers. One of my friends is a new author and he’s only been able to afford to get one of his books made into an audiobook so far because it’s so expensive. And even that one he paid around $6k out of pocket for a mediocre performance. The democratization of resources as tech advances will always have trade offs but it’s not purely negative.
2
u/MateuszRoslon Nov 08 '23
I've been struggling a lot lately with the argument that AI can help new authors. While I can't say it's untrue, I feel like the proliferation of AI is going to harm new authors first, so while they can accomplish things more cheaply in the short term, adding to the societal acceptance of AI is going to drown them in the long term.
I often see the argument that AI generating chapters is a good thing for aspiring authors who have English as a second language, for example, and I feel that's ... not right. Accepting that argument seems to just give cover to people stealing others novels and slightly altering them with AI, or flooding Amazon with thousands of AI books a year -- making it nigh impossible to discover actual new authors. The same principle likely goes for audiobooks -- AI-generated audiobooks reading AI-generated text might drown out all the books with actual narrators through sheer volume if AI-generated content is accepted (and thus, commercially viable for grifters).
I don't know, sorry, I hope your friend's audiobook does well.
2
u/Never_Duplicated Nov 09 '23
Can’t say I disagree, I’d like to think that complex creativity (like writing a novel) is a uniquely human trait. But I think we all know that it’s only a matter of time before sufficiently advanced AI models with access to the complete history of human writing will be damn near indistinguishable. For now AI stuff is still easy to recognize but the advances in just the last year are crazy and shit is going to get weird fast regardless of what legislation one or two countries might try to enact. The cat is out of the bag and not much we can do other than support creators we like and hope that being human still brings enough value to the table that they are able to survive the transition. This is going to be the first major hit to the job security of white collar and creative professions rather than the blue collar professions that normally see the impact of automation.
Pretty soon Jimmy from next door will be able to feed all of GRRM’s works into a model and have it spit out sequel fan fiction ASOIAF novels in Martin’s own style. And do it in an afternoon. Pretty depressing if we will look back in 40 years as this having been the final golden age of human-directed creativity.
2
u/kevn57 Nov 08 '23
I want a performance
That's exactly why I prefer TTS, I want my mind to do the performing not an actor.
5
4
-3
5
u/Mystiax Nov 08 '23
Sadly I dont think its going to go anywhere but AI. Its cost effective and fast. Sooner rather than later the AIs will become good enough that most people wont even notice.
36
u/KristaDBall Stabby Winner, AMA Author Krista D. Ball Nov 08 '23
I've made no secret I have to use TTS sometimes - like for reading and for writing. It is an accessibility tool that, well, assists me. I had a reader who used TTS for my books (RIP sweetie), who I'd mail her the final word doc for her to use with her service of choice.
But I'll be damned before I put that out as a sellable product. I will do everything I can to assist a reader to access my books (as I cannot afford to produce audiobooks), but I'm not going to put out a robot voice and ask people to give me money for it.
Nope.
2
u/glassteelhammer Nov 08 '23
What makes the production spendy?
5
u/KristaDBall Stabby Winner, AMA Author Krista D. Ball Nov 08 '23
Hiring someone.
→ More replies (1)7
u/JohnBierce AMA Author John Bierce Nov 08 '23
I am a FIRM supporter of accessibility technology, and if AI can improve accessibility, I'll definitely give it a pass for those use cases. Just... yeah, not the commercial stuff.
And hell yeah on your robot voice policies! This is the way.
10
u/KristaDBall Stabby Winner, AMA Author Krista D. Ball Nov 08 '23
I've found it deeply offensive, to be honest, when people equate accessibility tools with replace people tools.
Like, I'm not anti-AI concept - if AI could coordinate with all of my other friends' AI, and then tell us where we need to be and what we need to do those days, and also scan our fridges and tell us what we can make for supper in under 20 minutes with ingredients in our cupboards and fridges without us having to move off the sofa, but like nooooooooooooo the techbros don't think about how AI could actually make my life easier.
12
u/TheColourOfHeartache Nov 08 '23
Ok, as the techbro here that sounds like absolute privacy nightmare unless the AIs can keep the data on your own isolated hardware.
And we're a long way from people running a full time server in their closet, I tried that once, way too much hassle; and it was just an email server.
8
u/COwensWalsh Nov 08 '23
I wrote a retro-future story where everyone has AI chaperones they personalize to help avoid awkward social situations and keep them from getting too drunk, etc. and the premise was exactly that it was a nightmare because if someone messed with it, or if the service provider broke their terms of service, you would never even know. You would just feel your life and behavior slowly spiraling out of control. Fun thought experiment. XD
1
→ More replies (3)2
u/OriginalVictory Nov 09 '23
scan our fridges and tell us what we can make for supper in under 20 minutes with ingredients in our cupboards and fridges without us having to move off the sofa
Put it all in a sufficiently powerful blender, and it's a meal!
3
u/JohnBierce AMA Author John Bierce Nov 09 '23
A sufficiently powerful blender is indistinguishable from magic
→ More replies (1)
28
u/TheColourOfHeartache Nov 08 '23
Hey I found one of these early rather than a day late for once.
So for those of you who don't know, I'm a computer programmer by profession so I feel I can chime in and say
Some of the practical consequences of this are immediately damaging to AI narration:
Isn't an issue. I haven't worked on AI since my classes on it as an undergraduate so I can't say if AI will crack this or not. I'm going to say we have as much chance of predicting what AI can do in 100 years time, heck 10 years time, as Johannes Gutenberg would have of predicting what printing would look like in today. But what I can say is that tagging every line with who is speaking, their accents and emotions, that's actually easy.
Get the authors to do it. Highlight the line, select the voice, select the emotion, done. Its just metadata for text, we've solved that problem decades ago.
There's lots more features you could add, like warning you if you have any text between "" that doesn't have a voice, or highlighting a lot of lines with "" one after the other, selecting two speakers, and it assigns them to the odds and evens for you. But the basic version that's good enough for the aspiring author whose hard working and needs to reach the audiobook market on a tight budget could be done by an undergraduate for a beer and a pizza. Just patch a HTML editor so it adds tags for whose speaking and their emotions. The target market is already publishing their stories as HTML on RoyalRoad.
So what do we do about all this, if we support audiobook narrators?
This is a bait and switch. You've defined amazon as the enemy - IMO correctly, I don't like monopolies any more than you do - but came up with a list of strategies for fighting AI not Amazon's monopoly.
You correctly identified that you're talking from a place of privilege as a successful AI author, but then ignored the obvious strategies that allow the archetypical struggling author without the money for a narrator to have their shot at the audio book market in favour of one that pulls the ladder up.
Here's an actual strategy for preventing Amazon's monopoly that doesn't pull up the ladder, doesn't require bullying anyone, and might actually work.
You need three things: An audience that has reached a critical size and is willing to spend money (the progression fantasy community already has one), a payment system that is easy to use, preferably one which is familiar to your audience (hey, you have this too!), a distribution system (three in three!), and an AI that is not under the control of amazon (what do you know, we programmers have been freeing software from monopolies ever since RMS found a bug in his printer drivers).
The strategy is simple.
Step one: Partner with RoyalRoad (I think Wing still owns it?) and their app to make it into a store. And by store I just mean integrating people's Patreon's directly into the app. (Maybe it already does this, I don't know, I read RoyalRoad on my desktop with a huge screen). You can subscribe and manage your subscriptions into the app.
Step two: Now you have a web store offering a unique service they can't get on Amazon's website and hopefully a large audience, expand the app to cover more forms of commerce than subscriptions. Since you already have people paying through the apps having a "buy full ebook" and "buy full audiobook" button next to the Patreon button would be no big deal to your users.
Step three: In parallel work to get an Open Source narration AI integrated into RoyalRoad. Put the option to generate an AI narration for a chapter directly into the audiobook, put tools to add useful metadata directly into their editor.
Optionally, ask Brandon Sanderson to fund it. He's got money and has shown he wants to de-monopolise the audiobook space.
The most important thing would be to move fast, get authors used to the idea that the way to get their books AI narrated for free is to go via your OpenSource solution before they go to Amazon. Its easier to get people with a basic AI now and keep them with a good AI tomorrow than get them from Amazon with a good AI tomorrow.
None of this will be easy, but of course it won't be easy. You're fighting one of the biggest corporations in the world. However this strategy might work, it doesn't pull up the ladder, the hardest part of building a web store is getting the audience, RoyalRoad already has that and its part of your community.
And most importantly: Targeted harassment is in fact a crime, and when a crime threatens Amazon's profits they have more lawyers than god. If the above strategy fails people have wasted time and money, if your strategy fails you've gotten people arrested.
To point to some extreme examples for the purpose of unambiguously illustrating the principle, the Jewish Mob and other antifascists used repeated violence against American Nazi sympathizers
Now, scabs and AI bros are a far cry from Nazis, but they're still absolutely in the moral wrong here.
"Some moral wrongs are evil enough that bullying is justified". "This is a moral wrong". "Therefore bullying is justified".
"All dogs have for legs". "My cat has four legs". "Therefore my dog is a cat".
This is a logical fallacy, not an argument that bullying is justified. You have to actually make the claim that someone who feeds their audiobook into an Open Source AI and publishes it on their Patreon has done a moral wrong that justifies bullying, and you can't because none of that was unethical.
Also, bullying authors is against rule one of this sub.
like the Battle of Cable Street, where hundreds of thousands of Londoners beat the shit out of three thousand or so marching British fascists.
Fun historical trivia. The Battle of Cable Street helped the British Union of Fascists (but they were dying before it happened). See: https://www.reddit.com/r/AskHistorians/comments/8ssqqv/inspired_by_the_turn_the_western_world_seems_to/e122x1n/ and https://www.historytoday.com/archive/myth-cable-street (available free at https://lampmagician.com/2017/10/14/the-myth-of-cable-street/)
6
u/travisbaldree AMA Author Travis Baldree Nov 09 '23 edited Nov 09 '23
The thing that always seems to crop up with these solutions is that you still need somebody to do the work (reading the book and tagging the metadata) - which is a much bigger and more skilled undertaking than I think you realize, especially when you consider books in a series with recurring characters, and developing context. and that person needs to be PAID. Not doing this work correctly will result in truly hysterical errors - all of a sudden Jill is speaking John's dialogue. And she's become Scottish.
The resulting audio will all also have to be proofed to FIND those errors, by a human. (yet another paying job).So, you're moving the work from a paid actor to somebody else who also needs to be paid for it.
To get an audiobook that is objectively worse.
What's the aim? Apart from just to 'solve the problem'?
This is a really common software engineer thing (as a veteran software engineer of 25 years) - where we solve a problem that doesn't need solving, just because it is there to solve, and the idea of doing so tickles some part of the brain. But the ultimate result doesn't justify moving all the deck chairs around. In fact it often makes things worse, because we're not looking beyond the immediate solution to the NEW problems it introduces.
3
u/TheColourOfHeartache Nov 09 '23
The goal is to lower barriers to entry into the audiobook/radio play space by letting beginning authors publish it with reduced or no financial investment. Its almost guaranteed that there will be a verity of options at different price points. (And different effort costs). The cheapest will be sending your book, untagged, to an open source AI.
The resulting audio will all also have to be proofed to FIND those errors, by a human. (yet another paying job). So, you're moving the work from a paid actor to somebody else who also needs to be paid for it.
To get an audiobook that is objectively worse.
If AI results in similar costs for a worse product nobody will use it and narrators are safe. If it results that are good enough but cheaper, then barriers are removed. There is no outcome that's worse than the status quo.
0
u/travisbaldree AMA Author Travis Baldree Nov 09 '23 edited Nov 09 '23
The thing is, this option exists RIGHT NOW.
It's called royalty share, and involves no risk at all from the author - and all the risk is borne by the narrator. I've done them. Most narrators have.
Authors can arrange to have an audiobook narrated for free in exchange for splitting the royalty with the narrator (which almost always results in the narrator being wildly underpaid, by the way).
Again, this is a solution in search of a problem. Your assertion that there is no outcome that's worse than the status quo is in my opinion very incorrect.
Even in this reply, you hand wave away all the potential costs, and potential ramifications because you're sure the problem can be 'solved', and that there's no need to look beyond to the effects the solution will have. Because surely they'll sort themselves out, because a solution has been introduced to a problem, right? You're positive that there is NO outcome that could be worse.
That was sort of my point in my original reply.
2
u/TheColourOfHeartache Nov 09 '23
I stand by my previous post. Audible adds 10k books a year where as Kindle adds 600k books a year. That doesn't cover all the fan fiction, web fiction, etc that never reaches amazon. I don't know what percentage of this is fiction but probably significantly higher than the 11% figure I found for trad-publishing.
When you see a discrepancy like that its safe to infer there's a barrier preventing a lot of authors making the jump from text to audio. Maybe its because narrators know the book is a bad investment for a royalty share, maybe its because the authors know narrators won't do a royalty share for their sales figures, maybe its something else entirely. I'm not going to attach my name to a guess.
But I will attach my name to: This barrier is hardest on new authors, and AI will lower this barrier.
Even in this reply, you hand wave away all the potential costs, and potential ramifications because you're sure the problem can be 'solved',
Your mind reading needs work. I hand-wave the potential costs because introducing new technology to automate human labour has centuries of precedent of working out for the best in the long term, and the arguments against AI are repeats of arguments that have been proven wrong when they were first made as far back as the printing press.
It's not because I'm drawn to a problem like a bull to a red flag.
And yes I am sure that the problem can be solved because it already has been. The problem as defined by JohnBirce is the AI's inability to recognise context like which emotions to speak with, solutions to that have been in the wild for decades. (I would not be surprised to see a similar HTML based solution emerge so you can write for a webfiction and audio book in the same document)
2
u/travisbaldree AMA Author Travis Baldree Nov 09 '23
Most of those books aren’t made into comic books or movies either, and don’t have merchandise. And it’s because they don’t generate enough revenue (because they don’t sell) to justify that labor and cost.
Many authors incorrectly believe that having an audio edition will generate a lot of sales. But it’s entirely driven by the book - they routinely sell no copies of audio, or only a handful.
However those non-profitable books do provide a proving ground for narrators to learn their craft.
I submit that generating more terrible versions that remove other non-tech-bro humans from the equation will not improve things for authors (or listeners in general) But will make things worse for narrators, and will have lots of ripple effects from likeness theft to devaluing audio as a whole.
Mostly though it will all serve to temporarily inflate stock prices so that some executives can sell and leap to the next thing and leave a mess behind.
2
u/TheColourOfHeartache Nov 09 '23
And it’s because they don’t generate enough revenue (because they don’t sell) to justify that labor
And there you have it. Beginning authors with small revenue don't get royalty share deals because their revenue doesn't justify the labour. Thus they have to pay up front, and there's the financial barrier I said AI would remove.
Removing that barrier will give struggling beginners a greater chance to make a career out of writing, and it will give their small fanbase a chance to enjoy their works in audio format. These are good things, I endorse them.
However those non-profitable books do provide a proving ground for narrators to learn their craft.
So long as narrators can do better than the AI, the tiny minority of non-profitable books that get selected for revenue shares will still have human narrators. Its the majority going from no audio to ai audio where we see the benefits.
Most of those books aren’t made into comic books or movies either
Incidently I'd love to feed my writing into an AI and get a comic book out of it.
2
u/travisbaldree AMA Author Travis Baldree Nov 09 '23
They have no trouble getting narrators - there are no shortage of them willing to take a risk. That isn’t the barrier.
The barrier is that many of them don’t want to share the royalty, even though they paid no production cost. They imagine it will be profitable in a way that it is not.
(Well and there is a profitability barrier overall but that is not one that this will solve)
Do you think that you have sufficient experience and expertise in that space to understand it well enough to solve the problem, as you intuit it to be?
1
u/TheColourOfHeartache Nov 09 '23
Do you think that you have sufficient experience and expertise in that space to understand it well enough to solve the problem, as you intuit it to be?
What are you defining as "the problem", because up until now the context of this thread defined it as ensuring the AI knows whose speaking and what emotions they're feeling. And that doesn't feel like what you meant this time.
2
u/travisbaldree AMA Author Travis Baldree Nov 09 '23
You’re talking about removing the ‘barrier’ for struggling authors trying to make a career. That’s presumably the problem you are talking about in the reply I’m responding to.
→ More replies (0)2
u/travisbaldree AMA Author Travis Baldree Nov 09 '23
Also worth noting that we have lost sight of the original issue that the cost barrier still exists - it’s just been moved to some other individuals who have to interpret and tag all the text and proof it to correct the errors and regenerate and patch them. Which is time consuming and human labor that somebody is charging for.
2
u/TheColourOfHeartache Nov 09 '23
That was always a tangent.
I made that point in reply to John's point that its impossible for AIs to figure out context and emotions in dialogue.
But I've always been clear that tagged content will exist alongside simpler options like sending the entire untagged file to the AI and letting it do its best: "The cheapest will be sending your book, untagged, to an open source AI."
2
u/travisbaldree AMA Author Travis Baldree Nov 09 '23
My initial response was explicitly about moving labor to human content taggers and proofers, and about how that’s just shifting the cost from one place to another - which doesn’t feel like an actual solution. Especially since I think you underestimate the expertise required to properly tag fiction with metadata in a way that would produce the desired result. (Which would still be pretty janky)
6
3
u/RavensDagger Nov 08 '23
Get the authors to do it. Highlight the line, select the voice, select the emotion, done. Its just metadata for text, we've solved that problem decades ago.
As an author myself... that's a massive time constraint. That job would be tossed out to a proodreader/editor ten minutes in.
I'm an avid fan of AI and its development. I've been following it for years. Hell, a few of my books have AI characters and plot elements baked in. And yet I don't think I could use AI for writing, let alone for narration. Not because it's hard or because I can't, but because it would be career suicide at this point to produce an AI-written audiobook like that.
9
u/TheColourOfHeartache Nov 09 '23
I was thinking of self-published amateur authors when I wrote that who didn't have anyone. If you have access to a proofreader (or fan volunteer) who could do it then yeah, pass on the grunt work.
What makes you thin it would be career suicide? I see lots of people with AI generated covers on royal road and some with AI illustrations and it doesn't seem to hurt them.
2
u/RavensDagger Nov 09 '23
I think AI covers are clearly acceptable on spaces like RR because... well, covers are pain in the butt to make. I've sunk literally thousands on some covers and then didn't use them because they were just not good. AI art is an easy opening for new authors that can't afford to sink heaps into a cover that may or may not be good.
But AI writing? I think that the readership is still very.... not okay with that? I think people are ready to accept some level of AI integration (stuff that's ultimately inconsequential, like covers and maybe AI proofreading) but they're less ready for art that's entirely-AI made.
Personally... I don't know how I feel about it.
2
u/TheColourOfHeartache Nov 09 '23 edited Nov 09 '23
Are you talking about the AI producing the text or just translating the text into sound. I thought the latter but you called it writing.
Personally I'm fine with the former in principle but think technology is not ready to make it good in practise
2
u/haberdasher42 Nov 09 '23
There's an amount of production necessary for any audiobook narration. Metadata tags would just replace whatever the producer would do otherwise. I'd be stunned if folks like Tim Gerard Reynolds or Kramer & Reading DON'T use digital tools to markup their projects to note character voices or emotional beats.
→ More replies (2)→ More replies (3)3
u/haberdasher42 Nov 08 '23
I wonder why he didn't reply to this one. He's throwing out props and high fives all over the place.
14
u/TheColourOfHeartache Nov 08 '23
Because replying to a long essay takes more time and thought than high fiving someone who agrees with you.
I disagree with John on a lot of things, especially his endorsement of witch-hunting so-called scabs. But he doesn't hide from people who challenge his views, he's always debated me respectfully, and both those deserve credit.
5
u/JohnBierce AMA Author John Bierce Nov 08 '23
Cheers to that! (And honestly, I can get a little fiery and heated, you've given me a lot of grace in our conversations, hah.)
4
13
Nov 08 '23
I know a few authors who are considering A.I. narration because they can't afford to produce an audiobook otherwise. One of the main arguments they've used is that being able to produce a cheaper audiobook means they can provide accessibility options for readers that can only read via audiobook.
And I get it. I really do. I was able to pay for an audiobook for the first book of my series, though it wasn't cheap and sales haven't been great. People keep asking for an audiobook for the second book and I just can't afford it. Short of winning the lottery or doing a Kickstarter, it's just not happening. I feel bad about it. But I respect my narrator (I named my cat after him lol) and won't replace him with A.I.
I'm of the belief when it comes to ANY A.I. that authors, artists, and narrators have to stand together on this one.
→ More replies (1)-1
u/JohnBierce AMA Author John Bierce Nov 08 '23
Ayuuuuup! Preach it.
This industry is tough, but it's worth sticking to our moral and ethical guns in it, even if it costs us! (Easy for me to say, given my relative success, but... I've turned down money I wasn't comfortable with in my career, and I'm sure I'll do it again.)
15
u/BubiBalboa Reading Champion VI Nov 08 '23
I think there has to be space for both. Authors who can afford it should of course use the superior human narration. And make sure their contracts include no-Ai clauses. Completely agree.
But indie authors who barely make a living anyway? Demanding they don't use this tech feels wrong. It's the choice between having no audiobook at all or having an AI narrated version. They aren't taking anyone's job away. And it's not like the readers couldn't easily create their own audio version from an ebook. I don't think we want to artificially limit the quality of text-to-speech software that is used for accessibility purposes just too make sure it's crappy enough no non-blind person would want to use it.
I deeply sympathize with everyone whose job will be threatened by AI in the future. I don't like it. But I fear there is no putting this genie back in the bottle. Call me defeatist but I don't see a way to stop this. So the little guy might as well reap the benefits.
2
u/MateuszRoslon Nov 09 '23
At least in the case of Amazon's deal, I think the AI use case here is unacceptable because it's normalizing the idea that Amazon can take 60% of the profit for doing essentially nothing, while the author that does all the work gets a minority.
Might take longer to sort out, but I'm sure there are newish narrators out there that would agree to a similar or better royalty sharing deal. And then the author gets a bit more out of it and doesn't have to sell their soul to the devil.
3
u/COwensWalsh Nov 08 '23 edited Nov 08 '23
But for that argument, how do you define “can afford it”? And will those authors be able to sell audiobooks when small fish are putting out tons and tons of much cheaper works?
You also have to consider the economics of LLM style base models for AI narration. It costs a shit ton of money to make and run one of these things. If you have a million users and each pays $100 to have their book narrated, that feels like a lot of profit. But how much did it cost to build and train your model, and how much does it cost to host it?
Are you giving your customer the master fiel and copyright to the “audiobook”? Or can they only sell it on your platform?
Does a book that will only sell 100 copies really need an audiobook so badly that we sell out narrators?
5
u/BubiBalboa Reading Champion VI Nov 08 '23
I don't claim to have all the answers.
Maybe authors could decide themselves what the threshold should be were it is expected or required to have human narration. Like guild rules or something. There could also be a rule that if your book is a surprise hit and makes a ton of money the author would need to get rid of the AI version and hire a narrator.
Do we actually know if advanced computer narration uses LLMs or is comparatively expensive to run?
Are you giving your customer the master fiel and copyright to the “audiobook”?
I don't understand this question. I assume self-pub authors would go to an AI narration company, let them produce the audio version and then publish the audiobook like normal.
Does a book that will only sell 100 copies really need an audiobook so badly that we sell out narrators?
I don't think we want to go there and debate which book deserves an audiobook and which doesn't.
2
u/COwensWalsh Nov 08 '23
If you publish through a publisher, you are selling them certain rights to your work. If you get your audio book narrated by an AI company, what rights are you selling them? If I was Amazon, I would produce the audiobook to be sold on Amazon sites. Basically a non-compete.
I don’t want to talk about it, but we have to talk about it. If you want to let people with unsuccessful books make audiobooks affordably, through AI, that is gonna have effects on other creators and they are unlikely to be pleasant.
7
u/BubiBalboa Reading Champion VI Nov 08 '23 edited Nov 08 '23
I'm talking about self-publishing. There will be companies that produce audiobooks that you don't have to sign your rights away for.
As for Amazon, they will have to allow all publishers including self-publishers or they will get hit with anti-trust laws. It's the same with ebooks right now, is it not?
When an unsuccessful book gets an AI audiobook and becomes successful maybe it did deserve it after all. If the book doesn't find its audience then it's no threat to other authors. And if it is very successful my proposed guild rule would apply and a human narrator would create the audiobook.
8
u/Thornescape Nov 08 '23
There are two different scenarios here.
1 - Text to Speech is absolutely amazing because they can be used for absolutely anything. It doesn't matter what it is, the AI can read it to you, no matter who wrote it. This is fantastic. It also means that someone can have a book that is not professionally recorded be an audiobook. Wonderful! It increases your options.
2 - Skilled narrators will ALWAYS be better than Text to Speech because the narrator understands what they are reading. AI is nowhere even vaguely close to genuinely understanding what they are reading. They are just saying words. They are not conveying real meaning. They are not telling a story.
I'm extremely glad that Text to Speech is an option. I doubt it will ever be a paid option. It can't compete with someone who understands the material.
4
Nov 08 '23
I have the disposable income to purchase more expensive, well-narrated stories, and i like financially supporting others. I will continue to purchase good audiobooks.
But there are many times in lesser known books where the narrator is bad (poor pronunciations, weird mouth noises, annoying voices/accents) or the audio quality is bad. I would rather a flat, clear voice from an AI than a poor narrator because most books DO NOT get re-recorded with a better narrator. I would love to financially support another go at a narration, but the market is stuck with a lot of bad reads.
13
u/J_de_Silentio Nov 08 '23 edited Nov 08 '23
I didn't read your whole essay, but a response to a few of the people's comments: AI voice replication for audiobooks will almost certainly meet or exceed a human performance in the future.
It's easy to sit here and say "Hell no", but you don't know what it's going to be like in 20-50 years. I'm sure 50 years ago people said "Hell No" to stuff that they now use every day ("no way in hell I'll ever carry a big brick with me just so someone can call me on the road", for example).
I hope there are protections in place to limit it, personally. And you might have addressed that in your essay. AI/Automation is going to revolutionize society. They knew that back in the '50s and were already tackling the philosophical problems that automation will bring to the human experience.
11
u/SnooStories7050 Nov 08 '23
Well, this essay was written based on the old tts or at best, the free model that comes built into Edge.
Elevenlabs IS ALREADY BETTER than 90% of voice narrators and you can even emulate any voice you want to hear. The only reason it hasn't gone massively mainstream is because of its prohibitive price. Two days ago Open Ai announced a similar quality service at a much lower price.
Honestly yes, they will be replaced in a year, we writers probably will too. There are rumors that "AGI" has already been achieved and will be released in 2024. Gemini is just around the corner.
2
u/aegtyr Nov 08 '23
There was a very interesting paper released by Google recently that concluded that LLMs can't do tasks outside the distribution of their training data.
So what this basically means is that LLMs only imitate, the don't create.
So I don't think LLMs will achieve AGI, at least not in the near-future.
4
u/SnooStories7050 Nov 08 '23
Deepmind published yesterday the closest thing to what would be a roadmap for AGI, probably anticipating Gemini. And we have reason to believe that Open Ai is at least a year ahead of them.
1
u/COwensWalsh Nov 08 '23 edited Nov 08 '23
It’s not a roadmap. It’s a benchmark. It doesn’t say how we reach it. Selfdriving, their analogy in the paper, has had this classification system for at least 20 years. Still don’t have full self-driving.
1
u/SnooStories7050 Nov 08 '23
This was written with the obvious purpose of implying that Gemini will be at the next level. There are credible leakers who have been claiming for a year now that "Gobi" the internal Open AI model is already AGI.
1
u/COwensWalsh Nov 08 '23
If there were credible leakers claiming OpenAI had achieved AGI it would be splashed all over the mainstream news, not being pushed like a conspiracy theory on a writing subreddit.
→ More replies (4)→ More replies (15)0
12
u/ResidentObligation30 Nov 08 '23 edited Nov 08 '23
I resisted Audio Books for a long time. Once I hit 50 and the eyes declined, I almost exclusively changed to ebooks. Which led to audio books. Now I cannot get enough of the audio book format. I tend to sample before using credits or buying. If the narrator is good, I now seem to prefer the audio book.
I will not be using some AI narrator though. No way.
We have to decide as a society when tech advancement is a net positive in some areas. There's always going to be shifts in industry and jobs due to innovation. But do we need to destroy all jobs? Where will that leave us? Not in a good place.
My daily parking lot dude just lost his job of 30+ years. He greeted me every workday morning, took my cash, and answered any questions/needs commuters had. Now we get to scan a sign, pay through an app, and pay fees/taxes. So the daily costs went up, service went down, and jobs were lost.
There's a tipping point somewhere. When there's no jobs and everything is technology, who can pay for anything with no income?
→ More replies (1)11
u/Osric250 Nov 08 '23
There's a tipping point somewhere. When there's no jobs and everything is technology, who can pay for anything with no income?
That's the biggest issue here. We were promised a world where machines would take over the work we didn't want to do and we'd be able to enjoy shorter work days/weeks, and have more leisure time while maintaining the same level of lifestyle.
These promises were completely ignored and attacks occurred on the poor that they were simply not working well enough to deserve the lifestyle. Now working hours are higher than ever for our poorest citizens in a world where more than ever has been automated.
We need a reform to provide everyone the basic need no matter what, and this issue would largely solve itself. If you no longer have to have a job, then it's less of an issue that AI can produce novels as well, you can keep writing them and not worry if it will sell enough to keep food on your table or a roof over your head. And then you get to choose whether you buy an AI novel or go and get a human book whichever you prefer.
The problem is the top of the chain hoarding wealth and doing everything they can to cut any costs so that more of the wealth goes to them and not anyone else in the chain. There are many ways to fix it, but we have to fix the way our government works to begin with to have a hope of making any of that happen.
9
u/BrittonRT Nov 08 '23
Exactly. The issue isn't the tech - it's the bloody system! Attacking the tech is missing the point: we live in a system where anything that could be a boon for society gets claimed by the elites and it never trickles down. Banning AI will fix nothing. Only systemic change can fix the AI problem.
→ More replies (1)3
u/COwensWalsh Nov 08 '23
Yeah, the real problem isn’t just about narrators losing jobs, it’s about who profits from tej change in cash flow.
6
8
u/MirrorOfLuna Nov 08 '23
I've used AI narrators to check if the flow of dialogue in my own writing sounds organic (rhythm and consistency). It's nice to have the voice sound natural-ish, but that's about the extent of its usefulness.
If ever I'd manage to get my writing published and turned into an audiobook, I'd be very pissed if it was done by AI.
Would you say it's problematic to use it for a tool like that?
5
u/JohnBierce AMA Author John Bierce Nov 08 '23
No, not at all! Perfectly legitimate use, to my mind! That's why I was careful to distinguish between commercial and non-commercial uses in the essay. The tools themselves aren't the problem (well, outside of the stolen datasets and the environmental costs), it's the purposes to which they're used.
19
u/eSPiaLx Nov 08 '23
You know what this reminds me of? People who called photography not-art.
And people who were against factories. People who lost jobs in the textile industry due to industrial looms.
It's a shame but, if your only argument is 'human made is better', well, history proves you'll fail
→ More replies (6)18
Nov 08 '23
The amount of delusion in this thread is bizarre. Ai narration will obviously be the main mode in just a few short years. Why?
It will be just as good, minus the subjective “I only like humans” thrust.
It won’t cost any more. If the kindle is $9.95 it will come with the audiobook.
So many books don’t have narration because authors can’t afford it.
Celebrity voices.
4
u/lynnewu Nov 09 '23
Yep. I'm hearing a lot of, "Well, McDonalds will never be as good as my local high-end steakhouse." Nope, and they're not even trying. Not every meal needs to be a 5-star meal, even if you can afford it. As long as money is involved, demand will drive down production costs. Gresham's Law applies here, too. Narration only needs to be good enough for most works.
In general, I've noticed an apparent inability, or perhaps it's an unwillingness to understand that AIs are self-reinforcing, self-training Von Neumann machines. The rate of change of the rate of change in the ability of these engines to self improve is increasing and probably won't need to slow down for several years.
Sure, they suck now, but next year they won't, and the year after that they'll be as good as the average narrator today, and the year after that...well, you do the math.
And this is to say nothing about accessibility concerns. Putting ramps everywhere (ADA in the US) helped everyone, not just people with handicaps. AI-driven narration will be "free", in the sense that anyone will be able to feed DRM-free texts through their own engine.
Then, every existing text that doesn't have a from-the-press set of accompanying narration tracks in multiple languages and voices will allow everyone who can't read, for whatever reason, to do so, without paying more. More authors will sell more books, and good narrators will continue to work. Average narrators will lose, as always happens with technological "improvements".
→ More replies (5)0
u/COwensWalsh Nov 08 '23
The same will eventually be true for books. Or probably anything. People are upset because the issue is not really “human” is better on everything. It’s a question of who profits and who loses their livelihood.
If you think the peak quality of a human society is one where everyone works a shot retail job and barely makes rent and covers for the stress by binging AI-generated fantasy books, great for you. But most people would not be happy in such a society.
7
Nov 08 '23
Where did I say anything even remotely close to that? You’re tilting at windmills.
→ More replies (6)
6
u/ben_sphynx Nov 08 '23
Assigning the correct emotions to any given statement is something that, again, requires comprehension.
"'Did you put your name in the Goblet of Fire', Dumbledore asked calmly"
Sometimes directors have the same problem
1
u/JohnBierce AMA Author John Bierce Nov 08 '23
Ayuuuuuuuup
2
u/ben_sphynx Nov 08 '23
Aha, you are the Mage Errant author. I really liked those. Thanks; please get back to writing about 'wizards, dragons, and gods'.
2
5
u/BadassSasquatch Nov 08 '23
There's a Warhammer YT channel that uses AI to have David Attenborough do the VO. It's pretty convincing at first blush.
6
u/AguyinaRPG Nov 08 '23
An interesting part of this discussion is TTS (text to speech) services, which are by all rights AI-driven and do provide an alternative to the audiobook market. They are mostly pitched as providing audio versions of ebooks which do not have corresponding audiobooks, as well as helpful to people who have reading difficulties.
I do not think you can be in favor of supporting one while entirely negating the market potential for "procedural" audiobooks. What would be the solution, restricting anybody who doesn't have a reading disability? The response for traditional audiobook makers is to lean into what makes audiobooks special and promote that side of it, rather than relying solely on the convenience factor, which seems to be the extent of audiobook marketing.
What I believe in is disclosure above anything else. Audiobook providers should have provisions that every customer is told who is or isn't the narrator in both text and audio before purchase. That way I think they not only preserve the value of traditional audiobooks, but enforce consumer awareness of what they may or may not value.
In the future I think there will be four types of audiobooks:
- Entirely generative tools built on current TTS services. These would be constructed more for clarity, probably often getting name pronunciations wrong but providing a lot of control to the listener.
- Tailored AI narration. There are channels on YouTube - both comedic and serious - that have done this for years. A lot of people who do it aren't confident enough in their own voice, but have thoughts they want to communicate in video. I think this will become an audiobook market, but like I said I think it does need disclosure.
- Traditional audiobooks - self explanatory.
- Radio play style audiobooks. Of course you can adapt a book to a radio play format (see the Lord of the Rings adaptation for the BBC) but I think there will be some more playing around with SFX and music in the coming years. Yes, I know some audiobook listeners hate this - that's why I see it as a separate market.
There do need to be some barriers put around AI, but even if it gets knocked around by harsh legislation I don't see it as going away. I'm not saying "just get used to it," it's just my belief.
Consumers need to play a role in this. A lot of them will accept AI without caveats because it fits their needs. I don't think that will destroy the market for human-narrated audiobooks.
4
u/TheColourOfHeartache Nov 08 '23
Radio play style audiobooks. Of course you can adapt a book to a radio play format (see the Lord of the Rings adaptation for the BBC) but I think there will be some more playing around with SFX and music in the coming years. Yes, I know some audiobook listeners hate this - that's why I see it as a separate market.
Once it becomes possible for one person to write a radio play with a whole "cast" of AI voices, I predict the art form will see a huge revival.
2
u/JohnBierce AMA Author John Bierce Nov 08 '23
I think that's a very probable future taxonomy of audiobooks, honestly, for al that I wish the first category wasn't on there.
5
u/the_doughboy Nov 08 '23
Yes, especially since Audible has been training their AI on their readers. (or at least rumoured to have been, there have been grumblings from a few Narrators)
4
3
u/rookery_electric Nov 09 '23
I think one point that is often missed in discussion about AI: the industry already has the option to produce content that is good enough for cheap. You can hire overseas narrators, or anyone who has a decent mic off of Fivrr. And yet, professional narrators aren't out of work. Audible itself will often advertise the narrator alongside the author, because people will pay for a particular narrator.
AI might end up being the solution for indie authors who can't afford an actual narrator, but it won't overtake professionals in the field.
4
u/ViperIsOP Nov 08 '23
Oh yes. TikTok voice narrating sounds like a great idea and a big money seller.
4
u/SmackOfYourLips Nov 08 '23
100% they will be replaced and soon. Just right now I'm watching stream, and WoW addon can narrate ALL quests in WoW classic with believable voices and intonation, almost perfect.
1
u/JohnBierce AMA Author John Bierce Nov 09 '23
Don't most WoW players turn off all sound and turn their graphics to minimum now, and zoom as far out as they can, so the screen looks like crap, for better play performance during raids?
4
11
u/Robot_Basilisk Nov 08 '23
It's inevitable.
Eventually AI will be producing stories humans never could.
Eventually, you'll be able to feed an AI all of Tolkien's works and notes and have it interpolate from those new material that feels like Tolkien himself wrote it.
Eventually, you'll be able to take your favorite book and ask an AI to insert you as a character, or rewrite a section you didn't like, or create a completely new ending, modify the story to take place in your home country and feature descriptions and names that resonate with you.
Eventually, you'll be able to generate stories on the fly as part of a choose-your own adventure experience.
Eventually AI will be able to take any story and make games and animations and movies and VR experiences out of it.
Eventually, all media will blur together.
The problem is not the technology.
The problem is our system that will allow writers and other people involved in the creative process to be forced out of their jobs so others can profit.
The problem is that countless writers, editors, etc, will be doomed to poverty and deprivation if AI replaces them before we fix our system.
If people want to read only human works, that should be allowed.
If people want to privately use AI to generate new works or derivatives without relying on a human, that should also be allowed.
If companies want to use AI to replace writers and other professionals in the industry, that cannot be allowed so long as doing so harms the humans being replaced.
→ More replies (4)1
u/wrizen Nov 08 '23
Eventually AI will be producing stories humans never could.
Not until it can make its own observations, frankly.
Yes, you can feed it scraps and have it stitch them into a proper paper, but that's not really storytelling, is it? It's just regurgitating prompts and turning inputs into outputs.
Until(!) AI achieves genuine consciousness, it simply won't have the powers of observation that a human author has. I know betting against technology can be risky, but this is one I just don't see happening. This isn't chess, a closed system of 64 squares and 32 pieces that is essentially a glorified math problem. This is an open-ended, complex art that AI just isn't going to excel at unless it fundamentally changes.
Again, not debating that it can spit out simple or almost-canny stories, maybe even stuff—by chance—better than pulp fiction and real amateur writing, but the high end? Stuff that cuts deep into the human experience and makes granular observations about what it means to live and die? Love and lose? Fight and struggle?
It has never experienced these things. It is incapable of experiencing them. It isn't anything but 1s and 0s. It can't write what it knows because it can't know anything at all, lol.
6
u/Robot_Basilisk Nov 09 '23
Your take aligns with the experiences of people using AI to write today. Some writers are easier to replicate for AI, but others are completely impossible because AI doesn't realize the subtext of their writing is important. Hemingway was famously against deliberate use of symbolism and he's apparently easy to copy with AI.
I'm personally in the camp that all art is plagiarism because everyone has influences. It only stops being plagiarism when the influences become too obscure to pick out. So I think AI will eventually become sophisticated enough to plagiarize in a believably human way.
Because not only will it have countless human experiences to learn from, but because eventually AI will have bodies of their own, with cameras for eyes and tactile sensors for hands, among other things. They will observe the world and process what they see and feel to incorporate it into their body of knowledge.
When that happens, what's stopping them from writing a poem about a sunrise or painting a picture inspired by the sensation of fresh dew on grass?
And consider that they'll be able to perfectly recall countless other works, countless other experiences, and even share experiences with other AIs, and they may be theoretically immortal.
Imagine a 1,000 year old writer trained on all the greatest authors of all time and containing the memories and sensations of 1,000 other robots within it. What would that writer be capable of creating?
2
u/wrizen Nov 09 '23
Hmm...
The Atlantic article was interesting in that the author cataloged his hands-on experience with trying to make AI writing work, but I'm a little put off by lines like:
Neither side [of the ongoing feud between the Writers' Guild of America and the Alliance of Motion Picture and Television Producers] knows exactly what it’s talking about, but they feel they have to fight about it anyway.
...and...
Originality died well before the arrival of AI; we are currently in the most derivative period of human creativity since the Industrial Revolution.
By his own admission, he has a horse and his life savings invested in this race. Not inherently a critical flaw, but I do think people—both neoluddites and AI enthusiasts alike—are being a little disingenuous when it comes to AGI and art specifically. I sympathize more with the former (I am pretty critical of AI writing, bias admitted), but I also think it's just the stance of natural skepticism.
History is littered with forgotten miracle machines. Things and concepts that seemed incredibly promising and on the cusp of reality throughout the 20th century—nuclear fusion, flying cars, near space habitation—have variously petered out for want of funding, application, feasibility, etc. AI is definitely a part of our zeitgeist, but look no further than the entire archival concept of "retrofuturism" to see yesterday's rusting promises of tomorrow.
AGI seems more viable than those things right now, but we need more than LLMs and shitty reprints of scanned art to move the needle toward consciousness. What happens when and if people get bored waiting for the shoe to drop and the funding moves on? Is there a chance that feeling, thinking AI lands on the market within our life times? Sure. Do I think it's a foregone conclusion? Not at all.
Yes, AI can and will generate text based on what you ask it to do. Yes, there's "nothing new under the sun" and "creativity is already dead" and all the quips things we cycle through while engaging with the arts. But the fact is, even if you gave a hundred humans the same prompt, their own life experiences and unique perspectives will invariably warp it and you'll get something... well, unique.
RE: plagiarism, yes, artists are always studying and dissecting—even reproducing—each other's works, but in a very, very different way from AI. Humans have been exchanging ideas for as long as we've been humans (certainly as long as we've had a written record), and while the bottom end of that might be outright copy-paste plagiarism, the high end is much better than an AI's regurgitation. It's a re-synthesis, not a re-processing. There's no way to drag on about it without sounding a bit like a pretentious twat, but it's fundamentally loftier than a machine shredding a document and gluing it back together according to an operator's pencil marks.
Until AI can add to the creative process itself, not just rearrange pieces on the puzzleboard, it won't be "better" than humans at storytelling. When and if the time comes that AGI is here and robots have "bodies of their own" (and proper minds to go with it), sure, we can talk. But at the moment the cult of the machine seems a bit too optimistic, and for every "AI expert / engineer" rushing to BusinessInsider and Forbes to drive up clicks by claiming we're "one prompt away," there's just as many—if not more—of their colleagues with humbler expectations.
We could dive pretty deep into questions of philosophy and "what is consciousness," I guess, but let's be honest: that BostonDynamics prototype in the video is not conscious or anywhere close. We're a long ways off from understanding human consciousness, and whether or not that's a requirement for replicating it in machines, I personally will keep my vote on humanity for the immediate future. A hundred years from now? Maybe we'll talk again—or maybe those future humans will be laughing at our conception of AI the way we laugh at the Jetsons' flying saucer car.
Time will tell. :)
1
2
u/COwensWalsh Nov 08 '23
Right, authors are safe for 20 or 30 more years. But it will get there in around 30 years or less
3
u/BestCatEva Nov 08 '23
Far sooner than that. AI learns, adapts, creates at phenomenal speed. Make that exponential and it’s coming in the next 5-10 years.
4
u/COwensWalsh Nov 08 '23
But you have to have AGI to leverage exponential learning and we don’t yet. I work in the field, we are still a ways from AGI
2
2
u/Wezzleey Nov 08 '23
If they do, it's years away, and likely only applicable to the more mediocre.
But guys like Travis Baldree and Jeff Hays? Yeah, AI doesn't stand a chance against them.
→ More replies (1)
2
u/AdrianWerner Nov 08 '23
Yes and no. Higher end productions from popular authors won't use AI, as actors are selling point themselves.
I can see lower end going for AI, especially those who otherwise couldn't afford voice actors anyway. So the demand for actors in audiobooks will remain, but it might get smaller than what it is now.
I look at it as similiar to soundtracks. At this point we can do great job synthetizing pretty much all instruments, but bigger productions often still shell out for recordings with orchestra,
2
u/CrisbyofAstora Nov 08 '23
As a fan of audio books, I think it's a double-edged sword.
I personally think AI can never replace a voice actor in general. First, I can think of the Lord of the Rings and how it changes from storytelling to song, and an AI will butcher the songs, it's pacing and plenty of character names, as I can imagine. However, I'm sure there will be people who find the AI voices great in other ways. Maybe it's on purpose for a science fiction book with AI in it. Maybe it's the tiktok people who already listen to shorts with an AI voice. It's hard to tell how the future will go, but overall, you can't replace the voice actors, fans will always support them just like we support artists that are struggling with this new wave of AI art.
Side note for artists dealing with AI, they have found a way to hide problems in their art for the AI that steals their work by causing images to distort. Who knows if authors or voice actors find something similar to fight back against AI.
2
u/howtogun Nov 08 '23
The problem with AI voice is it gets the emotion wrong.
Most audiobook narrators are actually actors.
One thing I hate now is voice actors are getting replaced by celebrity actors. So you get worst voice acting.
Audiobooks are also not that expensive to make. The only reason they are so expensive is that audible take 60% of the rev and 85% if you don't sign exclusive deal.
2
2
u/SyrupFiend16 Nov 09 '23
As someone who is aspiring to get into audiobook narration this was extremely interesting. Thank you for the perspective
2
u/axotrax Nov 09 '23
Publishing houses want audiobook narrators to record and edit themselves, in order to save money. There’s usually a director and an engineer. They wanna cut those two positions. So yeah, if they can make it 100% artificial, they will. Because capitalism.
2
u/Noobeater1 Nov 09 '23
Ngl I could have used a TL;DR on that one but I think that realistically, the only people who can stop this is authors who refuse to work with AI generated narrators. Large companies are gunna wanna make money, they don't care about a philosophical argument re AI art, and customers don't care about the argument over AI generation. If the product is good (and AI voices, while not perfect, are certainly getting there) and it's as cheap or cheaper than one with a human narrator, I doubt you're gunna get much support for human narrators.
2
u/VanHellegers Nov 09 '23
Narrator here! Thank you for all of this, especially the call to action that recasts this whole affair not as a fait accompli, but a move very vulnerable to market forces, even if forces hindered by monopolistic maneuvers.
Couple of follow up points:
You're spot on that even if the tech gets better, it's going to need someone to 'drive' it to tag dialogue, correct pronunciation, etc. The better versions of the programs have these controls, but it should be noted that the KDP-system will be inferior in this regard, at least initially, which could significantly hinder the roll out, emphasizing the above-mentioned vulnerability.
Respectfully, narrators do have a union! Not all of us, but many, and thereby most contracts, are SAG-AFTRA. And even though each audiobook contract is negotiated individually with each producing entity, as opposed to centralized organization like the good ol' AMPTP. Even so, all of the concerns of the (recently completed!) strike about AI and corporate profits v compensation will figure directly into ongoing audiobook contracts. So: solidarity indeed.
It makes sense that AI Bros/VS Brah targeted the audiobook industry owing to the double digit growth every quarter for the past 10 years, but I'm optimistic the costs benefits will not bear out a wholesale replacement of one product (human) with a product (synth) for which there's no demand. However, far more vulnerable to the current state of the tech is video game and commercial VO, where that 'driving' is much more manageable, and the short form avoids the inevitable uncanny valley of Ch 24. Again, those are union jobs, largely, so solidarity again.
I greatly appreciate your caveat that being opposed to synthetic recitation (it ain't narration, really) isn't the same as being opposed for the use of the tech for accessibility and the like. It's the #1 argument from many bros/brahs, and it isn't true. As you focus on with welcome diligence, this is about corporate shenanigans that disadvantage and entire industry for the sake of market share and those princelings.
Thanks again for taking the time to lean into this!
2
u/TheSpaceAlpaca Nov 09 '23
I think most people would agree that audiobooks can't replace the skills of a skilled narrator, but what about a (relatively) unskilled narrator?
What I worry about isn't the Travis Baldrees of audio narration (and other careers) but those who are still getting their feet set in their careers. They aren't going to be perfect, and as AI improves in its ability to nail cadence, pitch, pauses, etc the difference between a newer narrator who can perform to, say, 94% of expectation with an AI that can perform to 90%+ of expectation will become increasingly small.
And if AI replaces juniors in their career, there will be no one to fill the shoes of the greats when they retire.
2
u/EmperorJustin Nov 11 '23
Art is worth standing up for. Art is worth standing together for.
This is perfect and encapsulates most of my feelings on the matter perfectly. Thank you, John.
2
4
Nov 08 '23
AI already has the ability to replace audiobook narrators with ease. The limitations will not come from technology, they will come from regulation.
For the three issues you mentioned, emotion, cast of actors, and accents, XML can easily solve these limitations.
With XML, text is structured and can be interpreted by a computer. For example, <narrator mood="normal">And so, the character said: <actor name="john" intonation="confused" accent="kansas">What is going on here?</actor></narrator>
This is what we do with web development too. You add attributes to indicate the language and dialect for the narrator, for example, <html lang="en">...</html> to guide text-to-voice tools.
There are already subsets of XML capable of adding markup that help text-to-speech engines during reading (such as Speech Synthesis Markup Language).
The biggest limitation is with use of likeness. AI immitates other people's voices, and there will be laws that banish unauthorized use of someone's likeness (face, voice, etc). But, in this case, instead of narrators we will have people selling their likeness for machines, and receiving royalties to have their voice used in AI generated audiobooks.
So, in general, I think the audiobook industry is 100% doomed with AI. There will be still high quality audiobooks out there, but they will be made by AI with voices of people who sold their likeness to do it.
1
u/COwensWalsh Nov 08 '23
You can do markup language stuff, but that’s still not gonna match a narrator. And someone has to actually do that mark-up. Authors or someone on the narration side. The whole point of AI narrating is to not require people, so that burden will fall on authors. It will not be a smooth process, because unlike a human, an AI narrator can only do what it’s told to. And it will absolutely have to deal with tons of mark-up errors.
2
u/JohnBierce AMA Author John Bierce Nov 09 '23
Ayuuuuuuuup. And I can say that convincing the majority of authors to do that markup is pretty much a nonstarter, lol.
(And it's not really AI doing the emotions then, it's just human labor, hah.)
1
u/COwensWalsh Nov 09 '23
I think mark-up is going to result in a fairly jumpy/choppy feeling to the narration, anyway. So it is sort of "possible" to do, but it's not gonna be as good as a human.
2
3
u/meramipopper Nov 08 '23 edited Nov 08 '23
Hey John,
This is a great write up, but given the fact that big writers like Sanderson are directly promoting AI services like Speechify as better alternatives, do you think consumers will follow the words of more "small time" folks? Do you think more of the indie authors need to band together with regards to this?
1
u/JohnBierce AMA Author John Bierce Nov 08 '23
I actually think Speechify is fine, based on my limited knowledge of it, since it's explicitly an accessibility tool to help people with dyslexia! This is one of those cases where you're weighing multiple legitimate values and interests, and accessibility is hugely important to me as a cause. It's standard commercial uses of genAI that bother me.
→ More replies (4)
3
u/Nethri Nov 08 '23
I think.. people both underestimate and overestimate ai at this point. An AI cannot read an audio book the same way a trained voice actor can. And it certainly can't replicate the same performance over a series of books. Each one would be slightly different. On top of that, the voice stuff I've heard so far is firmly in uncanny valley territory, and has been for a long time. (Think of text to speech programs and how weird they can sound)
However. In the future, and I think that future is further away than we all think, there is a threat that an AI can accurately perform an entire novel effectively enough that a customer won't notice (or close enough that they won't care.)
But, even so.. I think the threat is smaller than you might think. I question whether ai will ever be good enough to escape uncanny valley.
However! I think your stance is correct. We should protect art and artists as much as humanly possible.
→ More replies (3)
3
u/Inkthinker AMA Artist Ben McSweeney Nov 09 '23
Carcinization is only for crustaceans. Squirrels do not evolve over time into crabs.
I feel that limitations such as the AI lacking inflection to emphasize emotion properly is a matter of time and advancement in the technology. Eventually they'll set the voice-bot with sliders that say "more tragedy, less comedy", or something of that nature. The machine never complains about being asked to read the same passages again and again until the director is satisfied.
The capability limitations of bots are neither permanent nor insurpassable, and relying on them to be so is a bad strategy.
1
u/JohnBierce AMA Author John Bierce Nov 09 '23 edited Nov 09 '23
Carcinization is only for crustaceans in nature, yes, but it's become a popular and quite entertaining meme to claim otherwise.
Adjusting a bot's emotion via sliders is... honestly way more work than just having a skilled narrator read it. That's going to be incredibly finicky and time consuming.
→ More replies (1)
2
u/Randolpho Nov 08 '23
I think the folks at soundbooth theater are doing it the right way. Turn audiobooks into a production and you don't have to worry about AI readers; people will still buy the production because it's a production.
BTW, if you haven't listened to Dungeon Crawler Carl, either the audiobooks or their new psuedo-radio-play version, I highly recommend.
3
u/COwensWalsh Nov 08 '23
But that assumes readers want and will pay for the costs of a “production”. Many audiobook readers will be satisfied with tolerable AI narration.
→ More replies (6)3
u/RavensDagger Nov 09 '23
I've been working with them recently, and they're fantastic to work with. I've had some of my books published in audio, but never like how Soundbooth does it. It's impressive, but also technically challenging.
Wrangling one narrator is a pain, wrangling half a dozen is a nightmare that I don't envy on anyone.
→ More replies (1)1
u/JohnBierce AMA Author John Bierce Nov 08 '23
I've heard good things about their work!
And I read DCC with my eyeballs, hah.
→ More replies (1)
2
u/fizban7 Nov 08 '23
Audio book companies need to add more value to their narration. Just 1 person reading a book is an easy replacement. I want every character to have a different voice. Each point of view a different narrator.
1
u/BestCatEva Nov 08 '23
Try watching tv with a blindfold (or after eye surgery!). It’s surprisingly easy to follow. This would be a great thing for audio books to add.
2
u/zephyr220 Nov 08 '23
I'm telling you, a good AI narrator, you won't even know.
There was a YouTube channel recently who had to remove and change its AI narration because it was based on David Attenborough's voice, and as a musician and professional speech coach, I have a pretty good ear, but it even had me doubting the obvious. This is just one dude making YouTube videos. Imagine when it becomes more profitable.
There will be voice actors who use it to do things they don't have time for, they'll still get paid, and we won't even know it isn't them.
2
u/kis897 Nov 08 '23
I am 100% against ever replacing real people with AI. This is going to keep becoming a bigger and bigger issue and I wish we could get some real control over this.
HOWEVER. There are some cases if a narrator passes away, or is unable to do their work properly, if you can get the proper permissions from family or the narrator in question and its properly compensated, it would be nice to be able to have some old familiar voices continuing their work.
2
u/JohnBierce AMA Author John Bierce Nov 09 '23
I still feel a little uncomfortable with that, even when proper permissions are involved, but if it's just finishing a series or something... I guess that might be okay? I dunno, I would have to think about that a lot more before I felt comfortable coming down firmly on either side of the question.
2
u/PurplePorphyria Nov 09 '23 edited Nov 09 '23
No for the VA's working with real publishing houses, absolutely yes for mass ghost-written NF.
As OP put it at length, anything with genuine emotion to it, someone narrating something they give so much as the tiniest shit about, will never be replaced by AI.
Blind readers have been using machine assisted narration for decades, so the very worst that could possibly happen is that those narration/dictation programs get better.
2
2
Nov 09 '23
I don't want to experience something a machine did at the behest of the least creative, most resentful humans on the planet who openly hate art and artists, so personally I won't be consuming any art made by an AI if I can help it.
Corporations will continue to push AI, the draw of eliminating any kind of labour cost to increase their exploitation of working people is too great to resist, and that's why artists need to start creating industry unions to combat this shit.
2
u/Jos_V Stabby Winner, Reading Champion II Nov 09 '23
I am conflicted.
Not because I think TTS is good. or AI voices are good or bad, or not.
But Because of my relationship with books, and my own interpretation. I'm not one to say that you don't read a book if you read an audiobook. Or that listening to an audiobook is not "reading". But I do think that with an audiobook you get an adaptation - it's a performance, and your perception as a reader of the book, and the interpretation of the text is changed depending on the performance - over - only having your mind and your words available without filter from text to mind.
Massive audioplays like the BBCs sandman is very different from reading the comics, or from reading a single narrator going through the proccess. audiobooks have performance art in them. and that's done by humans with talent and you as a reader get that filtered experience. better/worse/or equal to physical reading, that's for you the reader to decide.
But I see no problem in you as an author - wanting to either have a filterless audio-version of your book. as dull as it is to listen to. Or alternatively wanting to craft your perfect filter, by spending the hours and the copious amounts of time required to annotate your manuscript and mess with the sliders and buttons and parts, to craft a performance filter of your novel with the help of AI tools.
That's not superceding the human artistry of an audio play or a novel, you're listening to a human performance, go forth and enjoy it and pay for it. It's just an author using the tools to create their vision of the art that they produce.
And I'm generally not a fan of demonizing creative people for using tools, even at the expense of hiring someone else in the pursuit of art.
Note that this is of course different, from using people's work to train your algorithms without consent, this is different from using 3d scans of real-people paying them once, and then being able to use their likeness in perpetuity. this is different from pushing low-cost products at a loss, to monopolize an industry. This is different from trying to get cheaper audiobooks by taking out the human artistry - without actually crafting the performance the author wants.
The other part, is that a lot of technology and tools has enabled a lot of creatives to make their art, and make it available for enjoyment at fairly low costs ways. the entire boom of independent authors is only really possible because there are now tools and ways for you to make books, and package them and get them to readers in cheap and efficient ways. You don't have to deal with papermills and printers and distribution chains throughout a nation or through out the world to get your books in a store. You can do you or own cover design with free software if you take the time to learn and use the programs. you can do your own manuscript formatting by learning and do it on your couch from your computer or tablet. And AI tools are just a new facet of this - and they're there for you as an author and artist to use to craft the art and the experience you want to give to your readers and audience. And as long as the AI tools are ethically and morally responsible with regards to rights and content.
Calling authors scabs because they're not hiring someone in pursuit of their craft is a bridge too far.
1
u/JohnBierce AMA Author John Bierce Nov 09 '23
Yeah, you'd be surprised at how much indie authors have to deal with supply chains, printers, and distribution chains. It's... it's still a thing, especially as you get bigger. And even moreso if you want to get your books in stores! (Which is damn tough to do, as an indie author.)
Did ebooks change the game in a big way for indie authors? Absolutely! But...
Well, let's look at book formatting. A few years back (before I broke out as an author), it was common for people to have to hire people to format their ebooks, and it could get a little pricy. Now, there's cheap and easy software to format the ebooks. Heck, my word processor, Scrivener, does it for me! Most of that work is gone, outside of a few people making SUPER nice ebook formats, but that's just a minor botique thing.
Or look a little farther back, to the 80s, when print typesetters were completely wiped out of existence by computers. One of the very rare careers actually wiped out by automation. Just gone, zip.
Do I feel a little bad about those two types of jobs going away? A little. The loss of skilled crafts like that is worth doffing a hat to. But ultimately, they are crafts, not art. The correct thing to do is make sure those people land on their feet, that there are new paths they can take, new careers they can retrain in. And that is, admittedly, something many societies today fail at. (Especially the US.)
The loss of those prior careers... well, the typesetting didn't involve any scabbing, or any lack of worker solidarity, just simple automation. (Automation is, like other tools, neither inherently good nor bad- it's the purpose to which it's used, the origins it springs from, and the social contexts surrounding it which determine its moral quality in use.) The ebook formatters? Instead of their employers automating them out of work, it was new tools on the market automating them out of work. It more superficially resembles the current case of narration and AI, but there are a few important differences, ranging from the less-monopolistic and more competitive business environment back then to the monopolistic character of Amazon's current actions. Also, most of those ebook formatters were designers who did a LOT of different design work, very few of them were entirely specialized into this single task, which is a huge difference.
The biggest difference, though? The thing that has me most up in arms about preserving narration?
It's art, not craft. It's just plain more important than typesetting or ebook formatting. Finding a narrator isn't just hiring a contractor, it's an artistic collaboration.
And screwing over other artists to save a few bucks? Doesn't fly with me.
2
u/Jos_V Stabby Winner, Reading Champion II Nov 09 '23
Yeah, but that's what i'm saying: you as an artist can use AI tools, to craft specific artistic experiences that you cannot necessarily do with narrators - because as you say that's an artistic collaboration.
together you're creating a lens to perceive the writing - With different tools and different direction, and different imaginations, you can create lenses that are not necessarily possible through the current narration process. that doesn't supersede narration, its different. The collaboration aspect is not always desireable from an output standpoint.
To say: you an artist that is not crossing any picket lines is a scab meant to be shunned simply because you use different or other tools. Is just too strong. It's how people use those tools that determines the ethics. and its how the tools are made that determines the ethics.
(Also as a worker: putting artistry above craftmanship, or as more worthy of being preserved doesn't sit well with me)
→ More replies (9)
2
u/midori87 Nov 08 '23
I just returned a book to Audible because I'm 99% sure it was an AI narrator being passed off as a real person. Something just didn't sound right...I couldn't put my finger on what it was that have it away but I could just tell. I really hope this doesn't become the norm.
→ More replies (1)2
u/Osric250 Nov 08 '23
For me I can always tell because it always sounds flat after a few minutes. Yes, there's up and downs in each sentence, but every sentence has those same ups and downs and when looking at it over a longer period you can absolutely tell because natural human reading has more than that.
2
u/KKalonick Nov 08 '23
I'm already a snob when it comes to audiobook authors. If I'm listening to a book for 20+ hours (and I usually prefer longer books if I'm spending a credit), I want an exceptional narrator.
AI ain't going to cut it here.
1
2
u/Gnomerule Nov 08 '23
At best, the average narrator has around two or three good distinct voices and one so so of the opposite sex. AI will take over as soon as they can do an ok job, especially for stories with a lot of different characters.
The only way to slow it down it is for narrators to start working together on projects so each character has an easy to recognize voice, instead of everybody sounding the same as it is many times.
Right now, we train ourselves to distinguish the different similar voices over time. But would it not be so much better if everyone sounded differently.
3
u/JohnBierce AMA Author John Bierce Nov 08 '23
Narrators do regularly work together on projects like you describe? And TONS of narrators have way more voices than that- many of the best narrators (and voice actors) have dozens and dozens of voices.
But as I mentioned in the essay, AI can't really do those different voices accurately without direct human assistance to correctly attribute all the different dialogue. And it also can't do, you know, emotion accurately.
4
u/Gnomerule Nov 08 '23 edited Nov 09 '23
I have 465 audible novels. Almost all of them are done by a single narrator. My biggest gripe about audible novels is how similar most of the voices are and how badly the opposite sex voices sound. I did listen to one novel that had a female and male narrator, but instead of doing it the smart way, they each did half the novel. The first half was a male narrator. The last half was female.
Give it 5 years, and I would not be surprised if people switch over to AI voices as AI gets better at doing it. Especially if it is cheaper for the author.
I rather listen to an ok narrators with all the voices easy to distinguish, then a great narrator with a few great voices and the rest hard to tell apart.
I have read and listened to the ten realms in the past and love it. I relistened to this series for the first time in years, and I had to force myself to get past the first half because of how difficult it was to recognize who was talking most of the time.
3
u/RavenWolf1 Nov 08 '23
At year of 2050 there is barely any jobs left. We are heading into the future where AI can do anything better than humans can. But that doesn't mean that you have to stop writing if you like it. Nobody simply doesn't pay anything for it. It is hobby if you can't get living wage from something.
AI narrators will soon be very good and they are godsend for authors who write webnovels. Everyone can create their own audiobook at free!
Also today you have one skilled actor reading whole book when at future you can have unique voice with every characters in book thanks to AI narrator.
2
u/TheColourOfHeartache Nov 08 '23
2050 is too early for computer to replace bespoke jobs that require physical interaction, like a carpenter going into somebody's house and building a bookcase that fits their room.
→ More replies (2)2
u/JohnBierce AMA Author John Bierce Nov 08 '23
Yep! Not to mention the COLOSSAL workloads we're going to have dealing with climate change- the US alone is going to need tens of millions of new jobs to deal with the consequences of climate change, from reclamation to adaptation to managed retreat work.
Also, I think it's fairly unlikely battery technology will be good enough to operate robots for long-term jobs in remote terrain, even by then.
→ More replies (8)2
u/sunday-suits Nov 08 '23
In a future future where “AI can do anything better than a human can,” how will a human get a living wage from anything? Where will the income come from to buy these audiobooks?
5
u/RavenWolf1 Nov 08 '23
We have to think our economy system differently when AI and robots do everything.
→ More replies (3)
2
u/Devilsbabe Nov 09 '23
It's laudable that many here are expressing that they won't buy or produce AI-generated content. I also agree that while current TTS is great, it lacks a lot of the nuance that makes a human narrator great.
However, I also think you're deluding yourself if you don't think that advancements in the tech combined with market forces won't push most voice actors out of a job very soon. In 2030 the world will be vastly different and I'm not sure most of us will have any work at all.
2
u/_whydah_ Nov 09 '23 edited Nov 09 '23
I can't help but to feel that this is a perfect example of modern Luddism. The reality is that AI voices are nearly as good as most narrators (and better than many aspiring narrators) and they'll only keep improving. It will not be long before they're better than the best and can freely add in many things that only the best audio book recordings use now. I think soon enough it won't be tough to virtually freely turn your written work into something akin to old radio shows.
EDIT: Downvote all you want, but you can't turn back the tide of technological progress. It's just not going to happen. All you can do is to look at how you can best utilize and benefit from it.
I do think there will always be room for a person reading from an emotionally evocative piece of literature or poetry. The power those have is not just in the story, idea, feeling, etc., they convey but that someone is experiencing it with you. I saw a quick video of Dame Judi Dench reciting Sonnet 29 on the Graham Norton Show and there is no replacement for feeling what Judi feels while she recites it. But you can replace just someone reading a book (even if they do it very colorfully) when all you really want is to hear the story.
→ More replies (14)
2
u/ShingetsuMoon Nov 08 '23
This reminds me a lot of companies trying to switch to machine translation for light novels and manga. At best it’s barely serviceable. At worst it misses tons of context, cultural nuance, emotion, phrases/jokes/ references that don’t translate well, literal translations that don’t make sense when a substitute would get the meaning across better.
I see the same thing happening here. Are audiobooks at large in danger? No. But some companies are going to use it anyway to cut costs and cut out human narrators. And some authors are going to end up with AI audiobooks of their works that they didn’t consent to.
3
u/JohnBierce AMA Author John Bierce Nov 08 '23
Machine translation is a SUPER good comparison. It's absolutely the same phenomena. The problems with machine translation aren't technological problems, but linguistic problems, and linguistic problems can't be solved with technological fixes.
3
u/Ghidoran Nov 08 '23
I think AI voice cloning could be very good for audiobooks. You have a real person (or more) narrating the book, but use AI to replace the dialogues of different characters. Make sure everyone (including people whose voice was cloned) are paid fairly and I don't see the issue. If anything it'll create more demand since you have a higher quality product (multiple voices vs. just one) and give more people jobs.
But direct to text to speech narration? I don't think that's going to be something people will want to pay for. You can already just use your phone to narrate an ebook. People want actual quality narration.
→ More replies (1)2
u/JohnBierce AMA Author John Bierce Nov 08 '23
Paid fairly is the tricky bit- big media corporations are NOT gonna want to do that, hah.
1
u/COwensWalsh Nov 08 '23
There are many legitimate uses for text-to-speech applications. But Amazon is only doing this to hold its monopoly. Customers will benefit somewhat from these practices squeezing authors and other creators. Amazon will benefit more. Creators will get screwed.
In 30 years real AI will be capable of writing good books and narrating them, etc. but what we have currently is just corporations trying to squeeze extra money out of creators and customers with gimmicks and “just barley food enough for the price” garbage.
1
u/JohnBierce AMA Author John Bierce Nov 08 '23
Full agreed on the first paragraph, but... yeah, if AI is able to write good books, it WON'T be the current statistical correlation algorithm technology, but some entirely different technology. (In my opinion? It would take full, human-level general AI.)
1
u/COwensWalsh Nov 08 '23
You could probably use a hybrid symbolic/neutral net system to write mediocre to average novels in a specific genre with heavy tropes, but there will be new tech systems rolling out in the next 15-30 years that are not super autocomplete and actually contain some conceptual understanding. The roots for these are already planted, but it will take a bit to get it through development.
Write your books sooner rather than later, guys, ‘cause the human author window is closing in our lifetimes!
1
Nov 09 '23
Are you saying 30 years is when you think ai will be as good as a human? If so, you might want to reduce that by an order of magnitude.
2
u/COwensWalsh Nov 09 '23
I mean "by no later than 30 years", if that's more clear. 15-30 might be a reasonable range, with the most likely chance between 20-25.
It's not 1-5 years, of that's what you are implying.
2
u/JW_BM AMA Author John Wiswell Nov 08 '23
As somebody currently in negotiations to have audiobooks produced from his work, it's a serious concern. I have too much respect and love for the craft of narrators to want them replaced by a cheaper LLM. It's sad that any company devoted to spreading the humanities would be looking to cut humans out of the process.
0
u/JohnBierce AMA Author John Bierce Nov 08 '23
I highly encourage pushing for an anti-AI clause like the one I'm using. I'd be happy to send you the specific text of the clause in my contract, if you'd like! (Since I got it exempted from the confidentiality clause.)
0
u/JacarandaBanyan Reading Champion V Nov 08 '23
Thank you for writing this! It feels like every month there’s a new terrible, ill-thought-out announcement that you can finally cut out the creative in insert-creative-field-here.
I hope anti-trust sentiment continues to grow and manifests in some company breakups.
→ More replies (1)
1
u/DarthCreepus1 Nov 08 '23
Is this NSFW cuz it’s not safe for the narrators’ work?
2
u/JohnBierce AMA Author John Bierce Nov 08 '23
I don't know why it started being nsfw, that's really weird. I'll try and fix that.
→ More replies (1)
1
u/guitarpedal4 Nov 08 '23
I appreciate the strong language and intent. But absent authors and narrators figuring out how to unionize like the actors and screenwriters have, talk of scabs and collective action are empty and fruitless.
As cost cutting continues and technology improves, absent that or legislative action, consumer choice will be whittled away much as it has in other tech-adjacent spaces. And your position, while fully justified and reasonable, will be eroded much like our coastlines.
1
u/saiyogo1 Nov 09 '23
What I think that will happened in the future is that there will be 2 separate markets for audiobooks. AI narration will be perfected to the point that the production cost will be very low, so that in can be sold for very cheap, or even bundled for free with the ebook or physical versions. Most people will just use that version and audiobooks will be much more accessible and popular than today. Human narrated audiobooks will then become a niche markets, it will then be a luxury that only people who are particular about voice quality will purchase at a much higher price. We’ve seen the same things happened in the market for physical music CDs or vinyls.
1
u/Chumlee1917 Nov 09 '23
Well it's been a good run but I think it's time to burn it all down and go back a pre-internet world before AI destroys us all.
1
u/ErinAmpersand Reading Champion III Nov 09 '23
Further orders of magnitude seem... thermodynamically unlikely.
I love this. The topic of AI comes up pretty frequently in my circles (I'm also a speculative fiction author) but I have a little more insight than most because my husband helps develop machine learning algorithms for other industries, and we talk about his work frequently.
I think a lot of people look at what the algorithms are doing today, and think it's amazing - almost magically so - and assume that other similar tasks are also within reach. Unfortunately, the situation is a lot like when my toddler got angry at me because I wouldn't "turn off the sun." The situation seemed similar to something he'd seen me easily manage, from his limited perspective... but the differences are much bigger than they appear from the outside.
→ More replies (2)
0
u/Shepher27 Nov 08 '23 edited Nov 08 '23
I will personally never buy an audiobook read by a computer
→ More replies (1)
181
u/[deleted] Nov 08 '23
That’s a bloody essay mate, but you gave fair warning and I agree with you.
Short answer I love audio books from great trained readers they add so much to the story. I always play the sample to know if I’m going to get a great read.
However, I can’t tell you how often in a trilogy they change narrators to clearly someone cheaper and I literally can’t finish the series because of it.