# Algorithm predicts book success based on word usage



## Devor (Jan 9, 2014)

Saw this this morning.

Will Your Novel Be a Best Seller? Ask This Super Accurate Algorithm

I don't have a comment, but I'm sure the rest of you do.


----------



## Mythopoet (Jan 9, 2014)

Just one: lol.


----------



## Twook00 (Jan 9, 2014)

> So, what makes a best-seller? There are a few key findings, according to the researchers:
> 
> Successful books make heavy use of conjunctions—like "and" and "but"—as well as large numbers of nouns and adjectives.
> Unsuccessful works include more verbs and adverbs, explicitly describing actions and emotions—like "wanted," "took" or "promised."
> Verbs in successful books more commonly describe thought processes—like"recognized" or "remembered."



So essentially it is distinguishing between well written books and poorly written books?


----------



## BWFoster78 (Jan 9, 2014)

There ya go.   Proof that you shouldn't use adverbs.


----------



## AnneL (Jan 9, 2014)

Wow, I overuse "recognized" and "realized" and "remembered." Guess I'd better start shopping for my mansion.


----------



## BWFoster78 (Jan 9, 2014)

AnneL said:


> Wow, I overuse "recognized" and "realized" and "remembered." Guess I'd better start shopping for my mansion.



Congratulations on your future success!

I'm editing as we speak to add more of such verbs.


----------



## T.Allen.Smith (Jan 9, 2014)

BWFoster78 said:


> There ya go.   Proof that you shouldn't use adverbs.



Finally....validation!


----------



## Ireth (Jan 9, 2014)

BWFoster78 said:


> There ya go.   Proof that you shouldn't use adverbs.



I guess the Harry Potter books are an exception to the "no adverbs" rule, then?  (Seriously, they're all over the place. :/)


----------



## skip.knox (Jan 9, 2014)

I'm so excited. Now I have a guideline for making my book -- pardon me, "writings" --  just like other writings. I'm eager to get to work on my writings so as to incorporate discriminative unigrams, clausal tags, and constituent tags. Especially constituent tags, uh huh. I was also fascinated, as I'm sure everyone was, to learn that academic writing differs from fiction writing. I have always suspected this but now I learn it's true on account of all them charts. Numbers are truth.

What's really needed, folks, is an algorithm that can predict if my writings is gonna get picked up by an agent. Stylometric *that*!

Thanks to @Devor for the reference!


----------



## Mindfire (Jan 9, 2014)

Hmmmm. Apparently they only used books that have fallen into the public domain. In other words, old and outdated books. How well does this algorithm stack up against _modern_ tastes? Maybe they should refine the algorithm using data from more recently published books.


----------



## T.Allen.Smith (Jan 9, 2014)

Mindfire said:


> Hmmmm. Apparently they only used books that have fallen into the public domain. In other words, old and outdated books. How well does this algorithm stack up against modern tastes? Maybe they should refine the algorithm using data from more recently published books.



I can't believe someone is actually taking this seriously....


----------



## Svrtnsse (Jan 9, 2014)

Mindfire said:


> Hmmmm. Apparently they only used books that have fallen into the public domain. In other words, old and outdated books. How well does this algorithm stack up against _modern_ tastes? Maybe they should refine the algorithm using data from more recently published books.



Yes, probably.

There's a note a the end of the article about how unsuccessful books on amazon also were rated as likely to be unsuccessful by the algorithm. However, I would say it's safe to assume that what's considered good/successful writing changes over time (right?).
Would this mean that the definition of "good" changes while the definition of "not good" stays the same?


----------



## Svrtnsse (Jan 9, 2014)

T.Allen.Smith said:


> I can't believe someone is actually taking this seriously....



There's serious and then there's serious.
It's interesting as an experiment. If it's possible to estimate with such high accuracy whether a book was successful or not by simply analysing the words, it probably tells us something.

I think that what I'm taking away from it is that there really is some kind of point to all these "rules" about writing that are being bounced around, both here and elsewhere. There's a point to making sure your writing it easily readable, it's not just grammar snobbery.


----------



## T.Allen.Smith (Jan 9, 2014)

Svrtnsse said:


> If it's possible to estimate with such high accuracy whether a book was successful or not by simply analysing the words, it probably tells us something.



Seems to me that having readers enjoy your work is a much better gauge. To me, these programs that try to boil success down to some measurement are beyond ludicrous. 

If art success could be reduced to an algorithm, we'd have THE blueprint for success. From there, it'd just be a paint by numbers. What a fabulous museum that would be...  



Svrtnsse said:


> I think that what I'm taking away from it is that there really is some kind of point to all these "rules" about writing that are being bounced around, both here and elsewhere. There's a point to making sure your writing it easily readable, it's not just grammar snobbery.



The point to rules is to use techniques and methods that compliment the style you're trying to achieve. Being easily readable is only one facet of an overwhelming amount of variables. Seems to me this is why people force computations onto art. Outside of amusement alone, I see no practical value.


----------



## psychotick (Jan 9, 2014)

Hi,

There's only one algorhythym that matters to our selling, and that's Amazon's sales algorhythym which determines how our books are placed on advertising lists, ranked and seen. And it doesn't care about our adverb usage.

Cheers, Greg.


----------



## Svrtnsse (Jan 9, 2014)

T.Allen.Smith said:


> Seems to me that having readers enjoy your work is a much better gauge. To me, these programs that try to boil success down to some measurement are beyond ludicrous.



Yes, clearly. I'm not saying it tells us everything. I'm saying it tells us something. 

If art success could be reduced to an algorithm, we'd have THE blueprint for success. From there, it'd just be a paint by numbers. What a fabulous museum that would be...  




T.Allen.Smith said:


> The point to rules is to use techniques and methods that compliment the style you're trying to achieve. Being easily readable is only one facet of an overwhelming amount of variables. Seems to me this is why people force computations onto art. Outside of amusement alone, I see no practical value.



I didn't mean to give the impression that I thought sticking to the rules was a guaranteed recipe for success. However, just faffing about without knowing what you're doing is unlikely to get you anywhere at all - unless you're extremely talented and/or lucky. 
There are always exceptions to all things.

I don't think this algorithm has the recipe for success. It may have some interesting points when it comes to measurable language statistics for successful vs unsuccessful books, but other than that, it's more amusing than anything else.

The question asked though, if the results would change it the system was trained on modern literature instead of on classics from Project Gutenberg, is still interesting (to me). I'm a bit of a maths geek and I find this kind of thing interesting.


----------



## T.Allen.Smith (Jan 9, 2014)

Svrtnsse said:


> Yes, clearly. I'm not saying it tells us everything. I'm saying it tells us something....
> 
> I didn't mean to give the impression that I thought sticking to the rules was a guaranteed recipe for success. However, just faffing about without knowing what you're doing is unlikely to get you anywhere at all - unless you're extremely talented and/or lucky. There are always exceptions to all things.  I don't think this algorithm has the recipe for success. It may have some interesting points when it comes to measurable language statistics for successful vs unsuccessful books, but other than that, it's more amusing than anything else.


I'm just poking fun at the algorithm idea Svrtnsse. I'm not really trying to argue for or against.


----------



## Svrtnsse (Jan 9, 2014)

T.Allen.Smith said:


> I'm just poking fun at the algorithm idea Svrtnsse. I'm not really trying to argue for or against.



Fair enough. I thought you thought I seriously thought... etc...


----------



## Devor (Jan 9, 2014)

The reactions have been fun to read.  It did not disappoint.


----------



## danr62 (Jan 9, 2014)

Great, now I can hire someone to write a program that will create bestellers on autopilot.


----------



## Svrtnsse (Jan 9, 2014)

...and on that note: A Computer Program Made A Game By Itself. Well, That's Unsettling.
Admittedly, I didn't try it out so I don't know how good it is, but still.


----------



## ThinkerX (Jan 9, 2014)

Like I told a poster here a while back:

Soon, on the shelves of your local bookstore, you will see:

'Fantasy Epic' by 'A. Computer'


----------



## Svrtnsse (Jan 9, 2014)

ThinkerX said:


> Like I told a poster here a while back:
> 
> Soon, on the shelves of your local bookstore, you will see:
> 
> 'Fantasy Epic' by 'A. Computer'



I can see that happening - except there will be a name that sounds like it could be a real person. People won't want to read a book written by a computer, even if it's technically more well written than something written by a person.

Then a while later, maybe a generation of readers or so it will be commonplace. People will be able to customise and download their own epics generated by their favourite algorithms. You'd be able to tune the amount of humour, action, romance, violence, sex and all kinds of other things to get a book that suits your own personal preferences.


----------



## Penpilot (Jan 9, 2014)

Eureka... I have emailed out my manuscript that follows the algorithm to the letter to all the big name publishers, accompanied by a copy of this article. I can't wait to get my millions and my book tour with appearances on Leno and Conan. I'm going down to the car dealer right now to pick up a Ferrari. What do you guys think, red or blue? 

OMG... I've already gotten a reply with a title suggestion, _When Hell Freezes Over_. Hmm... seems like I'll have to fight for one of my original titles, _One Born Every Minute_ or _Counting Before Hatching_. 

Mark this day folks. This is my first step to becoming rich and famous.


----------



## Philip Overby (Jan 9, 2014)

I'd read a book written by a computer just to say I've read a book written by a computer. Who knows if I'd enjoy it or not.


----------



## psychotick (Jan 9, 2014)

Hi,

I think you already have. Isn't "The Phone Book" written by computers? Though come to think of it, it is a best seller!

Cheers, Greg.


----------



## AnneL (Jan 9, 2014)

Svrtnsse said:


> Then a while later, maybe a generation of readers or so it will be commonplace. People will be able to customise and download their own epics generated by their favourite algorithms. You'd be able to tune the amount of humour, action, romance, violence, sex and all kinds of other things to get a book that suits your own personal preferences.



Now see this would be a good story.


----------



## BWFoster78 (Jan 10, 2014)

> I'm going down to the car dealer right now to pick up a Ferrari. What do you guys think, red or blue?



Penpilot,

This is an absolutely ridiculous response.

Blue?  Really?  You'd buy a BLUE Ferrari?


----------



## psychotick (Jan 10, 2014)

Hi,

Honestly a Ferrari? When everyone knows that Lamborghini was built specifically to beat Ferrari. You should at least aim a little higher!

Cheers, Greg.


----------



## Svrtnsse (Jan 10, 2014)

Cars?

I'll get me a dragon. I'll buy me an island in the pacific, set up a state of the art research facility in an underground cave and have the world's greatest scientist build me a dragon from genetic samples of dinosaurs and the latest in nano-bio-tech.


----------



## Devor (Jan 10, 2014)

Well, I guess this proves it.  I use too many verbs and adverbs and won't give them up . . . so I guess I'll . . . just - quit writing, now.

:frown2:

More seriously, I posted this so I suppose I should say that I'm disappointed.  Their scientific algorithm boils down to little more than an elaborate word count.  They couldn't even look at, say, sentence length and structure?  Or even the items you already see calculated on Pro Writing Aid?

Honestly, a good application of market research and what's come to be called "big data" could tell us so much about the industry, about who buys what books and why and when.  But in every survey and study I've stumbled upon, I've seen very little evidence that any of the principles of modern market research have hit the publishing industry.  I guess that's because the industry is in the process of fragmenting, when big data takes consolidated resources to implement.  But still, my disappointment is extreme.


----------



## BWFoster78 (Jan 10, 2014)

In all seriousness, Devor, I, too, am disappointed.

While I don't necessarily think that this type of analysis can provide a "paint by numbers" path to success, it could have provided some kind of useful information.  Instead, the paper seemed pretty worthless.


----------



## Penpilot (Jan 11, 2014)

BWFoster78 said:


> Blue?  Really?  You'd buy a BLUE Ferrari?



To match my blue tuxedo.



psychotick said:


> Hi,
> You should at least aim a little higher!










B-B-B-But my mail order bride seems to like it. Got her on the same day as the car. Three more payments for the car. Five more for the bride and to bring over her aunts and uncles and parents and siblings. I also got a couple of more title suggestion for my to-be-award-winning book, _We Have A Restraining Order_ and _Please No More_.

All right I'll stop now. 

On a more serious not and actually dealing with the original post. Writing is art, so an algorithm that predicts a successful book is in essences predicting what successful art is, or at the very least what popular art is. Yeah... good luck with that.


----------

