There are some people who spend their lives determining what makes an R-rated film and what makes a PG-rated film. And, of course, there are all sorts of factors. But if you took the visuals out and compared just the dialogue? What words appear in R-rated films but not PG-rated films? And vice versa?
I used the Internet Movie Script Database and several computer programs to build a list of the most common words in the 519 R-rated films they had scripts for, and the most common words in their 332 PG-rated film scripts. Obviously the majority of the words contained in these scripts are pretty unsurprising. The most common word in both is the, which is also the most common word in the English language. Where it starts to get interesting is where the proportions differ.
Some differences you’d expect – the word fucking has the greatest disparity, occurring in 80% of R scripts and only 33% of PG ones. The rest of the top four are taken up by fuck, fucked and fuckin. In at 5th is asshole (56% of R; 31% of PG). Then comes jesus, shortly followed by christ, so obviously it’s the two words together that are making the difference. It’s amusing to see that “jesus christ” is used in language so much more often than either “jesus” or “christ”.
Shortly behind our lord and saviour is motherfucker and the top ten is rounded off nicely by bullshit and fucker (29% of R; 6% of PG). Some of the words I might have expected to appear in the top ten are nowhere to be seen – murder is way down at 20th and gun comes in at a miserable 517th (just 18% of R-rated films and 12% of PG-rated ones). It’s interesting that cigarette (in 12th) beats joint (in 13th), which in turn beats drugs (29th).
Of course, there must be some words that appear in PG films but not R films, right? Well, there are, but the difference is less distinctive. The biggest hitter is the word ship, which appears in 40% of PG films but only 23% of R ones. It’s followed closely by bows, excitement, cloud, larger and tower (41% of PG; 26% of R), but sadly I think any statistician would tell you this was irrelevant on a data set of this size.
Without further ado, here are the words that appear the most in R-rated films compared with PG-rated films. The words are sized by discrepancy.
And, in case you’re interested, here is the entire list in a big spreadsheet. Want to guarantee a PG rating for your film? Steer clear of any sort of fucks, don’t mention cigarettes (other drugs are ok) and don’t blaspheme. Perhaps add a nautical theme and your film is practically guaranteed a PG. Unless its primary visuals involve people being gunned down while conducting sexual acrobatics with a chain-smoking Shetland pony.
It’s possible that films are rated on more than just dialogue.