Voyant Tools

With the database of apocalyptic fiction now at around 1500 entries, I ran it back through one of the tools I had used early on in the process, Voyant Tools.

Voyant Tools provides a number of visualizations, such as word clouds and trend graphs. The process can be started very simply by pasting a text corpus into the dialogue box, and can then become more complex as stop-words are assigned and specific trends are highlighted.

Word Cloud

Below is a word cloud of the whole database, which pulls up the frequency of words (minus stop-words such as “the,” “and,” etc.), with some fairly expected results:


Of the “Type” column, “post-apocalypse” is one of the largest words in the cloud, with “apocalypse” and “dystopia” both present, but notably smaller; “pre-apocalypse” and “near-apocalypse” do not make an appearance. Of the causes, “nuclear” and “war,” predictably, represent the largest frequencies, though “environmental” is surprisingly close, and “technology,” “science,” and “plague” are not too far behind.

“fiction” and “series” represent the largest “Format” categories, though “film,” “comics,” “video,” and “game” appear prominently too. “les,” “cycle,” “des,” “glaces,” “ice,” “deathlands,” “outlanders,” “ashes,” and “arnaud” also reflect the prominence of fiction series in the database, referring to G.-J. Arnaud’s “Les cycle des glaces” (or the Ice Series) and the “Ashes,” “Deathlands,” and “Outlanders” series.

Most interesting, and rather unexpected, is the fact that each year of the 1980s is represented here, but there is no other year present but these. Of the years that appear here, 1983 is the largest, with 1984 close behind. This really indicates that I should look at that era in more depth, and into the context surrounding the boom in texts at that time.

Word trends

Next, I rearranged the spreadsheet to be listed by year, and fed in into Voyant again to look for word trends over the course of time period (1805-2013) the spreadsheet covered. First I looked at the different types of destruction depicted, with “post-apocalypse,” “apocalypse,” “dystopia,” “pre-apocalypse,” and “near-apocalypse”:


The steady rise of post-apocalyptic texts is probably to be expected, as science fiction began to take off as a genre, but the steep peak here in the middle was interesting. I had a look at what part of the corpus this represented:


Funnily enough, it’s back in the mid-1980s, around 1986. Again this is a strong reflection that the median of post-apocalyptic texts in the 1980s. So, next I looked at what kinds of destruction were being depicted, by looking for “war,” “nuclear,” “environmental,” “technology,” and “terrorism”:


According to this graph, war seems to always have been a major concern, though it peaks at segment 2 (around 1948), has a slight resurgence in segment 4 (1972), has a significant peak at segment 6 (1986), then drops again before steadily rising from segment 9 (2003) onwards. “nuclear” follows this trajectory, as not all wars in the database are nuclear, but most things listed as nuclear are wars (the exception being nuclear accidents, which could explain the steeper climb around segment 6 around the time of the Chernobyl disaster). “environmental” was surprisingly strong, particularly in segment 6–so, though many apocalyptic fictions were produced around man-made apocalyptic events in world history, not all of the disasters they depicted were man made. As is to be expected, “technology” started strong as modernity and the industrial revolution began to unsettle many writers, and has had a recent boom again, as artificial intelligence, computers and the internet have sparked new fields for speculation. Terrorism flat lines all the way to segment 6, and only becomes relatively strong in the 2000s, though it still seems to be an underrepresented area in comparison to the others.

Back to word clouds

With these things in mind, I cut down the database to just works released in the 1980s, and removed all descriptions, quotes, authors, publishers, formats, so only the titles, years, types, causes, cities, states, countries, and tags showed. Below is the word cloud for this:

voyant 4

The size of “post-apocalypse” seemed to dwarf the other results here, and the prominence of “new” and “york” may have partly been due to the state name accompanying the city name, so trimming down the text here still further, I removed everything so that all that remained was years, causes, cities, and tags:

voyant 5

“nuclear” and “war” are very prominent here, but still “environmental” is showing to be very prevalent. “new” and “york” now are much closer to the size of “los” and “angeles,” and “san” and “francisco” are notably smaller. A strong ’80s aesthetic seems to manifest here, too, with a fairly strong representation of “aliens,” “mutation,” “pollution,” “gangs,” “motorcycles,” “science,” “plague,” “city,” “darkness,” and “technology,” that speak to a decade that saw the rise of cyberpunk and apocalyptic movies like Escape from New York (1981), Blade Runner (1982), The Terminator (1984), Aliens (1986), and Akira (1988).

Primarily what these results show is the importance of the 1980s to the genre, but they also show that despite the concerns with nuclear war the decade also produced a large amount of natural disaster fiction. This looks to be an area I will need to place extra focus in my research.

6 thoughts on “Voyant Tools”

  1. Guten Tag! Ich sehe die Nutzungsbedingungen fur Informationen nicht. Ist es moglich, den Text, den Sie auf Ihre Website schreiben, zu kopieren, wenn Sie auf diese Seite verlinken?

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s