Using AI for literary analysis: Infegy Atlas reads books
by Henry Chapman on May 1, 2023
A journey into Infegy Atlas’ ability to read long-form text
Infegy Atlas specializes in analyzing social media posts from various platforms such as Twitter, TikTok, Pinterest, and others. It also has a custom data capability that allows clients to upload data and then use the full suite of Infegy Atlas analytics to analyze and explore it (Infegy Atlas API has the flexibility to accommodate different sources of data for analysis).
Recently, our research team decided to experiment with non-traditional sources of data to test Infegy IQ's sentiment and Emotions analysis capabilities. The team found that classic works of literature presented an excellent data source to test Infegy Atlas's abilities.
Here, we'll explore just how well you can use social listening and text analytics tools to analyze long-form text and novels. We will leave you with some suggestions on how to use these tools to review more business-focused documents, such as stock earnings reports, customer service call transcripts, or your product reviews on Amazon.
How the Infegy text analytics process works
To begin, we needed a well-known, readily available novel.
Since we're huge Tolkien fans at Infegy, we settled on a copy of The Lord of the Rings that we discovered hosted at this GitHub repository. First, we had to do a little data cleaning to convert the novel into a CSV file. We accomplished this using a simple Python script.
We split the entire text into a series of sentences using our proprietary tokenizer. Our tokenizer uses stats to differentiate between period placement in sentences, resulting in more accurate sentence splits. Infegy Atlas’ data visualization tools require time-based data, so we then transformed the book into a narrative dataset, with each sentence representing a new "hour." Finally, we uploaded our dataset into our custom data portal.
Let’s dig into the analysis!
Social intelligence tools for long-form text analysis: Results
Infegy sentiment analysis of The Lord of the Rings
Sentiment is the most obvious place to start. We used Infegy Atlas sentiment analysis to chart the positivity and negativity of the plot as it advances (Figure 1).
Related: You can learn more about how to use social intelligence tools to boost your brand sentiment analysis here.
The Lord of the Rings starts positively within the first chapter of the book, which describes an idyllic portrait of the Shire. The sentiment then rapidly turns negative when the plot moves to the fall of Isildur, the rise of Mordor and Sauron, and the disappearance of the One Ring. Negative sentiment stays moderately elevated when Gandalf and Boromir die at the end of The Fellowship of the Ring and when the Ents capture Isengard.
Figure 1: Net positive and negative sentiment of the three Lord of the Rings novels with dates generated automatically; Infegy Atlas data.
Next, negative sentiment surges towards the end of The Return of the King when Aragorn fights the Battle of Pelennor Fields and Frodo and Sam move towards Mordor. Finally, negative sentiment surges one last time at the end of the book during the scourge of the Shire. The text ends with higher positive sentiment when Frodo, Gandalf, and the rest of our characters get on ships in the Grey Havens and head to the Undying Lands – Tolkien’s version of Heaven.
Conducting an Emotions analysis
An Emotions analysis of the plot is even more interesting than a mere sentiment analysis because it gives us a more granular understanding of what's happening within a text.
Related: You can check out an example of how we used Infegy Atlas to analyze how Silicon Valley Bank customers reacted to a bank failure here.
Infegy Emotions can detect and track which specific feelings social media users have about a particular topic. Emotion metrics usually show us which percentage of the conversation contains a specific emotion. Infegy Atlas can detect 10 emotions. Extending this to our literary analysis, we can track how Frodo felt about Gandalf’s death using sentence keywords like “loss” or “cried out.”
This goes beyond just positivity and negativity; we can see if our characters feel fear, anticipation, sadness, or love. As with our sentiment analysis, we used Infegy Atlas to track the characters' emotions through the plot progression (Figure 2).
Figure 2: Emotions across the three Lord of the Rings novels; Infegy Atlas data.
We found that Anticipation peaks towards the middle of The Fellowship of the Ring, which is when the Fellowship is formed, and the characters anticipate a long journey into the heart of Mordor. Anger and Sadness peak towards the end of The Fellowship when Gandalf and Boromir die. Anger spikes towards the end of The Two Towers when Treebeard and the Ents discover the destruction of Fangorn Forest.
By the way, if you enjoyed seeing the Sentiment and Emotions trends, check out The Lumen. The Lumen is our free tool that shows you trending topics and sentiment-at-large on TikTok, Instagram, and Twitter.
Using Topics to identify characters
Infegy Topics algorithms identify the critical terms in sentences, which are the subjects and objects of sentences. We can use the Topic metrics to create word clouds that show the frequency of mentions; word clouds are also colorized by sentiment. This works very well in isolating the important characters within The Lord of the Rings (Figure 3).
Figure 3: Top topics with sentiment color scale attributed to Lord of the Rings; Infegy Atlas data.
Frodo pops up as the most important topic within the work. Gandalf, Aragorn, and Pippin follow “Frodo” in terms of mentions/size. Faramir, Gimli, Legolas, and Saruman are relatively minor characters, are mentioned less, and thus, appear smaller within the word cloud.
Next, we'll use a linking analysis on those topics to see which frequently appear together (Figure 4). Interestingly, the two towers, Mordor and Isengard, appear as related topics within our clustering algorithm. The tragic father-son pair, Denethor and Faramir, also occur within an isolated cluster. Finally, “Théoden,” “Rohan,” “fields,” and “ride” all appear together as a horse-related cluster.
Figure 4: Top topics displayed with link analysis across all three Lord of the Rings novels; Infegy Atlas data.
Related: Learn how to use social intelligence metrics like the ones we’ve discussed above to prep winning new business pitches.
Using Infegy Interests metrics to identify general ideas
While Infegy Entities identify proper nouns, Infegy Content Interests identify general topics that appear within the body of the text. For example, if we analyzed a section that contained reviews about a youth tennis center, we’d get interests like “Tennis,” “Youth Sports,” or “Youth Education.”
Using interests to identify general subjects within long-form content can provide surprising results – some of which were downright amusing with our The Lord of the Rings analysis (Figure 5).
For example, Interests revealed that the most prominent topics within the three novels were “Celebrity Deaths,” followed by “Equine Sports” and “Games and Puzzles.” The interest in celebrity deaths can be attributed to the numerous deaths of notable characters throughout the trilogy. From Boromir's death at the hands of orcs in The Fellowship of the Ring to the demise of Saruman in The Return of the King, death is a recurring theme in the series.
Figure 5: Top Interests across The Lord of the Rings; Infegy Atlas data.
We understand the interest in equine sports as the prominent role of horses in The Lord of the Rings. The riders of Rohan are famed for their horsemanship and use their horses to significant effect in battle. Lastly, the “interest in games and puzzles” can be seen in The Fellowship of the Ring: the Fellowship must decipher the meaning of a cryptic message on the doors of the Mines of Moria, and there is lyrical, riddle-like poetry throughout the text.
The business case for long-form text analysis
If you've made it this far through this post, you're either really interested in The Lord of the Rings or looking for a way to analyze long-form text in your business.
Thus far, we’ve been analyzing a classic English novel. All these analytical tools work just as well for business-specific long-form text such as stock earnings reports, customer service call transcripts, or product reviews from Amazon. You could upload your custom data and then use various Infegy Atlas tools and capabilities to draw out insights.
For example, Figure 6 is the Top Topics word cloud we got after analyzing the January 2023 Apple earnings report with Infegy Atlas. Wearables were a potent growth engine for the company that quarter, so you see that term pop with high positive sentiment.
Figure 6: Top topics associated with Apple’s January 2023 Earnings conference call, colorized by sentiment; Infegy Atlas data.
We then followed the same process detailed above to analyze The Mueller Report, a 448-page document. This time, we used Infegy Narratives, our dynamic topic-clustering function, to look at The Mueller Report. Narratives identifies related conversations, clusters them together, and labels the categories. We wanted to see how Infegy Atlas Narratives would identify the main categories of inquiry that the Justice Department undertook as it investigated Russian interference in the 2016 election. It scaled to the project impressively well, as expected. If you had to read and summarize the report, the task would have taken hours. With Infegy Atlas Narratives, you can accomplish the task in seconds.
Figure 7: Infegy Narratives’ identification of clusters associated with The Mueller Report; Infegy Atlas data.
Infegy Atlas AI for diverse business analytics
There are numerous ways you can use Infegy Atlas for social media intelligence and other forms of business-related analytics. Here, we tested our custom data tool on the classic novel, The Lord of the Rings, to explore its ability to analyze long-form text.
The results showed that Infegy Atlas AI isn’t just for social intelligence. It can be used to track sentiment accurately, identify critical characters, and highlight general subjects across many forms of long-form text.
Extending our analysis to more business-focused documents, we were able to show that Infegy Atlas’ capabilities with long-form analysis can be a time-saving tool. Ultimately, our research demonstrated the Infegy Atlas’ flexibility in accommodating different data sources, making it a valuable asset for companies seeking insights from large volumes of data.