For the last couple of years, it seems we’ve hit never-seen-before peaks of attention across the internet devoted to Taylor Swift and her music, personal life, and everything else in between. Now the discussions around Swift and her music have popped up in academic circles. Recently, highly regarded universities such as Harvard, Stanford, NYU, and UT Austin (among others) have offered university subjects based on the singer. In 2024 academics from across the world are converging on Melbourne, Australia for a conference entirely devoted to the research and study of Taylor Swift.

So, you might ask: What does the study of Taylor Swift actually look like? In the study of music, society, popular culture you might expect the field to be solely the domain of sociologists, historians, philosophers, psychologists and law researchers, and of course music academics. As an applied mathematics and science academic at Griffith University, I found myself wondering (and wanting to challenge myself to come up with an answer): Where would maths fit in with this Taylor-Swift-ology? What could maths techniques bring to the table for studying the music of Taylor Swift?

Insights from mathematical analysis

One option we can explore comes from a field called information theory, where folks try to study the ways and methods of communication in one form or another. In particular, we can try to look at the words found in song lyrics and then to measure what words occur, and their frequency. We can use a formula to calculate a measurement of each song called the “entropy” of a text, in this case a song’s lyrics, given the probabilities, p, of each word occurring in a song’s lyrics.

This measure of word entropy can be thought of as a way to measure the amount of information content within a text. A very repetitive text would have a lower score, while a text with a wide vocabulary often has a higher score. Some people also describe the entropy score as a measure of the diversity, the variability, or sometimes the “surprise” in a text. As an introduction to this idea, we could check out the Billboard Hot 100 songs of each year from 1965-2015 and what we see is a graph like that below. Each song is represented by a pale blue dot – the darker the blue the more dots there are packed together.

Freestyling to the fore

As time goes on, we can use the red line to show the average lyrical diversity of songs, which seems to increase – with a surge around 1980 onwards with some of the high scores coming from tracks by Kool & the Gang and Run DMC. The notable stack of points around the year 2000 that uplift the yearly average are from artists like Eminem, Jay-Z, and OutKast among others. The mathematics is showing the popularisation and widespread adoption of hip-hop and rap music!

If we now apply this formula to the lyrics of Taylor Swift’s discography we can produce a graph like the one shown below. Each dot point represents a song on an album. The data for this was surprisingly easy to find, as an academic has kindly compiled an easy-to-use database.

Taylor Swift’s lyrical diversity to date

Entropy by eras

For this example, the actual numbers of the score aren’t so much what matters. For convenience we have dropped in a green line showing the average entropy score for each album. What is interesting is to consider is how the general trend changes over time and in the context of how Taylor Swift’s music genres change, as well as her campaigns for ownership and self-determination of her creative output. For each album, if we consider the song-writing credits on each song, we read that the bulk of her early songs were completely written solo, or with the assistance of one main collaborator. Indeed, the Speak Now album was entirely written by her alone. This lyrical pattern, or the habits or techniques naturally occurring in lyrics that are written primarily by one author have previously been proposed as a way to identify songwriters uniquely – think of it like a signature or fingerprint of their style. These techniques have even been used as a forensic tool to help decide on song-writing disputes through the courts.

The impact of collaborations

As more pop-focused albums release the song-writing credits on the albums get more complicated. There are fewer completely solo credited songs, along with many producers and co-writers providing input, some tweaks here and there perhaps based on their experience of pop song-writing.

This is also reflected in what the mathematics tells us! As a result of the input of others, and perhaps changing genres, we see a slight decrease in the average entropy scores compared to earlier albums, as well as a much wider spread of scores for albums such as Red and 1989 for example. As you might expect from having many cooks in the song-writing kitchen, the mathematics shows us that there can be a much larger range of values the entropy score might take.

As mentioned previously, it is important to consider what the numbers from formulas appear to tell us with the backdrop of Taylor’s evolution in song-writing ownership and creative direction. Following a period of collaboration and input from many wide-ranging creative influences, the entropy scores in recent years now appear to revert to similar trends and structures found from her earlier albums. While many of these newer songs are indeed co-credited, most are with well-known songwriters such as Jack Antonoff and Aaron Dessner and when considering the textual structure and mathematical features of word choices appearing through the song lyrics it appears, for the most part, that the trend is reverting to Taylor’s older album structures.

“Following a period of collaboration and input from many wide-ranging creative influences, the entropy scores in recent years now appear to revert to similar trends and structures found from her earlier albums.”
Taylor Swift
Image by Raph_PH,

This example highlights an interesting way that maths may be used to study a topic, which at first glance seems it might have absolutely nothing to do with maths.

However, if you can pick some suitable tools and have an open mind, we are able to see a remarkably clear narrative from just one graph in this case of Taylor Swift’s tracks. This approach is a key idea behind what many scientists do when we build a mathematical model: we want to try to capture some elements of interesting scenarios that occur in the real world and have that reflected in the mathematics. I have no doubt there are many other tricky things that could be done to analyse music in new ways.

As a final note, a little piece of trivia: one might be curious what this brief analysis using information theory indicates as the most “lyrically diverse” or “information-rich” song in the Taylor Swift catalogue? Checking the results from the above graph, it might come as no surprise to many avid ‘Swifties’ that they will already know the answer All Too Well!

Article updated 24 April 2024: Following the release of The Tortured Poets Department (TTPD) the new “maximum word entropy” song But Daddy I Love Him knocks off All Too Well.


Dr Nathan Garland is a lecturer in Applied Mathematics and Physics at Griffith University, Australia. Prior to joining Griffith, Nathan was a post-doctoral researcher in the Theoretical Division at Los Alamos National Laboratory in New Mexico, USA. His areas of research interest are based around computational modelling of plasmas in various applications, and the integration of high-quality atomic input data into plasma modelling frameworks.