Storytelling With and Around Data
Contributor: Patrick Danner
Affiliation: Misericordia University
Email: pdanner at misericordia.edu
Released: 15 August 2020
Published: Fall 2020 (Issue 25.1)
The last decade has seen data and big data carve out a space for itself in our cultural moment. The infamous rising line graph in Al Gore’s (Guggenheim, 2006) An Inconvenient Truth, the founding of FiveThirtyEight in the lead up to the 2008 election, FiveThirtyEight’s predictions for the 2016 election, the increasing prevalence of data journalism from the New York Times to CNN: All of these point to a broader public fascination with visual and numerical data. We’ve come to use quantifiable data to predict outcomes of football games and presidential elections; we use it to definitively close arguments on everything from climate change to institutional racism. That is, quantitative evidence is often treated as something extrinsic—not invented by the rhetor—to be applied to a case a la Aristotle’s (2001) inartistic proofs (Rhetoric, book 1, ch. 14).
This webtext is designed to unravel such treatment of data and data visualizations and to explore ways we can introduce both in our writing pedagogy. I open with a basic premise, one pored over by scholars of digital humanities (Drucker, 2014; Gitelman, 2013; Halpern, 2015) and those closer to writing studies (Gries, 2017; Welhausen, 2015; Wolfe, 2015) alike: Data is never raw. It is not objective. It relies on a certain level of context to do work. However, this webtext also seeks to trouble the ways in which we have come to understand how data and data visualization do persuasive work through (1) appeals to pathos (Kostelnick, 2016; Wolfe, 2010); (2) kairotic, cybernetic encounters (Drucker, 2014; Halpern, 2015); (3) interpretive strategies and heuristics (Jones, 2015; Kim, 2005); and (4) definitional and aesthetic considerations. This webtext seeks to trouble those froms of persuasion by excavating the trajectory such visuals take into becoming. I argue that data is not raw because it is delivered through story, a composed form that is negotiated, revised, and shaped by technical and situational affordances. Ultimately, this webtext is designed to fine tune a lens on numerical data and data visualization that reaches into classrooms, professional practice, and our everyday encounters with numbers, charts, and graphs.
Storytelling: Where We Are
Handbooks in technical and business communication increasingly embrace an idea of storytelling with data. Cole Nussbaumer Knaflic’s (2015) Storytelling with Data: A Data Visualization Guide for Business Professionals is perhaps most on the nose in this regard. Knaflic suggested that stories can capitalize on the pathos those like Johanna Wolfe (2010) and Charles Kostelnick (2016) find in contemporary data visuals. Yet her text goes to great lengths to provide stories' formal features, whether pulling from Aristotle’s (2001) theory of drama (the three acts), film (protagonists and desire), or the written word (parroting Kurt Vonnegut: “Keep it simple. Edit Ruthlessly. Be authentic”) (pp. 166–170). Stephanie D.H. Evergreen’s (2017) Effective Data Visualization takes a more rhetorical tack, finding stories interpreted in visuals themselves and then capitalized on for “effective decision making” and “learning” (p. 225). Nancy Duarte’s (2010) Resonate perhaps gets to the heart of the matter. Though she reflects on the emotional appeals and rhetorical power of visual data (not unlike Evergreen, 2017; Knaflic, 2015; Kostelnick, 2016; and Wolfe, 2010), she also notes a clear purpose for the story form. For Duarte (2010), data and visuals as standalone items don’t do the necessary work of persuasion. “Information is static; stories are dynamic,” she wrote, “they help an audience visualize what you do or what you believe. Tell a story and people will be more engaged and receptive to the ideas you are communicating” (p. 16). And though these texts ultimately present three different approaches to the story form—Knaflic’s cinematic, Evergreen’s spatial and contextual, and Duarte’s presentation-driven, TEDTalk form—they are on essentially formal understandings, focused on the shape of the product rather than the craft or process of composing those stories.
TEDTalks themselves, and their viewership on YouTube, can reasonably be seen as reflective of this formalist approach to data-driven storytelling. There is something captivating about a solitary speaker in front of an animated line graph, or moving dot plots. But what I want to suggest in this webtext is that, far from studying form, it is necessary for scholars of rhetoric and writing studies to consider the process of situated encounters with data, how we visualize it, and how we package it. We should imbue discussions of data with something like Johanna Drucker’s (2014) cybernetic encounter and re-remind ourselves, per Lisa Gitelman (2013), that data is never raw. Storytelling might be a worthwhile way to conceptualize this activity, but if so, we should emphasize the verb form here—how people craft stories—and impart those lessons to students and practitioners alike.
In what follows I present two case studies to drive home this point: Data visualization and the stories we tell with them are tenuous; they arise from a series of situational and technical limitations and have uncertain futures in the public sphere. I move through each case in turn, highlighting what the story form can’t account for in this rhetorical activity, and close by outlining ways to bring this activity into our classrooms and workshops.
Case Study 1: Workers in Poverty
The first case study is a line graph that visualizes workers in poverty from 2005–2018. This graph comes from an ongoing workplace ethnographic study of the Metro Data Coalition (MDC; a research non-profit that specializes in data-driven analyses of their home city, which I refer to as Gateway). In 2018, MDC data scientists were directed to expand and update employment data for an upcoming report to partner non-profits and Gateway’s broader civic sector. This was not unusual work, but I observed in their meetings an unusual amount of emphasis placed on the multiple ways of interpreting the draft visual (see Figure 1, below). Each of these stories report writers wanted to tell had wildly different implications:
1) Poverty is on the rise among Gateway’s families.
2) Despite an uptick in poverty rates since 2013, Gateway is stronger than it was during the recession.
3) Between 12–16% of workers are living in poverty in Gateway.
4) These may be low wage jobs, but people are finding employment at a quick rate in Gateway.
Each of these stories points to different impact points and thus different calls-to-action. The decision of which story to tell is—at least initially—up to the MDC itself. To say "poverty is on the rise," effectively zooming in on the slow arc upward from 2012–2015, is to call for action in areas of job market expansion. To say the city is “stronger than it was during the recession” is to depress the call to action, at least until the arc creeps closer to the 2006 peak. To focus on those currently in poverty is to redirect attention to wage gaps and wage policy. To declare low wage jobs part of the equation at all—to introduce that specific element to the story—is to redirect the reader entirely to the long slope downward, and to prompt questions about the value of employment rather than poverty itself.
And, of course, there are other possible stories to tell here. One would track this trend alongside national trends, to focus on how the city performed against the national average in the wake of the recession. Others could introduce educational data, or wage data, or dig more deeply into how poverty itself is defined here (by household size and income). We can introduce numerous variables to the equation here, find a visual correlation, and circulate a tentative story.
However, as we begin to look outward toward the circulation of this visual representation, we realize that—to paraphrase Catherine Chaput (2010)—rhetorical situations are decidedly not determinable in our present hyper-connected moment. Rather, the multiple publics MDC data circulates through determine the final form(s) of the stories data tell.
Case Study 2: The Ocular Test
The second case study is a series of maps from the Metro Data Coalition 2015 report that provided readers an ocular test (to use subject-reported terminology) to compare local racial geography to geographies of wealth. For the MDC, this series tells an obvious story: A history of segregationist policy and redlining has created a severe geographic wealth gap. For those who took up the data and put it to work elsewhere, however, such historical policies only serve to bolster the true culprit of the racial wealth gap: a lack of Black ownership in any part of the city.
The MDC team values its positionality within a broader network of non-profit and civic actors in their home city. Because of this, each report publication is celebrated with a launch event featuring speakers from across the city—representatives of these non-profits. These maps became a particular sticking point for one member of a local Urban League branch. After publication, these maps were accepted warmly, as “things we already know” but inspiring enough to act on. A year later, after the publication of a subsequent report, the same panel member referred to these maps rather than more current ones. She declared them “too weak” to “move the needle” on issues of segregation. With several government, non-profit, and corporate leaders in the room she spoke on the issues of ongoing racial and income segregation: housing, generational wealth, Black ownership of business, the migration of jobs to the east end of the city, and so on. She changed the story in an effort to re-activate its readers.
In speaking with this subject in advance of the launch event, she spoke to the immediacy of this data to her work world. “We work with this data every day, whether its MDC data or not,” she said. I take that to mean that as a non-profit actor entrenched in issues of poverty and segregation, the story, to her, was more detailed than the MDC maps alone allowed. And without those details on hand, other actors were unable to respond.
The Purpose(s) of Storytelling
These case studies provide the broadest view of the activity of telling stories. Inherent to each are moments where choices were made (e.g., how to define “impoverished” for a given visual, how to account for sampling error in racial demographic data). To an extent, such technical issues are best addressed in data analytics and policy courses. Yet, for the purposes of rhetoric and writing, they should also be understood as part of the activity of storytelling. Stories are actively crafted and deployed for specific purposes. In the context of the MDC they are to be compelling for a specific audience and actionable for that same audience. To understand story, then, as a discursive form to merely slip into, is to do it a disservice.
Storytelling is best understood through the contexts and actions of how stories are crafted and told. And they should moreover be understood through the public uptake of them. Whether or not a story is acted upon may be one metric of success, and may point to areas where storytelling activity can be reflected upon and critiqued. Had the MDC couched these two maps in the context of redlining, homeownership rates, or the location of jobs, they may have not proven weak.
Ultimately, though the discussion is not exhaustive here, readers of the webtext should take away a few truths about storytelling with data and further consider three places where stories may be out of our hands as storytellers: (1) statistical malleability, and the ability for viewers to reject correlational reality in favor of their own interpretation or whatever they bring to that data; (2) relatedly, the ecologies of public rhetorical action that inform the way stories are told and the language we use; and (3) the cybernetic spaces of encounter that encapsulate both of these things, the inputs and outputs of storytelling activity and the baggage that viewers of visuals bring to them.
Future Praxis 1: Exploratory and Explanatory Power
In the classroom and in industry, it is vital that we implore storytellers and data scientists to recognize the tenuousness of the cybernetic encounters their work will wade into. One way to tackle this project is to consider the dual functions of data visuals as exploratory and explanatory.
Consider the Figure below, made with the data visualization program Tableau. Tableau is a user-friendly, drag-and-drop dominant program for creating interactive dashboards with given data sets. It does not have the range of functions available to RStudio or similar coding-dependent programs, but it also doesn’t require learning new coding languages to operate.
In a recent data visualization course, the students and I had access to a corporate partner’s sales data, and were able to map sales-by-volume across the state of Wisconsin, highlighting high- and low-volume zip codes. I asked students to consider the explanatory power of the visual: What can it tell us? Students considered the high-volume sold in areas around Milwaukee, Madison, and Green Bay, and gap in the map in the southwestern corner of the state. Importantly, the deepest insights came from those with more familiarity with Wisconsin geography, speaking to the cybernetic nature of the encounter and the ability for interpretations to drive the story itself. I then asked students to speak to the exploratory power of the same visual: What questions does it raise? Students can return to the declarative statements from before and probe them, asking about population density, predominant industries, and median income in pieces of the state. They then went further, questioning what was possible in Tableau. Can we import demographic data? Can we go more granular than the zip code level?
Flipping the classroom, students were able to toy with the on-hand data. I added to the original, partner-provided data set by geocoding addresses in the Wisconsin data to provide a more granular study of geography. I assisted students (though many were able to find the functions themselves) in finding baked-in map layers, driven by census data, in Tableau: median age, median income, household density, housing trends, and the like. I then ask students to craft two stories based on their updated, more detailed versions. One resulting map looked like this:
The updated maps introduced, clearer, more detailed stories: Certain products sell better in certain corners of the state; relatively high-income areas of the state are being missed by the company’s distributor; population density is a good predictor of higher volume sales. Others overlaid median age by census block or percentage of white collar professionals, seeking out correlation between product sales and specific demographics. Students toggled map layers, filters, and displays in groups, turning controls over to those least familiar with the technology to gain experience with specific functions.
Through this, students developed stories by mining the exploratory and explanatory functions of maps and learning the limitations of the datasets and Tableau. For example, questions about layering neighborhood boundaries over city maps or filtering out individual areas by median income brought technical limitations—which in turn limit the exploratory and explanatory functions of the data visual—to the fore. The situatedness of this storytelling ultimately became central to this class activity. Groups discussed the types of questions our corporate partner had raised on the first day of class (questions about gaps in distribution, comparing on-premise and off-premise sales volume, etc.) and explored the capacity of Tableau to help answer those questions and demonstrate findings. Technical and rhetorical affordances abounded.
Future Praxis 2: Data Visuals as Situated Responses
I have inverted the activity and run it in simpler, less-technologically-dependent forms. In one version, I provided students with a fictional dataset for a fictional company (see Figure 6, below). Students were then prompted with a scenario—their need to hire new staff for the upcoming season—and tasked with designing two data visuals to support their argument. Students were given the option to draw the visuals by hand (to approximate scale) or design the visuals with Excel. Students produced a range of creative options—largely preferring to hand-draw their visuals—including bar graphs with layered sales lines, simple pie charts, single infographic numbers, etc. But the most surprising response was from one participant who pushed against the scenario entirely. Interpreting the data, they declared that the rate of missed orders was evident of a worker slow-down rather than a need for more hires. This activity impressed upon me the tenuousness of meaning within data. That the situation (for all but one participant) can often determine how quantitative information is interpreted.
Taken together, these types of activities round out the picture of what Johanna Drucker (2014) called the cybernetic spaces in which humans encounter data, visualized or otherwise. The ability of the situation to hone interpretative meaning, the limitations placed on the explanatory and exploratory power by the technology on hand, the social circulation of data and the ways it can be taken up and reinterpreted: each of these need to be taken into consideration when we consider how to present students and practitioners across fields of professional or technical communication with questions of how to incorporate quantitative information in their projects.
Practical and Conceptual Limitations
Technical and professional writing—in the classroom—often sets as a goal the approximation of real-world communicative scenarios. Taken together, however, the case studies and vignettes above point to a difficulty of approximating the real-world uncertainty that comes with the composing—including circulation and uptake—of data and data visualization. The available tools will lead students to mine explanatory power of databases and visuals or add features for greater persuasive velocity. Setting up such scenarios, in essence, provides students rhetorical sandboxes in which to negotiate their story form through particular emphases.
In the case of the cybernetic development and circulation of data, there are real difficulties there. In the absence of a highly-engaged corporate or non-profit partner, such as in the mapping lesson outlined above, it is incredibly difficult to mirror the range of interpretations available to publics and stakeholders encountering this work, or to creative scenarios in which public(s) can encounter the data with student composers in the room. Setting up these types of course activities requires inventing scenarios for students to repond or compose their way into.
There are also concerns about obtaining or designing datasets for students. There are some freely-accessible datasets. Census data could suit some coursework or individual projects. The newsletter Data is Plural provides weekly lists of newly released, publicly accessible datasets that range in subject from football scores to burrito density, utlilities, and parking meter locations. Such resources can give students a range of ways into developing stories from reliable, real-world datasets.
Ultimately, the uncertainty of this material can be a benefit if the class has ample time to debrief. As has been shown, the development and circulation of data and data visuals is part of their rhetorical complexity, and the end-product that is circulated results from negotiated interpretations of context, meaning, and stories. The ability for publics to bring their own interpretations to the table, the difficulty contending with technological limitations, and the limitations of the datasets themselves are real-world concerns that students and practitioners working to communication quantitative information would do well getting practice in.
Aristotle. (2001) Rhetorica. In Richard McKeon & C. D. C. Reeve (Eds.). The basic works of Aristotle (pp. 1325–1455). Modern Library.
Chaput, Catherine. (2010). Rhetorical circulation in late capitalism: Neoliberalism and the overdetermination of affective energy. Philosophy and Rhetoric, 43(1), 1–25. DOI:10.1353/par.0.0047
Drucker, Johanna. (2014). Graphesis: Visual forms of knowledge production. Harvard University Press.
Duarte, Nancy. (2010). Resonate: Present visual stories that transform audiences. John Wiley & Sons.
Evergreen, Stephanie D.H. (2017). Effective data visualization: The right chart for the right data. Sage.
Gitelman, Lisa. (Ed.). (2013). “Raw data” is an oxymoron. MIT Press.
Gries, Laurie. (2017). Mapping Obama Hope: A data visualization project for visual rhetorics. Kairos: A Journal of Rhetoric, Technology, and Pedagogy, 21(2). http://kairos.technorhetoric.net/21.2/topoi/gries/index.html
Guggenheim, Davis. (Director). (2006). An inconvenient truth (Film). Lawrence Bender Productions.
Halpern, Orit. (2014). Beautiful data: A history of vision and reason since 1945. Duke University Press.
Jones, Natasha. (2016). Found things: Genre, narrative, and identification in a networked activist organization. Technical Communication Quarterly, 25(4), 298–318. DOI: 10.1080/10572252.2016.1228790
Kim, Loel. (2005). Tracing visual narratives: User-testing methodology for developing a multimedia museum show. Technical Communication, 52(2), 121–127.
Knaflic, Cole Nussbaumer. (2015). Storytelling with data: A data visualization guide for business professionals. John Wiley & Sons.
Kostelnick, Charles. (2016). "The Re-Emergence of Emotional Appeals in Interactive Data Visualization."Technical Communication 63(2). 116-135.
Welhausen, Candice A. (2015). Power and authority in disease maps: Visualizing medical cartography through yellow fever mapping. Journal of Business and Technical Communication, 29(3), 257–283. DOI: 10.1177/1050651915573942
Wolfe, Joanna. (2010). Rhetorical numbers: A case for quantitative writing in the composition classroom. College Composition and Communication, 61(3), 452–475.