Telling stories with data

by Wei Yin, Research Support & Data Services Librarian, Columbia University Libraries

Why storytelling with data is important?

A piece of news from Forbes indicates data storytelling is the essential skill everyone needs in this big data era. A best-seller book on Amazon, “Storytelling with data: a data visualization guide for business professionals”, addresses the importance of knowing how to choose a most effective way of visualization (i.e. tables, graphs, bars, or others) for drawing audience’s attention both intellectually and emotionally. In other words, data visualization helps telling a meaningful story. Storytelling with data is not only useful in Business, but also embeds in people’s daily life. Strategic storytelling using government-released opendata is becoming more and more popular in city governance. Here, strategic storytelling helps with building up a campaign platform and promoting administrative level legislations and policies. For better storytelling with data, a research paper by two Stanford scholars suggests a balance between author-driven and reader-driven visualization. Author-driven visualization is good for message delivering, while reader-driven visualization supports interactive thinking based on current data display.

There are 4 kinds of open-source data visualization tools for storytelling that are popular in academia nowadays.

Storytelling with Tableau public

Tableau brands itself as a powerful Business Intelligence tool for visual analytics.  Rather than displaying graphs and tables for audience, it provides interactive data dynamics to make both presenter and audience understand data better. Tableau products include Tableau desktop (for personal use), Tableau server (for enterprise use) and Tableau online (for cloud service). Though these services are not free, Tableau offers one-year free desktop license, called Tableau public, to students at K12 and postsecondary levels around the world.

Here are featured storytelling examples by Tableau Public. If you have done data visualization in Excel, you probably won’t struggle with Tableau. However, it takes time to develop expertise with new tools.

Storytelling with D3.js

D3 is short for “Data-Driven Documents”, which is a JavaScript library for manipulating documents based on data. D3 uses HTML, CSS, and SVG for any type of data visualization you can imagine. Most of functions are free and open-source.

A book by Ritchie S. King “Visual Storytelling with D3: An Introduction to Data Visualization in JavaScript” suggests that learning D3.js could be a challenge, but you can find many freely-available learning tutorials and examples online. You may also want to check out the the book’s supplementary materials on GitHub.

Storytelling with R

R is widely used in academia, which is famous for its data visualization functions. Jeff Chen and Star Yang from Commerce Data Service wrote a detailed introduction to both 2D and 3D visualization using R. 2D visual tools include the packages of “ggplot2”, “datatables” and “dygraphs”, and 3D visual tools include the libraries of “Threejs” and “leaflet” (for mapping).

Storytelling with Python

Python is another good programming tool for data scientists because it has extensive built-in functions and libraries. This article compares 5 essential data visualization libraries, Pandas, Seaborn, Bokeh, Pygal and Ploty, to help you choose the right data visualization tool.

Jeremy Manning has a great Github repository for his course (Fall 2017) “Storytelling with Data” at Dartmouth. The students are asked to write codes in Python and organize in a Jupyter notebook. For those interested in developing deep expertise, this is a good resource.