The art of numbers: Who knew Big Data could look so cool?
A bright orange line emerges from the horizon at the left side of the screen. Getting brighter, it arcs upward—and then suddenly turns ashen gray before falling back to the horizon. More lines follow the first. Thousands upon thousands of lines. The visual is at once beautiful and daunting.
That’s because the lines aren’t merely lines. Each one represents the life of an American who was suddenly and violently snuffed out by gunfire. Each one is just a statistic in a database, but the “socially conscious” data visualization firm Periscopic brought the numbers to life in a dramatic illustration that makes a powerful statement: Some 7500 people have been killed by guns so far in 2013 (the orange lines), and those victims lost a projected 330,000-plus total years of life, liberty, and the pursuit of happiness (the gray lines).
Such is the power of the data visualization, a unique blend of graphic art and data science that’s helping both researchers and everyday people make sense of the ever-growing databases that increasingly influence our lives.
When done well (as in Periscopic’s gun-death example), data visualizations can express the meanings hidden in massive data sets in an immediate and intuitive way. “I was shocked. I realized the scale I was looking at, and then I saw the volume,” says Periscopic cofounder Dino Citraro. “It scared me.”
Helping Big Data ‘speak’
Of course, infographics are nothing new. People have made careers out of visualizing data since William Playfair created his first charts and graphics in the 18th century, and since Edward Tufte popularized modern information graphics in the 1980s. But today’s data visualizations wrangle much more information, and they’re far more sophisticated in their use of multimedia graphics and computer engineering platforms.
Big Data is helping to fuel the revolution. Governments, corporations, and individuals are collecting more and more data about people, behaviors, demographics, habits, and preferences. All this information comes from a multitude of sources, including satellites, wearable fitness devices, surveillance cameras, the Internet, cell networks, and social media. A new NSA data warehouse, opening soon in Utah, will store exabytes of surveillance data, for example. (An exabyte is roughly 1 billion gigabytes.)
Databases in both the public and private sectors are growing so big and multidimensional that perceiving patterns and trends within them is becoming a huge challenge. Enter the new breed of visualizations, which make it much easier to see patterns and trends that would otherwise be hidden inside giant spreadsheets. They can also express big-picture messages that raw numbers can’t easily explain.
The process of making visualizations varies, but generally it starts with the collection of raw data, which the visualization creators then analyze and model to tease out initial shapes and patterns. The creators can use a variety of tools for this task, such as the commercial business platform Tableau, free statistical software such as R, and the open-source package Processing. Artists then bring the visuals to life, with the help of Adobe design software and even movie-industry tools such as Autodesk Maya. Finally, the visualization is coded, rendered, and served up on websites and apps for viewing on a PC or tablet.
“They help people who don’t have backgrounds in data see what data is,” says Nathan Yau, author of the Flowing Data visualization blog and the recently released book Data Points: Visualization That Means Something.
Good visualizations are apolitical
In addition to being beautiful, visualizations seem trustworthy and objective because they’re rooted in hard numbers—lots and lots of numbers. They allow visitors to explore raw data to discover their own story, says Periscopic’s Citraro.
When Periscopic released its gun-violence visualization in February 2013, it received the expected accolades from gun opponents and criticism from gun advocates. But the irrefutability of the data, as well as the sheer volume of it, allowed the numbers to speak for themselves, Citraro says. And as a result, the visualization created a new conversation between the two sides of a polarizing issue.
“It’s not drawing those normal lines in the sand that we tend to draw,” says Citraro’s partner and Periscopic cofounder Kim Rees.
The current crop of visualization experts are a passionate, collaborative, and vibrant community. They are artists, engineers, coders, statisticians, and scientists. They have blogs. They converse on Twitter. They get together informally at data-visualization meet-ups throughout the country, and more formally at annual conferences such as the Eyeo Festival and OpenVis Conf (Open Web Data Visualization Conference). Many of them create visualizations for academic, corporate, and personal interests to help advance the form.
“The camaraderie is strange,” says Periscopic’s Citraro. “They don’t have reservations about sharing their ideas. It’s so refreshing.”
The community has some corporate champions. General Electric is a cofounder of Visualizing.org, a site for creators and enthusiasts to share ideas. GE also employs visualizations to communicate the impact of its energy, health, and transportation technologies, which are largely hidden from the public eye.
In addition, GE hosts contests that challenge data enthusiasts to use Internet-collected data to solve key industry problems, such as flight delays. A current contest, Flight Quest, asks participants to use flight and weather data to help airlines shave minutes (and money lost) from crowded flight schedules. Phase I of Flight Quest resulted in a data visualization that showed the economic benefit of a medium-size airline fleet saving 2.5 minutes of time per flight for one year: $26 million.
Visualizing natural science
Visualizations are particularly effective in helping the public understand natural-sciences data. And for 25 years, the NASA/Goddard Space Flight Center Scientific Visualization Studio has been creating thousands of them. “Our data is real physical data of the real world,” says Studio leader Horace Mitchell.
Changes in environmental data collected over time can be hard to notice in their raw form. Such is the case with Arctic sea ice, which melts throughout the summer and reaches its annual minimum size each September. Scientists use this benchmark to measure the effect of climate change on the Earth. But watching sea ice melt can be as interesting as watching paint dry—unless you can help people see it in context.
In August 2013, Mitchell’s group released its most recent Arctic Sea Ice update. It uses information from a satellite far above the Earth to show the daily shifting pattern of melting ice in the Arctic Ocean over a three-month period. This year’s melt won’t break any records, but if you want to feel more unsettled, look at the visualization of the aggregated data over a 32-year period.
In addition to creating public awareness, data visualizations are also an important tool for scientists themselves. NASA’s Curiosity Mars Rover, for example, enables scientists to explore the surface of Mars from 140 million kilometers away. But scientists need to see the data in as realistic a form as possible to be able to understand it.
“Is it silicate rock? Is it volcanic rock? Was this rock ever underwater? We have to be as accurate as we can,” says Eric De Jong, who leads the Rover’s visualization team at NASA’s Jet Propulsion Laboratory.
So the Rover’s visualizations use images fortified with extra data about what’s in the photo. Because colors and textures look different in Mars’s atmosphere, the team uses color calibration and computer modeling to make adjustments.
Visualizations beyond the Web
Not all visualizations sit in front of you on a screen. Some let you walk right inside them.
The Electronic Visualization Laboratory at the University of Illinois at Chicago created an immersive environment called CAVE, or the Cave Automatic Virtual Environment (the “cave” part is a nod to Plato’s analogy of the cave in The Republic). CAVE allows scientists and engineers to step inside a data visualization the size of a room, wearing the the same sort of 3D glasses people use to watch 3D TV. Through the visualization, users can explore an underground lake based on data collected by a roving robot, for example. Or they can examine highly detailed car design prototypes with everything displayed in actual size.
The original CAVE, released in 1992, used projectors to throw images onto the walls of a room. The new CAVE2 (unveiled in October 2012) shows images on LCD flat-panel screens that form the walls of the environment. The result is brighter images that take better advantage of the physical space.
“We think of CAVE2 as a special kind of lens for looking at big data,” says Electronic Visualization Laboratory director Jason Leigh, who created CAVE2 along with UIC associate professor Andrew Johnson.
In a neuroscience application of CAVE2, scientists can explore a map of neural connections in the brain (called a connectome) based on data taken from an MRI scan. Resembling some sort of neurological superhighway, green, red, and blue streamlines depict white matter fiber tracts in the brain. The colors indicate the primary direction of the lines.
Turning the CAVE2 environment into a 3D visualization of the brain required an interdisciplinary team. Olusola Ajilore, a UIC psychiatrist conducting research on how depression affects the brain, developed the map of the bundles of neural fibers. Electronic Visualization Lab faculty members and students took his data and modeled it in virtual reality.
The future of visualizations
As Big Data gets bigger, demand for data visualizations will grow, and the visualizations themselves will become better and better. Jer Thorpe, former data artist in residence at the New York Times and current cofounder of the Office for Creative Research, notes three reasons why.
First, display screens for visualizing data—think video walls and projected surfaces—are getting better and bigger. Second, gestural interfaces (screens that respond to body movement) will offer more flexible and intuitive ways to work with big data sets. Third, software tools, particularly open-source tools, are getting better and more accessible.
Visualization experts say that as visualization tools improve and become more common, people will pressure organizations and institutions to make their data sets more publicly available. Fast-forward to a future presidential election during which you can use cutting-edge visualization tools to evaluate all the facts and figures the pundits throw out—in the context of the whole data set from which the pundits selectively chose them.
What if visualizations of big data made us smarter about what people were saying about the world around us? Visualize that.