Data Visualization

Data visualization transforms data into visual formats (e.g. charts, graphs, maps, and infographics) to reveal patterns, trends, and insights that may be difficult to detect in raw form. Data visualization makes complex data easier to discover, read, and understand, and supports theme exploration, sensemaking, and communication.

Learning

Data Visualization Design Workflow

1. Identify Audience and Purpose

Data visualization offers a clear and accessible way to communicate complex information to audiences with different levels of familiarity with data science. When designing your visualization, start by asking yourself:

  • Who is my audience? What is their level of familiarity with the topic?
  • Is this visual intended for exploration or explanation?
  • What message am I trying to convey? If I have multiple messages, do they require more than one visualization?
  • Where will this visualization be displayed? Will it appear in a graph, infographic, poster, presentation, or another format?
  • Will the visualization be interactive? If so, what types of interactivity (e.g., filtering, zooming, tooltips) will enhance the audience’s understanding?

Recommend readings:

2. Select, Prepare, and Understand Data

The next step is to gather, prepare, and process your data. You may obtain data from a single source or compile it from multiple sources. In either case, it is crucial to clean and normalize the data. Consider some tools that can help with this:

 
When cleaning your data, start by asking yourself:
 
  • Is my dataset primarily qualitative, quantitative, or a mix of both?
  • What type of data am I working with? Is it geographic, textual, temporal, network, or statistical/ numeric?
  • Are there missing values or outliners in the dataset? If so, should they be removed, replaced, or estimated?
  • What is the size and format of my dataset? How many variables does it contain, and how many are actually needed for my analysis?
  • Does the dataset include sensitive or personal information? If so, what strategies will I use to anonymize or clean such data responsibly?

Recommend readings:

  • What is data cleaning? by IBM tells the importance, benefits, and techniques for data cleaning.
  • Research Data Management(RDM) guideline at DKU outlines how to manage research data responsibly in alignment with the DKU RDM Policy.
  • How to deal with sensitive data by OpenAIRE explains what is sensitive data and how to store them.

3. Select Visualization Form

Choosing the right graph type is essential for effective data visualization, and remember to use one chart for one message only. Depending on your data type, purpose, and the number of variables you want to display, different graph types may be more suitable. Consider these tools to help you understand graph types and choose:

4. Select Visualization Elements

Other than the graph itself, pay attention to the supporting visual elements—especially color, which is often the most significant element of any visual work. Some common principles when deciding your colors: 

  • Ensure color contrast meets accessibility standards and passes a color-blindness check, especially for visuals intended for the general public.
  • Use no more than 7 colors in a single graph.
  • Tools:
    • Color Brewer, a tool helps you select accessible and effective color schemes based on your data
    • Colormind, color palette generator
    • Contrast Checker, a tool to verify readability with adequate contrast between content and background.
    • Coblis, a tool used to ensure your color choices remain accessible for people with color vision deficiencies (color blindness)
Consider other elements that help your audiences better understand your message by asking yourself questions such as:
 
  • Title – Do I need a title for my graph? If so, is my title clear and descriptive for my audiences?
  • Legend – Do I need a legend in my graph? If yes, where is the most effective place to position it?
  • Axis – Do I need an axis to represent my data? If so, are the scales and intervals consistent? Note that some graph types (e.g., maps, pie charts, donut charts) do not require axes.
  • Label – Do I need labels for my data points? If so, how should I design them? For example, orientation, number of decimals, or placement?
  • Source – Do I mention where the data comes from, and what citation format should I use? For example, is it research data from an experiment or survey, open data from publicly available datasets or databases, or data from governmental reports?

Recommend readings:

  • Data visualization tips offers best practices for managing data visualization projects from design choices to communication strategies.

5. Share and Receive Feedback

Audiences are the best source of feedback for improving your work. Consider running small-scale user experience tests with 2 or 3 members of your target audience, and evaluate whether they can understand your message solely through your graph. Pay close attention to their questions, moments of surprise, and reactions to specific elements, as these can reveal how effectively your visual communicates.

You do not need to follow every suggestion, but audience feedback can guide your decisions and highlight areas for improvement. In addition, sharing your work with experts (e.g., a Data and Visualization Librarian) or with other user groups (e.g., instructors and teaching assistants) can provide valuable insights and perspectives that further strengthen your visuals.

DKU Library Workshops

We will be offering the Data and Digital Workshop Series during the 25/26 academic year, with more topics and tools to be added over time. Check out Fall 2025 Library Workshops and sign up today!

Software

Microsoft Excel

Microsoft Excel is a tool for creating, editing, and managing spreadsheets, while providing a simple, accessible way to produce basic data visualizations.

Learning Resources:

Tableau

Tableau is a tool for creating interactive, visual dashboards, making it excellent for data storytelling and presentation.

  • Tableau Public is a free tool that allows publishing visualizations online. The Desktop version can be downloaded here.

Learning Resources:

Power BI

Microsoft Power BI is an analytics platform designed for scalable, visual reporting, automated dashboards, and working with large datasets or multiple data sources.

  • DKU students, faculty, and staff currently only have access to Power BI through the web version in Microsoft Teams.

Gephi

Gephi is a leading free software for visualizing and analyzing complex networks. Network analysis (aka. social network analysis) is commonly used in domains, including social networks, bibliometrics, epidemiology, bioinformatics, complex systems, and text analysis.

  • Gephi can be downloaded from here.

Cytoscape

Cytoscape is a free software platform for visualizing complex networks and integrating them with a wide range of attribute data.

  • Cytoscape can be downloaded from here.

DKU Support

For data-related support, contact the Data and Visualization Librarian, Siti Lei (siti.lei@dukekunshan.edu.cn).