Visio Chart

Charts & Graph For Data Visualization

Sankey diagrams are among the most effective tools for visualizing data flows. Data analysts and business owners often use Sankey diagrams to represent the flow of data within a particular environment because of its advanced data visualization features. At first glance, the Sankey diagram appears to be complex and difficult to understand however, this is far from the truth as the Sankey diagram is one of the simplest data visualization you can acquire and use to glean insights from your data.

What is a Sankey diagram?

A Sankey diagram is a tool for data visualization that demonstrates how data moves from one point to another. The origin of the data sets is referred to as the source node, and the destination is referred to as the target node. In a Sankey diagram, these two essential components are often seen as rectangles. Keeping this information in mind simplifies the data interpretation of Sankey diagram.

A curving path, often known as the link, represents the movement of data inside a Sankey diagram. The length is directly proportional to the amount of data passing through the points. This means that determining the quantity of data flow is as simple as determining the length of the link. Further, a Sankey diagram may illustrate the flow of virtually anything as long as there is a clear understanding of the sort of data being used and the overall desired objective.

There are countless situations where a Sankey diagram may come in handy. Among many things, it can be used to visualize energy management, product costs in a business or even the flow of energy within a steam engine; which scientists had originally used this diagram for.

Essentially, when visualizing complex data sets and presenting data reports to non-technical audiences, a Sankey diagram is one of the greatest and most effective visualization to be utilized if done correctly.

How is a Sankey diagram used?

Since we now know the ‘what?’ of a Sankey diagram, let us move to the ‘how?’ of it. When using a Sankey diagram, certain general steps must be acknowledge

Identify the system or process to be visualized: This can be anything from the flow of energy in a building to the flow of materials in a manufacturing process or the movement of goods/ services in an economy.

Gather data: You’ll need to collect data on the system’s or process’s numerous inputs, outputs, and flows. This data may include the quantity of energy consumed in different portions of a building, the amounts of raw materials required at various stages of a manufacturing process, or the dollar values of an economy’s imports and exports.

Choose a Sankey diagram tool: There are numerous software tools available for creating Sankey diagrams, including Microsoft Excel, Google Sheets, and specialized data visualization software such as Tableau or D3.js.

Input your data: Enter your data into the Sankey diagram tool of your choice. Depending on the tool, you may need to format your data in a specific way.

Customize your Sankey diagram: When you’ve input your data, you may personalize your Sankey diagram to match your specific needs. This personalization includes chosen color schemes, the size and shape of the figure and additionally, labels or annotations can also be added.

Interpret your results: Interpret your findings: You may use your Sankey diagram to find patterns and linkages within the system or process you’re visualizing after you’ve generated it. The diagram may also be used to indicate areas where improvements or adjustments could be made to optimize the system or process.

The data type used by a Sankey diagram:

Many people struggle to understand the type of data needed to create a Sankey diagram and this ultimately deters them from using it effectively. In reality, weighted networks such as flows are typically identified using a Sankey diagram.

Simply put, if one has access to all the required data, this procedure is likely to occur with any variation of a data structure. Following the same, the diagram’s nodes are then shown in two or more classes, each of which represents a different phase in the data processing process.

Analysis and creating key insights become considerably simpler when the data is sufficiently characterized.

Benefits of Sankey diagrams

Simple to understand: Users may more easily comprehend how information, energy, or materials move through a system thanks to Sankey diagrams, which offer a highly visual and simple method to portray complicated systems or processes.

Effective communication: Sankey diagrams may be used to convey complicated information or concepts in a simple, straightforward, and engaging manner. They make it easier for users to interact with the data since they may be used to emphasize important insights or develop a narrative.

Identification of bottlenecks: Sankey diagrams may be used to locate areas of inefficiency in a system or process and helps identify bottlenecks. This is because it is easier to pinpoint parts where the system could be optimized by visualizing the flow of data, energy, or material.

Identification of improvement opportunities: Sankey diagrams may also be used to indicate areas where there is potential for development, such as by cutting waste or boosting efficiency. This is also owed to the fact that the visualization of data, energy or material makes it easier to identify regions where improvement can be made to achieve better results.

Comparison of alternative scenarios: Sankey diagrams may be used to compare various scenarios, including those that occur before and after a change is made to a system or process. As a result, assessing the change’s effects and determining its success or failure is made simpler.

The components of a Sankey diagram

A Sankey diagram is composed of three major components:

  • Nodes
  • Flows
  • Labels

Nodes: Nodes are crucial junctures on a Sankey diagram. They depict the various entities that are being tracked or analyzed. Nodes, for instance, might represent various energy sources including coal, oil, gas, and renewables in an energy flow diagram.

Flows: The movement or transfer of a certain amount (such as energy, materials, or money) between the nodes is represented by each flow in a Sankey diagram and further, the width of the flows are often proportionally related to the transferable quantity.

Labels: Labels provide further details about the nodes and flows, such as the entity’s name, the amount being moved, or the measurement units. They are typically positioned next to the nodes they describe.

Customized labels on a Sankey diagram

Labels in a Sankey diagram can be altered to include more details about the nodes, links, or flows. An idea of how to alter the labels in a Sankey diagram is:

  1. Changing the font size and style: This can help emphasize important information or make the labels easier to read.
  2. Changing the color: Colors can be used to emphasize certain elements or to distinguish between other diagram components.
  3. Label content: To give more details about the nodes, links, or flows, or to make the labels further informative, label content can be modified. This includes the addition of numerical values, percentages, or units of measurement for instance.

Data Format on a Sankey Diagram

Different data formats may be used to produce Sankey diagrams, but the most preferred format uses three data columns from a dataset: one for the “input” column, one for the “output” column, and one for the values corresponding to each pairing. The output items themselves do not link to other output items in this narrowed-down form of the chart, but input items do connect to one or more output items.

There are additional formats that may be used to create multi-level Sankey diagrams. Because there are more subcategories in a multiple-level Sankey diagram, the structure might be more intricate. Nonetheless, it enables a more comprehensive depiction of the flow of data/ resources across many levels.

Sankey Diagram Examples – Simple and Multi-level

As elaborated, there are mainly two types of Sankey diagrams, simple— that have a single level and, multi-level— which have multiple levels.

For clearer understanding, let’s start with an example of a Basic Sankey Diagram: (The trade flows between the major trading nations are the subject of the example we’ll cover here)

ExportsImportsAmount ($)

Let’s now move to a Multilevel Sankey Diagram example: (Monthly budgeting is the subject of the example being covered here)

Level 1Level 2Level 3Amount
SalaryExpensesUtility Bills25,000

How Do I Create a Sankey Diagram Online?

In contrast to the 1800s, when Sankey diagrams were first produced, creating your personalized chart is today exceedingly quick and easy.

Sankey diagrams may be created online using a variety of third-party tools. Tableau, Plotly, Infogram, Google Charts, and Flourish are a few popular platforms where Sankey charts can be created. It is as simple as entering your data and customizing colors and labels as per your need. Users then have the option to export the final diagram in a variety of formats after completing the data input.


Can a Sankey diagram be made using Excel?

Yes, you may use add-ins or tools from other parties to construct a Sankey diagram in Excel. Sankey diagrams may be created using Excel add-ins like FunFun and Datawrapper.

Are there any rules to keep in mind when making a Sankey chart?

When making a Sankey chart, one should bear in mind a few general guidelines:

  1. Consistent Flow Direction: Typically, data should flow from left-to-right and in the same direction. This aids users in comprehending the diagram’s flow.
  2. Proportional Widths: The flow lines’ thickness should correspond to the data that is being transferred.
  3. Colors: The use of color can be used to distinguish between flows or to draw attention to certain chart nodes.
  4. Layout: Sankey charts should have a unified design and utilize consistent flow patterns. This makes it simpler to read the chart.

What does a Sankey diagram even depict?

Sankey diagrams illustrate how data travels from one set of nodes to another; the thickness of the lines shows how much data moves between each pair of nodes. Further, these nodes can represent any variety of data.

Are Sankey and alluvial diagrams the same?

Both Sankey and alluvial diagrams are forms of flow diagrams that depict how data flows. These two charts do differ in a few ways such as:

  1. Flow of data: unlike alluvial diagrams, where the direction of the flow may be altered and merged in many ways, Sankey diagrams have a left-to-right data flow.
  2. Shape of node: Sankey charts use rectangular shapes to represent nodes, whereas alluvial diagrams use horizontal bars that are divided into smaller segments.
  3. Thickness of lines: In Sankey diagrams, the thickness of the lines equals the flow of data between nodes. Alternatively, in alluvial diagrams, the thickness of the bars represents the number of observations that fall into each segment.

Application: Sankey diagrams are typically used to portray continuous data types such as energy, traffic, or survey findings, while alluvial diagrams are commonly used to display categorical or discrete data types such as demographic data or survey results.

Leave a Reply

Your email address will not be published. Required fields are marked *

Scroll to top