Mastering the Art of Sankey Charts: Creating Smooth Transitions with Rank and Sigmoid Functions
- Sheba Alice Prathab
- Sep 25, 2024
- 6 min read

Sankey charts are a powerful tool for visualizing flows and relationships between different nodes, providing insights into how data moves from one point to another. A crucial aspect of creating effective Sankey charts is generating smooth curves between source and target nodes. This is achieved through calculated fields that leverage the concepts of Rank and Sigmoid functions. In this article, we will explore the components involved in creating these smooth transitions, emphasizing the importance of using cumulative values and percentages.
Why is Unioning Tables a Mandatory Step for Sankey Charts?
Unioning tables is essential in Sankey charts because it combines two sets of related data—such as source and target nodes—into a single dataset. This allows us to visualize the flow of values between these nodes effectively. Without unioning, the Sankey chart wouldn’t have a complete structure to visualize the relationship between data points.
Example: Sales Between Regions and Segments
Let's say we have two tables: one representing Regions and another representing Segments. The Region table contains sales data by geographical regions (e.g., North, South, East, West), while the Segment table contains sales by customer segments (e.g., Consumer, Corporate, Home Office).
Region Table:
Segment Table:
Without unioning these tables, you would have separate datasets, making it impossible to visualize the total sales flow between regions and segments in one Sankey chart. However, by unioning the Region and Segment tables, we create a combined dataset that tracks the flow of sales from one region to each customer segment.
Unioned Table:
This unioned dataset allows us to visualize the flow of sales from Regions to Segments in a single Sankey chart. For instance, the chart could show how much of the total sales in the North region went to the Consumer segment, thereby illustrating a complete flow of data.
By unioning the tables, we ensure that the flow of data between Regions and Segments is displayed in a continuous and accurate manner in the Sankey chart. Without it, the relationships between the regions and segments would be disjointed and unclear.
Key Components of Smooth Curve Generation
To create smooth curves in Sankey charts, several calculated fields play distinct roles. The following table outlines each field's purpose, contribution, and its position in the graph.
The Importance of Using Cumulative Values and Percentages
Using cumulative values for Rank1 and Rank2 allows for a more intuitive representation of how each node contributes to the overall flow. When combined with percentages, these ranks provide a clearer context for understanding the data's significance.
Example of Percentage Usage
Consider a scenario where Node A generates 100 units, Node B generates 200 units, and Node C generates 700 units. Instead of displaying these absolute values, presenting them as percentages of the total (1000 units) offers better insight:
Node A:
Value: 100 units
Percentage: 100/1000×100=10%
Node B:
Value: 200 units
Percentage: 200/1000×100=20%
Node C:
Value: 700 units
Percentage: 700/1000×100=70%
Using percentages facilitates relative comparisons, making it easier to identify trends and contributions across different nodes in the Sankey chart.
Utilizing Rank1, Rank2, and Sigmoid
In a Sankey chart, Rank1, Rank2, and Sigmoid can be utilized independently on the Y-axis, but they are typically combined to generate smooth curves between the source and target nodes. Let’s break down their separate uses:
Rank1: This can be represented alone on the Y-axis to show the starting position of the source node.
Rank2: Similarly, Rank2 can indicate the position of the target node on the Y-axis.
Sigmoid: This function can create a smooth transition on its own on the Y-axis, resulting in a smooth "S-shaped" flow without considering the ranks.
While individual usage of Rank1, Rank2, and Sigmoid can achieve basic functionality, they do not create the visually appealing smooth curve that defines effective Sankey charts. This is why they are often combined into a calculated field to achieve that seamless flow.
Understanding the Curve Formula in a Sankey Chart
The formula used to create smooth transitions in a Sankey chart is:
Curve=[Rank1]+([Rank2]−[Rank1])×[Sigmoid]
Breaking Down the Formula
Rank1:Represents the starting position of the source node on the Y-axis. For example, if Rank1 equals 2, the flow begins at position 2.
Sigmoid: The Sigmoid function smoothens the transition between the source (Rank1) and the target (Rank2). It maps the T value (which typically ranges from -6 to +6) to a value between 0 and 1. When T = -6, the curve starts at Rank1 (closer to the source), and as T approaches +6, it transitions smoothly toward Rank2. The Sigmoid formula is:
Sigmoid (T) = 1/(1+e^−T)
This function helps create the smooth "S" curve between the two nodes.
Combining Them: The final formula adjusts the Y-axis position of the curve based on the progression of T. Initially, the curve starts at Rank1 (when the Sigmoid value is near 0). As T increases, the curve gradually moves toward Rank2. The formula used to combine Rank1, Rank2, and Sigmoid is:
Curve= Rank1 + (Rank2−Rank1) × Sigmoid (T)
Rank1 defines where the flow starts on the Y-axis.
(Rank2 - Rank1) determines how far the curve needs to move from the source to the target.
Sigmoid (T) controls how smoothly the curve transitions from Rank1 to Rank2, based on the value of T.
Example:
Let’s say:
Rank1 = 2
Rank2 = 5
We evaluate Sigmoid(T) at T = 0 (which gives a value of 0.5)
Using the formula:
Curve = 2 + ( 5 − 2 ) × Sigmoid (0) = 2 + 3 × 0.5 = 2 + 1.5 = 3.5
At T = 0, the curve is halfway between Rank1 and Rank2, at position 3.5 on the Y-axis, representing the smooth transition between source and target. The Sigmoid function ensures this curve moves gradually from Rank1 to Rank2 as T progresses from -6 to +6.
The combination of Rank1, Rank2, and the Sigmoid function is essential for crafting the smooth curves characteristic of Sankey charts. The formula ensures a fluid transition from source to target, producing visually appealing flows that effectively illustrate relationships and transitions within the data. By understanding the contributions of each component and utilizing cumulative percentages, you can create Sankey charts that not only inform but also captivate your audience.
This understanding will not only enhance your Sankey chart creation skills but also improve the clarity and impact of the insights you communicate through your visualizations. If you're interested in learning more about Sankey charts, I invite you to explore my other blogs. They provide practical insights and examples that can help you deepen your understanding and create your own beautiful visualizations!

