Is UNPIVOT the best way for converting columns into rows?

June 15th, 2020

To start with, let’s give definitions and highlight the difference between PIVOT and UNPIVOT operators in SQL Server.

The PIVOT and UNPIVOT relational operators are used to change a table-valued expression into another table and involve data rotation. Let’s start with the first operator.

We use PIVOT when there is a need to convert table rows into columns. It also enables us to perform aggregations, wherever required, for column values that are expected in the final output. Let’s take this table as an example:

If we pivot it by the first column ‘screen’, we will get the following result:

To achieve this result in SQL Server, you need to run the following script:

SELECT [avg_], [11], [12], [13], [14], [15] 
FROM (
  SELECT 'average price' AS 'avg_', screen, price FROM laptops) x
  PIVOT (AVG(price) FOR screen IN([11], [12], [13], [14], [15])
) pvt;

In order to reverse a PIVOT operator, that is, to convert data from column-level back to row-level and get the original table, you can use the UNPIVOT operator. Nevertheless, note that UNPIVOT is not a complete opposite of the PIVOT function. It is only possible if the pivoted table doesn’t contain aggregated data.

PIVOT aggregates data and can merge a bunch of rows into a single row. UNPIVOT doesn’t reproduce the initial table-valued expression result since rows have been merged. Apart from that, null values in the input of UNPIVOT disappear in the output. When the values disappear, it shows that there may have been original null values in the input before the PIVOT operation.

Speaking of the PIVOT operator, dbForge Studio for SQL Server provides a useful function named a pivot table. To clarify, this is a data analysis tool that converts large amounts of data into concise and informative summaries. It enables us to easily rearrange and pivot data to get the layout best for understanding the data relations and dependencies. The greatest advantage of this feature is that it simplifies the aggregation process and the statistical information count. Also, the obtained report can be printed out, exported to various document formats, and e-mailed in the required format.

Now, let’s talk more about different implementations of T-SQL UNPIVOT transformations.

Let’s assume we have an aggregated table that contains data about the results of the games played by each player. We have a task to convert columns into rows.

IF OBJECT_ID ('dbo.Players'IS NOT NULL
    DROP TABLE dbo.Players;
CREATE TABLE dbo.Players
(
      PlayerID INT
    , Win INT
    , Defeat INT
    , StandOff INT
    CONSTRAINT PK_Players PRIMARY KEY CLUSTERED (PlayerID) ON [PRIMARY]
);
INSERT INTO dbo.Players (PlayerID, Win, Defeat, StandOff)
VALUES
    (1, 7,  6,  9),
    (2, 12, 5,  0),
    (3, 3,  11, 1);

There are several ways to accomplish this task, so let’s take a look at the execution plans of each of the suggested below implementations. For this, we will use SQL Profiler available in dbForge Studio for SQL Server.

To receive execution plans automatically each time the query is executed, we need to switch to the profiling mode:

It’s also possible to get the query plan without starting its execution. For this, you need to run the Generate Execution Plan command.

Let’s Start!

SQL Query profiler

1. UNION ALL

Previously, SQL Server did not offer an efficient way to convert columns into rows. Because of that many users opted for multiple reading from the same table with a different set of columns united through the UNION ALL statements:

SELECT PlayerID, GameCount = Win, GameType = 'Win'
FROM dbo.Players
    UNION ALL
SELECT PlayerID, Defeat, 'Defeat'
FROM dbo.Players
    UNION ALL
SELECT PlayerID, StandOff, 'StandOff'
FROM dbo.Players

The crucial weakness of this practice is multiple data reading. This happens because UNION ALL will scan the rows once for every subquery, which considerably decreases the efficiency of the query execution.

It is obvious when we look at the execution plan of the following query:

2. UNPIVOT

One of the fastest ways to convert columns into rows is definitely to use the UNPIVOT operator, which was introduced in SQL Server in 2005. Let’s simplify the previous query with this SQL UNPIVOT syntax:

SELECT PlayerID, GameCount, GameType
FROM dbo.Players
UNPIVOT (
    GameCount FOR GameType IN (
        Win, Defeat, StandOff
    )
) unpvt

As a result of the query execution, we get the following execution plan:

3. VALUES

One more point to consider is the possibility to accomplish the given task, which is to convert columns into rows, with the VALUES statement.

The query which uses the VALUES statement will look like this:

SELECT t.*
FROM dbo.Players
CROSS APPLY (
    VALUES
          (PlayerID, Win,      'Win')
        , (PlayerID, Defeat,   'Defeat')
        , (PlayerID, StandOff, 'StandOff')
) t(PlayerID, GameCount, GameType)

Moreover, the execution plan will be simpler, comparing to UNPIVOT:

4. Dynamic SQL

Using the dynamic SQL allows creating a general-purpose query for any table on the condition that columns not included in the primary key have a compatible data-type between them:

DECLARE @table_name SYSNAME
SELECT @table_name = 'dbo.Players'

DECLARE @SQL NVARCHAR(MAX)
SELECT @SQL = '
SELECT *
FROM ' + @table_name + '
UNPIVOT (
	value FOR code IN (
		' + STUFF((
	SELECT ', [' + c.name + ']'
	FROM sys.columns c WITH(NOLOCK)
	LEFT JOIN (
		SELECT i.[object_id], i.column_id
		FROM sys.index_columns i WITH(NOLOCK)
		WHERE i.index_id = 1
	) i ON c.[object_id] = i.[object_id] AND c.column_id = i.column_id
	WHERE c.[object_id] = OBJECT_ID(@table_name)
		AND i.[object_id] IS NULL
    FOR XML PATH(''), TYPE).value('.', 'NVARCHAR(MAX)'), 1, 2, '') + '
	)
) unpiv'

PRINT @SQL
EXEC sys.sp_executesql @SQL

And the result is the following query:

SELECT *
FROM <table_name>
UNPIVOT (
	value FOR code IN (<unpivot_column>)
) unpiv

This method is not as fast because the automatic generation of the UNPIVOT query requires an additional reading from the system views and rows concatenation via the XML trick.

5. XML

A smarter way to execute dynamic UNPIVOT is by doing a little trick with XML:

SELECT
	  p.PlayerID
	, GameCount = t.c.value('.', 'INT')
	, GameType = t.c.value('local-name(.)', 'VARCHAR(10)')
FROM (
	SELECT
		  PlayerID
		, [XML] = (
				SELECT Win, Defeat, StandOff
				FOR XML RAW('f'), TYPE
			)
	FROM dbo.Players
) p
CROSS APPLY p.[XML].nodes('f/@*') t(c)

In the query above, for each row SQL Server has generated a specified XML:

 < f Column1="Value1" Column2="Value2" Column3="Value3" ... />

Afterwards, the name of the attribute and its value are parsed. In most cases, the use of XML results in a slower execution plan. In our case, that is the price we pay for it to be universal.

Let’s compare the results by executing the Compare Selected Results command:

We have to say that there is no obvious difference in the speed of execution of the UNPIVOT and VALUES queries. This is valid for the cases when a simple transformation of columns into rows takes place.

Let’s examine another task, where we need to find the most frequent outcome of the game for each player.

We’ll try to do this task with the help of the UNPIVOT statement:

SELECT
	  PlayerID
	, GameType = (
		SELECT TOP 1 GameType
		FROM dbo.Players
		UNPIVOT (
			GameCount FOR GameType IN (
				Win, Defeat, StandOff
			)
		) unpvt
		WHERE PlayerID = p.PlayerID
		ORDER BY GameCount DESC
	)
FROM dbo.Players p

In the execution plan it shows that the bottleneck is the multiple data reading and sorting, which is necessary for organizing the data rows:

It’s easy to get rid of the multiple data reading if we remember that columns from the external query block can be used:

SELECT
	  p.PlayerID
	, GameType = (
		SELECT TOP 1 GameType
		FROM (SELECT t = 1) t
		UNPIVOT (
			GameCount FOR GameType IN (
				Win, Defeat, StandOff
			)
		) unpvt
		ORDER BY GameCount DESC
	)
FROM dbo.Players p

Multiple reading of data was eliminated, but the most resource-consuming operation – sorting is still there:

And that is how the VALUES statement behaves during this task:

SELECT
	  t.PlayerID
	, GameType = (
			SELECT TOP 1 GameType
			FROM (
				VALUES
					  (Win,	 'Win')
					, (Defeat,	 'Defeat')
					, (StandOff, 'StandOff')
			) t (GameCount, GameType)
			ORDER BY GameCount DESC
		)
FROM dbo.Players t

As we’ve expected the plan was simplified, however sorting is still present:

Let’s try to avoid sorting by using the aggregation function:

SELECT
	  t.PlayerID
	, GameType = (
			SELECT TOP 1 GameType
			FROM (
				VALUES
					  (Win,	 'Win')
					, (Defeat,	 'Defeat')
					, (StandOff, 'StandOff')
			) t (GameCount, GameType)
			WHERE GameCount = (
				SELECT MAX(Value)
				FROM (
					VALUES (Win), (Defeat), (StandOff)
				) t(Value)
			)
		)
FROM dbo.Players t

Now the execution plan looks like this:

Hurrah! We have managed to get rid of sorting.

Conclusion

When we need to carry out a simple conversion of columns into rows in SQL Server, it is better to use UNPIVOT or VALUES structures.

If, after the conversion, the received data rows should be used for aggregation or sorting, then we had better use the VALUES structure, which, in most cases, results in more efficient execution plans.

For the tables where different structure types might occur, and the number of columns is unrestricted, it is recommended to use XML, which, unlike the dynamic SQL, can be used inside the table functions.

A step-by-step analysis of the execution plan with the help of SQL Profiler in dbForge Studio for SQL Server allows detecting bottlenecks in the query’s performance.

Studio for SQL Server download



                         

            

9 Responses to “Is UNPIVOT the best way for converting columns into rows?”

  1. kevin Says:

    Hi Sergey Syrovatchenko , how do I use you example to reverse the unpivot method/s to go back to pivot ?

  2. Raghvendra Says:

    Please help me with the below query, I want column_Name to be displayed in rows and and Table_Name and database are grouped as all the column for a table will be displayed in single row:

    IF OBJECT_ID(‘Tempdb.dbo.#temptbl’, ‘u’) IS NOT NULL
    DROP TABLE #temptbl

    Create Table #tempTbl
    (
    Database_Name nvarchar(100),
    Table_Name nvarchar(100),
    Column_Name nvarchar(100)
    )

    Insert Into #tempTbl
    Select DB_Name() , sys.schemas.NAME + ‘.’ + sys.tables.NAME, sys.columns.name FROM sys.tables
    INNER JOIN sys.schemas ON sys.tables.schema_id = sys.schemas.schema_id
    Inner Join sys.columns On sys.tables.object_id = sys.columns.object_id
    Go

    Select * from #tempTbl

  3. Ahmed Shahin Says:

    I’m having a Table called Test_Pivot which looks like
    F_SN F_Title F_Data
    1001 TRX RRN 301807460001.00
    1001 DATE TIME 18:07 ,30/04/2015
    1001 AUTH CODE
    1001 AMOUNT 0.22
    1001 SCHEME NAME SPAN
    1001 RESULT DECLINED
    1011 TRX RRN 301920050002.00
    1011 DATE TIME 19:20 ,30/04/2015
    1011 AUTH CODE
    1011 AMOUNT 0.01
    1011 SCHEME NAME MASTERCARD
    1011 RESULT DECLINED

    The output should look like this:

    TRX RRN DATE TIME AUTH CODE AMOUNT SCHEME NAME RESULT
    301807460001.00 18:07 ,30/04/2015 0.22 SPAN DECLINED
    301920050002.00 19:20 ,30/04/2015 0.01 MASTERCARD DECLINED

  4. Andrey Langovoy Says:

    Hi Ahmed,
    Please clarify your question!
    Thank you.

  5. Kai Says:

    Hi,

    How did you get the query profiler results shown in image 06? What tool did you use?

  6. Sergey Syrovatchenko Says:

    Thanks for the comment. This is a built-in feature in dbForge Studio profiler

  7. Kai Says:

    Can dbForge Studio profile monitor the query’s tempdb and memory usage?

  8. Sergey Syrovatchenko Says:

    Hi Kai, you mean spills into tempdb? If yes, in this case we have identical functionality as in SSMS. In future it’s planned to expand this functionality.

  9. André St-Onge Says:

    I tried 2 differents way with using UNPIVOT and CROSS APPLY with VALUES and my results on 41,894,370 of rows is UNPIVOT is better but not very much
    Using UNPIVOT 31:38 minutes
    Using CROSS APPLY with VALUES 33:26 minutes
    CROSS APPLY spent 3 minutes more than UNPIVOT
    My SQL is : Microsoft SQL Server Enterprise (64-bit) version : 11.0.5532.0

Leave a Comment