Database, SQL Server

Using New PIVOT Operator in SQL Server 2005

The PIVOT operator is new in SQL Server 2005, and it extends to developers a tool that alters row form of data into tabular form. This can be done without the need for the developer to write complex and cumbersome queries.

Let’s look at the CASE statement used to achieve, the data transformation required here.

CASE Statement Queries

SQL Server database designers and developers have been using the CASE statement to transform row level data to tabular form. Up until SQL Server 2005, the CASE statement was all that the developer had in achieving this business critical data transformation. A general transformation, termed as T-SQL, consisted of a series of CASE statements along with a chicken fix series of aggregated complex queries to retrieve the required data set. While the CASE statement did offer developers some degree of flexibility over the records returned, and the way they were returned, the queries written to fetch the desired result were often flawed, cumbersome and complex to write.

Let us have a take a look at an example. In this example we will be creating a SalesArchive table and will be loading data into this table. This is shown in the SQL code below:


      Sale_ID INT IDENTITY(1,1),      Product_Sold VARCHAR(30),
      Sale_Date SMALLDATETIME,
      Sale_Price MONEY
SET @k = 1
WHILE (@k <=100)
      INSERT INTO SalesArchive
      (Product_Sold, Sale_Date, Sale_Price)
      ('Coffee Maker', DATEADD(mm, @k, '3/11/1919'), DATEPART(ms, GETDATE()) + (@k + 57) )
      INSERT INTO SalesArchive
      (Product_Sold, Sale_Date, Sale_Price)
      ('Blender', DATEADD(mm, @k, '3/11/1927'), DATEPART(ms, GETDATE()) + (@k + 13) )
      INSERT INTO SalesArchive
      (Product_Sold, Sale_Date, Sale_Price)
      ('Toaster', DATEADD(mm, @k, '3/11/1908'), DATEPART(ms, GETDATE()) + (@k + 29) )
      SET @k = @k + 1

After we are done with the creation of the table and the insertion of data within the SalesArchive table, we would be constructing a query with the CASE statement. The goal of the query is to generate the sum of the sales achieved yearly for the individual products, Coffee Maker, Toaster, and Blender. The resultset will show each product having its sales data in a dedicated column. The SQL code is given below:

SELECT CAST(YEAR(Sale_Date) AS VARCHAR(4)) AS Sale_Date, SUM(CASE WHEN Product_Sold = 'Toaster' THEN Sale_Price END) AS Toaster, SUM(CASE WHEN Product_Sold = 'Blender' THEN Sale_Price END) AS Blender, SUM(CASE WHEN Product_Sold = 'Coffee Maker' THEN Sale_Price END) AS Coffee Maker FROM SalesArchive GROUP BY CAST(YEAR(Sale_Date) AS VARCHAR(4))

As you can see, the above code, although flexible and scalable is still complex and cumbersome to write. You are expected to write a series of CASE statements for each row that you want to transform in to column form, Where there are over 10 rows per query, the query becomes tedious and quite a pain. This is exactly where the PIVOT operator rescues developers in the new SQL SERVER 2005.

Using the PIVOT Operator

While using the PIVOT operator, you will not require the CASE statement and the group by clause. This is because the PIVOT operator employs the GROUP BY clause on a single row, by itself. The same resultset can be obtained as the above query with CASE statements with a PIVOT operator, easily and cleanly. Look at the SQL code below:

SELECT Sale_Date, Toaster, Blender, Coffee_Maker
(      SELECT CAST(YEAR(Sale_Date) AS VARCHAR(4)) AS Sale_Date, Sale_Price, Product_Sold  FROM SalesArchive
) AS rf
(         SUM(Sale_Price) FOR Product_Sold IN(Toaster, Blender, Coffee Maker)
) AS p

The crux of the query is nested inside the PIVOT operator block. The SELECT structure in the above query returns the year and products contained in the SalesArchive table. Remember, the SELECT statement preceding any PIVOT clause, is not allowed to specify which columns can be used within the PIVOT clause. As mentioned above, the PIVOT clause would group all columns that are not grouped. Hence, as shown in the above code, we may have to use a subquery to return the desired resultset. Another limitation with the use of PIVOT operator is that, most of the fields will have to be hard coded. While you can write complex, dynamic T-SQL queries to overcome this limitation, the resultant code would be more of a burden to process as compared to its CASE statement form.