Using SQL Server Integration Services (SSIS) Data Flow in SQL Server 2005

Published: 18 May 2007
By: Brian Mains

SQL Server Integration Services (SSIS) in SQL Server 2005 has been redone, and is clearly superior over SQL Server 2000. With all of the new capabilities and features in SSIS, it is possible to create very complex data transformation packages. This article will show you how.

Introduction

SQL Server 2005 redesigned the way you design data transformation packages. It now includes much more capabilities, and because of that, it features a new tool. It separates the control of the package from the data transformation portion, so you can do much more. For instance, you have the ability to read information from WMIor a web service. There is also the ability to iterate through a loop of data stored in a variable, or from a result set.

Using the Data Flow

The data flow is the execution of data from one source to another; however, unlike DTS, you can have more than one data flow within the package. To create a data flow, add a Data Flow Task to the Control Flow, which is the controlling mechanism for the package. Within each of the data flow tasks, one source maps to one destination; however, several of the transformations feature the ability to split or merge the data as needed.

Each source and destination is similar to Data Transformation Services, where there are database, flat file, Excel, etc. connections. These are available in the toolbox or more readily available, in the connection tray (similar to the component tray in .NET 1.1) by right-clicking on it:

Figure 1: The available connections within the data flow

The available connections within the data flow

There are many specific types of transformations which can applied to the data. For instance, you can split apart data into two fields, merge fields into one field, split out rows by condition, pivot/unpivot the data, or split the data based on a condition. There are many types of transformations as the data moves from the data source to the description. Here is a summation of some of the transformations:

  • Copy Column - This transformation copies the data in a column and creates a new column with it.
  • Script Component - This uses a VB script to transform the data somehow. You can use programmatic means to access the data, and set this script up as a data source, destination, or a transformation. When you open up the script component, there is a button stated as Design Script, which invokes a Visual Studio Editor where you can code the script with intellisense support.
  • Sort - This transformation sorts the data based on one or more columns setup to sort in a specified order.
  • Pivot and Unpivot - New to SQL Server 2005 is the way to pivot/unpivot data rather easily and they are supported as transformations. Pivoting data means you can make the columns of the result set based on a distinct result from the data.
  • Merge/Union All – Merging allows you to veer two inputs into one output. You can specify the input/output parameters that the transformation will map to. The Merge transformation is a little more restrictive than the Union All transformation.
  • Conditional Split – Conditional Split works the opposite way; based on values within the data, you can setup statements to split the data if a condition matches. For instance, you can split one result based on the expression “ListPrice > 100” and any data that matches that result is returned via a specific data flow path.
  • OLE DB Command – This transformation executes a SQL statement for every row in the input source.
  • Lookup – This transformation looks up the value of a field in a lookup data source/table.

Each transformation, connection, etc. comes with an editor of sorts. Most of them have a basic editor, providing the minimal amount of information that someone might need to configure. You can open this editor by clicking to the right area of the text, or right-clicking and selecting the Edit option. For instance, below is a copy of an editor:

Figure 2: Simple editor available in most tasks

Simple editor available in most tasks

You will see an Advanced Editor option as well when right-clicking a task. The advanced editor contains a property view of all the settings that can be defined for a given task. This editor allows you with more control over the task details. Below is a copy of the advanced editor for the conditional split task:

Figure 3: Simple editor available in most tasks

Simple editor available in most tasks

As you work through designing a flow, it is possible to come across invalid column references as you change the design of your package. SSIS will prompt you about how you want to resolve the problem with the conflict. Why this came up for me is column names toward the end of the package had changed at the top. Because of that, some references became outdated, so to speak.

Figure 4: Invalid Column Reference Options

Invalid Column Reference Options

You may also notice that for each task, there is a green arrow and a red arrow. Each arrow signals a good data flow path and a bad data flow path. The bad data flow path must be setup for each task; when you go into the task, there is a configure error output button (shown in the images above) that specifies what action to take when errored or truncated information is found. The default is to fail the component, but the other options are to redirect the row or ignore it completely. If choosing to redirect the row, any errored data will redirect through the red path. From this, you could output errored data to a text file, or transform it somehow in an attempt to get it into a correct state before merging it into the final result.

The following is the final package in execution mode. To view the data going in and out, it is possible to setup a data viewer that stops the execution of the data flow and views the data being transferred. This is done by right-clicking on the arrow and selecting Data Viewers. Below are the available Data Viewers:

Figure 5: Data Viewers list; note the variable formats

Data Viewers list; note the variable formats

Upon running the package, the data results are below. These results are applied after the sorting operation.

Figure 6: The resulting data after the sort operation

The resulting data after the sort operation

In addition, if you click on the Progress tab, it contains a step-by-step report of what is happening in the package, for more detailed information. The final package runs successfully and the data tally is shown in the package at each step through the transformation:

Figure 7: Final successful package

Final successful package

Conclusion

You’ve seen how to create a Data Flow in SSIS. This is only a portion of what SSIS can do. I’ve included the package that was created, which outputs data from the AdventureWorks database to text files. To use this example, you will need to install the AdventureWorks database, and may need to change the LocalHost.AdventureWorks connection.

Downloads

SSIS Package

About Brian Mains

Brian Mains is an application developer consultant with Computer Aid Inc. He formerly worked with the Department of Public Welfare. In both places of business, he developed both windows and web applications, small and large, using the latest .NET technologies. In addition, he had spent many hou...

View complete profile

Top Articles in this category

Using SQL Server Integration Services (SSIS) Control Flow in SQL Server 2005
SQL Server Integration Services (SSIS) in SQL Server 2005 has been redone, and is clearly superior over SQL Server 2000. With all of the new capabilities and features in SSIS, it is possible to create very complex transformation packages. This article will show you how to take advantage of the control flow features.

Introduction to SQL Server Encryption and Symmetric Key Encryption Tutorial
Pinal Dave talks about encryption in SQL Server 2005.

The SqlCacheDependency Class
Luke Stratman shows how and why to use the SqlCacheDependency class.

SQL Server Pre-Code Review Tips
A list of tips for enforcing coding standards in SQL.

Review: SQL Server Integration Services Using Visual Studio 2005 – A Beginners Guide
Review of the book “Beginners Guide to SQL Server Integration Services Using Visual Studio 2005”.

Top
 
 
 

Please login to rate or to leave a comment.

Product Spotlight