SQL Server integration services is a powerful data integration and workflow platform provided by Microsoft. It can be used to automate data extraction, transformation, and other ETL processes. You can use SSIS to export data from various data sources like flat files, database servers, spreadsheets, etc. Due to these capabilities, SQL Server integration services are widely used in building data warehousing, business intelligence, and database migration tasks.
The SSIS also supports data conversion, which helps maintain compatibility across various data sources. Data conversion transformation is an essential component to handle the mismatched data type between source and destination during the ETL (Extract, Transform, and Load) process, especially when you are transferring data from heterogeneous sources.
For example, you are exporting the data from the flat file source, which has the birthdate column stored as a string, and we want to export it to the SQL Server table with the same column, but the datatype is going to be DT_DATE. In such cases, we can use the SSIS data conversion transformation.
Table of contents
- What are data conversation tasks in SSIS
- How to perform Data Conversion tasks in SSIS
- Streamline data conversion with Devart SSIS components
- Conclusion
What are data conversation tasks in SSIS
The data conversion task is crucial in any ETL process, especially when importing or processing data from heterogeneous sources with varying data types. While loading data from source to destination, the data conversion converts the datatype of the source to ensure the compatibility of the data between source and destination.
Here are some reasons why data conversion is important:
- Different data sources have varying data types like VARCHAR in SQL Server and String data type in a flat file, so to maintain the data source compatibility between source and destination, we require data conversion transformation.
- Data conversion is needed to maintain the data quality and standardization. For example, we can use it to convert the date to a standard format.
- Sometimes, it is necessary to avoid runtime errors caused by incompatible data types during data transformation.
SQL Server integration services support two methods for data conversion. Let’s overview them in detail.
Implicit Conversion
The implicit conversion is automatically performed by the SSIS or the database engine used in ETL process. It occurs when the data type of source and destination are compatible. For example, you are transferring data from the INT column to BIGINT column. SQL Server will automatically convert the data and insert it in the destination.
Explicit Conversion
If the explicit conversion is not done automatically, you must use the CAST or CONVERT function in your SQL query to convert the data type. If you are doing explicit conversion in SSIS, you can use derived column transformation, or data conversion transformation. These transformations are most used and can be integrated easily in your SSIS ETL pipeline.
In this article, we will explain how to use the SSIS data conversion transformation to convert the data type and store it in appropriate columns.
How to perform Data Conversion tasks in SSIS
First, let’s understand how to perform data conversion using SSIS data conversion tool.
Here’s the use case we’re going to reproduce: we have a table named tblEmployees_Source, and this table contains details of the employees working in different departments of the company. The table contains two column names: HireDate and BirthDate. In the source table, both columns are of VARCHAR datatype. Both columns contain NULL and Blank values.
Let’s imagine that we want to export the data of tblEmployees_Source table to a new table with appropriate data types. The NULL and blank values of the HireDate column should be changed to 1980-01-01 and the BirthDate column has to be changed to 1900-01-01. Also, the destination table has BirthDate, and HireDate, which should be converted to the DATE data type.
The process can be accomplished using INSERT INTO .. SELECT * FROM statement, or we can create a SQL Server integration services package.
From this article, we will learn how to clean the data using the SQL Server integration services package.
For demonstration purposes, let’s create an SSIS project named Clean Employee Data. The project should have a data flow task with specific components we will overview.
Devart SQL Server source
First, let’s configure the connection for Devart SQL Server source. To do that, right-click Connection manager, and then navigate to Select New Connection. Here, select DevartSQLServer. Configure the connection parameters as shown below.
The Devart SQL Server source is used to connect to the SQL Server database. To configure it, drag and drop Devart SQL Server Source, rename it to SQL Dirty data, and then double-click Devart SQL Server Source editor. Enter the values of connection properties as shown in the screenshot below.
To verify the columns are populated correctly, right-click the Devart SQL Source and select Advanced Editor. In the Advanced editor for Devart SQL Source, select column mappings.
Click OK to save the changes and close the dialog box.
Now, let’s configure the derived column tool.
Derived column transformation
As per business requirements, we must replace the NULL and blank values of the HireDate column with “20.” Similarly, we are going to replace the NULL and blank values of the BirthDate column with “1900-01-01”. This can be achieved by using the derived column SSIS tool. The derived columns are used to create a new column or modify the existing column by applying the expressions to the data within the source columns. The derived columns provide a lot of expressions that can be used to perform various operations like string manipulations, mathematical operations, conditional logic, etc.
Check our article on how to write expressions in SSIS derived column to explore the derived columns in detail.
Now, returning to the column transformation, first, let’s drag and drop the derived column tool to the data flow task designer. After that, rename it to Replace Invalid Values.
Next, connect the Devart SQL Server source with the Derived column.
Double-click to configure it. In our case, we will define two expressions to handle the NULL and blank values on HireDate and BirthDate columns. The expressions are defined as follows:
ISNULL(HireDate) || HireDate == "NULL" || HireDate == "" ? "1980-01-01" : HireDate
ISNULL(BirthDate) || BirthDate == "NULL" || BirthDate == "" ? "1900-01-01" : BirthDate
Enter the expressions mentioned above in the Expressions column.
Provide appropriate names to both derived columns. Click OK to save the configuration and close the dialog box.
The next step is to convert the values of the HireDate and BirthDate columns.
Data Conversion Transformation
Now, we’re ready to use the data conversion SSIS tool to convert the data type of HireDate and BirthDate. We have created two new columns named CleanHireDate and CleanBirthDate using derived column transformation. Instead of HireDate and BirthDate, let’s convert the values of CleanHireDate and CleanBirthDate.
First, drag and drop the Data conversion tool and rename it to Change Data Type.
Connect the output of the derived column to the Data conversion transformation and double-click to configure the data conversion.
When the Data Conversation Transformation Editor opens, in the first section, you will see the list of the input columns. The input columns can be the output of the data source or other SSIS data transformation tool. From section one, you can select the input columns which datatype you want to change. When you select the input column, it will be populated in a grid view. Check the list of options you can set.
Option | Description |
---|---|
Input Column | It is an input column that you have selected. The field cannot be edited. |
Output Alias | It is a column alias that contains the converted value. It is a virtual column that can act as an output of the data conversion transformation. The output alias of CleanHireDate is “Converted CleanHireDate” and the output alias of CleanBirthdate is “Converted CleanBirthdate.” |
Data Type | Select the datatype to which you want to change your input column. At the beginning of article, we have somewhat explained datatype supported in SSIS. In our case, we have selected the date [DT_DATE] data type. |
Length, Precision and Scale | Specify the Length, Precision, and Scale of converted datatype. |
Here is the screenshot of the data conversion transformation.
Now, finally, let’s configure the Devart SQL Server Destination.
Devart SQL Server destination
Now that we are exporting the data to SQL Server table named tblEmployees, we are going to use the Devart SQL Server destination to connect and access the database. Note, that we have already configured the connection manager for the Devart SQL Server source and destination.
- Drag and drop the Devart SQL Server destination from the SSIS tool. In the Component Manager screen, select the table name from the grid.
- In the column mapping screen, configure the mapping as shown in the following image.
Once the configuration is completed, the SSIS package is going to look like in the following image.
- Execute the package in Visual Studio.
- Once execution is completed, verify that your data has been inserted properly.
As you can see, the birthdate column has valid values.
Streamline data conversion with Devart SSIS components
From this article, you’ve learned how to use the Devart SSIS SQL Server Destination to export the data from the Devart SQL Server Source. Devart SSIS Universal Bundle contains various data flow sources, data flow destinations, and data flow lookup components. These components support not only SQL Server but various other databases like Oracle, PostgreSQL, MySQL, SQLite, cloud data warehouses, and cloud applications.
The SSIS Universal Bundle can be used to perform data import and export, will let you perform data synchronization, data replication, migration, and backup. Download the SSIS Universal Bundle from Devart and get a 30-day trial for free!
Conclusion
Now you know how to handle data conversion transformation in SQL Server integration services and how to use it to create ETL packages. While doing data cleansing or migration, we face challenges when the datatype of source and destination are different. In this piece, we have explained implicit and explicit conversion and how it can be handled using the SSIS data conversion transformation. Moreover, now you know how to configure the derived column transformation, which is used to clean the data, and you understand data conversion transformation a bit better.
Keep reading our articles to learn the nuts and bolts of SSIS data transformation, and explore Devart SSIS Universal Bundle to level up your projects with simple yet powerful solutions for data connectivity.