Synchronizing data between a company database and a data warehouse

Posted by on April 15th, 2010

Recently we’ve received a question from one of our customers asking how to synchronize data between companies servers and a data warehouse server. Let’s see how easy it can be done with the help of  dbForge Data Compare for SQL Server, a professional tool for data comparison and synchronization.

What we have:

We have two companies ( they use ERP software), each having their own SQL Server database with one table. The companies tables with their customers’ data are saved in the data warehouse as follows:

  • The SQL table for company1 is Company1$Customer.
  • The SQL table for company2 is Company2$Customer.

What we need:

We need to create a data warehouse with all information stored in one set of SQL tables. For example, the created data warehouse should contain a Customer table that contains all records of Company1$Customer and Company2$Customer.  This table should contain an extra field with a company name to make the records unique. We need to synchronize only changed records.

How to complete the task:


We use the following code to create them:

CREATE DATABASE datawarehouse
GO
USE datawarehouse
GO
CREATE TABLE dw_Customer
(id_records   INT         NOT NULL,
 id_companies VARCHAR(10) NOT NULL,
 some_data    VARCHAR(100))
GO
ALTER TABLE dw_Customer ADD PRIMARY KEY (id_records,id_companies)
GO
CREATE DATABASE Company1
GO
USE Company1
GO
CREATE TABLE Customer
(id_records INT PRIMARY KEY,
 some_data  VARCHAR(100))
GO
INSERT  INTO Customer VALUES  ( 1000, 'some_data1_for_Company1')
INSERT  INTO Customer VALUES  ( 2000, 'some_data2_for_Company1')
GO
CREATE VIEW vCustomer
AS
SELECT id_records, 'Company1' AS id_companies, some_data
FROM Customer
GO
CREATE DATABASE Company2
GO
USE Company2
GO
CREATE TABLE Customer
(id_records INT PRIMARY KEY ,
 some_data  VARCHAR(100))
GO
INSERT  INTO Customer VALUES  ( 1000, 'some_data1_for_Company2')
INSERT  INTO Customer VALUES  ( 3000, 'some_data3_for_Company2')
GO
CREATE VIEW vCustomer WITH schemabinding
AS
SELECT id_records, 'Company2' AS id_companies, some_data
FROM dbo.Customer
GO
CREATE UNIQUE CLUSTERED INDEX v_indx
ON vCustomer (id_records, id_companies)
GO

We have created a view on the servers of Company1 and Company2 that will be used for data synchronization with a data warehouse database.

For example, create a non-indexed view in a database of the Company1 and an indexed view in a database of the Company2.

First, its necessary to migrate data from Company1 database to a datawarehouse database (i.e., from vCustomer view to dw_Customer table), then from Company2 database to datawarehouse database (i.e., from vCustomer view to dw_Customer table.)

1. Run dbForge Data Compare for SQL Server and select Company1 database as Source and datawarehouse database as Target. In addition, select Views check box to compare views, as well as tables.

Source_and_Target_compare_views

Choose Source and Target, Compare Views

2. Go to Mapping wizard page and press Map Objects button on the toolbar. In the opened dialog map vCustomer view with dw_Customer table.

Map view and table

Map view and table

3. In the Column Mapping window select id_records and id_companies column as a comparison key.

Set key column

Set key column

4. On Options wizard page unselect Only in Target check box (as we will only update and insert the data, but will not remove the existing data) and press Compare.

Exclude Only in Target records

Exclude Only in Target records

5. When comparison is completed, open the synchronization wizard to synchronize the databases.

Synchronize the data directly

Synchronize the data directly

Repeat steps 1-5 for Company2 database (skip step 3 as in this case we use the indexed view and dbForge Data Compare for SQL Server will automatically set the right comparison key). Press Synchronize.

As the result dw_Customer table has data migrated from Company1 and Company2 databases:

Result

Result

If due to some reasons, its impossible to create a view in Company1 and Company2 databases, you may do the following.

Suppose datawarehouse, Company1 and Company2 databases look like this:

CREATE DATABASE datawarehouse
GO
USE datawarehouse
GO
CREATE TABLE dw_Customer
(id_records   INT         NOT NULL,
 id_companies VARCHAR(10) NOT NULL,
 some_data    VARCHAR(100))
GO
ALTER TABLE [dw_Customer]
ADD PRIMARY KEY (id_records,id_companies)
GO
CREATE TABLE [Company1$Customer]
(id_records   INT         NOT NULL,
 id_companies VARCHAR(10) NOT NULL
     CONSTRAINT def_company1_name DEFAULT ('Company1'),
 some_data VARCHAR(100))
GO
ALTER TABLE [Company1$Customer]
ADD PRIMARY KEY (id_records,id_companies)
GO
CREATE TABLE [Company2$Customer]
(id_records   INT         NOT NULL,
 id_companies VARCHAR(10) NOT NULL
     CONSTRAINT def_company2_name DEFAULT ('Company2'),
 some_data    VARCHAR(100))
GO
ALTER TABLE [Company2$Customer]
ADD PRIMARY KEY (id_records,id_companies)
GO
CREATE DATABASE Company1
GO
USE Company1
GO
CREATE TABLE Customer
(id_records INT PRIMARY KEY,
 some_data  VARCHAR(100))
GO
INSERT INTO Customer VALUES (1000, 'some_data1_for_Company1')
INSERT INTO Customer VALUES (2000, 'some_data2_for_Company1')
GO
CREATE DATABASE Company2
GO
USE Company2
GO
CREATE TABLE Customer
(id_records INT PRIMARY KEY,
 some_data  VARCHAR(100))
GO
INSERT INTO Customer VALUES (1000, 'some_data1_for_Company2')
INSERT INTO Customer VALUES (3000, 'some_data3_for_Company2')
GO

In datawarehouse database, create temporary tables for each company (for data synchronization with each company), then merge data of these temporary tables with a table of datawarehouse database.

First, its necessary to migrate data from Company1 database to datawarehouse database (i.e., from Customer table to Company1$Customer one.), then from Company2 database to datawarehouse database (i.e., from Customer table to Company2$Customer one.)

1. Run dbForge Data Compare for SQL Server and select Company1 database as Source and datawarehouse database as Target.

2. Go to Mapping wizard page and press Map Objects button on the toolbar. In the opened dialog map Customer table to Company1$Customer table.

Map Company1$Customer table and dw_Customer table

Map Company1$Customer table and dw_Customer table

3. In the Column Mapping window select id_records column as a comparison key and press Compare.

Map column Customer and Company1$Customer tables

Map columns of Customer and Company1$Customer tables

4. When comparison is completed, open the synchronization wizard. Press Synchronize to synchronize the databases.

Repeat steps 1-4 for Company2 database to synchronize Customer table with Company2$Customer one.

5. Then select datawarehouse database as Source and Target.

Compare database datawarehouse with itself

Compare database datawarehouse with itself

6. Upmap all the tables in the Map Objects window and map the following tables manually: Company1$Customer (as Source) and dw_Customer (as Target).

Unmap all objects

Unmap all objects

Map Company1$Customer table and dw_Customer table

Map Company1$Customer table and dw_Customer table

7. On Options wizard page unselect Only in Target check box (as we will only update and insert the data, but will not remove the existing data) and press Compare.

8. When comparison is completed, open the synchronization wizard to synchronize the databases.

Repeat steps 5-8 to synchronize data of Company2$Customer table with dw_Customer one.

As the result dw_Customer table has data migrated from Company1 and Company2 databases:

Result

The result is the same as in the first case

What next:

We have described several ways to use dbForge Data Compare for SQL Server to complete such tasks.

Of course, if we had possibility to compare Custom Query, then in the first case we would not have to create views and would be able to compare the result of the query with the table. In the future versions of the product we plan to implement the functionality for comparing Custom Query with tables, views, or other Custom Query.

Also the process of server synchronization with the help of a comparison and synchronization tool can be automated and started in certain time. To do this, an application should support command line. In the second version of our product, we have added the ability to perform all operations using command line, without GUI.

Check the benefits yourself, download dbForge Data Compare for SQL Server now for free.

Leave a Reply


Devart Blog