Devart Blog

Table data verification for safe schema synchronization

Posted by on September 22nd, 2009

Preserving data in tables after synchronization is one of requirements for successful schema synchronization. dbForge Schema Compare for SQL Server offers two mechanisms of preserving tables data during synchronization, they are warning the users about data loss and restoring data in recreated objects.

But it is not all, dbForge Schema Compare for SQL Server goes on and offers table data verification during synchronization.

How it works


Table data verification is based on hash function calculation for the whole table data. A hash function is calculated before and after synchronization, the results are compared to define whether the table data is changed or not. Following this logic, the synchronization process includes the following stages in dbForge Schema Compare for SQL Server:

  1. Calculating hashes of data in tables (these are results of hash function calculations for tables data) and saving them in a temporary repository (that is a temporary table)
  2. Synchronizing of objects
  3. Calculating hashes of data in the tables and comparing them with the previously saved ones. If the values are different, the warnings about the failure of table data verification are generated.

As the hashing algorithm, dbForge Schema Compare for SQL Server uses SHA1.  It generates a hash of a longer length that whittles away possible collisions while using a multitude of large-scale columns.

A hash of data is not calculated for the following columns:

  1. Added columns as they have no data
  2. Dropped columns as any data of such columns will be lost
  3. Synchronized columns if data type should be changed to any incompatible one and data will be lost after synchronization
  4. Computed, IDENTITY, TIMESTAMP, and ROWGUID columns as their data is auto-calculated

To calculate a hash of table data, dbForge Schema Compare for SQL Server uses folding of hashes of the table rows. The order of sorting of processed rows is defined by a set of identifying columns in the table.

Sorting of processed rows is performed if the table meets the following requirements:

• A table has a unique or primary key that includes all the columns.
• The key remains the same after synchronization or the columns included into the key are not dropped or do not get any incompatible data type.

Attention: If rows sorting while hash calculation is not performed, the result of data verification can be false negative in some cases, for example, if the partitioning property of a table is changed during synchronization.

Table data verification is performed for any altered table after synchronization as well as for tables modified during dependency processing. Scripts for generating a hash of table data and table data verification after synchronization are integrated into a synchronization script. If data in the table does not coincide, the Error List window displays a warning that the table data verification failed and the user should check the table data manually.

Some examples of detecting data loss:

1. The Target database: the table product has the field comment of varchar(2000) data type. The table contains data with the length of more than 1000 symbols.

CREATE TABLE product (
id int PRIMARY KEY
...
comment varchar(2000)

The Source database: the table product contains the field comment of varchar(1000) data type.

CREATE TABLE product (
id int PRIMARY KEY
...
comment varchar(1000)
)

When the length of the comment field is changed to 1000 symbols, the data in the Target table will be partially lost. dbForge Schema Compare for SQL Server will detect this and display the following warning: “Data verification for table {product} failed. Check table data manually.” in the Error List window.


2. The Target database: a user-defined type longstring has been created. This type is used by the table product.
CREATE TYPE longstring AS varchar(2000)

CREATE TABLE product (
id int PRIMARY KEY
name longstring
comment longstring
)

The Source database:

CREATE TYPE longstring AS varchar(1000)
CREATE TABLE product (
id int PRIMARY KEY
name longstring
comment longstring
)

When the length of the user-defined type is changed to 1000 symbols, data in the Target database will be partially lost. dbForge Schema Compare for SQL Server will detect this and display the following warning: “Data verification for table {product} failed. Check table data manually.” in the Error List window.

Leave a Reply