Friday, March 29, 2024
HomeHow ToHow to generate SQL Linked List Data with Python Generator

How to generate SQL Linked List Data with Python Generator

In this post, we would like to show how to quickly generate table values that are based on the values of another column of same table. For this purpose, we will use the Python Generator being a part of dbForge Data Generator for SQL Server.

First, let’s create a test table using the following script:

CREATE TABLE [dbo].[TestLinkedListTable](
[id] [int] NULL,
[name] [nchar](100) NULL,
[previous_id] [int] NULL,
[next_id] [int] NULL
)
GO

Then, we run the Data Generation wizard, select the Python generator for the previous_id column and enter the following script:

import clr
clr.AddReference("System")
from System import DBNull

def main():
  i = 0
  while i != config["n_rows"]:
    if (i == 0):
	  yield DBNull.Value
    i = i + 1
    yield i

main()

previous_id-linked-list

For the next_id column, we also select the Python generator and enter the following script:

import clr
clr.AddReference("System")
from System import DBNull

def main():
  i = 2
  while 1 == 1:
   yield i
   i = i + 1
   if (i == config["n_rows"] + 1):
	  yield DBNull.Value
	  break
       
main()

next_id-linked-list

Voila! As a result of the above manipulations, we’ve got the records, where each record references to a next record using the next_id column:

result-linked-list

result2-linked-list

That was a good start, but let’s complicate the task a bit, and try generating several linked lists of various sizes, whereas each list must be related to a unique Parent. And furthermore, the value in each child must be computed from a value column in its Parent.

So again, let’s create two tables:

CREATE TABLE [dbo].[TestLinkedListTableChild](

[id] [int] NULL,

[name] [nchar](100) NULL,

[previous_id] [int] NULL,

[next_id] [int] NULL,

[valueX] [int] NULL,

[commonParent] [int] NULL

)

CREATE TABLE [dbo].[Parent](

[id] [int] NULL,

[name] [nchar](100) NULL,

[value] [int] NULL

)

Then, we run the Data Generation wizard and clear the Include NULL values option.

Let’s enter the following Python codes:

  1. For the previous_id column:
    import clr
    import random
    clr.AddReference("System")
    from System import DBNull
    
    clr.AddReference("System")
    from System import Random
    random = Random(1)
    
    def main():
      i = 0
    
      max_possible_count_in_chain = 5 
      
      while 1 == 1:
         items_count_in_chain = random.Next(1, max_possible_count_in_chain)
           
         while 1 == 1:
           if (items_count_in_chain == 0 or i == 0):
    	      yield DBNull.Value
    		 
           if (items_count_in_chain == 0):
    	     i+= 1
    	     break
           i+= 1
           yield i
           items_count_in_chain -= 1
    
    main()
    
  2. For the next_id column:
    import clr
    import random
    clr.AddReference("System")
    from System import DBNull
    
    clr.AddReference("System")
    from System import Random
    random = Random(1)
    
    def main():
      i = 1
      max_possible_count_in_chain = 5 
    
      while 1 == 1:
         items_count_in_chain = random.Next(1, max_possible_count_in_chain)
           
         while 1 == 1:
           if (items_count_in_chain == 0):
    	      yield DBNull.Value
    		 
           if (items_count_in_chain == 0):
    	     i+= 1
    	     break
           i+= 1
           yield i
           items_count_in_chain -= 1
    
    main()
    
  3. For the commonParent table:
    import clr
    import random
    clr.AddReference("System")
    from System import DBNull
    
    clr.AddReference("System")
    from System import Random
    random = Random(1)
    
    def main():
      i = 0
      max_possible_count_in_chain = 5 
        
      while 1 == 1:
         i +=1 
         items_count_in_chain = random.Next(1, max_possible_count_in_chain) + 1
    	 
         while 1 == 1:
            if (items_count_in_chain == 0):
               break
            yield i
            items_count_in_chain -= 1
    	   
    main()

After that, we generate a script and run it against the database.

To remove the records from the Parent table which didn’t have corresponding records in the TestLinkedListTableChild table, we run the following script:

delete parent
where not exists 
(select 1 from TestLinkedListTableChild tl where tl.commonParent = parent.id) 

Since the last chain couldn’t be not closed,
unclosed-chain-linked-list

let’s run the following script to remove the last broken record:

delete TestLinkedListTableChild where commonParent not in (
  select commonParent from TestLinkedListTableChild tl
    where exists (
    select commonParent from TestLinkedListTableChild tl_child
    where next_id is null and tl.commonParent = tl_child.commonParent
  ) and exists (
  select commonParent from TestLinkedListTableChild tl_child
  where previous_id is null and tl.commonParent = tl_child.commonParent
 )
)

Finally, we’ve got the desired result – totally valid chains:

result-2-linked-list

RELATED ARTICLES

Whitepaper

Social

Topics

Products