Problem with
SSIS – OLEDB Destination – When there is a Constraint in Table...
Few Months back, I was
working on a task with respect to OLEDB Source and Destination. I tried to
extract data from Heterogeneous Source and tried to dump the data into SQL
Server Database tables. Everything was going fine. One fine day client asked me
to add the Constraint, I.e. Composite Key for the table columns. I added the
same and ran the package. Package was successful and we deployed it in our
Production environment.
Few days back Client came
back and said few row [Almost 20 Million Rows' J ] where missing from the
box and they wanted to a postmortem on that. I was too curious about this what
is happening in the database after applying the Composite Keys. So it took a
sample data as shown below and went on doing an RND on.
I Created a table with few columns called as Source Customer -- sCustomer with 3 Columns with the data types as Varchar(10)
Create table sCustomer
(
CustomerID
Varchar(10),
CustomerName
Varchar(10),
CustomerAddress
Varchar(10)
)
then i Inserted data with some duplicate data into sCustomer
Insert sCustomer
SELECT 1 AS CustomerID, 'A' AS CustomerName, 'Chennai' AS CustomerAddress
UNION ALL
SELECT 1 AS CustomerID, 'A' AS CustomerName, 'Chennai' AS CustomerAddress
UNION ALL
SELECT 1 AS CustomerID, 'A' AS CustomerName, 'Chennai' AS CustomerAddress
UNION ALL
SELECT 2 AS CustomerID, 'A' AS CustomerName, 'Chennai' AS CustomerAddress
UNION ALL
SELECT 3 AS CustomerID, 'A' AS CustomerName, 'Chennai' AS CustomerAddress
UNION ALL
SELECT 4 AS CustomerID, 'B' AS CustomerName, 'Chennai' AS CustomerAddress
UNION ALL
SELECT 5 AS CustomerID, 'B' AS CustomerName, 'Chennai' AS CustomerAddress
UNION ALL
SELECT 6 AS CustomerID, 'F' AS CustomerName, 'Chennai' AS CustomerAddress
UNION ALL
SELECT 1 AS CustomerID, 'A' AS CustomerName, 'Chennai' AS CustomerAddress
UNION ALL
SELECT 2 AS CustomerID, 'A' AS CustomerName, 'Chennai' AS CustomerAddress
UNION ALL
SELECT 3 AS CustomerID, 'A' AS CustomerName, 'Chennai' AS CustomerAddress
UNION ALL
SELECT 4 AS CustomerID, 'A' AS CustomerName, 'Chennai' AS CustomerAddress
UNION ALL
SELECT 5 AS CustomerID, 'E' AS CustomerName, 'Chennai' AS CustomerAddress
UNION ALL
SELECT 6 AS CustomerID, 'E' AS CustomerName, 'Chennai' AS CustomerAddress
UNION ALL
SELECT 7 AS CustomerID, 'A' AS CustomerName, 'Chennai' AS CustomerAddress
After Inserting Data, the sCustomer table shows it has duplicate data
Using SSIS, Create the package with a simple Dataflow task containing OLEDB Source and OLEDB Destination
Open the OLEDB Source and add the respective table
Select the required Columns and Click OK. Create the required Destination table with the Composite Key as shown below
Double click the Destination and Set the Commit Size of data to 15. Since i have only 15 rows in my Source Customer table. I set it as 15. It depends on the Number of rows you have.
Connect the Error Output to Redirect Rows on Error as Shown
Run the Package again. Before running the package, truncate both the dCustomer and Error Log table.
After setting the Commit size to 1, the data ran into the table properly. It's because of the Transaction occurs as a batch, when one records fails, all the records are moved into Error table.
Solution 1 : Use the LOOK UP table, check for the conditions with AND Cases
E.g. CustomerID != CustomerID AND CustomerName != CustomerName then do the Insert based on NO MATCH
Solution 2 : What happens if it Truncate and Load
Instead of using the "FastLoad" Option present in the OLEDB Destination, we can simply use the "Table Or View" Option to load the Data while you apply Composite Key or Constraint.
After using the "Table Or View" Option in OLEDB Destination, the Process ran properly.
I would like to thanks my Colleagues Chinnasekar and Suseela of CIT for giving such as cool solution :-)
You guys rock....
Regards
Daniel J
Dataware and Data Mining Architect...
No comments:
Post a Comment