Sunday, October 27, 2013

Extracting Data From Multiple Excel sheet with Multiple Tags...

Extracting Data from Multiple Excel sheet with Multiple Tags...
 
Step 1:  In order to Extract data from Excel sheet with Multiple sheets, Follow these steps below. Create few samples with Excel with DimCustomer or DimEmployee from Adventures Works Database.

Step 2:  Open the BIDS with SSIS Project and add a empty Package. Drag and drop the Foreach loop Container and name it as Excel File Reader


Step 3:  Create 3 Variables as varDirectory, varExtension and varExcelPath for passing the values dynamically to read the file name and sheet name from the respective Folders.

Step 4: Add these Varibles into the Excel File Reader -- Foreach loop. Set the Enumerator to File Enumerator and click on the Expression below the Enumerator Setting in Collection tab as shown

Step 5: Click on the Variable Mappings tab, and set the output of Foreach loop to a Variable varExcelPath. varExcelPath is the combination of varDirectory and varExtension -- > varExcelPath = varDirectory + varExtension. E.g. varExcelPath = E:\EnglishOccupation\Manual.xls will be created dynamically. But for designing purpose, few data has to be placed in the Variables.
Step 6: Drag and Drop the next Foreach loop and set the Enumerator as Foreach ADO.NET Schema Rowset Enumerator

Step 7: Create a ADO.net Connection called Excelschema with respective data as shown below. Select .NET Providers for OLEDB\ Microsoft Office 12.0 Access Database Engine OLEDB Provider. Add the Excel sheet path to the Server or Filename.

 
Step 8: Click on All button of the Same Connection Manager and enter Excel 12.0 in Extended Properties

















 

Step 9: Click on Excelschema connection to see the Properties and click on the Expression, added the Variable @varExcelPath to the ServerName.


Step 10: Again click on the Inner Foreach loop to add the Connection string and the Output configuration

Step 11: click on the Variable Mapping and use the Variable varSheet with the Data sheet name Management - A$ created earlier and use the Variable in Variable Mapping by setting the Index as 2

Step 12: Add the Data Flow Task with Excel Source, Derived Column and OLEDB Source. Excel source is used to read the Data from Excel sheets and Derived Column is used to find the Sheet Name and Excel Name

Step 13: Now run the package and check the Database. Create a table as shown below in the SQL Server and assign the table to OLEDB Destination. Click on the Excel Connection and add the variable @varExcelPath
 
CREATE TABLE [dbo].[CustomerData]
(
    [CustomerKey] [nvarchar](255) NULL,
    [FirstName] [nvarchar](255) NULL,
    [LastName] [nvarchar](255) NULL,
    [Gender] [nvarchar](255) NULL,
    [EmailAddress] [nvarchar](255) NULL,
    [YearlyIncome] [nvarchar](255) NULL,
    [EnglishEducation] [nvarchar](255) NULL,
    [EnglishOccupation] [nvarchar](255) NULL,
    [FilePath] [nvarchar](4000) NULL,
    [SheetName] [nvarchar](4000) NULL
) ON [PRIMARY]

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Step 14: Now run the package and check the Database table.


 
So Mission accomplished.

Daniel  J.
Dataware and Data Mining Architect
+91 8939574948...


No comments: