In my experience, Data Source Views are not commonly used in SSIS packages. I can imagine a use for them. If I were the ETL architect leading a large team of developers on a data warehouse project, I might use the DSV to simplify the development efforts of members of my team. Using AdventureWorks as an example, I may have one team working on a star schema based around FactInternetSales and another team working on a star schema based around FactResellerSales. As I create my package templates for the team to use I can include the DSV to simplify their view of an otherwise complex dimensional model.
The MSDN article, Using Data Source Views in Packageslists the advantages of using DSVs in packages as follows:
- A data source view can be defined one time and then used by multiple data flow components.
- A data source view can be refreshed to reflect changes in its underlying data sources.
- A data source view caches the metadata from the data sources on which it is built.
- A data source view can focus the lists of objects that appear in the user interface of data flow components.
It is important to note that there are not any dependencies between a DSV and the packages that reference it. When a package uses a DSV, the definition of the view is added to package source code in the data flow component that uses it. If you delete all of the DSVs from an existing solution, all of the packages that used those DSVs will continue to function.
So here is how you use a DSV in an Integration Services package:
- Start with a blank package by loading BIDS and clicking File, New, Project, and selecting the Integration Service Project template and clicking Ok.
- Right click the Data Sources folder in the solution exploder and select new data source.
- Click the New button in the Data Source Wizard.
- Make sure the provider is the Native OLE DB\SQL Server Native Client 10.0 (assumes SQL Server 2008.)
- Select the server and the database name (localhost, AdventureWorksDW2008R2), Click Ok, and then Finish twice.
- Right click the Data Source Views folder and select New Data Source View. Click Next twice.
- Select FactInternetSales and click the right arrow to move it to the Included Objects pane.
- Click the Add Related Tables button. Your screen should look like this:
- Click Next, name your DSV InternetSales and then click Finish. Admire the pretty diagram.
- Now, double click the Package.dtsx file under the SSIS packages folder to open the blank package.
- Drag a Data Flow Task from the Toolbox onto the Control Flow.
- Double click the Data Flow Task
- Right click inside the Connection Managers section of the IDE and select New Connection From Data Source. Click Ok.
- Drag an OLE DB Source component onto the Data flow design canvas.
- Double click the OLE DB Source component.
- In the drop down for OLE DB Connection manager you will see the InternetSales DSV underneath the data source we defined.
- Select the InternetSales DSV and click Ok.
- Best practice is to never use Table or view in the Data access mode so before we change that to SQL Command, notice the Name of the table or view dropdown contains only the tables related to and including FactInternetSales. Now change it to SQL command.
- In the SQL command text box, type SELECT * FROM dbo.FactInternetSales WITH (NOLOCK);
- Click the Build Query button and then click Ok.
- Notice it changed my SQL command text to include the column names rather than SELECT * complying with best practice recommendations. I would normally delete any columns that I didn’t actually insert into my destination to save memory during execution and prevent spillage to disk when the buffers fill up.
I hope this post will help someone. If you have any questions please add a comment below.