SQL Server Integration Services (SSIS) Training Kit Microsoft BI CoE COURSE CONTENTS Module1: Introduction Overview of BI ETL Concept ETL Tools Different ETL Tool providers Module 2: SSIS Background DTS features DTS Limitation Difference between DTS , 2005, 2008, and R2 Module 3: SSIS Architecture SSIS Designer Runtime engine Data Flow engine Object Model Module 4: Typical Uses of Integration Services Merging Data from Heterogeneous Data Stores Populating Data Warehouses and Data Marts Cleaning and Standardizing Data Automating Administrative Functions and Data Loading |Microsoft BI CoE, Offering COURSE CONTENTS Module 5: Tools Business Intelligence Development Studio SQL Server Management Studio Import and Export Wizard Package Installation Wizard Package Configuration Wizard Module 7:Objects and Concepts Package Control Flow Containers For each Loop Container For Loop Container Sequence Container Task Host Container |Microsoft BI CoE, Offering COURSE CONTENTS Module 8: Data Flow Task Data Flow Task Module 9: Data Preparation Tasks File System Task FTP Task Web Service Task XML Task Module 10: Workflow Tasks Execute Package Task Execute Process Task Message Queue Task Send Mail Task WMI Data Reader Task WMI Event Watcher Task |Microsoft BI CoE, Offering COURSE CONTENTS Module 11: SQL Server Tasks Bulk Insert Task Execute SQL Task Transfer Database Task Transfer Error Messages Task Transfer Jobs Task Transfer Logins Task Transfer Master Stored Procedures Task Transfer SQL Server Objects Task Profiling Task Data Profiling Task Module 12: Scripting Tasks Script Task |Microsoft BI CoE, Offering COURSE CONTENTS Module 15:Precendence Constraints Configuring the Precedence Constraint Variables and Expressions System Variables User Defined Variables Configuring Variables Module 16:Package configurations Package Configuration Types XML configuration file Environment variable Registry entry Parent package variable SQL Server table |Microsoft BI CoE, Offering COURSE CONTENTS Module 17: Connections Configuring the Precedence Constraint Module 18: Data Flow Task Sources Transformations Destinations Module 19: Deployment and Administration Package configurations Deployment Deploying packages Executing packages Scheduling packages with SQL Server Agent |Microsoft BI CoE, Offering Overview of BI and Data Warehousing Business Intelligence • Business Intelligence covers different technologies for gathering, storing, analyzing and providing access to data. • BI is a decision support because it simplifies information discovery and analysis, making . Data Warehousing • DW contains historical data derived from transaction data. • Data warehouse is a relational database that is designed for query and analysis rather than for transaction processing • DW includes ETL,OLAP client analysis tools, and other application |Microsoft BI CoE, Offering Copyright: Satyam Computer Services ETL CONCEPT ETL stands for extraction, transformation and loading. ETL is a process that involves the following tasks: Extract • Extracting data from source operational or archive systems which are the primary source of data for the data warehouse Transform • Transforming the data - which may involve cleaning, filtering, validating and applying business rules Load • Loading the data into a data warehouse or any other database or application that houses data |Microsoft BI CoE, Offering Copyright: Satyam Computer Services ETL TOOLS & DIFFERENT PROVIDERS ETL Tools: Extract, Transform, Load tools are software packages that facilitate the performing of ETL tasks. At present the most popular and widely used ETL tools and applications on the market are: IBM Websphere DataStage Informatica Power center Oracle warehouse builder AB Initio Pentaho Data Integration (open source ETL) SAS ETL studio Cognos Decisionstream Business Objects Data Integrator (BODI) Microsoft SQL Server Integration Services (SSIS) |Microsoft BI CoE, Offering Copyright: Satyam Computer Services MODULE 2 |Microsoft BI CoE, Offering MICROSOFT’S ETL BACKGROUND Initially introduced as DTS (Data Transformation Services), it’s predecessor of SSIS. DTS is a set of objects that extract, transform, and load information to and/or from a database. DTS was an original component of the Microsoft SQL Server 2000. It is capable of transforming and loading data from heterogeneous sources, using OLE DB, ODBC, or files specified as text only, into any database that supports them. DTS packages are created in Enterprise Manager |Microsoft BI CoE, Offering DTS LIMITATIONS - a limited set of tasks and transformations. - The graphical designer combined the workflow and data flow on a single design surface and offered limited control flow options. - The SSIS has lot added features as per the current requirements as compared to DTS |Microsoft BI CoE, Offering Difference between DTS , 2005, 2008, and R2 DTS packages are created in Enterprise Manager. In SSIS, the packages are created using Business Intelligence Development Studio (BIDS). DTS, the designer consists of a single pane. And to the right we see two options: the Connections and the Workflow tasks. The Connections are used to make the connections to a wide variety of data sources and destinations SSIS, the designer is split into 4 design panes: Control Flow, Data Flow, Event Handlers, and Package Explorer. |Microsoft BI CoE, Offering SSIS ARCHITECTURE… ARCHITECTURE Integration Services (SSIS) consists of four key parts SSIS Designer Integration Services Data Flow Integration Services Runtime Integration Services Object Model |Microsoft BI CoE, Offering SSIS ARCHITECTURE… SSIS Designer: SSIS is a graphical tool that you can use to create and maintain Integration Services packages.. Solution Explorer Manage your SSIS project from here ToolBox Select pre-defined components from here Control Flow, Data Flow, Event Handlers and Package Explorer Tabs Properties Change settings for components Connection Managers Create/manage data source connections for your project |Microsoft BI CoE, Offering SSIS ARCHITECTURE… Runtime Engine: • The Integration Services runtime saves the layout of packages, runs packages, and provides support for logging, breakpoints, configuration, connections, and transactions. Data Flow Engine : • The Data Flow task encapsulates the data flow engine. • The data flow engine provides the in-memory buffers that move data from source to destination, and calls the sources that extract data from files and relational databases. • The data flow engine also manages the transformations that modify data, and the destinations that load data or make data available to other processes. Object Model: • The Integration Services object model includes managed application programming interfaces (API) for creating custom components for use in packages, or custom applications that create, load, run, and manage packages. |Microsoft BI CoE, Offering TYPICAL DATA SYSTEMS Merging Data from Heterogeneous Data Stores Populating Data Warehouses and Data Marts Cleaning and Standardizing Data Building Business Intelligence into a Data Transformation Process Automating Administrative Functions and Data Loading |Microsoft BI CoE, Offering TOOLS Business Intelligence Development Studio (BIDS) - Can perform the following tasks: BIDS creates packages that include complex control flow, data flow, eventdriven logic, and logging. Test and debug packages Create configurations Create a deployment utility that can install packages Save copies of packages to the SQL Server msdb database, the SSIS Package Store |Microsoft BI CoE, Offering TOOLS… SQL Server Management Studio - provides the Integration Services service to manage packages and monitor running packages. It performs following tasks - Create folders to organize packages in a way that is meaningful to your organization. - Run packages that are stored on the local computer by using the package execution utility. - Run the package execution utility to generate a command line to use when running the dtexec command prompt utility. - Import and export packages to and from the SQL Server msdb database, the SSIS Package Store and the file system |Microsoft BI CoE, Offering TOOLS…… Import and Export Wizard The SQL Server Import and Export Wizard provides the simplest method of copying data between data sources and of constructing basic packages |Microsoft BI CoE, Offering TOOLS….. Package Configuration Wizard Package configurations update the values of properties at run time. Package configurations provide the following benefits: Configurations move packages from a development environment to a production environment. Configurations are useful when you deploy packages to many different servers. Configurations make packages more flexible. |Microsoft BI CoE, Offering OBJECTS AND CONCEPTS Package Control Flow Elements Data Flow Elements Integration Services Connections Variables Event Handlers Log Providers |Microsoft BI CoE, Offering OBJECTS AND CONCEPTS… Package A package is an organized collection of connections, control flow elements, data flow elements, event handlers, variables, and configurations,. The package is the unit of work that is retrieved, executed, and saved. The following diagram shows a simple package that contains a control flow with a Data Flow task, which in turn contains a data flow. |Microsoft BI CoE, Offering CONTROL FLOW ELEMENTS Control flow consists of below components: Containers • Provides structures in packages. Tasks • Provides functionality the executables, Precedence • Connects containers, and tasks into constraints an ordered control flow. |Microsoft BI CoE, Offering CONTROL FLOW ELEMENTS… Containers • The Foreach Loop container: It defines a repeating control flow in a package. • It repeats the control flow for each member of a specified enumerator. • There are following enumerators: Foreach ADO Foreach ADO.NET Schema Foreach From Variable Foreach Item Foreach File Foreach Nodelist Foreach SMO |Microsoft BI CoE, Offering CONTROL FLOW ELEMENTS… The For Loop container: It defines a repeating control flow in a package In each repeat of the loop, the For Loop container evaluates an expression and repeats its workflow until the expression evaluates to False. The For Loop container uses the following elements to define the loop: An optional initialization expression that assigns values to the loop counters. An evaluation expression that contains the expression used to test whether the loop should stop or continue. An optional iteration expression that increments or decrements the loop counter |Microsoft BI CoE, Offering CONTROL FLOW ELEMENTS… The Sequence container : Sequence containers group the package into multiple separate control flows, each containing one or more tasks and containers that run within the overall package control flow. Benefits of using a Sequence container: Managing properties Disabling groups of on multiple tasks in Providing scope for tasks to focus package one location by setting variables that a group debugging on one properties on a of related tasks and subset of the package Sequence container containers use. instead of on the control flow. individual tasks. The task host container: It encapsulates a single task. This container extends the use of variables and event handlers to the task level. |Microsoft BI CoE, Offering TASKS Task: Tasks do the work in packages. Integration Services includes tasks for performing a variety of functions. The Data Flow task DFT defines and runs data flows that extract data, apply transformations, and load data. DFT encapsulates the data flow engine that moves data between sources and destinations At run time, the DFT builds an execution plan from the data flow, and the data flow engine executes the plan. A Data Flow task can include multiple data flows. |Microsoft BI CoE, Offering DATA FLOW ELEMENTS Data flow components Sources Transformations Destinations |Microsoft BI CoE, Offering DATA PREPARATION TASKS File System Task The File System task performs operations on files and directories in the file system. A File System Task can: Copy a directory Copy a file Create a directory Delete a directory Delete the contents of a directory Delete a file Move a directory Move a file Rename a file Change the attributes of a file |Microsoft BI CoE, Offering DATA PREPARATION TASKS…. FTP Task The FTP task downloads and uploads data files and manages directories on servers. FTP task can be for the following purposes: Creates a folder on the FTP server. Sends a file from the local computer to the FTP server. Saves a file from the FTP server to the local computer. Deletes a folder on the local computer. Deletes a folder on the FTP server. Deletes a file on the local computer. Deletes a file on the FTP server. |Microsoft BI CoE, Offering SHOW CONFIGURATIONS |Microsoft BI CoE, Offering DATA PREPARATION TASKS…. Web Service Task The Web Service task executes a Web service method. Web Service task can be use the for the following purposes: • Writing to a variable the values that a Web service method returns. • Writing to a file the values that a Web service method returns |Microsoft BI CoE, Offering DATA PREPARATION TASKS…. XML Task The XML task can retrieve XML documents, apply operations to the documents, merge multiple documents, or validate, compare, and save the updated documents to files and variables. XML Task uses: Dynamically Reformat modify XML an XML documents document. at run time. Select sections of an XML document Merge documents from many source |Microsoft BI CoE, Offering WORKFLOW TASKS Execute Package Task Execute Package Task can run other packages as part of a workflow. A package that runs other packages is generally referred to as the parent package, and the packages that a parent workflow runs are called child packages. The child package can be run in the process of the parent package, or it can be run in its own process. The Execute Package task can run packages stored in the SQL Server msdb database and packages stored in the file system. Can use the Execute Package task for the following purposes: • |Microsoft BI CoE, Offering WORKFLOW TASKS… Execute Package Task uses Breaking down complex package workflow. Reusing parts of packages. Grouping work units Controlling package security |Microsoft BI CoE, Offering WORKFLOW TASKS….. Execute Process Task The Execute Process task runs an application or batch file as part of a SQL Server Integration Services package workflow. Execute Process task can open any standard application. When the Execute Process task runs a command-line application, it provides input to the application through a variable Message Queue Task The Message Queue task allows you to use Message Queuing (also known as MSMQ) to send and receive messages between SQL Server Integration Services packages, or to send messages to an application queue that is processed by a custom application. |Microsoft BI CoE, Offering WORKFLOW TASKS….. Message Queue Uses Delaying task execution until other packages check in Sending data files to the computer that processes them Distributing files throughout your enterprise |Microsoft BI CoE, Offering WORKFLOW TASKS….. Send Mail Task The Send Mail Task sends email via Simple Mail Transfer Protocol (SMTP). The Send Mail Task is most commonly used as a notification tool. This allows to receive information about the package that can be passed into the mail task through variables. The Send Mail Task can be placed at the end of a Control Flow to send email on the successful The Send Mail Task can also be used to send files, because it has the ability to send attachments. |Microsoft BI CoE, Offering WORKFLOW TASKS….. Can configure the Send Mail task in the following ways: Provide the message text for the e-mail message. Provide a subject line for the e-mail message. Set the priority level of the message. Specify the recipients on the To, Cc, and Bcc lines. |Microsoft BI CoE, Offering SQL SERVER TASKS Bulk Insert Task The Bulk Insert task provides an efficient way to copy large amounts of data into a SQL Server table or view It can transfer data only from a text file into a SQL Server table or view. The destination must be a table or view in a SQL Server database. If the text file is located on the same computer as the SQL Server database into which data is inserted, the copy operation occurs at an even faster rate because the data is not moved over the network. |Microsoft BI CoE, Offering SQL SERVER TASKS…. Execute SQL Task The Execute SQL task runs SQL statements or stored procedures from a package The task can contain either a single SQL statement or multiple SQL statements that run sequentially. Execute SQL Task Uses: Truncate a table or view in preparation for inserting data. Create, alter, and drop database objects such as tables and views. Re-create fact and dimension tables before loading data into them. Run stored procedures. Save the rowset returned from a query into a variable. |Microsoft BI CoE, Offering SQL SERVER TASKS…. Connecting to a Data Source from the Execute SQL Task • The Execute SQL task can use different types of connection managers to connect to the data source. EXCEL OLE DB ODBC ADO ADO.NET SQLMOBILE |Microsoft BI CoE, Offering SQL SERVER TASKS Transfer Database Task The Transfer Database task transfers a SQL Server database between two instances of SQL Server. This task supports two modes of database transfer - namely DatabaseOnline and DatabaseOffline. The Transfer Database task can also be configured to permit overwriting a destination database that has the same name, replacing the destination database. |Microsoft BI CoE, Offering SHOW CONFIGURATION |Microsoft BI CoE, Offering SQL SERVER TASKS…. Transfer Error Messages Task The Transfer Error Messages task transfers one or more SQL Server user-defined error messages between instances of SQL Server. User-defined messages are messages with an identifier that is equal to or greater than 50000 The Transfer Error Messages task can be configured to transfer all error messages, or only the specified error messages An error message is defined as a duplicate error message if the identifier and the language are the same. The Transfer Error Messages task can be configured to handle existing error messages in the following ways: Overwrite existing jobs. Fail the task when duplicate jobs exist. Skip duplicate jobs. |Microsoft BI CoE, Offering SHOW CONFIGURATION |Microsoft BI CoE, Offering SQL SERVER TASKS…. Transfer Jobs Task The Transfer Jobs task transfers one or more SQL Server Agent jobs between instances of SQL Server. The Transfer Jobs task can be configured to transfer all jobs, or only specified jobs. You can also indicate whether the transferred jobs are enabled at the destination. The jobs to be transferred may already exist on the destination. The Transfer Jobs task can be configured to handle existing jobs in the following ways: • Overwrite existing jobs. Fail the task when duplicate jobs exist. Skip duplicate jobs. |Microsoft BI CoE, Offering SHOW CONFIGURATION |Microsoft BI CoE, Offering SQL SERVER TASKS…. Transfer Login Task The Transfer Logins task transfers one or more logins between instances of SQL Server. The Transfer Logins task supports a source and destination that is SQL Server 2000 or SQL Server.. The Transfer Logins task can be configured to transfer all logins, only specified logins, or all logins that have access to specified databases only. The logins to be transferred may already exist on the destination. The Transfer Logins task can be configured to handle existing logins in the following ways: • Overwrite existing jobs. Fail the task when duplicate jobs exist. Skip duplicate jobs. |Microsoft BI CoE, Offering SHOW CONFIGURATION |Microsoft BI CoE, Offering SQL SERVER TASKS…. Transfer Master Stored Procedures Task The Transfer Master Stored Procedures task transfers one or more user-defined stored procedures between master databases on instances The Transfer Master Stored Procedures task can be configured to transfer all stored procedures or only specified stored procedures. This task does not copy system stored procedures. Transfer SQL Server Objects Task The Transfer SQL Server Objects task transfers one or more types of objects in a SQL Server database between instances of SQL Server. |Microsoft BI CoE, Offering SQL SERVER TASKS…. Objects to Transfer Tables User-Defined Functions Views Stored Procedures Defaults User-Defined Data Types |Microsoft BI CoE, Offering DATA PROFILING TASK Data Profiling Task Data Profiling Task provides data profiling functionality inside the process of extracting, transforming, and loading data. To analyze data in a SQL Server database and, from the results of that analysis, generate XML reports that can be saved to a file or an SSIS variable. Task is an easy and effective method to profile target source systems (tables & views) prior to building ETL solutions consuming their data. |Microsoft BI CoE, Offering DATA PROFILING TASK The Data Profiling Task can compute eight different data profiles. Five profiles analyze individual columns Three profiles analyze multiple columns or relationships • • • • • Column Length Distribution Profile Column Null Ratio Profile Column Pattern Profile Column Statistics Profile Column Value Distribution Profile • Candidate Key Profile • Functional Dependency Profile • Value Inclusion Profile |Microsoft BI CoE, Offering DATA PROFILING TASK …. Prerequisites for a Valid Profile A profile is not valid unless you select tables and columns that are not empty, and the columns contain data types that are valid for the profile. Valid Data Types Some of the available profiles are meaningful only for certain data types. Valid Tables and Columns If the table or column is empty, the Data Profiling takes the following actions: Data Profiling task does not compute any profiles. Data Profiling task computes only the Column Null Ratio profile. The task does not compute the Column Length Distribution profile, Column Pattern profile, Column Statistics profile, or Column Value Distribution profile. |Microsoft BI CoE, Offering SCRIPTING TASKS Script Task The Script task provides code to perform functions that are not available in the built-in tasks and transformations that SQL Server Integration Services provides. This task can accomplish anything that can be done with any .NET programming. The Script task can also combine functions in one script instead of using multiple tasks and transformations. |Microsoft BI CoE, Offering SCRIPTING TASKS Script task uses: Access data by using other technologies that are not supported by built-in connection types. Create a package-specific performance counter. Identify whether specified files are empty or how many rows they contain, and then based on that information affect the control flow in a package |Microsoft BI CoE, Offering PRECEDENCE CONSTRAINTS Precedence constraints Precedence constraints link executables, containers, and tasks in packages in a control flow, and specify conditions that determine whether executables run. Precedence constraints supports three options to define workflow: Success Failure Completion |Microsoft BI CoE, Offering CONFIGURING THE PRECEDENCE CONSTRAINT Besides the regular Success and Failure constraints, can define precedence constraint Workflow by Expressions To add an expression, double-click the precedence constraint to open the Precedence Constraint Editor dialog box, as shown in below figure. |Microsoft BI CoE, Offering CONFIGURING THE PRECEDENCE CONSTRAINT.. To add an expression to a precedence constraint, the first step is to select Evaluation operation. And Evaluation operation has following options: Constraint: • The precedence constraint is evaluated solely on the option selected in the Value property. The precedence constraint will evaluate to true only if the precedence executable runs successfully. Expression: • The precedence constraint is evaluated based on the expression defined in the Expression text box. If the expression evaluates to true, the workflow continues and the constrained executable runs. Expression and Constraint: • The precedence constraint is evaluated based on both the Value property and the expression. Both must evaluate to true for the constrained executable to run. Expression or Constraint: • The precedence constraint is evaluated based on either the Value property or the expression. At least one of these properties must evaluate to true for the constrained executable to run. |Microsoft BI CoE, Offering CONFIGURING THE PRECEDENCE CONSTRAINT.. Define precedence constraint Workflow by Logical AND or Logical OR: Precedence Constraint Editor dialog box has following two options at the bottom: Logical AND • All precedence constraints that point to the constrained executable must evaluate to true in order for that executable to run. This is the default option. If it is selected, the arrow is solid. Logical OR: • Only one precedence constraint that points to the constrained executable must evaluate to true in order for that executable to run. If this option is selected, the arrow is dotted. |Microsoft BI CoE, Offering VARIABLES Variables store values that a SQL Server Integration Services package and its containers, tasks, and event handlers can use at run time. Can use variables in Integration Services packages for the following purposes: Updating properties of package elements at run time Including an in-memory lookup table Loading variables with data values and then using them to specify a search condition in a WHERE clause. Loading a variable with an integer and then using the value to control looping within a package control flow. Populating parameter values for Transact-SQL statements at run time. Building expressions that include variable values. |Microsoft BI CoE, Offering VARIABLES… Type of variable SSIS supports following two types of variables: System variables User-defined variables • System variables are defined by Integration Services and cannot create additional system variables. System variables contain useful information about a package, container, task, or event handler. • User-defined variables are defined by package developers, and You can create as many user-defined variables as a package requires. The names of user-defined and system variables are case sensitive. |Microsoft BI CoE, Offering VARIABLES To add a variable On the SSIS menu, click Variables or right click on control flow and select vaiables: |Microsoft BI CoE, Offering VARIABLES In the Variables window, click the Add Variable icon. The new variable is added to the list: |Microsoft BI CoE, Offering VARIABLES To delete a variable |Microsoft BI CoE, Offering EXPRESSIONS Property expression can add in most of a package’s objects, which includes the package itself, the containers and tasks in the control flow, connection managers, event handlers, log providers and variables. To add expression right click on the object and select property. Properties of variable: |Microsoft BI CoE, Offering EXPRESSIONS…. To add an expression to a property on a specific component, click the ellipses button to the right of the Expressions property in the Properties window. This launches the Property Expressions Editor: As per below figure we have created expression Getdate() for the variable which we have created. |Microsoft BI CoE, Offering EXPRESSIONS…. The Expression Builder dialog box provides several tools to create expressions: The upper-left window lists the system and user-defined variables that can be used in your expressions. The upper-right window lists the functions and operators that you can include in your expressions. The Expression window provides a workspace for creating your expressions. You can drag variables, functions, and operators from the two upper windows into the workspace. The Evaluate Expression button launches the logic necessary to evaluate the expression in the Expression window. |Microsoft BI CoE, Offering PACKAGE CONFIGURATIONS Packager Configuration allows to update the properties and objects of the package at run time. It helps to move the packages from Development environment to Production environment. Enabling Package Configurations On the SSIS menu, click Package Configurations. In the Package Configurations Organizer dialog box, select Enable Package Configurations, and then click Add. |Microsoft BI CoE, Offering PACKAGE CONFIGURATIONS Types of Package Configurations SSIS has following five types of Package Configuration: XML configuration file: With this type of configuration, the configuration value and the path of the property being configured are saved in a XML file. Package Configuration Wizard can create XML file. We need not to create manually. An XML configuration file has two parts. The header contains metadata about the file itself, like creator, the name and ID of the package that was used when creating the file, and the creation date and time. The configuration section is where the path to the properties being updated and the configuration values to be used are stored. Below is a sample of a XML configuration file: In the example, there is only one configuration entry, whose Path attribute points to the Value property of a variable declared at the package level called [User:: bSSISDBConfig]. And the value of the variable will be updated using the string contained in the ConfiguredValue element. |Microsoft BI CoE, Offering PACKAGE CONFIGURATIONS… Create XML configuration On the SSIS menu, click Package Configurations. In the Package Configurations Organizer dialog box, select Enable Package Configurations, and then click Add. Package Configuration Wizard will open click Next. On the Select Configuration Type page, verify that the Configuration type is set to XML configuration file and click Browse to select XML file location. |Microsoft BI CoE, Offering PACKAGE CONFIGURATIONS… Create XML configuration On the Select Properties to Export page, in the Objects pane, expand Variables, expand FilePath variable , expand Properties, and then select Value . On the Completing the Wizard page, type a configuration name for the configuration. This is the configuration name that is displayed in the Package Configuration Organizer dialog box. SSISTraining_config. SSIS_GlobalConfig.dtsConfig XML file will get created on selecetd file path. We can open that file in Notepad and can update variables value. Multiple packages can use the same XML configuration file. |Microsoft BI CoE, Offering PACKAGE CONFIGURATIONS… Environment Variable Configurations With this type of package configuration, create an environment variable for each package property that will need to update during run time. On the Select Configuration Type page, verify that the Configuration type is set to Environment Variable and select Environment Variable. On the Select Properties to Export page, in the Objects pane, expand Variables, expand FilePath variable , expand Properties, and then select Value . On the Completing the Wizard page, type a configuration name for the configuration. This is the configuration name that is displayed in the Package Configuration Organizer dialog box. |Microsoft BI CoE, Offering PACKAGE CONFIGURATIONS… On the Select Properties to Export page, in the Objects pane, expand Variables, expand FilePath variable , expand Properties, and then select Value . On the Completing the Wizard page, type a configuration name for the configuration. This is the configuration name that is displayed in the Package Configuration Organizer dialog box. Important points when using this type of package configurations: The environment variable(s) must exist on all machines where the package is going to be deployed. Can use either system or user environment variables, but user environment variables are not visible to other users. Multiple package configurations from multiple packages can reference the same system environment variable |Microsoft BI CoE, Offering PACKAGE CONFIGURATIONS… Registry Entry Configurations On the Select Configuration Type page of the configuration wizard select registry entry as the configuration type . And select valid registry key name that exists under the Windows registry HKEY_CURRENT_USER key. On the Select Properties to Export page, in the Objects pane, expand Variables, expand FilePath variable , expand Properties, and then select Value On the Completing the Wizard page, type a configuration name for the configuration. This is the configuration name that is displayed in the Package Configuration Organizer dialog box. |Microsoft BI CoE, Offering PACKAGE CONFIGURATIONS… Important points when using Registry Entry type of package configurations: The wizard does not create or modify the registry keys; the registry keys must be manually created and edited. The registry keys must exist, either directly or indirectly, under HKEY_CURRENT_USER, and the configuration value has to be store in an entry called ‘value’. Registry keys under HKEY_CURRENT_USER are visible only to the user that created them, hence, the package can apply the configurations only when it is executed using the credentials of that user. We have to use one registry key per configuration value. If we are configuring multiple properties, we shpuld consider using SQL Server or XML file configurations instead. |Microsoft BI CoE, Offering PACKAGE CONFIGURATIONS… Parent Package Variable Configurations When execute a package (the child) from another package (the parent) via the Execute Package task, we can use Parent Package Variable configurations in the child package to pass variable values from the parent. Regardless of its name, this configuration type has to be set up in the child package. Child package is unaware of the existence of the parent package, and the name of the variable that you enter is not validated when you create the configuration. Child package is unaware of the existence of the parent package, and the name of the variable that you enter is not validated when you create the configuration. |Microsoft BI CoE, Offering PACKAGE CONFIGURATIONS… Create Parent Package Variable Configurations On the Select Configuration Type page of the configuration wizard select Parent package variable as the configuration type. To select variable when using the direct method type the variable name exactly as it appears in the parent package. Alternatively, select an environment variable that contains the name of the parent package variable. The next page of the wizard allows to select the property to be updated, in the same way as when using registry entry or environment variable configurations. |Microsoft BI CoE, Offering PACKAGE CONFIGURATIONS… Important points when using parent package type of package configurations: By design, parent package variables are always applied last because this type of configuration may not suitable in every scenario. This type of configuration is applied only when the child package is executed via the Execute Package task. SSIS variable names are casesensitive. Always make sure that both spelling and casing of the variable name in the configuration entry are identical to the ones in the parent package. For example, “Myvariable” and “MyVariable” are considered different variables |Microsoft BI CoE, Offering PACKAGE CONFIGURATIONS… SQL Server Configurations This configuration type offers almost the same level of flexibility and functionality as XML configuration files. The table can be created in any database that is accessible by the package at execution time. This is the default structure of the table: CREATE TABLE [dbo].[SSIS Configurations] ( ConfigurationFilter NVARCHAR(255) NOT NULL, ConfiguredValue NVARCHAR(255) NULL, PackagePath NVARCHAR(255) NOT NULL, ConfiguredValueType NVARCHAR(20) NOT NULL ) |Microsoft BI CoE, Offering PACKAGE CONFIGURATIONS… These fields are used as follows: ConfigurationFilter • This field is used by SSIS to indentify a set of property/values pairs that are part of the same configuration entry in the Package Configurations Organizer. ConfigurationValue • It stores the value that is used to update the package property specified in PackagePath column. PackagePath • The path that point to the property being configured. ConfiguredValueType • The SSIS data type of the property being configured. |Microsoft BI CoE, Offering PACKAGE CONFIGURATIONS… Create SQL Server Configurations On the Select Configuration Type page of the configuration wizard select Parent package variable as the configuration type. Then choose method to provide the connection information to the configuration table. With the direct method, the connection information, configuration table and filter are stored inside of the package. The indirect method instead allows storing that information in an environment variable. The next page of the wizard allows to select the property to be updated, in the same way as when using XML configurations. |Microsoft BI CoE, Offering PACKAGE CONFIGURATIONS… Important points when using SQL Server type of package configurations: SQL Server configurations are very flexible, as can store multiple configuration properties/values in the same table. Multiple packages can use the same configuration table and filter. We can have a configuration filter for each entry in the configuration table, or we can use the same filter to group a set of entries. We can use existing database security and backup policies to protect the configuration table contents. A drawback of this configuration type is that it relies on an SSIS connection manager to get access to the configuration table, but does not have built-in support for updating its connection string. In other words, the connection string that points to the configuration table is hard-coded inside of the package, and we have to take extra steps in order to update it. The /CONN option of DTExec utility can be used to override the connection string used by the configuration only in SSIS 2008. This is because SQL Server 2008 Integration Services reloads configurations after applying DTExec command line options. |Microsoft BI CoE, Offering CONNECTIONS SSIS has below connections: ADO ADO.NET CACHE EXCEL FILE FLATFILE FTP HTTP MULTIFILE MULTIFLATFILE ODBC OLEDB SMOSever SMTP SQLMobile |Microsoft BI CoE, Offering CONNECTIONS To create connection right click on connection manager window and select connection that want to create: Connection Managers Create/manage data source connections for your project |Microsoft BI CoE, Offering DATA FLOW TASK The Data Flow task encapsulates the data flow engine that moves data between sources and destinations. A data flow consists of at least one data flow component, but it is typically a set of connected data flow components. At run time, the Data Flow task builds an execution plan from the data flow, and the data flow engine executes the plan. Create Data Flow Task Deploy & Administer Add Source Transformation Destination • Source data from required file systems • Data manipulation • Store data • In desired format Execute Packages Scheduling |Microsoft BI CoE, Offering DATA FLOW TASK To add data flow task in SSIS Designer drag Data Flow Task from tool box in control flow tab : Drag Data Flow Task |Microsoft BI CoE, Offering DATA FLOW TASK….. To add source, Transformation and destination to the Data Flow Task Double click on Data Flow Task or Click on Data Flow Tab in SSIS designer |Microsoft BI CoE, Offering Adding and Configuring Source The source data can be in any of the following formats ADO NET Source Excel Source Flat File Source OLE DB Source Raw File Source Script Component XML Source | Sector, Alliance, Offering Adding and Configuring Source ….. Example : Flat File Source Open the Data Flow designer, either by doubleclicking the Extract Sample Currency Data data flow task or by clicking the Data Flow tab. In the Toolbox, expand Data Flow Sources, and then drag a Flat File Source onto the design surface of the Data Flow tab In the Flat file connection manager box, type or select Sample Flat File Source Data Click Columns and verify that the names of the columns are correct. | Sector, Alliance, Offering Transformation The following transformations are possible in SSIS Aggregate Audit Cache Transform Character Map Conditional Split Copy Column Data Conversion Data Mining Query Derived Column Export Column Fuzzy Grouping Fuzzy Lookup Import Column Lookup Merge Merge Join Multicast OLE DB Command Row Count Row Sampling Script Component Slowly Changing Dimension Percentage Sampling Pivot Sort Term Extraction Term Lookup Union All Unpivot | Sector, Alliance, Offering Tranformation ….. Example : Lookup Transformations (Currency Key) In the Toolbox, expand Data Flow Transformations, and then drag Lookup onto the design surface of the Data Flow tab. Place Lookup directly below the Extract Sample Currency Data source. Click the Extract Sample Currency Data flat file source and drag the green arrow onto the newly added Lookup transformation to connect the two components. In the Available Input Columns panel, drag joining column to the Available Lookup Columns panel and drop it. On joining column of available lookup columns | Sector, Alliance, Offering Destinations ADO NET Destination Data Mining Model Training Destination DataReader Destination Dimension Processing Destination Excel Destination Flat File Destination OLE DB Destination Partition Processing Destination Raw File Destination Recordset Destination Script Component SQL Server Compact Edition Destination SQL Server Destination | Sector, Alliance, Offering Adding Destination ….. Example : OLE DB Destination In the Toolbox, expand Data Flow Destinations, and drag OLE DB Destination onto the design surface of the Data Flow tab. Place the OLE DB destination directly below the Lookup DateKey transformation. Click the Lookup Date Key transformation and drag the green arrow over to the newly added OLE DB Destination to connect the two components together. | Sector, Alliance, Offering Adding Destination ….. Example : OLE DB Destination (2) Click Mappings. Verify that the AverageRate, CurrencyKey, EndOfDayRate, and TimeKey input columns are mapped correctly to the destination columns. If samenamed columns are mapped, the mapping is correct. | Sector, Alliance, Offering Deployment Deployment Options: • Deploy to the file system • Deploy to the package store • Deploy to SQL Server Three ways to deploy our packages: • Create a deployment utility from our project • Use the DTUTIL command line tool • Use SQL Server Management Studio (SSMS) | Sector, Alliance, Offering Deployment …. SQL Server Integration Services (SSIS) Deployment Utility The deployment utility can be used to create an SSIS package installer. The deployment utility is a built-in feature in an Integration Services project. In this section we will review the steps to enable the deployment utility and create a deployment Launch BIDS by selecting SQL Server Business Intelligence Development Studio from the Microsoft SQL Server program group Click File, Open, Project / Solution Navigate to the TutorialSample-1 project in Solution Explorer Right click on the Tutorial-Sample-1 project in the Solution Explorer and select Build Click Deployment Utility in the Configuration Properties list Right click on the Tutorial-Sample-1 project and select Properties | Sector, Alliance, Offering Deployment …. Launch BIDS by selecting SQL Server Business Intelligence Development Studio from the Microsoft SQL Server program group | Sector, Alliance, Offering Deployment …. Click File, Open, Project / Solution | Sector, Alliance, Offering Deployment …. Navigate to the TutorialSample-1 project in Solution Explorer Right click on the Tutorial-Sample-1 project in the Solution Explorer and select Build Click Deployment Utility in the Configuration Properties list | Sector, Alliance, Offering Execute Package Three options available to execute an SSIS package: • DTEXEC command line utility • DTEXECUI windows application • SQL Server Agent | Sector, Alliance, Offering Additional Information Execute Package SQL Server includes the command line tool DTEXEC.EXE which can be used to execute an SSIS package. DTEXEC can be run from a Command Prompt or from a batch (.BAT) file. Navigate to the Tutorial-Sample-1 Type the following command to execute the CreateSalesForecastInput.dtsx package To see the complete list of command line options for DTEXEC type: DTEXEC /FILE CreateSalesForecastInput.dtsx DTEXEC /? | Sector, Alliance, Offering Execute Package Windows application to execute SSIS packages Navigate to the Tutorial-Sample-1 project folder Double-click on the SSIS package As a general rule you can simply click the Execute button to run your package. Fine tune your execution by clicking through the various screens and entering your own settings | Sector, Alliance, Offering Scheduling Open SSMS connect to the Database Engine drill down to the SQL Server Agent node in the Object Explorer Right click on the Jobs node and select New Job Go to the Steps page, click New, and fill in the dialog as shown | Sector, Alliance, Offering Scheduling ….. Set up the proxy is to create a credentials Navigate to Security then Credentials Right click to create a new credential as shown Navigate to SQL Server Agent then Proxies in SSMS Object Explorer Right click to create a new proxy as shown | Sector, Alliance, Offering Scheduling ….. Specify the credential and check SQL Server Integration Services Package. Now when you create or edit a SQL Server Agent job step, you can specify the proxy for the Run as setting as shown | Sector, Alliance, Offering Thank You | Sector, Alliance, Offering