SQL Server Integration Services: Difference between revisions

Content deleted Content added
Added architectural image
Features of the data flow task: [[File:SSIS workflow.png|thumb|An example of a SQL Server Integration Services (SSIS) workflow, by visually programming the steps needed to transform data from source to destination.|alt=Diagram with seven nodes. 1: OLE DB Source (Source Table), flows to 3: Sort Source Table, flows to 5: Merge Join. Another flow goes from 2: OLE DB Source (Dest Table), flows to 4: Sort Dest Table, flows to 5: Merge Join. From 5: Merge Join, the flow goes to 6: Conditional Spl
 
(430 intermediate revisions by more than 100 users not shown)
Line 1:
{{short description|Database migration software}}
'''SQL Server Integration Services''' ('''SSIS''') is a component of [[Microsoft SQL Server]] 2005. It replaces [[Data Transformation Services]], a feature of SQL Server since Version 7.0.
{{Infobox software
[[Image:http://msdn2.microsoft.com/en-us/library/ms141026.Local_-1605283248_max_sql_arch_dts.gif]]
| logo =
Integration Services provides a platform to build data integration and workflow applications. The primary use for SSIS is [[data warehouse|data warehousing]], as the product features a fast and flexible data extraction, transformation, and loading ([[Extract, transform, load|ETL]]) engine. The tool may also be used to automate maintenance of SQL Server databases.
| screenshot =
| developer = [[Microsoft]]
| latest_release_version =
| latest_release_date =
| latest_preview_version =
| latest_preview_date =
| operating_system = [[Microsoft Windows]]
| genre = ETL Tools
| license = [[Proprietary software|Proprietary]] [[commercial software]]
| website = {{URL|https://technet.microsoft.com/en-us/library/ms141026.aspx}}
}}
'''Microsoft SQL Server Integration Services''' ('''SSIS''') is a component of the [[Microsoft SQL Server]] database software that can be used to perform a broad range of [[data migration]] tasks.
 
SSIS is a platform for [[data integration]] and [[workflow application]]s. It features a [[data warehouse|data warehousing]] tool used for data [[Extract, transform, load|extraction, transformation, and loading (ETL)]]. The tool may also be used to automate maintenance of SQL Server databases and updates to multidimensional [[OLAP cube|cube data]].
==Features==
 
First released with Microsoft SQL Server 2005, SSIS replaced [[Data Transformation Services]], which had been a feature of SQL Server since Version 7.0. Unlike DTS, which was included in all versions, SSIS is only available in the "Standard", "Business Intelligence" and "Enterprise" editions.<ref>{{cite web | title = Features Supported by the Editions of SQL Server 2014 | url = http://msdn.microsoft.com/en-gb/library/cc645993.aspx#SSIS|website=msdn.microsoft.com | publisher = Microsoft Developer Network | access-date = 20 August 2014}}</ref> With Microsoft "Visual Studio Dev Essentials" it is now possible to use SSIS with Visual Studio 2017 free of cost so long as it is for development and learning purposes only.
Like DTS, SSIS features a [[Wizard (software)|wizard]] that lets the user readily create a package which moves data from a single data source to a destination with no transformations. This tool is called the '''Import/Export Wizard'''. The Wizard is approprite for use to quickly move data into or out of SQL Server from or to a variety of sources, including text files and other SQL Server instances.
 
==Features==
Developers tasked with creating or maintaining SSIS packages utilize a visual development tool similar to '''Microsoft Visual Studio''' called the SQL Server Business Intelligence Studio. It allows users to edit '''SSIS packages''' using a drag-and-drop user interface. A scripting environment in which to write programming code is also available in the tool. The package holds a variety of elements that define the workflow for the package. Upon package execution, the tool provides color-coded, real-time monitoring.
The SSIS Import/Export [[Wizard (software)|Wizard]] lets the user create packages that move data from a single data source to a destination with no transformations. The Wizard can quickly move data from a variety of source types to a variety of destination types, including text files and other SQL Server instances.
 
Developers tasked with creating or maintaining SSIS packages use a visual development tool based on [[Microsoft Visual Studio]] called the SQL Server Business Intelligence Development Studio (BIDS). It allows users to edit SSIS packages using a drag-and-drop user interface. A scripting environment for writing programming code is also available in the tool. A package holds a variety of elements that define a workflow. Upon package execution, the tool provides color-coded real-time monitoring. (Note: In more recent versions MS SQL Server, BIDS has been replaced with "SQL Server Data Tools - Business Intelligence" (SSDT-BI).<ref>{{Cite web|url=http://sqlmag.com/sql-server-2014/q-where-business-intelligence-development-studio-bids-sql-server-2014|title = IT Pro}}</ref><ref>{{Cite web|url=http://blogs.msdn.com/b/analysisservices/archive/2013/03/06/sql-server-data-tools-business-intelligence-for-visual-studio-2012-released-online.aspx|title=DevBlogs}}</ref>)
'''Connections'''. A connection includes the information necessary to connect to a particular data source. Tasks can reference the connection by its name, allowing the details of the connection to be changed or configured at runtime.
 
; Connections: A connection includes the information necessary to connect to a particular data source. Tasks can reference the connection by its name, allowing the details of the connection to be changed or configured at run time.
'''Tasks'''. A task is an atomic work unit that performs some action. There are a couple of dozen tasks that ship in the box, ranging from the file system task (which can copy or move files) to the data transformation task. The data taransformation task actually copies data; it implements the ETL features of the product.
; Event handlers: A workflow can be designed for a number of events in the different scopes where they might occur. In this way, tasks may be executed in response to happenings within the package &mdash; such as cleaning up after errors.
; Parameters (SQL Server 2012 Integration Services): Parameters allow you to assign values to properties within packages at the time of package execution. You can have project parameters and package parameters. In general, if you are deploying a package using the package deployment model, you should use configurations instead of parameters.
; Precedence constraints: Tasks are linked by precedence constraints. The precedence constraint preceding a particular task must be met before that task executes. The run time supports executing tasks in parallel, if their precedence constraints so allow. Constraints may otherwise allow different paths of execution depending on the success or failure of other tasks. Together with the tasks, precedence constraints comprise the workflow of the package.
; Tasks: A task is an atomic work unit that performs some action. There are a couple of dozen tasks that ship in the box, ranging from the file system task (that can copy or move files) to the data transformation task. The data transformation task actually copies data; it implements the [[Extract, transform, load|ETL]] features of the product
; Variables: Tasks may reference variables to store results, make decisions, or affect their configuration.
; Streamlined Data Integration: SSIS offers a visual interface and pre-built components to simplify the process of extracting data from various sources, transforming it, and loading it into target destinations. This reduces development time and effort compared to writing custom code. <ref>{{cite web | url = https://techunwrapped.com/ssis-816-definition/ | title = SSIS-816: Definition, History, and ETL Capabilities |work=Techunwrapped | date = 6 September 2024 | publisher = Techunwrapped.com}}</ref>
 
'''Precedence Constraints'''. Tasks are linked by precedence constraints. The precedence constraint preceding a particular task must be met before that task executes. The runtime supports executing tasks in parallel if their precedence constraints so allow. Constraints may otherwise allow different paths of execution depending on the success or failure of other tasks. Together with the tasks, precedence constraints comprise the '''workflow''' of the package.
 
A package may be saved to a file or to a store with a hierarchical namespace within a SQL Server instance. In either case, the package content is persisted in [[XML]].
'''Event Handlers'''. A workflow can be designed for a number of events in the different scopes where they might occur. In this way, tasks may be executed in respose to happenings within the package &mdash;such as cleaning up after errors.
 
Once completed, the designer also allows the user to start the package's execution. Once started, the package may be readily debugged or monitored.
'''Variables'''. Tasks may reference variables to store results or make decisions or affect their configuration.
 
A package may be saved to a file or to a store with a hierarchical namespace within a SQL Server instance. In either case, the package contents is persisted in [[XML]].
 
Once completed, the designer also allows the user to start package's execution. Once started, the package may be readily debugged or monitored.
 
===Features of the data flow task===
[[File:SSIS workflow.png|thumb|An example of a SQL Server Integration Services (SSIS) workflow, by visually programming the steps needed to transform data from source to destination.|alt=Diagram with seven nodes. 1: OLE DB Source (Source Table), flows to 3: Sort Source Table, flows to 5: Merge Join. Another flow goes from 2: OLE DB Source (Dest Table), flows to 4: Sort Dest Table, flows to 5: Merge Join. From 5: Merge Join, the flow goes to 6: Conditional Split, flows to 7: OLE DB Destination.]]
The Data Flow task, arguably the most important task in the product, features a different design surface than that of the workflow. Data flows are edited within the same design tool, of course. The user can draw data sources, transforms, and data destinations connecting them together in order to achieve the transfer and transforms they desire.
SSIS provides the following built-in transformations:<ref>{{cite web | title = Integration Services Transformations | url = http://msdn.microsoft.com/en-us/library/ms141713.aspx|work=Microsoft Developer Network | publisher = Microsoft | access-date = 22 March 2013}}</ref>
 
{{Div col|colwidth=18em}}
Data sources can connect to any number of source types using [[OLEDB]] or ADO. They generally execute a SQL statement (including a stored procedure) to retrieve rows, though there are exceptions. Most notably, a flat file data source allows reading from text files, and an XML adapter can source data from XML files. Similarly, data destinations write data to their target by executing a statement on its connection, or writing to a file.
* Aggregation
* Audit
* Cache transform<ref>{{cite web | url = http://msdn.microsoft.com/en-us/library/bb895264 | title = Cache Transform|work=Microsoft Developer Network | publisher = Microsoft | access-date = 22 March 2013}}</ref>
* Copy/Map
* Data conversion
* Data mining model training
* Data mining query
* Dimension processing
* [[Derived column]]
* Export and import column
* For loop container
* Foreach loop container
* ([[Fuzzy_logic|Fuzzy]]) lookup
* [[Fuzzy_logic|Fuzzy]] grouping
* OLE DB command
* Partition processing
* Percentage sampling<ref>{{cite web | url = http://msdn.microsoft.com/en-us/library/ms139864.aspx | title = Percentage Sampling Transformation|work=Microsoft Developer Network | publisher = Microsoft | access-date = 22 March 2013}}</ref>
* [[Pivot table|Pivot]]
* [[Pivot table|Unpivot]]
* Row count
* Row sampling
* Row sampling transformation
* Script component
* [[Slowly changing dimension]]
* [[Term extraction]]
* Term Lookup
{{div col end}}
 
The conditional split transformation is used to conditionally route rows to other transformation objects based on a particular condition. It is similar to the "<code>[[conditional (programming)|if … else]]</code>" construct in the C language.
Various transforms exist in the product, including a sort, aggregation, and lookup. A derived column transform evaluates an expression to compute the value for a new column. A script transform is available which allows more complicated transform to be written in [[VBScript]].
 
=={{Anchor|A-tools}}Other included tools==
Unlike DTS, SSIS can have more than one transform in its operation. Transforms, data sources, and data destinations can have multiple inputs and outputs. Most have error outputs so that rows which would cause the transform to fail can be optionally redirected for further or alternate proessing.
Aside from the Import/Export Wizard and the designer, the product includes a few other notable tools.
 
DTEXEC executes a package from the command line wherever it may be stored.<ref>{{cite web | url = http://msdn.microsoft.com/en-us/library/ms162810%28v=sql.105%29.aspx | title = dtexec Utility (SSIS Tool) | access-date = 2013-03-24 | publisher = Microsoft|work=Microsoft Developer Network}}</ref><ref>{{cite web | url = http://www.databasejournal.com/features/mssql/article.php/3897236/Executing-SSIS-Packages-Using-DTExec.htm | title = Executing SSIS Packages Using DTExec | access-date = 2013-03-24|date=2010-08-09 | publisher = Database Journal|author=Marcin Policht}}</ref> Before running the package, the tool may be instructed to apply configuration information, which will allow the same package to be reused with slightly different parameters, including different connection strings for its endpoints.
Any number of sources or destinations are supported in a data flow.
 
DTUTIL provides the ability to manage packages from the [[Windows command prompt|command prompt]].<ref>{{cite web | url = http://msdn.microsoft.com/en-us/library/ms162820%28v=sql.105%29.aspx | title = dtutil Utility (SSIS Tool) | access-date = 2013-03-24 | publisher = Microsoft|work=Microsoft Developer Network}}</ref><ref>{{cite web | url = http://www.databasejournal.com/features/mssql/article.php/3734096/Using-dtutil-to-copy-SSIS-packages-stored-in-SQL-Server.htm | title = Using dtutil to copy SSIS packages stored in SQL Server | access-date = 2013-03-24|date=2008-03-20 | publisher = Database Journal|author=Yan Pan}}</ref> The tool can copy or move a package from a file into the server store, or back out again. Among a few other sundry functions, it can be used to delete, rename, encrypt, or decrypt packages.
'''Transformations supported'''
 
The Bulk Copy Program (BCP), is a [[Console application|command-line tool]] used to import or export data against a [[Microsoft SQL Server]],<ref name="FIPADA">{{cite book|last1=Rankins|first1=Ray|last2=Bertucci|first2=Paul|last3=Jennsen|first3=Paul|date=December 2002|title=Microsoft SQL Server 2000 Unleashed|edition=2|publisher=Sams|___location=Indiana|isbn=9780672324673|oclc=474621100|pages=86–87}}</ref> or [[Sybase database]].<ref>{{cite book|url=http://infocenter.sybase.com/help/index.jsp?topic=/com.sybase.infocenter.dc30191.1550/html/utility/X14951.htm|work=[[Sybase]]|accessdate=14 May 2021|title=Adaptive Server Enterprise 15.5|chapter=Chapter 7: Utility Commands Reference}}</ref>{{better source needed|date=May 2021}}
SSIS provides the following built-in transformations:
* Conditional Split
* Multicast
* Union-All, Merge, and Merge Join
* Sort
* Fuzzy Grouping
* Lookup and Fuzzy Lookup
* Percentage Sampling and Row Sampling
* Copy/Map, Data Conversion, and Derived Column
* Aggregation
* Data Mining Model Training, Data Mining Query, Partition Processing, and Dimension Processing
* Pivot and UnPivot
 
==Extensibility and programmability==
==Other included tools==
Users may write code to define their own connection objects, log providers, transforms, and tasks.<ref>{{cite web | title = Developing a Custom Data Flow Component | url = http://msdn.microsoft.com/en-us/library/ms136078.aspx|work=Microsoft Developer Network | publisher = Microsoft | access-date = 22 March 2013}}</ref><ref>{{cite web | title = Developing a Custom Task | url = http://msdn.microsoft.com/en-us/library/ms135965.aspx|work=Microsoft Developer Network | publisher = Microsoft | access-date = 22 March 2013}}</ref>
 
SSIS features a programmable object model that allows developers to write their own hosts for package execution. Such a host can respond to events, start and stop packages, and so on. The object model also allows developers to create, store, and load packages, as well as create, destroy, and modify any of the contained objects.
Aside from the Import/Export Wizard and the designer, the product includes a few other notable tools.
 
Within limits, SSIS packages can load and call [[Assembly (CLI)|CLI assembly]] [[Dynamic-link library|DLL]]s, providing access to virtually any kind of operation permissible by the .NET [[Common Language Runtime|CLR]].
DTEXEC executes a package from the command line wherever it may be stored. Before running the package, the tool may be instructed to apply configuration information, which will allow the same package to be reused with slightly different parameters, including different connection strings for its endpoints.
 
SSIS can be used on all SQL Server [[Microsoft SQL Server#SQL Server 2005|2005]], [[Microsoft SQL Server#SQL Server 2008|2008]], [[Microsoft SQL Server#SQL Server 2008 R2|2008 R2]], [[Microsoft SQL Server#SQL Server 2012|2012]], [[Microsoft SQL Server#SQL Server 2014|2014]] and [[Microsoft SQL Server#SQL Server 2016|2016]] editions except Express and Workgroup.
DTSUTIL provides the ability to manage packages, again from the [[command prompt]]. The tool can copy or move a package from a file into the server store, or back out again. Among a few other sundry functions, it can be used to delete, rename, encrypt, or decrypt packages.
 
==See also==
==Extensibility and Programabiltiy==
* [[Data Transformation Services]]
Users may write code to define their own connection objects, transforms, and tasks.
 
==References==
SSIS features a programmable object model which allows developers to write their own hosts for package execution. Such a host can respond to events, start and stop packages, and so on. The object model also allows developers to create, store, and load packages, as well as create, destroy, and modify any of the contained objects.
{{Reflist|2}}
 
==External links==
* [http://www.sqlismicrosoft.com/sqlserver/2008/en/us/integration.aspx SQL Server Integration Services unofficialofficial site]
* [http://msdn.microsoft.com/SQL/bi/integration/default.aspx SQL Server Integration Services official site]
 
[[Category:ETLExtract, transform, load tools]]
[[Category:Microsoft database software]]
[[Category:Microsoft server technology]]
[[Category:2005 software]]
[[Image:Example.jpg]]