There are three interesting Whitepaper recently released by Microsoft regarding SSIS. If you are using SSIS enthusiast and work with Hybrid data this three Whitepapers are very essential white-paper in the reference.
I am listing them here together for quick reference. The abstracts are built from the content of the white paper.
When transferring between a database and the cloud, data obviously is in transit. This involves multiple phases, including pre-production testing, data loading, and data synchronization. Sound complex? SQL Server Integration Services (SSIS) is a tool created for moving data in and out of Windows Azure SQL Database, as part of the extract/transform/load (ETL) solution or as part of the data movement even when no transformations are needed. It is effective for data in the cloud, all on your on-site databse, or a mix of the two.
There are already many processes involved with storing and data through the cloud. Not only are there many “moving pieces” involved in the transfer, but it is also necessary to adjust your performance tuning knowledge to apply to a system that is no longer completely self contained, but is a common resource for a greater pool of users. It is important to understand the best practices for cloud sources and hybrid data moves.
SQL Server Integration Services (SSIS) can be used effectively as a tool for moving data to and from Windows Azure SQL Database, as part of the total extract, transform, and load (ETL) solution and as part of the data movement solution. The Windows Azure (WA) platform poses several challenges with SSIS, but sever solutions as well. Projects that move data between cloud and on-site storage involve many processes within all available solutions. SSIS can be used to move data between sources and destinations in the cloud, as well as hybrid situations combining the two. Because operating “in the cloud” can be extremely different from on-site database performance turning, it can require all new training to fully understand how best to use SSIS.
Remember when one gigabyte of data was an unheard of amount? Now systems deal with terabytes and petabytes – but not quickly. Querying this much data can take hours, especially when it is stored in Hadoop unstructured. However, the same data can be stored, structured, in SQL Server and queried in seconds. Thus, there is a need for data transfer between Hadoop and SQL Server. SSIS, an ETL tool can be used to automate Hadoop and non-Hadoop jobs and manage data transfers.
Microsoft has coordinated with Hadoop to allow Hadoop to run on Windows Server and Azure, integrating with the rest of the Microsoft platform. This allows users to download data into or out of any Windows program, like Excel or Word. SSIS is another Windows tool that allows easy communication between Hadoop, and SQL Server, in this example. This integration is sure to become a useful tool in any database administrator’s tool belt, and ought to be learned as early as possible.
Reference: Pinal Dave (http://blog.SQLAuthority.com)