Frequently an organization will find that the same resources are used in multiple ETL applications, for example, the same database, general purpose processing logic, or file system locations. Creating an easy way to reuse these resources across multiple applications would increase efficiency and reduce errors. Moreover, not every ETL developer has the same skill set, and it is likely that one developer will be more adept at writing code while another is more comfortable configuring database connections. Real productivity gains will come when these developers are able to work independently while still making their work available to others assigned to the same project. These are the benefits of a centralized version control system.
Of course, most version control systems could be used to store and serve files, but the real need is to store and serve entire ETL applications so that each developer’s ongoing work can immediately benefit from another developer’s completed work. In other words, the version control system needs to be tightly integrated with the tools used to develop the ETL application.
The following screen shot shows such a tool.
Desktop ETL tool that tightly integrates with a central version control system
Developers can checkout or commit entire projects or just a single artifact. Each artifact may be managed independently so if you need to go back to an earlier version of one artifact, changes you may have made to other artifacts are not lost. By being tightly integrated into the graphical environment used to create and edit the project artifacts, it is extremely easy and straight-forward to move your files to and from the version control system and there is no dependency on another vendor’s version control system. The built in version control system is optimized for managing the artifacts of ETL applications.
It is equally important that the version control system supports all of the actions one typically performs such as rollbacks, locking and unlocking of files, and the ability to resolve conflicts. Note that this particular ETL tool also has the capability to switch back and forth between multiple version control systems.
It also needs to be easy to determine the status of an artifact. Not just that it has been committed or modified, but when and by whom. Generally you must query the version control system for this information, but having it displayed within the development environment is more desirable.
Who’s ETL tool works in this fashion? Last month I mentioned the data integration solution offered by expressor software. The version control features I described in this post are all available in their just released expressor 3.1 Standard Edition through the integration of their expressor Studio development environment with a centralized metadata repository and version control system.
You can download their Studio application, which is free, or evaluate the full Standard Edition on your own hardware. It may be worth your time.
Reference: Pinal Dave (http://blog.SQLAuthority.com)