Wednesday, October 7, 2009

Transformations are NOT Tasks

This is another one of those “I thought I blogged this years ago” posts. I was having a conversation with a fellow MCT a few days ago, and he described SSIS data flow transformations as “tasks on the data flow tab.” I didn’t say anything then, but this phrase has been sticking in my mind ever since – not unlike a nasty splinter – and I think I need to get it written down before my brain lets me move on.

Why is this seemingly innocuous phrase causing me so much mental distress?

Because transformations are not tasks!

Let’s list a few characteristics of tasks in SSIS. Tasks:

  • Are used in the SSIS Control Flow.
  • Are Containers, with all of the properties (such as the ability to control transaction scope, or to be enabled or disabled) of containers.
  • Support property expressions.
  • Support breakpoints and debugging.
  • Can be individually configured for logging.
  • Are presented as boxes connected by arrows on the BIDS design surface.

The only characteristic that data flow transformations share with tasks is the last one: they’re presented as boxes, connected by arrows.

But the similarities end there. Transformations are metadata-bound functions that operate on data stored in the memory buffers managed by the data flow engine. They are subcomponents of the Data Flow task, which is just another task. It’s an important task, and it gets its own designer because of this, but it is just another task nonetheless.

Why is this important to me? Some (ok, most) may argue that I’m just being pedantic. And perhaps they’re right, to some degree. But any technical field has its own terminology that allows its practitioners to communicate efficiently and precisely. Hearing an SSIS developer use “task” for “transformation” is kind of like hearing a surgeon use “abdomen” for “stomach” or a .NET developer use “component” for “assembly.” The terms are related, but their meanings are very different. In each case it’s probably likely that the speaker knows the difference and is simply not speaking carefully, but sometimes (particularly with the surgeon ;-) you just want to be sure.

So the next time someone tells you that he’s using “the Lookup Task” please send him here.

P.S. I should also be sure to mention that the MCT who used this offending phrase probably does not need this rant to let him know what’s going on. MCTs tend to have a solid grasp of the technologies they teach; you tend to crash in burn in front of the classroom if this is not the case. But I knew that I was going to have this stuck in my mind forever if I didn’t say something somewhere…

2 comments:

VASANT JAGTAP said...

Hi,
In DTS (SQL Server 2000), when I want to insert only 3 rows from Excel file to SQL Server; I click on ‘Transform Data Task’, click on ‘Option’ then update in First row as 1 and Last row as 3 in Data movement. Now I use SSIS (SQL Server 2008) but I unable to found this type of setting option in Data Flow Task, so please help me to how can I do it?


I wait your reply. Any body can help me as early as possible.


Vasant Jagtap
(vasant_jagtap1@yahoo.co.in)

Matthew Roche said...

@Sai - Your best bet is to post this question on the SSIS forms at MSDN: http://social.msdn.microsoft.com/Forums/en-US/sqlintegrationservices/threads

Good luck!