Wednesday, June 25, 2008

SSIS and SQL Server Agent - Choosing the Right Job Step Type

My "SSIS and SQL Server Agent" post is the most frequently accessed post on my blog, or at least it has been for the two months that I've actually been tracking site usage. But some recent posts on the SSIS Forums on MSDN have made me realize that it was probably long overdue for a follow-up focused choosing the correct job step type.

Microsoft has provided two different SQL Server Agent job step  types that you can use to execute SQL Server Integration Services Packages:

  1. The "SQL Server Integration Services Package" job step type
  2. The "Operating system (CmdExec)" job step type.

Now which one leaps out at you as being ideal for running SQL Server Integration Services packages?

Unfortunately, despite the name and the oh-so-friendly GUI through which you can specify all sorts of options, the SQL Server Integration Services Package job step type is not the one you want to use. There are quite a few reasons I believe this (and it's worth pointing out that while there are people who agree with me, there are some very talented SSIS people out there who do not) including:

  • Consistency: You can create a batch file that calls DTEXEC, test that execution outside of SQL Server Agent, and then copy and paste the DTEXEC command line into the SQL Server Agent job. This minimizes the number of environmental variables[1] that change between testing package execution outside of SQL Server Agent and testing package execution within a SQL Server Agent job.
  • Discoverability: I realize that this won't apply to everyone, but I personally find it much simpler to look at command line switches and options than to look at nine different tabs in a tabbed window. Looking at the command line I know that I'm seeing everything. Looking through the tabs, I'm pretty sure I'm going to miss something.
  • Troubleshooting and Error Output: This is the big one. When you run a package through the SQL Server Integration Services Package job step type, you're limited to the error logging that is already configured within the package itself, and this isn't always sufficient to diagnose and resolve problems with the job. But when you execute a package through DTEXEX and the the CmdExec job step type, you have more options that make tracking down and resolving problems much simpler.

So how do you do it? And do you really want to, if you're not one of those weird people who would rather have a command line than a GUI?

Well, the answer to the second question here is yes. Yes you do. Because even though you're using the CmdExec job step type, you still have a GUI to use. And here's how.

  1. Double-click on the package you want to run.[2] This will open up DTEXECUI, the Execute Package Utility.
  2. In DTEXECUI, specify all of the options that you need by using its nice friendly GUI.
  3. On the Command Line tab in DTEXECUI, highlight and copy the Command line text - these are the switches and options that need to be specified for DTEXEC to duplicate the options you selected in the GUI:

    Use DTEXECUI to build your command line
  4. In you SQL Server Agent job, add a new job step, and specify the job step type as CmdExec. In the Command text box, type DTEXEC and a space, and then paste the command line text that you copied from DTEXECUI earlier[3]:

    Paste the command line text into your job step
  5. On the advanced tab of the New Job Step dialog box, enter a file path and name in the Output File text box. (This is the big thing.) With this specified, the output from DTEXEC, which is exactly what you want when troubleshooting problems, will be written to a text file. You can also specify whether the file should be overwritten or appended to with each execution.

    Specify an output file for the job step

Now, when you execute your SQL Server Agent job, all DTEXEC output will be written to this text file. And if something goes wrong, you now have a single, easy to use location for tracking down and resolving the problems.

One of the most common things that the MVPs and other people who help out on the SSIS forums request is "the complete set of error messages and warnings from the package execution."[4] The reason we ask for this is that it's pretty much impossible to remotely troubleshoot someone else's problems without complete and explicit output. And the greatest thing about using the CmdExec job step type is that all of this information is now in one place. Copy and paste the errors and warnings, and help is (hopefully) on your way.

 

[1] Not Windows environment variables - I'm talking here about "things that are important to your deployment environment."

[2] If your package is stored in SQL Server and not on the file system, you can still use this basic technique, but you will have to browse to and select the package from within DTEXECUI, as opposed to double-clicking on the package on the file system.

[3] You can also specify the fully qualified path to DTEXEC.EXE - this is helpful if DTEXEC is not on the path, or if you're on a 64-bit server and want to explicitly select the 32- or 64-bit version of DTEXEC for this package.

[4] And we're not the only ones who ask for this. Your in-house support folks will probably ask for the same thing if you have a support team who works with SSIS.

7 comments:

Matt Masson said...

One thing to note about using the CmdExec approach is that your jobs will break when upgrading from 2005 to 2008, as the path to DTEXEC will change (<sql>\90\dts\binn to <sql>\100\dts\binn).

Matthew Roche said...

Matt - this is certainly correct if you are specifying the fully qualified path to DTEXEC.EXE. Thanks for the heads-up.

Rob said...

Hi Matthew,

Thanks for blogging, I have been enjoying your posts.

I want to point out that the SSIS step type doesn't have many of the limitations you are claiming.

The SSIS job step type has a "Command line" tab that will show you the command line options that you setup in the GUI. Alternatively you can forgo the GUI and manually enter the command line options.

As for saving the output to a file, the SSIS step type has the same output file option on the advanced page as the Operating system step type.

You can argue the consistency point if you need to run a mixture of 32bit and 64bit packages since on a 64bit server the SSIS step type will only use the 64bit runtime. Using the Operating System step type will let you use the same step type for both 32bit and 64bit SSIS packages.

As for using batch files, there is one major disadvantage for me. You have to connect to the file system on the server to see the command line options. If you use the SSIS step type, or include the actual command in an Operating System step you can see (and edit) the command line through management studio.

Matthew Roche said...

Hey Rob,

Thanks for the correction on the properties of the IS job step type. It has been quite a while (probably a few years at this point) since I used this one, so either they've added some functionality in SQL 2005 SP2 or (more likely) my memory is not serving me correctly.

I still believe that the CmdExec job step type gives you the best possible consistency when moving between environments (because DTEXEC has the same properties and works the same way in any environment) to eliminate variables when troubleshooting, but I do acknowledge that I am biased against GUIs in general. Use what works for you, and thanks for the feedback!

Stump said...

Matthew,

This is an old post...hopefully someone still checks it...

I have a situation where I'm using indirect configurations to point to an XML file for my config. I've purposely set one of my connections in my config file to not be correct to prove out what's happening:

If I run my package using the SSIS type, it fails because it can't connect to the db that I've messed the connection up. This is working as it should.

However, if I run the same package using the CmdExec type, the package runs successfully. As best I can tell, this must mean that it's completely ignoring my config file and pulling it's connection from the local connection object in my dtsx. This is of course, not desired...

Any thoughts on why running the package with the CmdExec is completely ignoring my Config file and pulling from local objects?

Matthew Roche said...

@Stump - Sorry for the delay in responding. I've been traveling a lot lately and am not checking comments as often as I would like.

The behavior you're describing is not at all what I would expect. I use indirect XML configurations all the time with the CmdExec job step type and the packages always use the configurations properly when I have the job defined correctly.

Based on what I can see (which is not a lot, to be honest ;-) about your scenario, I would look into the different SQL Server Agent subsystems and the Proxies that are configured for each. If you have different Credentials (accessed via the Proxies) configured for SSIS and CmdExec, the Windows accounts referenced by those Credentials may not have the same permissions, and may be seeing different values for the environment variables used by your indirect configurations. (How is that for a dense sentence?) if this is the case, and if the package cannot load the XML config file when executed by the CmdExec Proxy, then SSIS will log a Warning (note: it does not log an Error, which is a long-standing complaint of mine) and then use the value that is present in the SSIS package itself.

This sounds a bit complicated, but you should be able to check your package's log output to see if the warning exists.

Good luck!

Gavos said...

Your a champion! I have spent hours trying to figure out database user permissions for this, config files for that, all so so complicated just to run a job, and then at 1:45am in the morning, there you have the simple solution, thanks so so much, you have saved my life! Tip for those with 64 bit issues, just copy DTExec.exe in the x86 folder path to DTExec32.exe, then in the cmd line you can specify the 32 bit executable without digging all the way into the directory structutre.