Sunday, April 5, 2009

SSIS and the Package ProtectionLevel Property

This is another one of those “I could have sworn I blogged on this years ago” posts.[1] I was looking for something I’d already written that discussed the ProtectionLevel property of the SSIS Package, but search as I might I could not find it. So I guess it’s time to write it, eh?

(As a note for the lazy reader, the moral of the story is that you should use the DontSaveSensitive ProtectionLevel for your packages, and use package configurations to apply the sensitive property values to your packages. Feel free to read the next five pages, but it all boils down to this.)

First of all, let’s start off by reviewing the property itself. Take a look at the properties of any SSIS package and you’ll see it there:

ProtectionLevel00

Notice the description there at the bottom of the Properties window? “Specifies the protection mode and the protection method of the package.” Hey – that’s pretty self-explanatory, so maybe we don’t need a blog post after all. Right?

Maybe not.

Before we continue, why not take a look at what SQL Server Books Online (BOL) has to say about this property?

Ok, so you’ve read the docs. Now let’s review:

  • Component developers get to specify which properties of their components are considered sensitive.
  • For the built-in components, this includes the password property of connection managers, some system variables and the XML nodes in the underlying DTSX file that store this information.
  • SSIS users (that means us, the package developers) cannot control what is considered sensitive.

Why does this matter?

Well, SSIS is designed to be secure. One aspect of this is that any “sensitive” information that is part of your package (think about it – many packages include information like passwords in connection strings for connecting to databases and legacy systems where valuable data is stored) needs to be protected. And by protected, we mean never, ever stored in clear text.

That’s right, donkey – never, ever.[3]

But wait, you say – I’ve built many SSIS packages and I’ve never been prompted to encrypt any sensitive anything. How am I supposed to believe you?

Good question. Like most developers[4], the fine members of the SSIS product group wanted to avoid irritating their users.[5] Because of this, they wanted to make protecting sensitive data seamless, even transparent. So they did. Take another look at that screen shot of the Properties window above. Do you notice the EncryptSensitiveWithUserKey option? That’s the default.

This means that, by default, SSIS uses the Data Protection API (DPAPI) to encrypt any sensitive data with the user key of the user who saves the package. Every Windows user has a user key, so this is pretty straightforward. Then, when the user goes to open the package again, the encrypted data is automatically decrypted without the user needing to do anything. Pretty slick, eh?

Yep. Pretty slick. Especially for SSIS developers who live in a world where they never work in teams and never have their packages run in an unattended batch job that uses a service account.

Unfortunately, I don’t live in that world, and neither do many of the SSIS developers I know.

Ok, let’s be fair. That sounded awfully critical, but it wasn’t intended to be. I generally don’t like to criticize unless I can suggest a better alternative, and in this context I don’t see a more reasonable default behavior. It just works, and it lets developers be productive with SSIS without needing to worry about manually configuring security settings.

The criticism that I have is that people just don’t seem to understand this, and that causes problems. If one developer didn’t understand it, then it’s probably his fault. But if dozens or hundreds of people have this same problem (and they do – just take a look at the SSIS forums on MSDN to see how often questions related to the ProtectionLevel property come up) then the communication about the feature hasn’t been what it could be.

So what is the problem, and how does it show up? Well, consider one of these situations:

  • The package developer checks his package into source control[6] and another developer checks it out to make changes. The second developer can open the package, but gets an error saying “Failed to decrypt protected XML node "DTS:Password"” and the package will not execute.
  • The package developer deploys the package to the production server and the DBA schedules it to execute via a SQL Server Agent job. When the job is run, the job step fails with an error saying “Failed to decrypt protected XML node "DTS:Password"” and the package will not execute.

I won’t go on. You probably see the trend already. With the default settings, initial development is easy, but downstream tasks are more difficult. Only the user who saves the package can edit and execute it without error. This is probably not what we want.

What about the other options? Let’s consider:

  • EncryptAllWithUserKey – This option has all of the drawbacks of the default EncryptSensitiveuserKey option, but has the additional drawback of encrypting the whole darned package, so not only can no other user execute the package without re-entering the sensitive data, no other user can even open the package. Probably not what you want.
  • EncryptSensitiveWithPassword – This option replies on the package developer entering a password for the package, and then entering that password every time the package is opened or executed. Um… right. I don’t know about you, but my real-world SSIS projects tend to involve many dozens of packages. And I typically have a dozen or more open in Visual Studio at any given time. So picture me staggering to my desk on a blurry Monday morning, desperately clutching my mug of coffee, and launching Visual Studio. Visual Studio loves me, so it has remembered the packages I had open when I shut down for the weekend, and in a display of digital affection it re-opens them all.[7] Now picture the look on my face as I am prompted to enter the &^%$%@(* package $#^%#& password for every &@%$(^@ package as it opens. (And for that matter, picture the look on  the faces of the poor people from accounting down the hall as they run for the fire escape.) And to make matters worse, this option pretty much negates the security it’s designed to provide, because every developer on the team needs to know the package password, and at least one developer[8] will put that password on his white board or on a Post-It Note.
  • EncryptAllWithPassword – Just when you thought it was safe to go back into the Properties window, here comes an option which combines the worst of all worlds. Not only do you have to enter the &%#*& password every time you open the package, the whole package contents are also encrypted so if you forget the password[9] you’re pretty much SOL[10].

So what does that leave us? That leaves us with two options. The first is the one that I use and advocate: the DontSaveSensitive option.

This option, like the others we’ve seen so far, is pretty much self-describing. It protects the sensitive data by simply not saving it in the package file. What’s not to like? No password, no user keys, everyone can open the file without error. It’s a little slice of perfection in our imperfect world!

Maybe.

Maybe not.

The difficulty, of course, comes in when you actually want to execute the package. The package can’t run if it doesn’t have a valid connection string to connect to that legacy database server from which it extracts data, now can it? So where does the sensitive data get stored, if not in the package? The answer is that we, the package developer, need to store that sensitive data in package configurations. This could be XML config files, it could be a SQL Server database, or the Windows Registry – there are many options. And each one of them can be secured in its own way, through database permissions, NTFS permissions or Registry ACLs.

The problem that some developers have with this is that it doesn’t let them get down to doing what they want to do the most, which is build their packages. And what do we say to those developers?

We say “tough luck!”

We don’t have much sympathy for those developers, because they’re missing the big picture. Yes, it can slow things down up front to have to add configurations to your packages, but this saves a ton of time later on. Configurations are a required feature of packages that will need to get deployed to multiple environments (and unless your packages will only ever be run by you on your machine, then this means your packages) and by adding them right at the beginning of package development you will have a much greater chance of having a successful deployment than if you wait until you start deploying to start thinking about deployment.

So this is what I recommend that SSIS developers do when building their packages:

  • Set the ProtectionLevel property to DontSaveSensitive
  • Add a package configuration for each connection string or other sensitive property in the package.
  • Test everything and make sure that it works.
  • Save the package and use it as a template package moving forward.[11]

Ahhh… life is good. Life is so much simpler when you have prescriptive guidance.

But wait… What about that last option. What about ServerStorage? Should we ever use that one?

ServerStorage can only be used when deploying packages into a SQL Server database[12]. If you deploy to SQL, then you can rely on the database to protect your sensitive data. But as you probably know, this doesn’t apply to package development. Visual Studio can only work with DTSX files on the file system – it cannot open packages from or save packages to SQL Server. This means that while ServerStorage is a viable option for once package development is complete, but not during development. During development you want to use DontSaveSensitive, even if you are planning to deploy to SQL.

So there you have it. I’ve spent five pages[13] saying “use the DontSaveSensitive package ProtectionLevel option when you develop SSIS packages.” But even though it took longer to read, hopefully you’ll have a better idea of why you should choose this option than if I’d just come out and told you…

 

[1] Just as a side note, re-read the subject line. Doesn’t it sound like the name of a children’s story? It was all that I could do to not write this post as an AeSISSop’s Fable or modern cautionary tale[2].

[2] I also started putting it to the tune of the Gilligan’s Island theme before i caught myself. I haven’t been feeling well lately, you see…

[3] Say it in your best Shrek voice and it may make sense. Especially in the fable context.

[4] Read the rest of the sentence and then have a good laugh with me, ok?

[5] Wait, then wipe the tears from your eyes, and then start reading again.

[6] And we all use source control, right?

[7] Awww….

[8] You know who I’m talking about.

[9] And can’t find your way down the hall to Bob’s white board.

[10] That’s “Server Offline” to you.

[11] There will likely be other things that need to be set up in your template package too, but this is a good start.

[12] I personally prefer file system deployments and so do most of the SSIS professionals I know. Check out this post from Jamie Thompson where he discusses this topic, and make sure you also click on the link in the top of his post too.

[13] And if memory serves me, set a new record for the number of footnotes in a single blog post.

9 comments:

Saggi Neumann said...

Hi,

This post's just in time as one of my customers just asked me for SSIS deployment best practices.

As far as I know, when using windows authentication in connection strings, nothing is encrypted within the package and so the package can be easily deployed.

Do you know what else is considered sensitive other than passwords in connection strings? (I read the BOL article and "task generated XML nodes" and "variables marked as sensitive" aren't really descriptive.

The problem with using configurations is that the password is saved in clear text (or transparent encryption if you insist). Authorization (via permissions) and encryption are two complementing security facades...

If I understand correctly ServerStorage again only relies on authentication, so I guess you can't have both encryption (of some sort) and easy deployment...

Matthew Roche said...

Hi Saggi,

I have never seen a list of what properties are marked as sensitive. If you look at the SSIS API for building components you'll see that the component developer can flag any property they want as "sensitive" but from my experience this information is not included in the documentation for existing components. I'm sorry I can't help you there.

If you really need encryption, the one option I've seen is this: http://curionorg.blogspot.com/2007/05/encrypted-sql-server-ssis.html

I've never tried it personally, and don't know anyone who has, so proceed with caution...

M

Jeremy said...

Hey Matt,

I'm using the do not encrypt option as you recommended, having run into this clusterF myself in the past. I'm also using Windows Authentication running the dstx via a Windows 2003 Scheduled Task and a dedicated domain user account. In this case I really don't need a package configuration, right, because I'm not embedding passwords anywhere and I don't want to save any sensitive information?

Second question - even though I'm using the do not encrypt option at the package level, when experimenting with Package Configurations (which I'm hoping not to have to use in this case), I noticed that my Connection Manager for a OLE DB SQLNCLI.1 data destination has Property Attributes that specify ProtectionLevel, Type = Object, Level = 1. I thought level 0 was do not encrypt, and I thought that the Connection Managers would automatically inherit their ProtectionLevel settings from the package setting. I'm concerned that my Connection Manager may be trying to encrypt some information (like that pesky connection password that seems to be required in connection managers even though I set it to blank).

I say all this because my package runs successfully every 15 mins on the W2003 / SQL 2005 box against a target DB on a similar SQL box in a separate forest. Nearly every day now, the package fails with DTS_E_CANNOTACQUIRECONNECTIONFROMCONNECTIONMANAGER, but then runs again successfully at the next 15 minute interval. Arrg. Intermittent issues are the worst.

Thanks for any help you can provide with these questions.

Jeremy said...

OH and that ProtectionLevel Property Attribute on the Connection Manager is greyed out. Remember I'm looking at it through the Package Configuration Wizard.

AND, I see nowhere to view or edit that ProtectionLevel Attribute when editing or viewing the properties of the Connection Manager itself when editing the package in BIDS (outside of the Package Configuration Wizard). Perhaps I'll look at the underlying XML. double arrg.

Matthew Roche said...

@Jeremy - ProtectionLevel is a property of the Package, not the connection manager. I'm honestly not sure what you're trying to do with the connection manager. I would strongly recommend NOT mucking about with it. Also, remember that the XML behind the DTSX file is explicitly unsupported. There's no ovious reason why you would want or need to go there, and if you do make changes to the XML, you can damage the package in ways that cannot be repaired.

Jeremy said...

OK I'll leave those alone, thanks for the info.

But I still need to know if I should be using a Package Configuration, or not. Clearly the package works without one, and since I'm using Windows Authentication I don't have any sensitive information to encrypt. I'm just wondering if using a Package Configuration is recommended anyways to eliminate connection issues such as what I've described.

Matthew Roche said...

@Jeremy - The primary point of package configurations is location independence. With configurations you can run your package in different environments (referencing different servers and databases and such) without needing to change your packages themselves. You just update the configurations and the packages remain untouched.

If you will be (for example) executing your package against a development server while you're building it, against a test server while you're testing it, and against a production server after you deploy it, then yes - a configuration will come in awfully handy even if you are using Windows integrated authentication.

Englestone said...

This is useful.

Cheers,

-- Lee

Adam said...

Hi Matthew

Great article. Informative and Entertaining. I've been using the same approach for a while. It has always puzzled me though, how is it more secure keeping a password in clear text in an XML config file than in clear text in a dtsx package?

In our scenario, where we are generating SSIS packages, it would be more desirable to have a 'keep but don't encrypt sensitive' option. We generate the package for each target environment.

Thanks - Adam Gilmore - Dimodelo Solutions - dimodelo.com