Saturday, December 12, 2009

An Open Letter to Dell

Dear Dell,

    I’ve been a Dell customer for well over ten years. I’ve owned and used close to 25 different Dell laptops and desktops, and for years I ran my consulting business on Dell PowerEdge servers. Today I have nearly 10 dell computers in my home office alone, and support an additional five or six that I have given as gifts to family members over the years. Your computers have generally been reasonably priced, and your support has been acceptable as well. Yes, I’ve had moments of dissatisfaction in the past, with issues related to order tracking and technical support, but all in all I have been satisfied in our long-running relationship.

    I’ve also used computers by any number of other vendors. Currently I own and use computers from Toshiba, Acer, Apple and Lenovo, and have owned computers from HP, Gateway and Compaq as well. But the majority of my computers over the years have been Dells, and when I go to purchase a new computer, Dell is the first place that I look.

    But all this has changed.

    On “black Friday” November 26 I ordered a computer (an Inspiron 546 Minitower) as a Christmas gift for my mother. Her current computer (another Dell desktop, I might add) is a few years old and is reaching the end of its useful live. Your “black Friday” ad was explicitly targeted at Christmas shoppers, and the confirmation email I received listed a delivery date over a week before Christmas. I had no need to worry.

    Then, yesterday, I received an email letting me know my “recent Dell Order has been delayed” and saying that my “Revised Est. Delivery Date” is now two weeks after Christmas. There was no explanation, no information, simply a banal “We apologize for the delay and appreciate your patience.”

    I posted a Facebook update expressing my displeasure: “is very upset with Dell's "Revised Estimated Delivery Date" of 4 January 2010 for an order placed in November (as part of a Dell holiday promotion) for a Christmas present.” Within half a day I had received nearly 20 comments (many more than I typically receive) including these highlights:

  • Hey...same thing happened to me. What is going on over there? Do they need to employ more elves?
  • Dell is foul....My employer has a deal with them: my new 3-week old laptop has failed twice, and Dell has no obligation to replace a lemon...just fix parts. Better still: this latest piece of crap was meant to replace my own personal laptop which after two years is also dead: another Dell.
  • Mac!
  • MY ORDER HAS BEEN CANCELLED!!!! I may have to reach through the phone and strangle someone!
  • OH..this is nice. My order has been changed to ship right after Christmas (not in Jan...poor Matthew!). But the monitor just arrrived at my house. It's a refurb AND it's the wrong model. I think that's strike 3. What on earth is going on over there?
    Matthew, you may have more adventures to look forward to!
  • I'm still on the phone being F$&@ed around and no closer to getting this resolved!
  • After many personal and professional experiences similar to this, Matthew, I've decided never to buy a Dell again. Cancel it, buy a Toshiba for less, it'll last just as long, and I bet it'll get there faster than Dell's original ship date.
  • Seems like the same here in Holland

    Dell, you have seriously messed up here, and have lost my business. I know that there are probably “logical explanations” for your failure and your disappointing so many customers at this time of the year. But you are a mature manufacturing and distributing company and if you received an unexpected order volume in response to your promotions, then your inability to deliver on your promises is your problem, and it is your responsibility to make it right to your customers. (As a side note, I continue to receive email promotions from Dell day after day – I wonder how you plan to fulfill these new orders when you are unable to fulfill your existing orders.) The Christmas holiday is a once-a-year opportunity for giving gifts to lived ones. You know this, and you exploit it in your promotions. But now, in my limited circle of friends, there are many people who will be receiving nothing on Christmas morning other than an “IOU” card and an excuse. I’m sure my mother will understand. But I’m equally sure that a friend’s 10 year old daughter will not.

    Dell, you have lost our trust. How do you plan on re-earning it? How do you plan to make right the wrongs that your failure here have brought to our family holiday?

    While I’m waiting for your response I’ll be out at the shops, looking for a computer. But definitely not a Dell.

Sincerely,

Matthew Roche

Wednesday, October 7, 2009

Transformations are NOT Tasks

This is another one of those “I thought I blogged this years ago” posts. I was having a conversation with a fellow MCT a few days ago, and he described SSIS data flow transformations as “tasks on the data flow tab.” I didn’t say anything then, but this phrase has been sticking in my mind ever since – not unlike a nasty splinter – and I think I need to get it written down before my brain lets me move on.

Why is this seemingly innocuous phrase causing me so much mental distress?

Because transformations are not tasks!

Let’s list a few characteristics of tasks in SSIS. Tasks:

  • Are used in the SSIS Control Flow.
  • Are Containers, with all of the properties (such as the ability to control transaction scope, or to be enabled or disabled) of containers.
  • Support property expressions.
  • Support breakpoints and debugging.
  • Can be individually configured for logging.
  • Are presented as boxes connected by arrows on the BIDS design surface.

The only characteristic that data flow transformations share with tasks is the last one: they’re presented as boxes, connected by arrows.

But the similarities end there. Transformations are metadata-bound functions that operate on data stored in the memory buffers managed by the data flow engine. They are subcomponents of the Data Flow task, which is just another task. It’s an important task, and it gets its own designer because of this, but it is just another task nonetheless.

Why is this important to me? Some (ok, most) may argue that I’m just being pedantic. And perhaps they’re right, to some degree. But any technical field has its own terminology that allows its practitioners to communicate efficiently and precisely. Hearing an SSIS developer use “task” for “transformation” is kind of like hearing a surgeon use “abdomen” for “stomach” or a .NET developer use “component” for “assembly.” The terms are related, but their meanings are very different. In each case it’s probably likely that the speaker knows the difference and is simply not speaking carefully, but sometimes (particularly with the surgeon ;-) you just want to be sure.

So the next time someone tells you that he’s using “the Lookup Task” please send him here.

P.S. I should also be sure to mention that the MCT who used this offending phrase probably does not need this rant to let him know what’s going on. MCTs tend to have a solid grasp of the technologies they teach; you tend to crash in burn in front of the classroom if this is not the case. But I knew that I was going to have this stuck in my mind forever if I didn’t say something somewhere…

Tuesday, September 29, 2009

SQL Server MVP Deep Dives

I’m not a SQL Server Most Valuable Professional anymore[1], but I was when this amazing project got underway, and I am proud to contribute as an author on SQL Server Integration Services. SQL Server MVP Deep Dives is a unique book. It has:

  • 53 authors
  • 59 chapters
  • Coverage of database design, development, administration and performance tuning
  • Coverage of the SQL Server BI suite: SSIS, SSAS and SSRS

But that’s not the coolest thing about this book. The coolest thing is where the money goes. All author royalties (that’s right – 100%) go to War Child International, a network of independent organizations, working across the world to help children affected by war.[2] So not only will this book help you become a better SQL Server practitioner, it will also help children around the world.

You can pre-order the book online today at the SQL Server MVP Deep Dives web site, and get immediate access to a pre-release PDF copy of the book. (Note: if you order before the end of September and use the code pop0928 at the checkout you will also save 50%!) Also, if you’re attending the PASS Summit in Seattle in early November, you can buy a copy at the conference bookstore and it signed by many of the authors. I’ll be there, and based on the email I’ve seen we’ll see a majority of the authors and editors should be there as well.

If you only buy one SQL Server book this year, why not make it this book? And since all proceeds go to benefit a great cause, why not buy two?

I’ll see you at PASS!

[1] You can’t be an MVP and a Microsoft employee at the same time, so when I joined Microsoft they took away my MVP status. Which was OK.

[2] Not, as some may think, a non-profit organization dedicated to helping aging Jethro Tull fans.

Monday, September 21, 2009

SSIS Package Templates: The Good, The Bad and The Ugly.

This is one of those “I could have sworn I blogged on this topic years ago” posts. I speak to this topic whenever I’m presenting on SSIS development best practices, and with my half-day mini-seminar in Stockholm last week[1] I was thinking of it – and speaking about it - again. Although SSIS is a great platform and provides pretty much all the tools you need to get the job done, sometimes it’s not always easy to know what tools to use and how to use them.

Which brings us, of course, to package templates. So before we continue, please read this Knowledge Base article on how to create a package template in BIDS.

Done? Good – let’s review, and paraphrase while we do it.

Package templates are pretty much the entire reuse story in SSIS.[2] You define the baseline package, including connection managers, log providers, configurations, variables and expressions, tasks, containers and precedence constraints and sometimes data flow components as well. When you create a new package you create a new copy of the template with this pre-defined starting point instead of starting with a blank package, so you can be sure that the package is just like the baseline you created.

In keeping with the theme established in the post title, this is the “good” side of SSIS package templates. They’re valuable, and when you’re building a large number of packages for the ETL subsystem of a DW/BI project, they’re invaluable, in that the common “plumbing” required by all packages can be defined once and used consistently everywhere. And this is particularly vital when working in a team environment, where different developers (each of whom inherently has his own style and own way of building packages) are working on the same project, which is typically the case on ETL projects of anything but the most trivial in scale.

Which brings us to the “bad” side of package templates, at least as implemented when following the guidance in the KB article referenced above. This guidance is firmly targeted at the lone developer; did you notice how all the instructions about creating a package and saving it in the Visual Studio install folder never mentioned on what computer you should be performing these steps? That’s because the article assumes the context is “your machine” – the lone developer’s machine or - at scale – on every developer’s machine.

Why is this bad, you ask? Surely it can’t be so much work to copy a DTSX file to the same folder on a few computers, can it?

Of course not. The “bad” side of following this approach comes not when creating the templates, but when the real world comes into play and the project continues to grow and evolve. In a perfect world, when the needs of the project are understood 100% before development begins, and the needs of the project never change after development begins, this approach will work just fine for team development. Your mileage may vary, but I’ve yet to work an SSIS project[3] like this.

In the real world, most successful BI projects are iterative and agile, just like most other successful development projects. If you look at SQL Server MVP John Welch or MVP and VSTS Ranger Ted Malone, their blogs focus strongly on agile BI. If you read Kimball, you will see an overarching theme of agile and iterative development. And if you’ve ever seen former SSIS program manager Michael Entin present on SSIS performance tuning[4] you’ve noticed this best practice under the heading of “Principles of Good Package Design”: “Follow Microsoft Development Guidelines - Iterative design, development & testing.”

Ok, so why did I just go off on that tangent and drop so many names? Regular readers will quickly note that my writing style practically revolves around tangents and will likely seize on this as the reason, but today it is to drive home a point: The real world is neither pretty nor neat, and people who know more about BI than I will ever forget[5] have gone out of their way to point this out.

Why is this bad, you ask yet again? Surely it can’t be so much work to copy a DTSX file to the same folder on a few computers, can it? And can you please just get to the point, Matthew?

Maybe. Picture this story:

A small team of five or six SSIS developers are working together on a project. The lead developer has created a package template for all developers on the team to use moving forward. He emails the DTSX file to each team member along with a reference to KB article 908018 for instructions on how to install and use the template.

A week later the lead developer identifies a problem with the package template.[6] He updates the template package and emails it out to his team with instructions to update their local machines with the new package right away.

Unfortunately, one of the team members is a “heads-down” kind of developer who turns off email during the day, and finishes a few more packages before he gets the message, even though he does save the new template package once he checks his email.

And another developer is on vacation when the email is sent, and by the time he gets back he has too many new messages in his inbox to ever get through, so he decides that if anything is really important then the sender will re-send it, and he just deletes anything that is unread to get a fresh start with no email backlog.[7] So he keeps on developing packages using the outdated template until the next update is sent out a few weeks later…

You see where this is going, right? This is gong to the “ugly” side of SSIS package templates. The ugly part comes in downstream when integration testing begins. The packages developed by the individual team members need to be deployed into a test environment and tested together, and of course at that point things start to fall apart. Everyone works extra hours and deadlines slip by while people try to track down problems they knew had been solved long ago. Although to be honest, the worst case scenario is that the test cases don’t find the problem at this point, and the ETL solution is deployed into production without the errors being resolved.

Unfortunately, I have seen this story play out in one way or another on several different projects over the years. The details are always different, but the ending is always the same: extra work, unhappy clients, and pain.

Now picture this story retold, with a few different details:

The lead developer works with his team to establish Team Foundation Server (TFS) source code control[8] before any development begins. He develops a set of packages that serve as “template examples” and checks them into TFS.

He also establishes standards with his team, so that each team member knows to Get Latest whenever starting work on a new package. This ensures that – no matter what happens – each developer always has the most recent versions of all packages with no need for “special case” tasks like updating a template.

The lead developer also instructs each development team member to create any new packages by right-clicking in Solution Explorer and selecting Add Existing Package, and then browsing to a current file from TFS.[9] This ensures that whenever a new package is created, it is being created as a new instance of a “rolling template” that is stored in TFS and as such the same version is always available for use by all developers.

Then, whenever problems are identified in existing packages, the “problem packages” are updated immediately, and the fixes are applied to any existing packages created from a now-outdated template.[10]

And whenever anyone goes on vacation, problems are found earlier, their impact is lessened, deadlines are met, and everyone lives happily ever after.

Ok, so the results are probably not so extreme, but in my experience on multiple projects this “rolling template” approach is the best practice way to approach SSIS package templates. Any attempt to utilize the Microsoft guidance in a team environment runs into problem after problem, and at the worst extreme I have seen a development team manually re-implement dozens of SSIS packages using a new template after churning for a full week while trying to reconcile and resolve the problems disseminated by using the “official” template approach.

And to be fair, the definition and implementation of policies and procedures can help make following the “official” approach less bad, but the starting point is inherently flawed for team development. If anyone were to suggest to me that following KB 908018 was the right way to approach package templates, I would have to politely[11] disagree.

What do you think? Even though I have been around the block a few times with SSIS, I don’t claim to know it all. I’d love to hear your thoughts on how you use package templates – especially if I took your name in vain earlier on - including what works and what does not work in your team environment.

 

[1] I never seem to find the time to do tourist stuff when traveling for work, but for some reason I just could not say no when I was invited to speak on SSIS best practices when in Sweden for a friend’s wedding. I have the world’s most understanding wife.

[2] I say “pretty much” because using small, modular packages and executing them via the Execute Package task or some other client, or using the SSIS .NET API to dynamically construct and execute packages are the other parts. In SQL Server 2008 and SQL Server 2008 there is not a lot to say about reuse in SSIS.

[3] Or projects using any other technology, for that matter. If you know of any projects like this and are looking for a consultant, I’m pretty sure I can work out a moonlighting arrangement with Microsoft. ;-)

[4] Or have seen me blatantly plagiarize his slide decks for my own talks.

[5] No Ted, this doesn’t include you. Sorry.

[6] I won’t go into the details of the problem because it isn’t really important to the story, but it could be anything along the lines of:

  a) A missing variable, which needed to be added.
  b) A variable defined at an incorrect scope, hiding a variable at the package scope when accessed from a Script task.
  c) A log provider needed to be added to support new project requirements.

You get the idea, right?

[7] Yes, people really do this. Yes, I’m thinking about doing it right now. I’ve been on vacation for a week and have been checking email at least once a day to respond to high priority items, but I still have nearly 700 unread messages in my inbox. ;-(

[8] Or Visual SourceSafe or pretty much any modern SCC system. But TFS has so many additional benefits, it’s hard to recommend anything else.

[9] Just to be clear, I’m not implying that there is any ability to browse TFS repositories in BIDS, but instead am saying that the developers all have the most recent packages from TFS and can browse to them on their local file systems.

[10] Odds are they use a tool like PacMan to batch update the existing packages. ;-)

[11] Or perhaps not so politely depending on the hour…

Friday, May 22, 2009

Coolest. Tool. Ever.

I’m struggling with how to best start this post. Here are a few possibilities:

  • I like to think that I’m a hard person to impress. But…
  • I swear at my computer many times a day, but usually in anger and almost never in delighted surprise.
  • Someone deserves a raise.
  • Holy crap!

All of them apply, so you can take your pick.

But what am I talking about?

I’m talking about the DocumentReflector tool that’s included with the Open XML Format SDK v2.0.

You see, I’ve been playing today with a few Office development tasks today for a work-related project. I’ve never looked too closely at working with the Office Open XML file format before this week; I’ve read white papers and such, but until you start writing your own code you never really “get” what an API is all about. After a few false starts and many searches, I was less than impressed.

But then I discovered DocumentReflector. This is a stand-alone SDK tool that will (wait for it) let you open an Office Open XML document, select a subset of the document or the entire document and it (here it comes) displays the C# code that will create it.

Holy crap indeed.

I’m sure I’ll be swearing at this amazing tool before the weekend is over (the new wears off pretty quickly for me) but today whoever built this app is officially my hero.

Edit: I almost forgot to add – there’s a great overview of using this tool on Erika Ehrli’s blog, here.

Now I can’t help but ask why we don’t have a tool like this for SSIS packages. Wouldn’t it be great to point a utility at a DTSX file and have it output the code required to build that package through the SSIS API? ;-)

Tuesday, May 19, 2009

Spatial Demo Goodness

If you’re someone who presents on SQL Server topics, you have probably run into something of a wall when it comes to getting interesting spatial data sets to demonstrate. The spatial data included with the SQL Server 2008 sample databases is functional, but not particularly complex or interesting. There is free spatial data available for many different sources online, but it tends to be difficult to find, in different formats, and annoyingly difficult to load into SQL Server.[1] And of course, the not-free spatial content out there tends to be really, really not-free, and while it may make sense to pay a premium price if you are developing premium software, but for demo purposes this is generally a non-starter.

Enter GeoNames.

GeoNames is an open source provider of spatial data. Essentially they have many disparate sources of free spatial content and have aggregated them into a single location, with many different access methods. They support web service access (and publish a nice set of client libraries too) which is nice for direct application integration, but to me the cool factor comes from the ability to download text dumps of the whole database or just the countries you want. Because then you can load the data into SQL Server 2008 and let the demo goodness begin.

And Ed Katibah, PM for the SQL Server spatial team, has posted instructions for loading GeoNames data into SQL Server 2008. It’s great to have these steps documented because there are quite a few of them, but hopefully you’ll only need to perform them once.

So if you have been waiting for great spatial data that’s available for free, wait no longer.

I should also point out that I became aware of this cool resource not based on my own hard work and research, but instead because of the excellent Simple Talk newsletter that Red Gate Software produces. And I should probably mention that the primary reason I blogged about it is that my friend and colleague, senior SQL Server trainer and all-around good guy Chris Randall has been working on building better spatial demo sets, and I’ve heard rumors that  he occasionally reads this blog. Hopefully this will save him (and you, and me) some work…

 

[1] Please keep in mind that I don’t claim to be a SQL Server spatial expert, so what is “annoyingly difficult” for me may be “exceptionally simple” for someone with more experience, but it likely to be “annoyingly difficult” for many people.

Thursday, April 9, 2009

More on Package Configurations

Douglas Laudenschlager from the SSIS documentation team has posted an excellent article to his unofficial SSIS blog on how package configurations are applied in SQL Server 2005 and SQL Server 2008 Integration services. Check it out here.

I don’t often post on my blog simply to reference someone else’s original work, but I wanted to do it today because:

  • Package configurations are a vitally important feature that are often misunderstood.
  • The manner in which configurations are applied has changed between SQL Server 2005 and SQL Server 2008.
  • These changes can cause changes in package behavior when executing packages with specific DTEXEC options

So if you are moving from SQL Server 2005 to SQL Server 2008, definitely be sure to read about these changes.