Thursday, August 30, 2007

Coming Soon to A Screen Near You

Ahh, the smell of bourbon and pigskin. Football has returned to the South. It is going to be a busy Fall season between gridiron action and the following teaching/speaking events on my calendar.

September 25 - Microsoft Business Intelligence Boot Camp
I will be conducting the program on SQL Server Integration Services for this Microsoft Partner event.

October 6 - Alabama .Net Code Camp V
A couple of abstracts have been submitted, stay tuned for details should I get an invitation. Even if I am NOT speaking, I will be there. Well worth it if you can get to Birmingham for a code-filled Saturday.

November 10 - SQL Saturday
I will be doing a once more refined edition of my Data Mining with SQL Server 2005 presentation, with an eye to new features in SQL Server 2008.

Monday, August 27, 2007

SQL Saturday!

I have been invited to present on SQL Server Data Mining at the SQL Saturday event, in Orlando, Florida on November 10, 2007. The event is sponsored by the Orlando Chapter of PASS and further details can be found at www.sqlsaturday.com.

Friday, August 24, 2007

SSIS Lookup Transforms and Varchar Natural Keys

A recent project required the use of inferred dimension members due to early arriving facts. My initial thought was to do something like the figure below. I would use the client natural key for an in-bound fact row, and first attempt to find it in the dimension using a Lookup transform and a cached reference set. If the dimension key was found, it is pipelined to the Union All and eventually inserted into the fact table. If the dimensional key was not located, the row was dispatched to the Add Inferred Client Member task.
Essentially, this task would execute a T-SQL stored procedure to find the dimension key in the table and insert an inferred member if it is not found. The primary benefit to this is being able to avoid the case sensitivity of the SSIS engine and instead have the matching logic be executed in SQL Server. The row is then dispatched to the 2nd Pass Lookup.

The 2nd Pass Lookup is a Lookup transform identical to the initial attempt to find a matching dimension member, except with reference set caching disabled. This allows newly added rows, inserted by the Add Inferred Client Member task, to be part of the dimension lookup. If for some reason the candidate row is not found at this stage (perhaps an error getting the inferred member inserted was not caught), the row has an ErrorReason column added via a Derived Column transform, and is sent into an error flow. All inbound rows that find a dimension surrogate key in the 2nd Pass Lookup are moved into the destination (ie Production) fact table.

Caveat SSIS Developer

In this case the natural key for this dimension was a varchar datatype, and my initial impression was the use of a Lookup transformation would make the lookup of dimensional keys during fact table load straightforward. Unfortunately, there is a difference in how SSIS and SQL server handle string comparisons. SSIS executes a case-sensitive lookup. A fact table candidate natural key of 'A1234' would not match a dimension key of 'a1234'. Some lookup 'failures' in the initial lookup were actually case sensitivity differences. The first-cut of this process actually was two lookups, with the Execute SQL task between them only to insert an inferred dimension member. As duplicate natural keys for inferred members were inserted, the case sensitivity issue became evident.

Wednesday, August 01, 2007

Speaking In "LA"

It was my pleasure to be the featured speaker at the Lower Alabama .Net User Group (LANUG), in Mobile, AL on July 31, 2007. The presentation was on the data mining features in SQL Server 2005 and can be downloaded as a PDF. I would like to add that Doug Greene and Matthew Hughes are doing a great job with a user group that just recently celebrated the beginning of their second year.

As the post-presentation discussion swerved into dimensional data warehousing, there was some discussion on doing a dimensional data modeling session at the next Alabama .Net Code Camp. If you are interested, please post a reply here or send me an email at bralston@comframe.com.