2008/07/16
Got TFS Hate?
2008/07/07
When good coders write bad code.
I ran into a side-effect of this mentality on my latest project. Recently, I was working on optimizing some SQL queries that were running slow. In the middle of plowing through some of the newly added functions I came across one that really seemed to stick out.
In a nutshell, the function was to take a case identifier and return a "hierarchical" start date so that if ActualEndDate exists, return ActualBeginDate. If ApprovedEndDate exists, return ApprovedBeginDate. If CertifiedEndDate exists, return CertifiedBeginDate. If RequestedEndDate exists, return RequestedBeginDate. If all these conditions are false, return NULL.
Here is some sample data:
CaseID | RequestedBegin | RequestedEnd | CertifiedBegin | CertifiedEnd | ApprovedBegin | ApprovedEnd | ActualBegin | ActualEnd |
1 | 7/1/2008 | 7/31/2008 | 7/1/2008 | 7/14/2008 | 7/1/2008 | 7/7/2008 | 7/3/2008 | 7/7/2008 |
2 | 8/1/2008 | 8/30/2008 | 8/3/2008 | 8/30/2008 | 8/4/2008 | 8/30/2008 | 8/10/2008 | NULL |
3 | 9/1/2008 | 9/30/2008 | 9/2/2008 | NULL | 9/3/2008 | NULL | 9/4/2008 | NULL |
And here are the calculated Hierarchical dates
CaseId | HierarchicalDate |
1 | 7/3/2008 |
2 | 8/4/2008 |
3 | 9/1/2008 |
So looking at the function that is used to return the scalar value, I see this:
7 CREATE FUNCTION [dbo].[getHierarchicalDate]
8 (
9 @CaseID bigint
10 )
11 RETURNS datetime
12 AS
13 BEGIN
14
15 declare @HierarchialDate datetime
16 declare @ActualEnd datetime
17 declare @ApprovedEnd datetime
18 declare @CertifiedEnd datetime
19 declare @RequestedEnd datetime
20
21 set @ActualEnd =
22 (select ActualEnd from cases where caseid = @caseID)
23 set @ApprovedEnd =
24 (select ApprovedEnd from cases where caseid = @caseID)
25 set @CertifiedEnd =
26 (select CertifiedEnd from cases where caseid = @caseID)
27 set @RequestedEnd =
28 (select RequestedEnd from cases where caseid = @caseID)
29 if @RequestedEnd is not null
30 if @CertifiedEnd is not null
31 if @ApprovedEnd is not null
32 if @ActualEnd is not null
33 set @HierarchialDate =
34 (select ActualBegin from cases
35 where caseid = @caseID)
36 else
37 set @HierarchialDate =
38 (select ApprovedBegin from cases
39 where caseid = @caseID)
40 else
41 set @HierarchialDate =
42 (select CertifiedBegin from cases
43 where caseid = @caseID)
44 else
45 set @HierarchialDate =
46 (select RequestedBegin from cases
47 where caseid = @caseID)
48
49 RETURN @HierarchialDate
50
51 END
Yup, 8 SELECT statements. And this was written by a developer who has been in architect roles. Someone respected. Someone who should know better. So with a little adjustment, the function become this:
7 CREATE FUNCTION [dbo].[getHierarchicalDate]
8 (
9 @CaseID bigint
10 )
11 RETURNS datetime
12 AS
13 BEGIN
14
15 declare @HierarchialDate datetime
16
17 select @HierarchialDate = CASE
18 WHEN ActualEnd IS NOT NULL and ActualBegin is not null
19 THEN ActualBegin
20 WHEN ApprovedEnd IS NOT NULL and ApprovedBegin is not null
21 THEN ApprovedBegin
22 WHEN CertifiedEnd IS NOT NULL and CertifiedBegin is not null
23 THEN CertifiedBegin
24 WHEN RequestedEnd IS NOT NULL and RequestedBegin is not null
25 THEN RequestedBegin
26 ELSE NULL END
27 from [Cases] where caseid = @CaseID
28
29 RETURN @HierarchialDate
30
31 END
So we're down to one SELECT statement. One table scan. All because nobody took a step back to ask "is there a better way". This is something that seems to prevalent in our industry. Nobody wants to admit they are over their head, that they don't know everything, that someone else might have a better solution. I don't have the answers, so I'd love to hear any feedback.