SQL Tips by Namwar Rizvi

May 31, 2007

Finding all mispelled string values quickly

Filed under: Information, Query, TSQL, string manipulation, tips — namwar @ 9:10 pm

During data cleansing of imported data from some legacy system, we find that some values are actually spelled incorrectly and therefore, they are not getting included in the result of a query. Finding all the misspelled versions of a given value is quite difficult if you have thousands or millions of records.
Fortunately, there is a quick and easy solution for it which is “SOUNDEX” function in TSQL. Please note that Soundex is not the guarnteed way of finding all the incorrect versions but it is one of the quickest and nearly 90% accurate way of it.
Soundex is an algorithm and it bases on the idea that similar sounding words will have a same alpha-numerical score calculated by this algorithm. So for example, if you have a column called “Color” and you have different variation of same color names like Red,Redd,Redh etc. then Soundex will assign the same score to all of them and you can easily find these variations by comparing the Soundex score. Following is the full working example to better understand this concept.

–Disable SQL Server intermediate messages
Set NoCount On

–Create test Table containing daily data
Declare @m_TestTable table (ItemId int, Color varchar(50))

–Insert some sample values
Insert into @m_TestTable values(1,‘Red’)
Insert into @m_TestTable values(2,‘Reddh’)
Insert into @m_TestTable values(3,‘Redd’)
Insert into @m_TestTable values(4,‘Blue’)
Insert into @m_TestTable values(5,‘Green’)
Insert into @m_TestTable values(6,‘Dark Red’)

—Select all those items which are Red
Select * from @m_TestTable Where Soundex(Color)=Soundex(‘Red’)

May 30, 2007

Over 100 SQL Server Samples and 5 Sample Databases Free

Filed under: Information, SQL Server 2005, tips — namwar @ 9:53 pm

Today I want to share with you a great resource to learn about SQL Server 2005. Just download the SQL Server Samples and Samples Databases from

http://www.microsoft.com/downloads/details.aspx?familyid=e719ecf7-9f46-4312-af89-6ad8702e4e6e&displaylang=en

It contains more than 100 Samples and five databases which are compatible with SQL Server 2005 SP2 and will give you excellent idea about nearly every type of task you want to perform on SQL Server 2005.

Enjoy!

May 29, 2007

Finding N most recent orders of each customer – Wonders of Rom_Number()

Filed under: New Features, Row_Number, SQL Server 2005, TSQL, tips — namwar @ 8:30 pm

Retrieving n most recent orders from each customer is a quite frequent requirement from management reports. Before SQL Server 2005, there were tricky and sometimes unmanageable TSQL solutions for this problem. In SQL Server 2005, you can use Row_Number() to get the n most recent orders. Following query uses AdventureWorks database to demonstrate the solution.

Use AdventureWorks
Go

Select CustomerID,OrderDate,SalesOrderNumber
From
(
Select Row_Number()
OVER (Partition by CustomerId Order by OrderDate Desc ) as RowNo,
CustomerID,OrderDate,SalesOrderNumber
from Sales.SalesOrderHeader
) A

Where A.RowNo < 3

How it works:
1. Firstly, the inner query sorts the order of each customer in descending order of OrderDate
2. It assigns a RowNumber to each order from latest to oldest for each customer.
3. RowNumber will become reset for each new customer.
4. After above steps, result will be returned to Outer query.
5. Outer Query just selects only those rows which have RowNumber less than n(here n is 3)

Possible Enhancements:

1. You can change hard coded 3 to a stored procedure parameter and can convert this query to stored procedure for any number of recent orders for each customers.
2. By changing the OVER clause with different combinations of sorting orders and columns, you can get n latest due orders etc. Possibilities are unlimited.

May 28, 2007

Retreive current user permissions in SQL Server 2005

Filed under: Information, New Features, Object Search, Query, SQL Server 2005, TSQL, tips — namwar @ 8:36 pm

If you want to retrieve the permissions available for the currently logged in user then you can use the function called fn_my_permissions in SQL Server 2005. Following are examples of its usage:

Use AdventureWorks
Go

—Return Caller’s permission on the current server
SELECT * FROM fn_my_permissions(NULL, ‘SERVER’);
Go

—Return Caller’s permission on the current database
SELECT * FROM fn_my_permissions(NULL, ‘DATABASE’);
Go

—Return Caller’s permission on the given object
SELECT * FROM fn_my_permissions(‘Sales.vIndividualCustomer’, ‘OBJECT’);
Go

For further details please refer to Technet help here

Use VARCHAR(MAX),NVARCHAR(MAX) instead of TEXT, NTEXT

Filed under: New Features, SQL Server 2005, TSQL, Tuning, tips — namwar @ 8:06 pm

Have you ever faced the problem of assigning more than 8000 characters to varchar data type or more than 4000 characters to nvarchar data type? If your answer is yes, then I can understand your frustration due to this small limit.
Since SQL Server uses 8KB page to store data to disk therefore, it does not allow you more than 8000 characters in varchar or 4000 (2 Bytes per Unicode character) in nvarchar.

But wait, don’t get disappointed, Microsoft came up with Varchar(MAX), NVarchar(MAX) and VarBinary(MAX) data types in SQL Server 2005 which allows you to save upto 2GB in a single variable. The best part is that It allows you to use these data types as stored procedure parameters, internal variables etc.

So stop using TEXT and NTEXT data types because they are just there for backward computability and will be deprecated in next versions of SQL Server.

May 26, 2007

Generate Results for missing rows

Filed under: Query, TSQL, Tuning, tips — namwar @ 11:36 pm

Sometimes we require to display values for all days of a month. The problem with this is if you have properly normalized database then there is a chance that you may be recording data for only those days which have any value or in simple words you are recording the data as date, value pair. If for example, there is no data for 17th day of the month then we will not record an empty row for it. It is a proper and correct approach but If we need a report to display the data for all days and we require 0 as a value for all missing days then we can use the following TSQL approach to generate the data for all days including missing days also:

–Disable SQL Server intermediate messages
Set NoCount On

–Create test Table containing daily data
Declare @m_TestTable table (DayId int, value int)

–Insert values for some days
Insert into @m_TestTable values(1,30)
Insert into @m_TestTable values(4,20)
Insert into @m_TestTable values(9,33)
Insert into @m_TestTable values(12,77)
Insert into @m_TestTable values(15,177)

—Create Numbers table and fill it with 1-30
Declare @m_Days table (id int)
Declare @m_index int
Select @m_index=1
While @m_index<31
Begin
Insert into @m_Days values(@m_index)
Select @m_index=@m_index+1

End

—Run the following query to return value of each day in a month.
Select B.Id,Isnull(A.Value,0) From @m_TestTable A Right Outer Join @m_Days B On A.DayId=B.id

Finding Nth Highest value — Wonders of Row_Number()

Filed under: New Features, Row_Number, SQL Server 2005, TSQL, tips — namwar @ 12:33 am

Finding Nth highest value is one of the most common requirements of Reports. For example, you may have a requirement of finding 5th Highest salaried employee or 3rd most expensive item etc. In SQL Server 2005, it is now very easy to find Nth item in the result because of the Row_Number function. Using Row_Number is also very efficient and does not impact your system performance. Following is the code to demonstrate the usage of Row_Number() function to find 5th highest value. You can replace 5 by any number to find Nth highest value.

–Disable intermediate SQL Server messages
Set NoCount On

—Create a table variable
Declare @m_TestTable table ( id int, value int)

–Insert 100 Random values
Declare @m_Index int
Select @m_Index=1
While @m_Index <101
Begin
Insert into @m_TestTable values (@m_Index,Ceiling(10000 * Rand()))
Select @m_Index=@m_Index+1

End

—Return 5th highest
Select id,Value from
(
Select id,value ,(Row_Number() Over (Order by value desc)) rowNum
from @m_TestTable

) A
Where A.rowNum=5

May 24, 2007

Finding gaps in values

Filed under: TSQL, Tuning, tips — namwar @ 7:19 pm
Sometimes you need to find gaps in values of the given integer column. For example, you need to find gaps in the id column of a given table so that you can use those values for new records to fill the gaps. Mostly it happens when we import data from some other system which does not guarantee the consecutive ids. Following is the TSQL query which returns all the gaps:

–Disable intermediate SQL Server messages
Set nocount on

–Create a table variable
Declare @m_TestTable table (id int)

—Insert some sample values
Insert into @m_TestTable values(0)
Insert into @m_TestTable values(5)
Insert into @m_TestTable values(6)
Insert into @m_TestTable values(7)
Insert into @m_TestTable values(10)
Insert into @m_TestTable values(15)

—Run actual query to display gaps
Select currentPoint+1 as StartIndex,nextPoint-1 as EndIndex
From
( –Create pairs of start and end points
Select id as currentPoint,
(
Select Min(id) from @m_TestTable as B
Where B.id>A.Id
) as nextPoint
From @m_TestTable as A

) as D
–Removing all incorrect pairs where Start is greater than End
Where nextPoint-currentPoint>1

May 23, 2007

Difference between EXCEPT and NOT IN Operator

Filed under: Uncategorized — namwar @ 7:53 pm

SQL Server 2005 has introduced a new operator called “EXCEPT”. It is a good shortcut for Distinct Not In queries. EXCEPT operator returns all distinct rows from left hand side table which does not exist in right hand side table. On the other hand “NOT IN” will return all rows from left hand side table which are not present in right hand side table but it will not remove duplicate rows from the result. Following code demonstrates the behavior of these two operators:

–Disbale intermediate informative messages
Set Nocount on
go

—Create two sample tables
Declare @m_TestTable table ( id int, name varchar(50) )
Declare @m_TestTable2 table ( id int, name varchar(50) )

–Fill first table with sample data
Insert into @m_TestTable values(1,‘BBC’)
Insert into @m_TestTable values(2,‘CNN’)
Insert into @m_TestTable values(2,‘CNN’)
Insert into @m_TestTable values(3,‘FoxNews’)
Insert into @m_TestTable values(3,‘FoxNews’)

–Fill second table with sample data
Insert into @m_TestTable2 values(1,‘BBC’)

–Run query with EXCEPT operator, Only distinct rows will return
Select id,name from @m_TestTable
except
Select
id,name from @m_TestTable2

–Run query with NOT IN operator, duplicate rows will exist in the result
Select distinct id,name from @m_TestTable
Where id not in
( Select id from @m_TestTable2 )

May 22, 2007

Insert Script Generator

Filed under: Insert Script Generator, SQL Server 2005, TSQL, tips — namwar @ 9:18 pm

Every SQL developer feels the need for INSERT statements generator script for a given table data. I have found a very good script to accomplish this task. With some minor modifications I have created the following stored procedure which takes table name as input and generates insert statements. Following is the stored procedure code:

Create Proc spInsertScriptGenerator
(
@p_tableName varchar(255)
)
as
/**************************************************************/
–Description : Stored procedure to generate insert statements
–Original Source : http://blogs.consultantsguild.com/index.php/mclerget/2005/02/07/dynamic_sql_insert_generator_unleashed_1
–Modified By : Namwar Rizvi
–Date : 22-May-2007

/**************************************************************/
Declare @tmp table
(
SQLText varchar(8000)
)
Declare @tmp2 table
(
Id int identity,
SQLText varchar(8000)

)
set nocount on
declare
@vsSQL varchar(8000),
@vsCols varchar(8000),
@vsTableName varchar(40)

declare csrTables cursor for
select name from sysobjects where type in (‘u’)and name =@p_tableName
order by name

open csrTables
fetch next from csrTables into @vsTableName
while (@@fetch_status = 0)

begin
select
@vsSQL = ,@vsCols =
select @vsSQL = @vsSQL +
CASE when sc.type in (39,47,61,111) then
””””’+’ + ‘isnull(rtrim(replace(‘+ sc.name + ‘,””””,””””””)),””)’ + ‘+”””,”+’
when sc.type = 35 then

””””’+’ + ‘isnull(rtrim(replace(substring(‘+ sc.name + ‘,1,1000),””””,””””””)),””)’ + ‘+”””,”+’
else
‘isnull(convert(varchar,’ + sc.name + ‘),”null”)+”,”+’
end
from
syscolumns sc where sc.id = object_id(@vsTableName)
order by ColID

select @vsCols = @vsCols + sc.name + ‘,’from syscolumns sc
where sc.id = object_id(@vsTableName) order by ColID

select @vsSQL = substring(@vsSQL,1,datalength(@vsSQL)-1)
select @vsCols = substring(@vsCols,1,datalength(@vsCols)-1)

insert @tmp
exec (’select ‘ + @vsSQL + ‘ from ‘ + @vsTableName)

update @tmp
set sqltext = ‘insert ‘ + @vsTableName + ‘(‘ + @vsCols + ‘) values(‘ + substring(sqltext,1,datalength(sqltext)-1) + ‘)’

insert @tmp2 select ‘DELETE from ‘ + @vsTableName
insert @tmp2 values (‘GO’)

if (select count(id) from syscolumns where id = object_id(@vsTableName) and ((status & 128) = 128) ) = 1
begin
insert
@tmp2 select ’set identity_insert ‘ + @vsTableName + ‘ on’
end

insert @tmp2 select * from @tmp

if (select count(id) from syscolumns where id = object_id(@vsTableName) and ((status & 128) = 128) ) = 1

begin
insert
@tmp2
select ’set identity_insert ‘ + @vsTableName + ‘ off’
end

insert @tmp2 values (‘GO’)
insert @tmp2

select ‘update statistics ‘ + @vsTableName
insert @tmp2 values (‘GO’)

delete @tmp
fetch next from csrTables into @vsTableName
end

close csrTables
deallocate csrTables

update @tmp2
set sqltext = substring(sqltext,1,charindex(‘,)’,sqltext)-1) + ‘,NULL)’
where not(charindex(‘,)’,sqltext) = 0)

update @tmp2
set sqltext = replace(sqltext, ‘,””’,‘,null’)
where not (charindex(‘,””’,sqltext) = 0)

update @tmp2
set sqltext = replace(sqltext, ‘(””’,‘,null’)
where not (charindex(‘(””’,sqltext) = 0)

set nocount off
select
sqltext from @tmp2 order by id
go

Next Page »

Blog at WordPress.com.