andrewlocatelliwoodcock

Thoughts on Software

Flexible Search using Stored Procedures: varying the WHERE clause at run-time

with 4 comments


Stored Procedures offer a number of advantages when developing databases, perhaps the chief among them being performance and security. I am loathe to lose these advantages but because they are pre-compiled blocks of SQL code, it can be challenging to develop Stored Procedures that can meet requirements for very flexible functionality such as search.

Over the years, I have seen a number of attempts to provide flexible search functionality. The worst of these involved building SQL Script on the client machine and then passing that script to the server for execution … SQL injection, anyone? Just as bad were stored procedures where the entire where clause was built on the client before being executed on the server: simply passing the where clause as a parameter to the stored procedure does not prevent SQL Injection attacks. Another method is to create a series of stored procedures, one for each combination of search terms. This is used in conjunction with business logic which decides which search stored procedure to run; this approach prevents SQL injection attacks (unless the stored procedure is spectacularly badly written) and also means that each search term has its own execution plan computed. The main disadvantage to this approach is that the number of stored procedures required increases exponentially with each additional search term. In these situations, the technique I prefer sacrifices some performance in order to retain security and improve flexibility.

The technique uses a single stored procedure with a single WHERE clause containing all search terms and uses nullable parameters and boolean logic to decide exactly which combination of search terms is executed.

An example of this technique in use:

CREATE PROCEDURE [dbo].[Search]
	@searchterm1  VARCHAR(255) = NULL,
	@searchterm2  INT = NULL,
	@searchterm3  DATETIME = NULL,
	@searchterm4  varchar(255) = NULL,
	@searchterm5  varchar(255) = NULL
AS
BEGIN

	SELECT
		*
	FROM
		MyTable (NOLOCK)
	WHERE
		(field1 = @searchterm1 OR @searchterm1 IS NULL) AND
		(field2 = @searchterm2 OR @searchterm2 IS NULL) AND
		(field3 = @searchterm3 OR @searchterm3 IS NULL) AND
		(field4 = @searchterm4 OR @searchterm4 IS NULL) AND
		(field5 = @searchterm5 OR @searchterm5 IS NULL)

END

So, what is it doing and how does it work?

SQL Server allows you to set default values for stored procedure parameters; if a default is provided for a parameter, then that parameter need not be explicitly set when the stored procedure is executed. By declaring the parameters as

	@searchterm1  VARCHAR(255) = NULL,

we are saying “If no value is explicitly set for parameter searchterm1, use NULL instead”.

So far, so good: if we don’t explicitly set the value of a particular parameter, it will be set to NULL automatically. The next part of the technique uses Boolean logic to allow us to build a flexible WHERE clause:

	WHERE
		(field1 = @searchterm1 OR @searchterm1 IS NULL) AND
		(field2 = @searchterm2 OR @searchterm2 IS NULL) AND
		(field3 = @searchterm3 OR @searchterm3 IS NULL) AND
		(field4 = @searchterm4 OR @searchterm4 IS NULL) AND
		(field5 = @searchterm5 OR @searchterm5 IS NULL)

The key to this is the bit between the brackets:

		field1 = @searchterm1 OR @searchterm1 IS NULL

this statement is saying “Where the value of field1 in table MyTable is either equal to the value of parameter searchterm1 or searchterm1 is NULL” or, more succinctly: “field1 must be equal to searchterm1 if, and only if, searchterm1 is not NULL”.

We can use any standard SQL comparison here so we could also say:

		field1 LIKE @searchterm1 OR @searchterm1 IS NULL

The key thing is to make sure you understand what this statement is doing: “field1 must be equal to searchterm1 if, and only if, searchterm1 is not NULL”. Because searchterm1 defaults to NULL if we don’t explicitly set a value when executing the stored procedure, this statement means that we will only consider the value of field1 if the value of searchterm1 was explicitly set. Let the import of this sink in for a moment …

The next stage is to use brackets and Boolean AND statements to combine multiple OR’d search term statements together. Brackets tell SQL Server to group the statements within the brackets as a single unit, whilst the AND statements combine the results of each statement within the brackets.

The use of AND means that each bracketed statement must return “true” in order for any results to be returned. Our bracketed statements each say “the value of fieldx must either be equal to the value of parameter searchtermx or searchtermx must be NULL”, so taken in combination

	WHERE
		(field1 = @searchterm1 OR @searchterm1 IS NULL) AND
		(field2 = @searchterm2 OR @searchterm2 IS NULL) AND
		(field3 = @searchterm3 OR @searchterm3 IS NULL) AND
		(field4 = @searchterm4 OR @searchterm4 IS NULL) AND
		(field5 = @searchterm5 OR @searchterm5 IS NULL)

means

	WHERE
		(the value of field1 is either equal to the value of parameter searchterm1 or searchterm1 is NULL) AND
		(the value of field2 is either equal to the value of parameter searchterm2 or searchterm2 is NULL) AND
		(the value of field3 is either equal to the value of parameter searchterm3 or searchterm3 is NULL) AND
		(the value of field4 is either equal to the value of parameter searchterm4 or searchterm4 is NULL) AND
		(the value of field5 is either equal to the value of parameter searchterm5 or searchterm5 is NULL)

and because any parameter not explicitly set defaults to NULL, we now have a WHERE clause in a stored procedure in which we can effectively vary the number of statements, so we can say both:

	WHERE
		(the value of field1 equals the value of parameter searchterm1) AND
		(the value of field2 equals the value of parameter searchterm2)

and

	WHERE

		(the value of field2 equals the value of parameter searchterm2)

or

	WHERE

		(the value of field2 equals the value of parameter searchterm2) AND

		(the value of field3 equals the value of parameter searchterm3) AND

		(the value of field4 equals the value of parameter searchterm4)

Simples.

Written by andrewlocatelliwoodcock

May 30, 2011 at 20:37

Posted in SQL Server, T-SQL

Tagged with , , ,

4 Responses

Subscribe to comments with RSS.

  1. Good article. In the event of a Null, is that part of the query executed and casuse a full table scan?
    field1 = @searchterm1 OR @searchterm1 IS NULL – or does the SQL engine optimise to remove this? Same answer for SQL Server and Oracle?

    Paul Smullen

    May 31, 2011 at 10:01

    • Hi Paul,

      that’s actually a pretty good question.

      If the value of field1 is NULL, whether that row is returned in the recordset depends on the value of the @searchterm1 parameter: if @searchterm1 was not explicitly set, or was explicitly set to NULL, then the second part of the statement (@searchterm1 IS NULL) is true and so the row is returned; if @searchterm1 is not NULL, then the first part of the statement is evaluated and will fail if field1 is NULL and the row will not be returned. The question now is: is the DBMS smart enough to realize that if the parameter is NULL then it does not need to check the value of the field?

      Looking at the execution plan in SQL Server 2008, this is the case as there is no difference in plan when reversing the order of the sub statements: the statements “field1 = @searchterm1 OR @searchterm1 IS NULL” and “@searchterm1 IS NULL OR field1 = @searchterm1” produce identical execution plans. So SQL Server 2008 at least is optimising the execution. I’m unable to test currently against Oracle and earlier versions of SQL Server but reversing the order of the search terms and watching the execution plan will reveal what’s going on.

      It is in fact much more important to make sure that each of the fields in your WHERE clause is covered by an index so that the query does not have to perform a table scan. Ideally I would use a single index that contains all fields in the WHERE clause because although the execution plan will not be recomputed on each execution there will be at least one index that is reasonably efficient for all combinations of parameters.

      andrewlocatelliwoodcock

      May 31, 2011 at 11:42

  2. […] wrote a post recently about creating flexible WHERE clauses in stored procedures to support, amongst other […]

  3. A very useful blog post for the developers and architects. Sharing this post with my professional network through my daily digest

    Kannan Subbiah

    April 14, 2013 at 04:56


Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

%d bloggers like this: