Are there any records there?

It’s been a while that I haven’t blogged, and my queue grows inversely proportional to the amount of time I have available for blogging, so let me do a short series of easy stuff, simply to take it off the list.

This is not about new features, crazy new tips and tricks or anything of the sort. It’s just a couple short lessons on performance and how to reduce your carbon footprint and make the planet last longer.

It’s about how to properly ask the database: are there any records there?

Out in the wild, you can find a bajillion instances of something like this:

IF SomeTable.COUNT > 0 THEN
  DoSomething();

Or a variation of it, that looks like this:

IF SomeTable.COUNT = 0 THEN
  DoSomethingElse();

It amazes me again and again how often people just routinely write this. The logic is seemingly simple: if there are (no) records in the table then do something.

You see, the problem is that when you ask for COUNT, that’s exactly what you get. To give you the count, SQL Server has to go down into the database and perform the actual counting of rows, typically by doing either a table scan or an index scan.

A scan involves at minimum physically accessing each record in its storage (but not necessarily reading its content). You can imagine that it can take a long time to process it.

This is what COUNT looks like to SQL:

SELECT COUNT(*) FROM …

How fast SQL Server will be at processing COUNT depends on a number of variables. Number of records, presence (or absence) of a matching index, index distribution, index fragmentation, logical order fragmentation, page density fragmentation, are a few of variables that can influence how fast SQL Server can physically perform the operation on the data storage level. Then there are hardware constraints, such as memory available to SQL Server or amount of storage memory, that can influence whether the data will be actually read from the disk or from memory. Then there are behavioral factors, like how busy the table is, and how much concurrency there is on it.

All in all, whenever you ask for COUNT, you make SQL Server (and quite possibly your users, too) hate you. I’ve personally seen instances where COUNT ran for over a minute.

Now, you may instinctively think that you can improve the speed of COUNT if you apply some filters. This is only true if you have an efficient index (which probably comes with performance penalties elsewhere). However, if you apply filters and SQL Server cannot (or decides not to) use an index, not only SQL Server will have to access each row, it will also have to retrieve it in its entirety. This can slow everything down by an order of magnitude. Bummer.

So, all this crazy work, just to tell you if there are *any* records in a table.

Logically speaking, to know if there are any elements of a set, you don’t necessarily need to count all its elements and then compare the result to zero. Why not simply trying to retrieve whichever one element from the set and see if you got anything or nothing.

And that’s precisely what ISEMPTY function does: it accesses the first matching row only. This is what ISEMPTY looks like on SQL Server:

SELECT TOP 1 NULL FROM

Now, SQL Server will still have to do a table or an index scan, and depending on your filters it may need to retrieve the row from the database to perform the filter matching, but once it successfully retrieves (and matches) one row, it terminates immediately without going to further rows.

This may still be slow, though. Depending on same factors COUNT depends on, you may be extremely unlucky that your one matching row is the very last row read from the database. However, this is extremely unlikely, and if this systematically happens, you can solve it with an index.

In any case, ISEMPTY is as efficient as it gets and you should always use ISEMPTY instead of COUNT when the only thing you need to know is if there are any records in the table, and you don’t really care about how many of them there are exactly.

Vjeko

Vjeko has been writing code for living since 1995, and he has shared his knowledge and experience in presentations, articles, blogs, and elsewhere since 2002. Hopelessly curious, passionate about technology, avid language learner no matter human or computer.

This Post Has 11 Comments

  1. Claudiu

    Hi, what about FINDSET or FINDFIRST? Is it a performance issue if I use them to check if I have any records?

    1. Vjeko

      They will certainly be slower than ISEMPTY.

  2. Dennis

    Hi, I have had some issues with performance on ISEMPTY but this only applies in very few scenarios using temperary tables.

    This issue happens when you have a temperary table that is quite big and you have a loop surrounding a temptable.ISEMPTY(And some other code of course), where you are filtering on records in a secondary key.

    Apparently SETCURRENTKEY is not applied when doing ISEMPTY and therefore in that scenario you will have better performance when using SETCURRENTKEY and FINDFIRST.

    1. Vjeko

      Dennis, you are talking temporary tables here. Temporary tables are in-memory, not on disk. They ignore all your keys and everything, and when you set filters on it, and then do any kind of operation that checks or retrieves data, they have to do full “scan”. If you had ISEMPTY on a temporary table with a filter that yields no results, ISEMPTY had to scan through all of rows to figure out if it was really empty. Again, temporary tables don’t benefit from any keys you define.

    2. Vjeko

      But of course. Temporary tables are in memory.

Leave a Reply