It’s so easy to make make mistakes with dates. Here is another one I saw recently, this one used when doing some checks on some billable data that had been captured:
select * from sometable where somedate between ‘6/1/2011’ and ‘7/1/2011’
Nine times out of ten when you see that statement what they meant was:
select * from sometable where somedate >= ‘6/1/2011’ and < ‘7/1/2011’
It’s a difference of almost nothing in time, but it can be a huge difference in data. Dates get stored as numbers, with the date to the left of the decimal and the time to the right of the decimal. Many only want to store the date and will remove the time (no longer needed if you have access to the separate date and time data types), but even if you don’t specify a time what you’re storing includes a time of zero – midnight.
Imagine if you look at the data and see:
- 6/1/2011 00:00:00
- 6/2/2011 00:00:00
- 6:30/2011 00:00:00
- 7/1/2011 00:00:00
- 7/1/2011 02:00:00
The first query above that uses the between will return the first four records. The second query only returns three. It’s not much difference in time, but it can be a big difference when you care about being precise! It’s worse than it appears because the first day of each month will get counted in two different periods if the the time is zero.
Mistakes like this go unnoticed for years because the logic that is applied may be wrong, but it’s applied consistently. It usually gets noticed when someone writes a new report and does the date range correctly, and then a client asks why is there a difference.
I wouldn’t call it a beginner error I (thought it often is) as much as I’d call it an expectation error. It seems like between should do what we want in that situation. Such are the things that bugs are made of.