I am currently working on a project where I am taking data from an old database. Unfortunately, many of the date fields in the database contain incomplete date values, simply because not all the data was available when it was entered (it was taken off paper, memory and other relatively unreliable storage media).
In the old database, these fields were stored as text. However, I wanted them to be searchable by date, so I was thinking about it and I have come up with a couple of possible ideas…
Firstly, you can store the centrepoint datetime and a float to represent a timespan in days of how accurate this is. Whilst I think it is possible to do an SQL query to query this (I haven’t actually tried it), it is a bit messy, and I think it would probably be a nightmare to write a Linq to SQL query or use another query framework.
Secondly, you can store the min and max datetime that it could be. This also allows you to do a little clever parsing when you retrieve the date from the database and for example, you can list a date which is in the database as 2009-01-01 to 2009-01-31 (in ISO date format) as “January 2009″.
Of course, both of these only work if the missing data is the least significant part (in mathematical terms). I mean, if you know the day and month, but not the year, neither of these schemes work. For that you’d probably need to use 3 nullable integer fields, but then searching on that would be a pain.