A while back I was doing some routine housekeeping and ran a script Kendal Van Dyke wrote to identify duplicate statistics. Not uncommon for those to occur, queries get written that need a stat and the system creates one, later someone does some tuning and adds an index which then creates another stat. The space used for the duplicates isn’t much, but it does add time and IO when you do regular statistics updates. For whatever reason I was digging in a bit and found that we had many system generated stats with multiple columns, when BOL says single column only. My first thought was that I was wrong, but I asked friends at other locations to check their systems and they found the same thing – unicorns! Our guess is that this behavior comes from SQL 2000 or earlier, maybe as far back as SQL 6.5, and more work to figure that out than seems worthwhile. Does it matter? Maybe. It’s not uncommon to clean up system stats by dropping them all, letting the engine add back the ones it needs if and when that happens. That’s fine for single column stats, but if you had queries that were working due to having multiple columns in the stat, you won’t get that back – you’d have to create it manually, and know what to create. My guess is that it’s not a huge risk, but one to keep in mind.
Here’s my mod of Kendal’s script. Let me know if you have any problems with it.
--Detect system stats that overlap a user created stat --Andy Warren 11/3/2014 WITH sysstats (tablename, statname, columns, schname) AS ( SELECT object_name(object_id) as TableName, name as statname, (select col_name(c.object_id, c.column_id) + ',' from sys.stats_columns c where c.object_id = s.object_id and c.stats_id = s.stats_id order by stats_column_id for xml path('')) as Columns, OBJECT_SCHEMA_NAME(object_id) as schname FROM sys.stats s WHERE s.auto_created = 1 and OBJECTPROPERTY(s.object_id, 'IsMSShipped') = 0 ), userstats (tablename, statname, columns, schname) AS ( SELECT object_name(object_id) as TableName, name as statname, (select col_name(c.object_id, c.column_id) + ',' from sys.stats_columns c where c.object_id = s.object_id and c.stats_id = s.stats_id order by stats_column_id for xml path('')) as Columns, OBJECT_SCHEMA_NAME(object_id) as schname FROM sys.stats s WHERE s.auto_created = 0 and OBJECTPROPERTY(s.object_id, 'IsMSShipped') = 0 ) select *, 'use [' + db_name() + '];if exists (select * from sys.stats where name = ''' + statname + ''') DROP STATISTICS [' + schname + '].[' + tablename + '].[' + statname + ']' as DropStatement from ( select a.tablename, a.statname, a.columns, (select top 1 statName + ' / ' + columns from userstats g where g.tablename = a.tablename and Left(g.columns, len(a.columns)) = a.columns) as Overlaps, schname from sysstats a ) o where o.Overlaps is not null