There are several scrips around to identify what allocates space in the database, some use sp_spaceused, some sys.allocation_units. I'm looking at medum-large production database (4TB range), several files, most space according to sp_helpdb is on Primary
(1921318912 KB, 1239993344 KB,...). sql 2008r2, EE.
unallocated space is below 5-7% depending on a day
something like sql below returns space allocated by tables (total) in 1TB range.
SELECT SCHEMA_NAME(schema_id) as schemaname,
t.NAME AS TableName,
i.name as indexName,
sum(p.rows) as RowCounts,
sum(a.total_pages) as TotalPages,
sum(a.used_pages) as UsedPages,
sum(a.data_pages) as DataPages,
(sum(a.total_pages) * 8) / 1024 as TotalSpaceMB,
(sum(a.used_pages) * 8) / 1024 as UsedSpaceMB,
(sum(a.data_pages) * 8) / 1024 as DataSpaceMB
FROM
sys.tables t
INNER JOIN
sys.indexes i ON t.OBJECT_ID = i.object_id
INNER JOIN
sys.partitions p ON i.object_id = p.OBJECT_ID AND i.index_id = p.index_id
INNER JOIN
sys.allocation_units a ON p.partition_id = a.container_id
WHERE
t.NAME NOT LIKE 'dt%' AND
i.OBJECT_ID > 255 AND
i.index_id <= 1
GROUP BY
t.NAME, i.object_id, i.index_id, i.name , SCHEMA_NAME(schema_id)
sp_spaceused totals are in similar range.
this particular database is ETL target and source database, quite active and lot's of data are re-loaded. it is possible that sp_spaceused does not return accurate data. question is how much off can it be .
on one of restored backups (close in size to prod life, not terrebly old, month give or take) DBCC UPDATEUSAGE did not find a lotto correct
USED pages (In-row Data): changed from (1503867) to (1503841) pages.
RSVD pages (In-row Data): changed from (1503977) to (1503961) pages.
USED pages (In-row Data): changed from (2636729) to (2636707) pages.
RSVD pages (In-row Data): changed from (2636824) to (2636808) pages.
in prod running something like that is not particularly a good idea, the db is busy, lots and lots of i/o, no maintenance windows to run this without affecting production workflow. my guess would be that actual prod would return something similar when it comes
how much was inaccurate. it is not tool old full backup restored. in production there is a lot of data churn on some tables, however the largest ones we know about have lesser churn of the data (it is not like truncate and full reload kind of a thing, the
diffs are loaded in there in for of delete old data, insert new, no actual update statements, we are not talking about billions of records, the largest tables are quite wide so).
Do not see blob, text on that database. My thinking was that if dbcc updateusage did not find too much to correct then my totals should be not that much off and at least most of the allocated space should be accounted for give or take. for now it's more
like where did tb or 2 go kind of a thing?
Does anyone have production databases medium-large size, active ETL kind of workload and what do you use to find out about actual space allocation on these production databases?
Looked at sys.dm_db_index_physical_stats too. for top 1 table for example i get that
Msg 2591, Level 16, State 40, Line 11
Cannot find a row in the system catalog with the index ID 0 for table...
basically it is a datamart, large, the schema was generated by a tool a while ago, it has clustered primary keys all over (different story how did that happen), varchar (40) etc. it has some good challenges. indexes are de-fragmented regularly (ola's script
was implemented in prod).
thanks