When I do my world famous SQL sessions I always warn people to autogrow their databases, to embrace whitespace. As a guideline I tell people to grow their databases to how large they will be in a year. In order to do that, you have to know what your churn rate is. By "churn rate" I mean how large your content grows over a period of time. That period of time may be days or months. As an admin it's a good idea to know your churn rate for a number of reasons. In this post I'll talk about a couple of ways to determine your churn rate and what to do with that information once you have it.
How to determine your churn rate
Before you can make any sizing decisions about your farm, you need to have a good idea how quickly it grows. There are a few ways to get that information. Being the cheapskate that I am, the two I demonstrate will be free with built in tools. These are by no means the only way to get this information.
Through SQL
Through SQL we can see the backups performed on each database, and how large they were. This gives us a great way to see how each database has grown. Remember, white space is not backed up, so your backups will reflect the actual amount of data you have, not the size of your MDF file. Run the following query against each of your content databases to see how they've grown over time:
select
BackupDate = convert(varchar(10),backup_start_date, 111) ,SizeInMBs=floor( backup_size/1024/1024)
from msdb..backupset
where
database_name = 'WSS_Content'
and type = 'd'
order by
backup_start_date desc
Change the database_name in line #6 to run it against a different database. When you run this query you will get a list of all of the backups run against that database by date and size. It should look like this:
This shows my most recent backups of this database were 48 MB, while the ones before the 19th were all 22 MB or so. I uploaded 25 MB then ran a backup in Central Administration. Since Central Administration actually does a SQL backup it shows up in this list. Backups made from third parties that don't do database backups may not show up in that list. If you would like you can change the heading in line #3 to "sizeInGBs" and add another "/1024" to the backup_size if that makes more sense for the size of your database. "backup_size" is in Bytes, so you can modify it accordingly. That's one way to get your database growth rate.
Through SharePoint
You can also get your churn rate from the SharePoint object model. I like this method because it will also let you keep tabs on individual site collection growth, should you choose to. This lets you be proactive on site cllections that are growing out of control, before they come to you because they are out of space. For this method I use the output of "stsadm –o enumsites." Fortunately for us enumsites not only lists out the site collections in your farm, it also gives us a ton of information about them. One of those tidbits being how much space the site collection is using. We can add that all up to see how much space our farm is taking, and compare it. To make this work you have to run "stsadm –o enumsites" peridodically, and you have to massage the output a little. Don't worry, I'll spell it all out for you. J
First we need to run "stsadm –o enumsites" to get the output. Let's take a look at what it looks like on one of my test servers:
The URL parameter is the URL of your farm, not an individual site collection. You may recognize the format of the output. It looks suspiciously like XML. Because of that we can do all kinds of fun stuff with it. I highlighted the output from one site collection, http://stockholm. We have a lot of information there. For this blog post we are only really interested in the "StorageUsedMB" value. That is what we will add up to get our total usage. We can simply pipe that output to a file with something like "stsadm –o enumsites –url http://stockholm > usage.xml" Copying this file to a document library and rendering it in a browser is a good idea. Putting it in the document library allows us to keep many copies and lets us see the growth over any period of time, that also makes it easy to view in a browser. To aid in the display of the XML data, we will use an XLS transform. The XSLT allows us to format the data in the XML file, as well as do some simple things like create sums. In order to make things work more smoothly, we will prefix two lines to the XML file. These two lines define the type of XML file we have, and which XSLT file to use. We could do it without these two lines, but it would be more work to view. You couldn't just click the XML file you want to view and have everything just link up. Here are the two lines we added to every usage.xml file:
<?xml version="1.0"?>
<?xml:stylesheet type="text/xsl" href="usage.xsl" ?>
This tells your browser to use the file usage.xsl when trying to render the XML file when you click it. I've saved these two lines to a file called header.txt and I'll use the
copy
command to combine the header.txt file and the usage.xml file created by STSADM. Here's the first couple of lines in action:
stsadm -o enumsites -url http://stockholm > usage.tmp
copy header.txt /b + usage.tmp /b usage.xml /b
This gives us a file that looks like this:
<?xml version="1.0"?>
<?xml:stylesheet type="text/xsl" href="usage.xsl" ?>
<Sites Count="6">
<Site Url="http://stockholm" Owner="STOCKHOLM\administrator" ContentDatabase="WSS_Content" StorageUsedMB="29.5" StorageWarningMB="480" StorageMaxMB="500" />
<Site Url="http://stockholm/dupes/spdtest" Owner="STOCKHOLM\administrator" ContentDatabase="WSS_Content_JohnRoss" StorageUsedMB="0.5" StorageWarningMB="0" StorageMaxMB="0" />
<Site Url="http://stockholm/lrcs" wner="STOCKHOLM\administrator" ContentDatabase="WSS_Content_JohnRoss" StorageUsedMB="15.5" StorageWarningMB="480" StorageMaxMB="500" />
<Site Url="http://stockholm/sites/Search" Owner="STOCKHOLM\administrator" ContentDatabase="WSS_Content" StorageUsedMB="0.2" StorageWarningMB="5" StorageMaxMB="10" />
<Site Url="http://stockholm/sites/spdtest" Owner="STOCKHOLM\administrator" ContentDatabase="WSS_Content_JohnRoss" StorageUsedMB="1" StorageWarningMB="0" StorageMaxMB="0" />
<Site Url="http://stockholm/ssp/admin" Owner="STOCKHOLM\administrator" ContentDatabase="WSS_Content" StorageUsedMB="0.4" StorageWarningMB="0" StorageMaxMB="0" />
</Sites>
Next I make a copy of usage.xml. I name it usage-<today's date>.xml. This lets us always click usage.xml for the most up to date usage, but it also gives us a history to look back on. The command looks like this:
copy usage.xml usage-%date:~12,2%%date:~4,2%%date:~7,2%.xml
That creates a file with the name usage-090322.xml for March 22nd, 2009. I made the date in the format of YYMMDD so that they would sort well.
Next I copy the resulting files up to SharePoint. I use a tool called Davcopy to do that. Here are the two lines:
davcopy usage.xml http://stockholm/Shared%20Documents /o
davcopy usage-%date:~12,2%%date:~4,2%%date:~7,2%.xml http://stockholm/Shared%20Documents /o
Now that all of the files are up there, all we need to do is upload the XSLT so the XML files can be easy viewed. Here is the text of the XSLT file:
<?xml version="1.0"?>
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="1.0">
<xsl:output method="html"/>
<xsl:template match="/">
<HTML>
<HEAD>
<TITLE>SharePoint Storage Usage Page</TITLE>
</HEAD>
<BODY>
<TABLE>
<TR>
<TH>URL</TH>
<TH>Content DB</TH>
<TH>Used space</TH>
</TR>
<xsl:for-each select="//Sites/Site">
<tr>
<td style="text-align:left">
<xsl:value-of select="@Url"/>
</td>
<td style="text-align:left">
<xsl:value-of select="@ContentDatabase"/>
</td>
<td style="text-align:right">
<xsl:value-of select="@StorageUsedMB"/>
</td>
</tr>
</xsl:for-each>
<tr>
<td colspan="3" style="text-align:right;font-weight:bold">
Total Space Used:
<xsl:text disable-output-escaping="yes">&nbsp;</xsl:text>
<xsl:text disable-output-escaping="yes">&nbsp;</xsl:text>
<xsl:text disable-output-escaping="yes">&nbsp;</xsl:text>
<xsl:value-of select="sum(//@StorageUsedMB)"/>
</td>
</tr>
</TABLE>
</BODY>
</HTML>
</xsl:template>
</xsl:stylesheet>
You can see that the XSLT creates HTML based on the XML file. The sum command at the end is what gives us our total at the end. Here's what the XML file looks like when it is rendered in a browser with the XSLT file:
That is pretty bland, but you can use XSL to make things look better, even apply color coding if you would like. One idea would be to have site collections that are over their quota warning level have a yellow background.
That is it. You can run these files manually, or use the Task Scheduler to have it run every evening. I have attached all of the files referenced here, so download them and take a look at them. That will probably help it make more sense.
What to do with this information
Regardless of how you get your information, you should be able to calculate your farm's churn rate. Once you have that you can start pregrowing your databases. Like I said above, try to give yourself one year's worth of growth in white space. This keeps your databases from having to resort to autogrowing. It also lets you know ahead of time how much drive space you'll need in a year. This is handy for budgeting purposes. You'll know ahead of time whether you'll need new hardware, and you'll have good numbers to back up any requests you make at budget time.
Have fun, let me know what you think.
tk