I wanted to share a little bit more from my deck I delivered at Teched with Paul Learning. Had a great session, great attendance. What we got as a result was the need for more crossover information on SQL and SharePoint. I do recommend you start with this post on “Understanding the SharePoint SQL Databases” on SharePointForAll which I put together as post to better explain how the data is spread out and to help better understand the differences between the different SharePoint databases.
Performance and the various bottlenecks introduced by SharePoint leveraging SQL need to be better understood for us to optimize SQL for SharePoint. Beyond RAM which is often the first bottleneck for SQL systems to hit when running with SharePoint databases is disk I/O. It’s not the typical content database that has these challenges, but the temp and search db. The SharePoint databases are most often disk bound, meaning the bottleneck is Disk IO. You can easily optimize your databases for disk throughput requirements by understanding their read and write patterns and demands.
Above I’ve tried to break down the database disk demands into three tiers. High, medium, and low. Obviously there are operations which affect the performance of these databases at various times.
The search database is extremely write intensive. During a crawl of a collaboration environment you will find higher disk IO in a large SharePoint farm than even Exchange! As well, the temp db is extremely volatile. It is often the bottleneck to write performance for your databases. I recently stated that after RAM, the temp db is most often the bottleckneck and can do more for your large scale performance and planning than anything else (Of course there are other performance considerations like the pipes between the servers and NIC). Recently I explained in a post titled “SharePoint Performance and File Groups for Temp db, Search db, and Content Dbs” recommending for large environments looking for better peformance to split out the temp and search dbs with multiple NDF files (files and file groups) to optimize the write performance by allowing for better write throughput. SharePoint admins may not realize, but everything is written to the temp db first, and then written into the transaction logs and then into the content database.
Create as many files as needed to maximize disk bandwidth. Using multiple files reduces tempdb storage contention and yields significantly better scalability. However, do not create too many files because this can reduce performance and increase management overhead. As a general guideline, create one data file for each CPU on the server (accounting for any affinity mask settings) and then adjust the number of files up or down as necessary. Note that a dual-core CPU is considered to be two CPUs.
Make each data file the same size; this allows for optimal proportional-fill performance.
Put the tempdb database on a fast I/O subsystem. Use disk striping if there are many directly attached disks.
Put the tempdb database on disks that differ from those that are used by user databases.
As well, in the hot/high importance category, the transaction logs for those who aren’t database admins, are often misunderstood. The transaction logs need to be optimized despite whether your environment is in simple or full logging mode. I highly highly encourage that your transaction logs be on optimized RAID configurations such as 0+1. Stripped and mirrored. You need write performance for your transaction logs.
Backup (SQL & Tape) – serious write disk I/O performance hit and serious CPU hit. This stuff hurts. STSADM backup is one of the most inefficient commands you can run. Using SQL backup is a good start, and DPM or other snapshots technology and even SQL Litespeed or SQL 2008 backup with compression all help to lessen the performance hit. (Tips for improving SharePoint backup performance)
Profile Import – watch your SQL CPU spike and it hold onto the cycles for hours. You can attempt to be smarter about the Query in your LDAP query to your AD. That’s a good start.
Misc Timer Jobs – User Sync for large #s of Users – the more users the longer the timer jobs will run. Profile Synchronization – This job runs once every Hour and there is one per Web Application Quick Profile Synchronization – This job runs once every minute as performance permits and there is one per Web Application (MSDN.) More information on miscellaneous timer jobs…
STSADM Backup/Restore – Note the updated information on 12GB as a size where they recommend putting site collections in their own database? This is a recommendation around performance and scale. There is an article on monitoring for SQL blocking.
Heavy User Operation List Import/Write – Another scenario of users having power they don’t realize they have. In SQL there’s little difference between this and bulk user operations. Not a lot on the performance impact, but here’s an example of how to create a list based on a spreadsheet. You can imagine an underground project to import a ton of data from file shares or from Access databases. Totally under the radar. Not much you can do other than plan for peak, and encourage people to plan with you for these type of long running operations.