As you know, SharePoint stores the BLOBs (Binary Large Objects) associated with documents and attachments in a SQL Server database, specifically the AllDocStreams and AllDocVersions tables within a given content database. While this may seem like a good idea to some, it's hard to derive any benefit from the approach. One could try to argue that it simplifies backup/restore and disaster recovery scenarios, but that argument starts to fall apart quickly as the size of the content databases grows exponentially. At that point you're left with incremental and full backup times that exceed available windows and an unnecessarily bloated database that will impact performance and create scalability challenges.
Maybe my view on all of this is a little tainted given my work experience prior to starting BlueThread with my business partner. I spent a lot of years architecting and deploying large-scale content management solutions on FileNet, IBM Content Manager, Documentum, and others. Not one of those solutions stored the BLOBs in a database; they stored them on the file system. The "content database" was nothing more than logical containers (folders) and metadata with a pointer to the content BLOB, wherever it was ultimately stored. This approach easily facilitated compression, encryption, HSM, and other capabilities that are lacking out-of-the-box with SharePoint.
There have been solutions out there that externalize SharePoint content, usually around archived content, for years. These solutions break SharePoint. They replace what was there with a HTML placeholder, making everything beyond opening a document from the UI unusable...you can't crawl and index the externalized content or access it through any API. You crawl or access -drumroll, please- the HTML placeholder.
There is hope though. Microsoft released the External Binary Storage interface with Service Pack 1 and will support SQL Server Remote BLOB Storage for SharePoint 2010. Our Product, StoragePoint, supports EBS today and will support both EBS and RBS for 2010. With StoragePoint installed you very quickly see why storing BLOBs in the database is/was a bad idea. Uploading and downloading documents is quicker, especially when performing bulk operations. Part of that (...the easy part for most folks to grasp) is due to streaming BLOBs to and from the file system instead of chunking them in and out of the database. Part is also due to the fact that we're not doing any of that I/O on the SQL tier, freeing up SQL Server resources to execute queries and perform transactional I/O. The I/O is all being done on the Web tier, which is more easily and economically scaled out. Can I get a "Better, Faster, Cheaper"? Oh, forgot to mention that we're compressing (Zip/Deflate) and encrypting (256-bit AES) the content in the test results above and are still 50%+ faster. Turn the compression off and it jumps up to 100-200% faster...YMMV.
The following is a 5-minute demo of installing and configuring StoragePoint along with externalizing a piece of content. There are no clicks or steps left out - it really only takes 5 minutes to install and configure.
You can go to StoragePoint.com and download a free 30-day trial of the software and see for yourself. Run it side by side with a comparable setup sans StoragePoint and prepare to be surprised, shocked, and/or left speechless.
Posted
Jun 15 2009, 09:00 AM
by
Rob D'Oria