Wednesday, June 20, 2007

SharePoint and Large files ( > 1GB) in a Document Library ?

I have been repetitively asked if SharePoint can be used as an alternative to FTP or Network Share. SharePoint does offer strong capabilities for document management but it by NO means is an alternative to FTP or Network Share

As you probably know, SharePoint stores all documents in database as a blob. Which, in itself is not an bad idea. I know you probably thinking I am smoking !. But the truth is that in SQL SERVER 2005 BLOB performance is vastly improved. Also SharePoint 2007 does smart caching on Web Front ends (WFEs) for most frequently used files ( e.g. JavaScript, CSS and images etc) .

However because its a web application and does not use special controls for most files ( except for Office documents, which are downloaded via FPRPC over HTTP - discussed later) it suffers with all following limitation if you try to upload or download

1) Your browser could timeout while uploading a large file

2) Your browser could run out of memory due to spooling

3) If lots of users are downloading or uploading large files, it could impact your web server performance

4) If your download fails for any of above reasons, Your download will not resume itself, You will need to start all over again...

So, what are your choices ?

1) Store File on a Network Share or SAN devices: One solution is to store the actual file on a fast network device and only keep the path in the document library (or create a simple list item and store the link) . You need to make sure that only the process account has direct access to network access and you may also need to write custom event to prevent unauthorised access.

This way you can take advantage of SharePoint UI, Security, Target Audience and Search feature and also optimise the user experience and performace

However, using this solution requires careful planning, coding and security testing.

2) Create custom download activeX control or Windows App : If you are building an application for a controlled environment, you could also write a custom ActiveX control or a Windows application and use WCF / MTOM with SharePoint Web services to download files asynchronously and chunk by chunk ( streaming) -

In case of an error, you can resume the download where you left before the error occured. You can also notify server on a successful download using custom hash.

Basically you have lot more choices in this approach then the first approach for download

3) Wait for SQL SERVER 2008 : Well, if you can wait, there is a rumor that SQL Server 2008 will have an alternative storage mechanism for BLOB. After all you might a get choice to store BLOB in a file. Does the name WinFS ring a bell?

1 comment:

Alex said...

In regards to #3, SQL Server 2008 has a new thing called FILESTREAM which can be used to store BLOBs in the file system instead of the database.