Wednesday, March 21, 2007

Custom Protocol Handler vs BDC

I get this question a lot. If I have SharePoint 2007 enterprise version, I can index any repository with Business Data Catalog (BDC). Why should I care about Protocol Handler? Writting Protocol Handler is anyway is a pain in neck, even the experienced C++ developers find it hard to write

So the simple answer is the BDC was designed for knowledge workers (those who Microsoft consider pseudo-programmers) to be able to extract busienss information from Line of Business Application using declarative langauge (XML) as oppose to a regular programing language.

Since the target audience of BDC are knowledge workers and not programmers, its NOT designed to scale or for complex scenerios. Its a very much black box design. You don't really know what happpens behind the close door ( and you probably don't care, if you are not a programmer)

This is what you don't get in BDC

  • You have no control over what type of IFITLER ares applied. Agreed, you don't really care if this is just a matter of text extaction from binary documents, but if you really want to emit custom metadata, or links, you do need a good control on how its being handled
  • You can not do any kind of custom throttling or optimization. Everything is a black box. SharePoint provides some throttling options for content-source, but thats as far as you can go
  • You have no way to do any network optimization, custom error handling or debugging. If your document is not indexed, well you will in the gatherer log but with standard error message
  • You can not emit custom security ACL for your document/items to BDC
  • You can not control the incremental indexing, new items or deleted items.

  • You can certainly specify in BDC definition but nothing compare to protocol handler
  • Finally, putting PH developer on the resume is totally different than putting BDC developer :)

1 comment:

Christopher said...

From someone who has PH on their resume multiple times I can testify how hard it is. The BDC PH is too limited, not at all capable of indexing a enterprise system with Millions of records. We have a flexible PH that can crawl any source and honor security. It can even combine data from multiple systems to create a more complete searchable record. Like a Document in DMS cross referenced by Client information from the CRM.
Christopher Even