Friday, September 12, 2008

Custom Parsers continued

It has been a while since I posted anything new about developing custom parsers.

The short of it is that it requires a substantial investment in time and if at all possible I would stay away from it until the feature has matured. I suggest that you implement the same functionality using event handlers instead. This area is well understood and doesn't require you to deploy a COM parser.

Since it's a COM object it can't be deployed using wsp solution it requires an (msi) installer because the COM registration needs administrative right to write to the registry.

The main problem we have run into is the "complete" lack of detailed API documentation.

I would suggest that you develop the parser in unmanaged code or at least a wrapper layer between the parser interfaces and your managed code. This layer is to shield you from the unfortunate descision to use LPCSTR's (Ansi strings) as input/output parameters from the parser.
These out parameters are especially troublesome since you can't use strightforward marshalling since the memory is owned by the parser and becomes invalidated everytime you update a property.

I have no idea why MS didn't follow the defacto standard related to memory ownership in COM and using BSTR's???

Your wrapper layer will return automation types to your managed code so you can use regular masrshalling.

You should also be aware of the fact that you can NOT write parsers for the most common image formats like jpg, gif, tiff, bmp and others.
This is because SharePoint internally handles these formats (I think to support picture libraries) and it won't call your parser. You can't even "hook" the out of the box parser the way you can with the office parser since the image parser is not an external COM object. It looks to me like it's an internal object.

Here are some resource that you can look into.

ZipParser a custom parser on CodePlex written in C++

A discussion thread on MSDN related to custom document parsers.

Good luck, and if you have more information related to custom parsers please share it here.

Thanks
/Jonas



No comments: