Government compliance guidelines and corporate retention policies are always changing- and rarely in a way that makes things easier for the IT staff. Long forgotten documents, saved in obsolete formats can all of a sudden need to be formally archived or made accessible to staff- all with very little notice. Once that happens, the ‘archive’ directory on the old file server in the corner can quickly turn into a nightmare- “Does anyone know where the IBM DisplayWrite install disks went?”
Both the Muhimbi Converter for SharePoint and Muhimbi Converter API and Server Platform natively convert many different document types and more can be easily added with our ability to add 3rd party converters. Our Office format converters can handle many older file formats, but the truly obsolete ones are not supported anymore- the last version of MS-Word that included a DisplayWrite converter was MS-Word 2000.
Adding 3rd party converters is a great way to add support for your legacy formats, while still being able to leverage our Converter’s latest and greatest features. The sky is really the limit- you could create a workflow that converts, watermarks, copies meta-data, and merges a brand new InfoPath form to old dBase database tables and then, without any modification, could turn around and do the same for an AutoCad file and DisplayWrite document!
If you’re using the Muhimbi PDF Converter Services, your own code does not require any modification to allow users to use the same process to submit any of the listed file types to it- once integrated with Muhimbi’s Converter, these formats can be manipulated like any other. Who would ever have thought that you’d be able to convert file formats that are 40 years old using a modern web services based API using Java, C#, Ruby and PHP.
As these formats are obsolete and not in high demand, we didn’t hold out much hope of finding a relatively modern Converter for them. Luckily, we were wrong, and a bit of searching led us to Advanced Computer Innovations Inc. and their FileMerlin product. FileMerlin supports conversion of a dizzying array of file formats and some deserve special attention.
The Muhimbi Converter does not support Microsoft Access (or other database formats) conversion as achieving the fidelity we demand from our own conversions would require a considerable effort for something that is not in very high demand. FileMerlin provides good conversion of MS Access, dBase, FoxPro, Microsoft Jet, along with other database formats. Trying to convert the look and feel of a database to file obviously doesn’t work, but for extracting table data, it does quite well.
Word Processor Conversion
By far the largest selection of conversion options is in the word processor arena. This provides conversion for formats like the aforementioned DisplayWrite, as well as WordPerfect, WordStar, Lotus Manuscript, among many others. Again, the conversion fidelity on these is not perfect, but it is good and most importantly, does not require you to find these old applications in order to open and save them into a more accessible format.
Microsoft Office Conversion
FileMerlin supports the conversion of the most common MS Office file formats (Word, Excel, PowerPoint). Since these conversion do not use MS Office in the background like Muhimbi’s Converters, their fidelity is not as good and newer Office formats can cause some problems as well. That being said, in an environment where installing MS Office to support conversions is not an option, this could be a good solution.
So, how do you get this working? The process is pretty much the same as for our other 3rd party converters. Starting with the assumption that the Converter for SharePoint or Converter Services has already been installed, the following steps need to be carried out:
- Download and install FileMerlin.
- Modify the ‘ Muhimbi.DocumentConverter.Service.exe.config’ file as described here and add the following entry to the <MuhimbiDocumentConverters> section. This tells the Converter what file types can be Converted to PDF. If you installed the 3rd party software in a different path then please update the content of the parameter attribute.
In the above example we added the DisplayWrite (RFT) to PDF Converter. In this next one, we will add another entry to allow MS Access conversion to HTML.
If you want to convert other formats as well, you will need to change the supportedExtensions and/or the supportedOutputFormats entries to include the file type extensions you want to convert. As well, you will need to change the sfrm and dfrm fields as well. FileMerlin can support auto detection of file types, however it is much more reliable to specify them manually and simply add the required 3rd party convert entries for each type.
Normally, we’d add some before and after samples of something like a DisplayWrite file conversion, but the boiler for our DisplayWriter terminal is out of coal, so we’ll have to stick with something a bit more current, an MS-Word conversion to PDF. The image on the right is the document as displayed in MS-Word and the one on the left is the PDF converted by FileMerlin as seen in a PDF viewer. You’ll notice that the fidelity is not as good as our native converter, however it should give a good representation of the conversions that can be expected with this tool.