My most and least favourite system that I have ever had to administer involved an embedded copy of OpenOffice.org (originally a Solaris SPARC binary of 3.1, later the Ubuntu 10.04 package of 3.2), started from a cgi-bin Perl script as needed, to convert MS Word .doc files to PDFs.
This is because MS Word .doc is such an utter shower (specification PDF, 563 pages) that it takes something approximately the size and complexity of MS Office to deal with it. And OOo is certainly that.
(We thankfully talked them out of making us install embedded MS Office instead, by offering to charge them the cost of a Windows sysadmin to run it. Never say no!)
The .doc files were end users’ edited versions of deeply defective source .rtf files generated from XML and XSLT by Apache FrontOffice, that wouldn’t open properly in anything else but MS Word. I wish I still had some, they’d be great bug fodder to submit to LibreOffice.
Even though this system was released in 2009, it never did manage to cope with MS Word 2007 .docx files.
It also kept said .doc files in a Subversion repo. Which was not operated using svn binaries, but twiddled by the relevant Java libs.
The cgi-bin Perl script was because the “generate PDF” functionality required a process fork, which hit a bug in Solaris SPARC Java 5 and 6 that tried to make a copy of the entire Tomcat process in physical RAM, which of course ran out of memory and failed. OOo won’t run if the user’s home directory isn’t writable, so on Ubuntu this required “sudo chown www-data /var/www“. Thankfully we weren’t using that as the webroot.
When I asked the (very good) senior developer about these design decisions, the pained expression on his face as he detailed why all these terrible ideas were the least-worst options was quite exquisite.
The purpose of a specification is to nail down the vague hopes and dreams of the business unit that’s just seen a fat, juicy market opportunity. In this case, they wanted a magical flying unicorn pony that ejaculated rainbows. The spec went into quite some detail about the desired properties of the wing feathers.
What they ended up with, of course, was a retired seaside donkey that had been tarred and feathered, a cornetto stuck on its head and fed food colouring and laxatives.
One part of the original specification of the system was to take updated versions of the source .rtf files and merge them with the user’s free-form .doc files such that the end users’ annotations would be preserved and stay correct. That is, the spec implicitly required the implementation of strong artificial intelligence. We told them that bit would have to be implemented later.
I quite delighted in deleting all traces of that system personally when, after five years’ soaking up money for two institutional customers ever, it was finally decommissioned.