sdb:ldap (Linux “named” accessing OpenLDAP) is not thread-safe

We’ve been searching the cause for sporadic name resolution errors for quite some time now – “sporadic error” in terms of incorrect name resolution and dying named processes once or twice in a couple of weeks, nothing you’d track down easily. We weren’t able to reproduce the error, but we got lucky and (at that time unknowingly) hit another symptom of the same error cause and were finally able to track things down.

Some background

For years, we’ve been running our “named” processes against LDAP, using the so-called “SDB:LDAP” interface. The DNS data is stored in a hierarchical LDAP tree, with manual and automatic updates of that data by our systems management automation processes. When using SDB, every DNS lookup leads to a data back-end lookup, there’s no need to “export” data into named zone files once the data was changed in the back-end storage. The DNS environment is spread across various servers, all those name server processes have an LDAP server close by, avoiding SPOFs. Distribution of the data is handled by OpenLDAP’s replication mechanisms.

We’re using the SDB-LDAP implementation, which was the only suitable one at that time. It comes as “contributed software” and is not that well maintained as the BIND software itself – which didn’t look like a big problem.

The symptoms

We’ve noticed spurious false replies to our DNS queries, perhaps once or twice a month. And those were noticed only when i.e. ssh reported that the server key had changed while contacting a certain host – a false alert, obviously, as the immediate retry to set up the ssh session worked flawless and the key file was unchanged on the server (as was the known keys file on the client).

Sometimes, the named process dropped dead without obvious cause.

Lately, we’ve received quite a few syslog entries from the named processes that the LDAP server had to be reconnected. Checking the LDAP server manually didn’t reveal such problems at all.

The solution

We had had a few glimpses at a more modern LDAP integration (bind-dlz) for our named, and had seen that some (unstable) updates were available for SDB:LDAP, too. The change log pointed out that multi-threading issues were fixed, which got our attention: Quite obviously, our nameds are running multi-threaded (checked that via “ps”), so we turned that off (or rather limited that to a single thread, via the “-n 1” command line option for named).

Ever since, we’ve had no more spurious reconnects to the LDAP server and the named process looks stable. Sample time is a bit short, but I’m sure that even within the next weeks we won’t see any false resolutions no more, either.

Case closed.

PS: No, not actually closed. We’ll for sure be having a real close look at the newer back-end “bind-dlz” and its LDAP approach. The first “glimpse” made us feel a bit uneasy, but I’ll let you know how things went as soon as we have finished our tests.

 

Posted in Uncategorized | Tagged , , , | Leave a comment

Google versus OpenCms’ static export

Hello everyone out there,

here’s a quick one.

I’ve noticed that Google wouldn’t index the pages that were recently published by our OpenCms. We had supplied a proper sitemap file (auto-generated by OpenCms), but quite prominently, an error concerning the robots.txt was reported by Google’s webmasters tools:  “Google couldn’t crawl your site because we were unable to access the robots.txt file.”

“Fetch as Google”, also part of the webmasters tools, would properly display the page, and looking at the httpd access log clearly showed that Googlebot was accessing robots.txt regularily.

But: This is an OpenCms installation. We wouldn’t want to generate static pages upon each access, and robots.txt clearly is a static page. We had activated static export for that page, too, which turned out to be the root cause of all this trouble.

OpenCms will respond to requests for such pages with an HTTP 302 code (“moved temporarily”), pointing the requester to i.e.  http://your.site/export/sites/your.site/robots.txt. Google follows that redirection, as the logs and “Fetch as Google” prove.

Unfortunately, Google handles this as a case of a non-accessible page. Which, in case of the “robots.txt” file, will hold off all scans. (Google’s decision itself, not crawling a site if the owner’s intend cannot be clearly determined, is IMO correct. But we’re delivering a syntactically correct file, via an unambiguous redirect – in my opinion, Google should accept that file, too.)

I’ve since changed the “export” property of our robots.txt to “false”, now everything is back in order. At least from Google’s point of view.

 

 

Posted in OpenCms | Tagged , , | Leave a comment

Adding video content to Opencms sites

Running an OpenCms site, a recent content update kept me quite busy. Not only writing, reading, re-reading and re-re-reading the new content, but also from two technical points of view.

Where we came from

The server in question has been up for some years, running both a standard corporate site and one with multimedia content (as in “pictures and video”). The multimedia site it truly old-school, with embedded video players, with video files in “Windows Media” format and as RealMedia files… no MPEG4, no Ogg yet. We’re still kind of proud, as for one we have created our own publishing workflow presenting multi-format video automatically, and we’ve kept the videos out of the Opencms database, by adding a “symlink”-like coupling to the RFS to OpenCms. (That way, we have the best of both worlds: easy storage of large files in the host file system, with all the meta data, features and standard mechanisms inside OpenCms.) Of course we’ve put OpenCms behind an Apache web server installation, both to be able to remove the too obvious “/opencms/opencms” prefix from the URIs, and as caches that reduce the workload of the OpenCms process by directly serving static content that was exported by OpenCms on the first request. Remember, we have to serve videos… nothing you would want to pass through OpenCms on every request.

And where we needed to get to

But now, the corporate site needed a severe face lift. Additionally, video clips were to be added to that site, too – with modern technology, namely HTML5’s “<video>” tags.

It was not only the site that was in need of an update – the OpenCms installation was from the v7 series, still facing the “ever-growing database” bug and therefore was to be upgraded to the current version 8.0.4. A short test installation (we moved a clone of the OpenCms installation to a test server and did an upgrade there) made us believe that everything was feasible – even the v7 coding still worked like a charm.

Upgrading OpenCms

So, after weeks of preparation, we upped the system in two steps: First, we upgraded the OpenCms installation of the production server. Unfortunately, that turned out to be completely different from how it went with the test server… the upgrade wizard wouldn’t want to properly update the OpenCms config files like it had done during the tests (well, there was one time it stumbled during test – but we fixed that in advance on the production server) and all in all it worked that much differently, it took us the better of three hours to be up & running again.

We let the system run for a few days with the old sites, to see if we missed anything during the update, but things seemed fine.

Adding new MIME types to OpenCms

With the new site content, because of the video stuff, we had to handle new video types within the sites: MPEG4 and Ogg. As far as I can tell, that’s the minimum list of formats you’ll want to have your videos available in, which is by itself (and some details on the tool chain to create the videos) worth its own article. But in the OpenCms context, one of the consequences was that Opencms didn’t include any MIME definitions for those two video files. At least v7 didn’t, and as the types where added before the upgrade, I cannot tell if v8 has support for these types out of the box.

Adding this to OpenCms is a “piece of cake”: Have a look at the list of defined MIME types in opencms-vfs.xml (located in OpenCms’s WEB-INF/config directory) – if the required entries are there, fine. Else: Add them. As I already wrote, we had to add them:

<mimetypes>

<mimetype extension=”.ogv” type=”video/ogg”/>

<mimetype extension=”.mp4″ type=”video/mp4″/>

</mimetypes>

After an OpenCms  restart, everything looked fine. Accessing the not yet published site in OpenCms’s workplace, then selecting the proper HTML page with the <video> tag, played the video both in Firefox and Internet Explorer. Now that seemed easy!

Publishing the new content

Step two of our operation was publishing the new site content. Usually, this is more or less that much straight-forward that I wouldn’t even dare to mention this in an article. But this time, like with the OpenCms update, some unexpected results were to be seen.

Publishing itself was actually easy, even as we had rolled a few blocks in the way. (Note to myself: Never create completely new site versions in sub-directores below the current site… I wish that OpenCms would handle the case better where you delete a  file or directory (like when creating a directory “/old” and moving all old content there), without publishing this, and then later re-create the resource for new content. OpenCms cannot do this currently. You have to publish the deletion before creating the new instance of the directory – which means that site guests will stumble over some probably very strange 404s… index.html comes to mind.)

Like I had tested the new video stuff prior to publishing, I tested those pages right after publishing – everything seemed to work nicely, like did the rest of the site.

Trouble’s on the way

But the next day, the problem reports started rolling in: Users reported that they couldnt’t get the videos displayed, but rather receive error messages.

One set of messages where easily explainable: The browsers were too old to support the HTML <video> tag.

But even users with current Internet Explorers and Firefox browsers and Google Chrome and so on reported that displaying the videos did not work. Sometimes Firefox displayed a message in place of the video, sometimes only a grey “X” showed up. The message indicated that the video file could not be transferred or that the MIME type isn’t supported.

First guess was some transfer problem, probably the export from OpenCms. But that worked fine. Displaying the video from the editor UI (“workplace”) worked, too. Using “wget” to fetch the video from the production server: Check.

But wait… what’s that? “wget” reported the MIME type of the video file as “text/plain”? Didn’t our OpenCms customization catch? Were our manually added entries lost during the upgrade? No, everything in place.

In the end, the culprit was identified as being… Linux! Our Linux server that is hosting all this didn’t have the MIME entries for MPEG4 nor Ogg in /etc/mime.types (OpenSuSE 12.1 has at least an entry for the extension .ogg in there, but not mp4), and that file is used by Apache’s httpd as the source of MIME information. Which lead to the following scenarios:

  • Using the OpenCms “workplace”, the files are always served by OpenCms directly, bypassing the httpd “caching”. OpenCms was configured correctly, so the files were handed out with the proper MIME type.
  • The test invocations after publishing were the first calls for the files, therefore the files were requested from OpenCms again, with copies going into the cache…MIME type ok again.
  • Later calls for the video files were served by httpd, directly accessing the files in the “export cache” file system – and as httpd didn’t have the proper MIME types, the files were sent as “text/plain” – boom.

As a side note: It didn’t help at all that opening the file via Firefox’s video player context menu “Play video” opened the file successfully in an external video player – it rather added to the confusion. This works because the video file itself is fully ok – the MIME type isn’t handed to the external application, the detection is done by looking at the file content.

Overall conclusion for everyone running OpenCms with “static export” and a web server software up front: If you add custom MIME types to OpenCms, make sure your httpd supports these types as well.

Posted in Linux, OpenCms | Tagged , , , , , | Leave a comment