Multiple IEEE 802.3ad (LACP) trunks with Linux

I had come to the conclusion that whenever something should be configurable, it already is with Linux. Unfortunately, that’s not true in every case, as I had to find out with a new network topology.

We’ve set up a new (small) site backbone, consisting of two SMC8926EM, stacked via a 10GB link. That way a redundant collapsed back-bone with LACP support could be formed. Both the end-user switches and many of the servers are connected with at least two 1GB links to that stack, and the switch setup was mostly nice and easy.

We configured local keys per LACP trunk within the central switch and, for the uplinks, within the end-user switches. Almost like plug&play, piece of cake.

interface ethernet 1/24
[…]
lacp
lacp actor admin-key 11

 

interface ethernet 2/24
[…]
lacp
lacp actor admin-key 11

To avoid getting all links of the LACP-attached servers joined into a single trunk, we had to go a bit further: As we’ve found no way to set the LACP key of the Linux side of the links, we had to rely on the switches ability to set that on behalf of the servers.

interface ethernet 1/3
[…]
lacp
lacp actor admin-key 2
lacp partner admin-key 2

 

interface ethernet 2/3
[…]

lacp
lacp actor admin-key 2
lacp partner admin-key 2

Why can’t we configure that key on the Linux end of the link? How’d we create two separate LACP trunks between two Linux servers without a switch in between?

Posted in Ethernet, Linux | Leave a comment

DSS, NPIV and SCST – pitfalls, dug deep

We’ve been having trouble with our SAN server for years, up to severe disk content corruption. We’ve been using a pre-packaged software called DSS, and the problem persisted up to the current release.

Maybe our setup isn’t like that of most other DSS users, and we’re pushing the software to higher limits – but all well within the defined functionality (which is, by the way, quite nice: I’d still recommend DSS for the general user, despite the problems that led to this article). But since the company that created DSS was unable to help us, we’ve taken things in our own hands. They tried, and I don’t want to leave the impression their support is unresponsive, unfriendly or not good – it’s just that the company seems to be out of resources in the Fiber Channel area and resorted to simple re-packaging an open-source component there.

We’re running a set of SLES11 servers as a Xen cluster, using Fiber Channel storage as virtual disks. In preparation of a future feature of Xen (being able to use virtual FC adapters inside the VMs) we gave each VM a virtual HBA address, a feature already present in the Fiber Chanel protocol and named NPIV. Simply speaking, we’re creating virtual fiber channel adapters on our Xen servers, one per VM. The disk resources of each VM are accessed via its virtual HBA, and since the virtual HBA gets created during startup of the VM (and destructed after shutting down the VM), we have only disks of active VMs attached to a Xen server, giving less chances to accidentally corrupt them by parallel access.

What happened is that we accidentally corrupted those disks simply by moving VMs between cluster nodes.

After a lot of digging we found out what and why it happens: DSS uses the code of the SCST open-source project to provide iSCSI and Fiber channel target support. And in the piece of code that provides the target services for our hardware setup (“qla2x00t”), there is a bug – a serious one. We’ve since have provided a patch which is included in head of development, but i.e. not in the 2.2 release that was put out just a few days ago. Target milestone is said to be version 3.0, so be warned: If you run into the following scenario, you’re running into trouble.

From the SCST target’s point of view, NPIV adapters (“vHBA”) on any initiator are “slots in a table”, one table per real initiator. The initiator itself again simply keeps a corresponding table, and when you destruct one vHBA and recreate another, the new one inherits the “slot” of the former. Typically it has a WWPN different from that of the old virtual adapter, so it can be distinguished – but without the mentioned patch, the SCST target won’t bother!

So without the patched version, you’ll run into one of two nasty situations:

  1. You destruct vHBA A and create vHBA B on the same physical initiator, without creating vHBA A on another node. From SCST’s point of view, it still thinks it is vHBA A and serves the virtual disks defined for the WWPN of A to the server – which is running the VM that expects the disk of vHBA B. You’ll get surprising results, just like you get when you boot up a physical machine with the wrong disk installed.
  2. You destruct vHBA A and create vHBA B on a physical initiator,and create vHBA A on a second node. This is a common case when you migrate a VM from node 1 to node 2 and then start another VM on node 1.
    What happens in addition to the first case is that the migrated VM on node 2 gets its own virtual disks – the same disk space that the VM B on node 1 accesses. Two VMs live on on disk: Yes, you’re in trouble. This is the *really* nasty case. And we’ve had more than a few of these, until we found out what’s going on.

Unfortunately, the DSS-providing company has not switched to using the patched version ’til this day, so we’re on our own. Fortunately, we’re skilled enough to be on our own – and luckily our “we want open source software” (and I’m not talking about the “free beer” notion) has payed off again. Had it been closed source, we’d have had no chance to even dig to the root cause.

Posted in DSS, Fiber Channel, NPIV, SCST | 2 Comments

WordPress multi-site (aka network) installation

Well, what’s more obvious for a new WordPress blog on IT systems woes than reporting on its installation?

The base installation itself went fairly smooth, all done on an existing Linux server:

  • downloaded the latest WordPress zip file
  • unpacked the content to a fresh directory
  • created a new virtual host for the Apache web server pointing to the new directory
  • set up a new MySQL database, user and the corresponding grants
  • added a .htaccess file to the directory, offering some basic protection while the blog is under test
  • created the required DNS entry
  • reloaded the web server

…and off it went. I opened the installation URL and everything went smoothly.

Of course, I already knew that I’d go for the multi-blog setup (via sub-domains) and had read all about it in http://codex.wordpress.org/Create_A_Network, so I immediately tried to get that operational, too:

  • created the corresponding wildcard DNS entry
  • added the ServerAlias entry to the Apache vhost definition
  • edited $SITEROOT/wp_config.php to add WP_ALLOW_MULTISITE
  • ran the set up under “Tools” – “Network setup”
  • received the changes to .htaccess and wp_config.php and manually typed them in

But logging in to the site just resulted in a big fat message “Error establishing a database connection”. Bang!
After re-running all of the above (with the same result…don’t we all love reproducible problems) and searching the net up & down, I spotted a post about database repair (called via http://…/wp-admin/maint/repair.php), which yielded in messages about missing tables (like “Table ‘wordpress.wp_1_posts’ doesn’t exist”). Searching for that message I finally came across a minor comment in a forum thread about mistaking “MULTISITE” for “WP_ALLOW_MULTISITE”… and yes, having had to manually make the changes to wp-config.php, I made the same mistake. Once I added “define( ‘MULTISITE’, true );” as requested by the setup tool, everything was back to operational.

Had I used cut&paste (which was not available to me where I was doing my work at the time), I’d have had it up & running in less than a quarter of an hour. Of course, that’s only for the initial installation and setting up blogs “out of the box” – but nevertheless this is amazingly easy and quick: A big “thank you” to all the developers working on WordPress.org!

Posted in WordPress | Leave a comment