My former days with IBM have always left me for a soft spot when it comes to hardware, and even today I still find myself looking for the older Itanium servers that were once made (and subsequently abandoned). With one of my Supermicro servers starting to have issues and an upcoming project that needs more hardware, I figured looking through eBay to see what is available and can leverage some of the components I already have would be a good idea.
As with most of my hardware, eBay is the typical source of server equipment for me. Examining some of the listings revealed some of the x3650 M4 servers (M/T 7915) that were going remarkably cheap and looked quite promising. For less than £200 you can get a 2u server with two high-frequency Xeon CPU's (v1 rather than v2), 32GB ECC RAM, both PSU's, both PCIe expansion boards, hardware RAID, two dual-port 8Gb Fiber-Channel HBA's, and a dual-port Emulex 10Gb network adapter. If it wasn't for the noise, for simple workloads its a nice piece of kit (even if the power bill is higher than a more modern server). Fan noise would likely be an issue (more on that later), but overall it fits my requirements. As there was another server from the same seller for under £100 (slightly lower specification and no CPU/RAM), I figured this would be good even as a spare (I already have the parts it needs). One purchase later and two IBM servers had arrived.
Once the servers had arrived and were unboxed they both needed some prep work before they could be used. The listing had indicated that both of them may be dusty and in fairness one of them definitely did. Cleaning them doesn't take long (thankfully they are well engineered from a layout perspective) and with all of the dust removed the components could be changed/fitted. For the server with the existing CPU/RAM fitted I decided to leave as-is, given I only have a requirement for one of the servers at the moment and this one would likely be a backup. For the barebone system I fitted my v2 Xeon CPU's that I had in storage, and the 256GB ECC RAM from an earlier server I had decommissioned. Thankfully the parts were still functional and the usual POST completed without any related issues. I did also order the missing PCIe expansion riser (in case its needed at a later date), which turned out to be cheaper than expected (I did test the slots to ensure they all worked).
In my former IBM days, updating firmware on System X servers was somewhat of a breeze as UpdateXpress was easily used (using the internal media). Sadly it isn't what it once was, and more often than not its easier to simply search for the M/T and download the latest versions directly. As with most systems there is a specific order that needs to be followed, and with these systems its not different.
Flashing the core system requires updating the Integrated Management Module first, followed by the UEFI, then finally the DSA. These upgrades are performed via the IMM itself (which makes life significantly easier) and are pretty foolproof. If something does go wrong, you can change the jumpers on the motherboard to load the backup IMM/UEFI firmware, allowing you to easily recover. It's worth noting that for the UEFI upgrade to be regarded as successful you need to successfully restart the system and boot an OS, otherwise it will continuously show as 'pending'.
The onboard SAS controller was also due an update, which thankfully was also easily performed using a CentOS 7 Live CD. For all of the OS-level updates the 'glibc.i686' package needed to be installed (as the update files are 32-bit), however once installed you simply run the package with the '-s' option and it does its thing. You can also download the latest of all of the updates for the system (and any built-in/add-in adapter) and run them all in this fashion (saving some time). The RAID controller was also updated in the same fashion, bringing it to the latest level. It is possible to flash these controllers into IT mode, however for my upcoming use-cases this isn't required.
By far this was one the biggest challenges with these systems, as trying to upgrade the firmware for these adapters was remarkably problematic and frustrating. The first hurdle was finding a compatible update for the adapters, as every update I downloaded from FixCentral either didn't find any adapters or complained about missing libraries. Digging into this further allowed me to resolve the library issue, only to find that none of the listed update packages actually recognised the card. Using lspci and digging through the related FixCentral documentation still didn't give me a clear view of what the exact model/chipset these cards were, with confirmation only coming once getting ESXi installed (at which point the chipset was detailed in a friendly way).
Some more searching led me to the HS23 FixCentral page, whereby a 4.6 version of the firmware was listed and was also documented as being compatible with the x3650 servers. Using this package still resulted in a 'no devices found' message, however this was again due to OS compatibility and issues with the bundled library. Using the firmware flasher tool from the CentOS repo (and replacing its default device mapping table allowed an upgrade to take place to 4.6 (progress finally being made).
While searching for a more up-to-date version of the firmware, and now armed with the knowledge of the exact chipset being used, I took to the vendor pages to see what I could find. Some more digging later took me to the Emulex (now Broadcom) support portal whereby the 11.2.1153.23 version of the firmware was found, packaged as a bootable ISO (OneConnect Flash ISO Image x86) to avoid any challenges/complexities with using your own. This method worked first time and had the adapters updated to the latest version, and for anyone looking at updating their 10Gb firmware in the future I highly recommend this approach to save time and frustration!
For reference, these are the versions of the firmware (and filenames where applicable that I have used):
- IMM: 8.41
- UEFI: 3.30
- DSA: 9.54
- MPT2SAS: 1.20.02
- MPT3SAS: 1.16.10
- M5100: e36
- MR1215: 24.12.0-0024
- MR5200: 24.21.0-0151
- SAS Expander: 61c6
- Emulex 10Gb NIC: 11.2.1153.23
- QLogic 8Gb HBA: 4.10.05c
- QLogic NIC: 7.13b.4.1c
Old hardware, even enterprise-grade, is not without its quirks that trip you up from time to time. A good example of this is how on one of the servers if you try to disable any of the onboard devices (be it SAS, Ethernet, or anything else) the system is no longer able to POST and will hang for no reason. Originally I thought it was one of the other UEFI settings I was configuring, however after changing each setting one by one it became apparent that for some reason if you able any of the onboard devices it simply breaks. Updating the firmware didn't resolve this, neither did resetting the UEFI settings to factory defaults, nor did changing the boot order to remove said device prior to making the change.
Installing ESXi is usually the simple task that takes only a few minutes at the end of system setup, however with these servers that turned out not to be the case. While ESXi would install (both at 6.7 and 7.0 levels), updating 7.0 to the latest version would result in a boot failure. Some digging later (thankfully I remember there being an issue with the UEFI loader some time ago) revealed that it is still an issue with the latest release for some hardware and does require a workaround (see here). Using the UEFI loader from the 7.0 GA version allowed the hypervisor to boot when running the latest version of ESXi thankfully. As a bonus, the ESXi status page for the server also shows the IBM logo :-)
As these servers are designed to run in a datacentre the fan noise from the main fans is something unpleasant. Thankfully, there is something that can be done about it providing you are careful with the system thermals. Replacing the internal fans is in theory possible (a wiring diagram for the 6-pin connector can be found online) however they are double-depth and may do more harm than good changing them out. As an alternative, you can configure the system to run in 'acoustic' mode, whereby the fans will run at their lowest level to keep the system within thermal tolerances. An important note here is that this doesn't include your add-on drives and add-on PCIe cards, so care must be taken to ensure you don't cook your add-ons. On my servers (as they are using PCIe NVMe add-ons) specific vent holes in the chassis have been taped to increase the positive pressure within, and to ensure the air flow is forced out via the specific paths I want (PSU, 10Gb Ethernet adapter, PCIe NVMe cards).
To run the system in acoustic mode I use ipmitool with the following commands:
- ipmitool raw 0x3a 0x07 0x01 0x00 0x01
- ipmitool raw 0x3a 0x07 0x02 0x00 0x01
Note: These commands also work under ESXi providing you install the ipmitool vib
One aspect of the systems that is still somewhat of a frustration is the noise from the PSU fan, which sadly (and common with most 40mm * 40mm fans) has a whine to it due to the bearings being worn during its life. As the PSU fan is PWM (albeit with a different wiring pin-out than standard PWM fans) I attempted to switch to a Noctua fan. Unfortunately, while the fan did spin the PSU did detect that something wasn't right (likely the lower rotational speed) and so would power itself off after running a self-test.
I've now ordered replacement Delta fans from China which hopefully are new (there is a good chance they will be refurbished and have the same issue) and hopefully work (as its impossible to find an exact match to the IBM P/N. If these replacements don't work then I will make a baffle assembly for the rear of the PSU to help absorb some of the noise.
At present the servers are powered off waiting for a future project to start (more on that soon), but are wired up and ready to go. While they won't get any further firmware updates (not impossible, but unlikely), they will get hypervisor updates for as long as ESXi supports the older Xeon CPU's.