Search This Blog

Wednesday, June 20, 2018

A horrible waste of time

Incredibly frustrating day. I got a Xeon Phi back from a buyer who bought it and had no idea what it was or how to use it. eBay has fairly awful seller protection: if the buyer wants to return something without paying for shipping, all they have to do is select "item wasn't as described", and the seller is instantly slammed with the shipping costs. It could come back in pieces, and they still have to issue a refund. Luckily, the Phi seemed fine, but I needed to test it. 12+ hours of hell later...

Turns out that the Intel MPSS that built fine on CentOS 7.4 does not at all on CentOS 7.5. Someone posted a patch on the intel forums, but I tried doing the source code modifications and it didn't work. There goes 1.5 hours. Intel MPSS comes with RPMs for CentOS 7.3, so I thought I'd try that. I then try creating a bootable CentOS 7.3 drive and the real awfulness started. I've made about 10 bootable CentOS drives over the past year and have never had this much trouble. All you have to do is pop a usb drive in and use the dd command to copy the iso over. I usually had to do it twice due to some sort of bug, but fine. This time I could not get the installer to run on my workstation no matter what. I tried UEFI, with and without basic graphics mode, and legacy with and without basic graphics mode, and two different graphics cards. The screen always blanked out immediately after selecting the install option. I tried two different usb drives and probably wrote the iso about 8 times. I tried overwritting the first X MB with zeros using dd. Nope, doesn't help. I try booting the drive with my laptop: no problem, the installer starts right up. What the fuck. So I take everything nonessential out of my workstation, try again: nothing. Nuclear time. I used my laptop's windows diskpart to delete the partition on one of the usb drives, clean it, then do a slow fat32 format to make sure everything is wiped. While that was running, I pulled the cmos battery out of the desktop, put it back in, and reset the bios settings to defaults. Since the workstation is out of commission (because I pulled the nvme drive and the raid array (which is now fucked) so it won't boot without me booting into rescue mode and editing fstab), I downloaded the Centos 7.3 iso to my windows (25 minutes later...), checked the sha1sum (windows has a built in utility called certUtil), and used rufus to put it on the now cleaned and formatted usb drive (which took an hour). I tested it with my laptop, no problems, then tried the work station again, but it still fails to load the installer. There must be something seriously wrong with my work station. I tried reinstalling the BIOS, didn't help. Tried re-writing the drive again, nope. Tried using a known working ubuntu drive, first two times caused boot to hang, third time after selecting install caused the workstation to reset, and fourth time finally started the install correctly...ok. Try centos drive again. Nope, failed. Tried the minimal iso...still failed. I'm going to try CentOS 7.4 DVD, which I have successfully used on this workstation before. If that doesn't work, I'm not sure what else to do. I've tried everything I can think of.

Maybe it's some sort of bios incompatibility. The bios I'm using is from Jan. 2018, while CentOS 7.3 was released in Dec. 2016. 7.4 is from Sept. 2017. I downloaded the 7.4 DVD iso and used rufus to put it on the drive. And it actually booted the installer! Holy shit. It must be some sort of bios incompatibility. I've never heard of such a thing, but that's the only thing it could be. However, the drive failed the installer self test, so I re-did the rufus dd thing again. I've had to do it twice before, so this wasn't surprising. However, that failed. I think it has to be done with linux dd twice to work right. So I booted into my laptop's ubuntu and dd'd the iso to the drive twice (there goes another 40 minutes). And this worked. No problems installing.

There is nothing in the google-verse about a (modern) bios and linux operating system version being incompatible that I can find (apart from something really stupid, like an ARM OS on an x64 architecture). But that's what happened here.

Lessons learned:

  • CentOS 7.3 is NOT compatible with the ASUS Z10PE-D8 BIOS version 3501. It probably isn't with the latest (3703) either. I didn't try any others, but it wouldn't surprise me if other os/bios combinations are not compatible. OS/Bios combinations close in date probably have the best chance of working. 
  • Don't use RUFUS on windows for centos installers


Also, eBay was nice enough to refund the return shipping I had to give the buyer after I explained the situation. Good customer support.

No comments:

Post a Comment