Search This Blog

Saturday, March 31, 2018

Wanhao i3, first good prints

After spending ~100 hours fixing and modding a used Wanhao i3 V2.1...

Finally something positive:

First good looking print

Settings: PLA, 60C bed, 215C extruder, CURA, 0.2mm layer height, 0.4mm nozzle, 0.4mm line width, 3mm brims, 2mm 60mm/s retraction, 40 mm/s print and infill speed, 0.4mm extrusion width, 0.8mm walls (2 perimeters), 100% infill, 15% infill overlap, 20% skin overlap, full fan after ~1mm. Excellent layer adhesion, great bridging, good corners, ok first layer adhesion. Still not getting amazing first layer adhesion to the bed with the PEI. Some initial underextrusion on brim (maybe needs more pre-print extrusion). Still has some blobs/strings, but not as bad. Not sure how to get rid of those without dropping temperature and getting poorer layer adhesion. Some ringing, but can't get rid of all of that with this printer...it's just not stiff enough. Overall pretty good though.

Next up, XYZ calibration cube:



This is by far the best one I've made. Similar settings to above, but 25% infill. No drooping on X or Y overhanding points, no overshoot on corners. The only major problem is the banding/lines. These match up with where the extruder rose up at the start of each layer. If I had aligned the Z axis moves, you wouldn't see the bands, but part of one side would be too wide. It looks like the printer is slightly over-extruding every time it starts a new layer, which is causing the first ~10-20mm of that line to be a little too wide. Not sure what's causing that. Maybe too hot, temperature fluctuations (I haven't retuned the extruder PID yet), or the cheapo filament I've been using. May also need to enable retraction at layer change, retract before outer wall, or Z seam alignment. There's also some (very) minor ringing, but I can't fix that without some major modifications. Excellent first layer and interlayer adhesion.

I repeated the above but extruder temperature lowered to 210 C for first ~13mm, then 205 C. The lower temperature helped slightly, but the bands were still there. No noticeable difference in interlayer adhesion strength at 205C, so 215C is likely too hot.

3DBenchy with same settings, but with retract before outer wall enabled and 200 C extruder temperature:






This is actually the second one. The first one failed about ~40% through when the extruder gear's grub screw backed out AGAIN. I put it back in with more loctite and as tight as I thought I could make it without stripping the socket. Anyways, this one worked pretty well. The horizontal banding is significantly less. Arches and bridges came out pretty well. Flag pole holder and smoke stack are intact. The super fine letters on the back of the tugboat mostly resolved. Dimensions are pretty close (height is dead on 48mm). There are a few layers where it looks like it was underextruding, particularly in the cabin, which is the only really concerning thing. This is probably either poor quality filament or the extruder gear again. I checked the extruder by extruding and retracting some...seems fine. There was a bit of stringing/blobbing. I like to run a little hot to get good layer adhesion, so some minor stringing is expected.

Wanhao i3: Thingiverse Things Reviewed

This post is continuously updating list of Thingiverse Things I've made with my Wanhao i3 v2.1. It's split up into sections: 1. Things I've made that have 100% worked, 2. Things made that sort of work, 3. Things made that don't work. 4. Things I want to make. 5. Test articles. Most of these have comments in other posts.

Things that 100% work

  1. better filament guide for the Wanhao i3. I designed this.
  2. A SSD adapter for a Supermicro drive tray I designed this.
  3. T-rex This came out awesome.
  4. MOSFET holder . Worked great. The small threads are modeled, which is pointless, but that doesn't affect use.
  5. Fucktopus. A classic
  6. Dachshund Cookie Cutter. I managed to get the best layer consistency I've ever had on this print. Had to get extra prime just right for the ear to come out.
  7. Celtic skull. I printed the candle holder remix of this. Came out great
  8. Mini barrel-o-monkeys. I just printed the monkeys
  9. toothbrush holder
  10. Nautilus gears
  11. Sanding blocks. Printed all of the stl's
  12. Totoro coat rack revised. I remixed this, but ended up redesigning most of it.
  13. Barbell cookie cutter. I designed this
  14. Cable corners. Had to try a couple times to get the scale right for my cables, but otherwise good.
  15. Batman cookie cutter
  16. Yin yang cookie cutter
  17. Cat cookie cutter
  18. Fidget cube
  19. Gyro bearing fidget thing. Had to use supports and brim for the balls, and it took some trimming to get the balls to come out well
  20. Spaghetti measure tool
  21. Family guy alien
  22. Elliptical gears

Things that sort of work

  1. Ball bearing spool holder. The Wanhao i3's spool holder is too short for most 1kg spools. This thing worked the best out of the 3 spool holders I printed. I had to sand about 0.2mm off the width of the bar part that inserts into the original spool holder (my original spool holder's slot measured 15.8mm at one end and a bit over 16mm at the other), but otherwise works great. The bearings (which I got off eBay) were a perfect fit.
  2. Y axis belt tensioner. 3D printer belts are a pain to tension manually. A tensioner generally uses a screw mechanism to make tensioning easier. This one worked ok. The little nubs that the belt loops around broke off, but I was able to superglue them back into place. The holes for the M4 screws were also about 1mm too far apart...had to dremel one of them. The ends also aren't in the same plane, which isn't ideal. Otherwise, it worked great. Belt, screw, and nut clearances were spot on. I printed the X-axis version of this too, which is almost identical, but haven't tried it yet.
  3. Bed spring cups. You have to scale them up 110%, and suggest only using them on bottom. I had bad layer adhesion with them, but that was probably my printer.
  4. DiiiCooler. I had some issues with supports, but that was my printers problem, and it's not suggested to use supports at all. I had to lower mine using washers as spacers because it wasn't quite low enough (was blowing on the nozzle more than the extruded filament). It's also really hard to see the print. It does seem to work pretty well, though.
  5. Filament guide. I think this is the best one on thingiverse as of my downloading it. Most of the others are very flexible because they're so thin. However, the notch for the metal angle brace for the spool holder is at the wrong angle, requiring you to take pliers/dremel the notch larger. It also holds the filament pretty far out away from the extruder.
  6. Jet engine. A few of the tolerances were off, and the with-support part versions tended to break chunks of blade off when you pulled the supports off. Assembly is a pain. Otherwise, it's a great model.
  7. Ethernet cable clip repair thingy. I don't think these work on all cables, but it seems to print well with low layer heights
  8. Female torso. I think my printer has trouble with all of the retractions, but I had a terrible time printing this. The farthest I got was about the shoulders before a nozzle clog. Ended up sticking the Trex head on top of it, haha.
  9. Marijuana leaf cookie cutter. Printed this for some friends. The wall thickness is exactly wrong for typical line widths, so it causes a lot of skipping. Must either use one line width with a big nozzle, or use two very thin perimeters with a smaller nozzle. 
  10. Dragon. I tried printing the one with tree supports: those supports are a huge paint to try to get them to adhere to the bed. My nozzle kept knocking them off. I ended up having to print the head separately and glue it on. 
  11. Bathtub tug boat. Had to do some surgery to get it to float upright
  12. Hand clamp. Had to print the "sliding hinges" with brim or they just wouldn't stick. The holding force isn't great, though it is made of plastic...

Things that don't work

  1. X-axis belt tensioner. This just didn't print well. I think the tolerances are too tight, particularly for the screw holes and nut slot. The teeth for holding the belt also don't come out well, though changing orientation helped a little with this. 
  2. Bad spool 1. This one's radius is too large, which causes the spools to rest on the two edges, which causes a lot of friction, preventing them from turning.
  3. Bad spool 2. The threads on this one don't come out well unless your printer is very well tuned. Otherwise, the threads come out too weak to use. This was with PLA. Might work better with ABS or PETG. 
  4. Dial gauge clip holder. This one had the wrong size slot and wrong location hole. Use this thing, which has a CAD file you can modify. To be fair, it did "work", i.e. it held the gauge and clipped on the X-axis rods. But the problem with this is that the clipping and unclipping moves the Z axis slightly, and it's a pain to clip and unclip, which defies the whole purpose of making the bed easier to level. There are some dial indicator holders that mount to the stepper motor, which might be a good option.

Things I want to make

  1. DiiiCooler. I did already make this, but I want to remake it with PETG so it has a higher temperature resistance.

Test/calibration articles


Wanhao i3, Part 2

Lots of updates on this...

I finished refurbishing the MK10 heater block. This required sanding all the crud off of it.

Mostly sanded and cleaned
 I went a bit crazy with the insulation. I covered all sides with 3mm of ceramic fiber insulation and "Koptan" (knock off Kapton, i.e. polyimide) tape. I also soldered and installed a new thermistor. I was having a lot of weird heating issues before. I'd have to run PLA at ~250C to prevent delaminating layers, but measuring with a thermocouple multimeter showed that the nozzle wasn't getting anywhere near that hot. Turning the cooling fan on would also lower the temperature significantly. So I decided to fix all possible reasons for this: new insulation, new thermistor, new cooler (DiiiCooler) aimed at extruded filament and not the nozzle/heat block.
Insulation, new thermistor

More insulation

While I'm talking about this, I later discovered that the fan cooling off the nozzle problem had gone away, but I still had to run PLA at 250C. This meant two things: 1. Insulation helps, 2. the old thermistor was problem fine. I replaced the nozzle and PTFE liner next (the old PTFE tube looked fine actually). The old nozzle was either nickle coated brass or SS, the new nozzle was definitely brass. This made a big difference. I could now print PLA at 215C and get similar layer adhesion strength. I think the old nozzle was SS, which has a much lower thermal conductivity, resulting in a higher temperature gradient from thermistor to nozzle. I also added some more insulation around the nozzle since this one was longer. Remember to re-zero your Z axis when you change nozzles! This heating mess took many hours to resolve. What a pain.

Brief review of Koptan tape: Koptan is the chinese knock off of Kapton polyimide tape. Kapton has a fairly high heat resistance and is electrically insulative, making it suitable for heating and electronic applications. I've used it a lot in the past. Surprisingly, Koptan seems to work ok, though I don't think it has as high of a temperature rating. Well, it's probably rated the same, but won't actually work at the higher temperatures. 

I printed a DiiiCooler before I took everything apart. I had an ABS CiiiCooler that came with the printer, but it was pretty warped and was too high in Z, which caused the air to mostly blow on the nozzle. I figured a PLA cooler would be fine as long I insulated the hot end well. I wouldn't recommend printing it with external supports (definitely not internal)...the supports spaghetti'd on me and I was barely able to save the print with a bunch of superglue. Unfortunately, the DiiiCooler was also too high in Z (I did not print the "short" version, which is even shorter in Z). I used washers as spacers. Update: currently using 4 washers and some paper as shims, which results in the cooler being ~1mm above the part.

I ended up using fewer washers than this. The cooler was not sitting flat, so I had to crack the supports and reglue them slightly angled.

You should barely be able to see the nozzle peaking through. This means the air flow is aimed at the part and not the nozzle.
I still ended up needing to shim the cooler with paper to get it level to the bed.
While I had the hot end apart, I decided to do the rotate the extruder stepper motor mod. This lets the wires clear the top of the frame at the max Z height. Unfortunately, I discovered this:

Stripped thread...ugh
The previous owner must have stripped the threads out of that hole while trying to get the extruder lever screwed in. Turns out a M3 helicoil kit is more expensive than replacing the stepper. But since buying the exact same model is impossible, I'd have to go through stepper tuning (steps and current) again. Ugh. I decided to just switch out the Y-axis motor for the extruder motor. Turns out the screws holding the Y-axis motor are long enough to engage the threads at the back of the stripped hole, so it worked out.

I replaced the belt and added a 3D printed y axis tensioner. This ended up working ok, but not amazing. The little nubs that the belt ends wrap around kept breaking off and falling out...poor layer adhesion or something. I managed to superglue them in, and they held long enough to get the belts in, but then broke off again. I added zip ties to the belt ends to help keep them from pulling through. While I was doing this, I also aligned the y-axis belt (adjusted the idler and belt gear. I also added a zip tie around the frame and end of the bolt holding the y-axis idler pulley so it wouldn't be cantilevered. I also printed the X version of this tensioner (comes with the download), as well as another X axis tensioner. I couldn't get the second X axis tensioner to print well, but I was able to manually tighten the X axis belt enough that I decided I didn't need it or the first X axis tensioner. I did replace the screws holding the belt ends with longer ones and add a zip tie between them so they screws wouldn't be taking the whole belt tension load as cantilevered beams. I wasn't able to align the belt perfectly, though, so I'm going to wait to replace that belt until I can figure out a better solution.

I also replaced the y-axis carriage plate. I purchased it from Tehnologika in Slovakia for about 25 Euro with shipping. It's about 145g lighter than the stock plate (230.7 vs 375.5g). You need longer screws because it's 6mm thick (18x M4 12mm button caps, and 4x M3 25 or 30mm button caps); order these while you wait on the carriage plate to come. The longer screws only add about 5-6g, so you definitely come out ahead in terms of weight. It's also way stiffer than the original stamped steel build plate. It's made of two 3mm thick plastic core aluminum sandwich panels glued together. It has a coat of white paint on one side and gray on the other, though the paint job sucks and the paint flakes off easily. They're faster "airmail" shipping option is horribly slow, even to the UK. It took over 2 weeks to arrive in London, then it only took a day to get to me. Ironically, I had gotten fed up waiting and had just finished reinstalling the original carriage plate when it got here -_- ...ugh. So I took that back off and installed the new one. I tried using the spring cups I printed, but they kind of fell apart...poor layer adhesion. I then put the new glass+PEI build plate on, using 3-4 pieces of Koptan tape on each edge to hold it down. Then an endless string of problems began.

I could not level the bed at all with the glass build plate. I used the paper under the nozzle method. No matter what tightening/loosening pattern I used, the screw/springs on one diagonal would be full tightened and the two on the other would be fully loose. This basically was pringle-ing the heated bed and trying to warp the glass.  The glass was very flat...I felt suction trying to lift it off of flat surfaces, so it had nothing to do with that. I also checked this by laying a thick edge ruler on edge on it and looking for gaps. Using these methods, I also determined that the new carriage plate and heated bed were level within 0.5mm, so I don't think they were the problem either. Geometrically, this doesn't make any sense...why would a perfectly flat piece of glass need to be heavily warped to be level?  The most frustrating thing was that I was able to level the old bed (a 3mm thick plate of PEI plastic that was definitely not flat) better!!!  It took ~5 incredibly frustrating hours for me to figure out what the cause was. It can't be the glass: that's flat. It can't be the carriage plate (I tried both the old and new one). It can't be the X axis level-ness because I leveled it by placing an object between the lower x-axis rod and the Z-axis stepper mounts on both sides. Besides, that could only be the problem if the left or right side were higher than the other, not the diagonals as was the problem here. This left only one thing: the y-axis rods. If one rod is angled (rotated about X) relative to the other, then you will get this diagonal mismatch behavior. This isn't because the rods are warping the bed. It's because the bed moves on the y-axis. If the y axis rods are tilted, then the bed will tilt as it moves. You have to think about it some, but it finally clicked with me. The worst part is that the bed can only be as level as these rods are co-planar, IF the bed is perfectly stiff...like a 3mm thick piece of glass, but very much unlike a 3mm thick piece of plastic. THAT is why I was able to "level" the old bed better. I was actually warping the plastic build plate (and heated bed) to compensate for the non-coplanar rods. I used some digital caliper's depth post to try to measure the y-axis rods' Z distances to the frame. I would then loosen the metal bracket holding one end of a rod, adjust it up or down, then retighten the bracket. Then remeasure and repeat. I was able to get them within +/-0.3mm this way, but that means that the glass build plate can only be as level as that...which sucks. There really wasn't an easy way to use the calipers to do this. What I ended up doing was sliding the thick metal ruler under the rods but over the frame, then using pieces of paper as shims between the ruler and the rods. I'd then take the stack of paper pieces out and measure them with the calipers. Then I would adjust the rod heights and repeat. I was able to get them within 0.1mm this way. The pringle problem remained though. I then tried taking one of the screws out (back right), leaving 3 points defining the build plate's plane. This helped marginally, but the front left and back right corners were still about 0.05-0.1mm higher than the front right and back left. This makes sense because that's about as level as I could make the y-axis rods. I just can't figure out a better way to level them. What a pain in the ass. Update here.

I decided to just go with that. For small parts in the center of the bed, it shouldn't matter that much. The new glass+thin PEI will be worth it, right? I immediately had first layer adhesion problems. I had no adhesion problems with the old build surface. What the ****. The print will start out seemingly fine, but then if the nozzle bumps a slightly raised portion of the part, the part pops right off. I can manually pop them off of the heated build surface without much effort at all. Tried heated bed settings from 60-70 C (PLA). I always clean the surface thoroughly with 99% rubbing alcohol.  I tried sanding the PEI with 3000, 1500, and 1000 grit sand paper...didn't seem to help much. The whole point of using PEI on glass was to avoid using adhesives, so I didn't want to go that route. I still haven't solved this problem. I'll just try using rafts for now. Maybe the PEI I bought isn't actually PEI...no good way to check that. I may abandon it and either try adhesives on the other side of the glass or ditch it all for an ultrabase plate.

Then I started having extruder problems, specifically problems with random but infrequency under-extrusion, and no retraction. 

Ignore tower that popped off...bad bed adhesion as mentioned above. The angled blob towers indicate poor retraction and overheating. At least I know the cooling is working well since the blobs solidify before they can droop.

The retraction command would be sent, and I could hear the motor move, but the filament wouldn't move. I tested this with just the extruder position menu option, as well. It would start to retract after after about ~5mm, which is odd. Why was there suddenly a 5mm deadband? I took the extruder apart. Turns out that the extruder gear's grub screw had come out enough to cause the gear to rotate, but not all the way around the shaft because it would catch on either side of the flat spot on the shaft. This was causing the filament retraction (and probably random extrusion) problems. I put it back in with loctite.

During one of the many test prints, I tried pausing the print (under SD card menu) to clear a blob of plastic. With the current firmware, "pause" means move the X axis to 0 and y axis to 200mm (forward). Normally, this would be fine, but I had replaced the y-axis belt and left the end sticking out past the zip tie about 5mm too long. This causes it to jam in the y-axis idler pulley and skip steps. Resuming the print results in a y-layer shift. UGH. I had to flip the printer over, cut the zip tie, cut the belt shorter, and replace the zip tie. Take away: make sure your belts aren't too long.

I noticed that whenever I use the SD card menu (mount, print file, etc), that the extruder fan spins a little faster. I think this is related to the ground/power issues with the control board that ships with these printers. While no one has reported the fan thing, people have reported temperature fluctuation and reset issues. I may look into that some.

Near term to do: 
1. Fix first layer adhesion problem. Update: Actually, it seems this is ok. Anything with a very small contact area needs brims/raft, but most things seem to be sticking adequately.
2. Fix bed level problem
3. Reprint spring cups
4. Figure out weird power/grounding SD card thing.

I'm probably up to about 100 hours of work spent on this printer. It's all stubbornness now...definitely NOT monetarily worth it if factor in time spent.

My impressions of 3D printing so far: 
  • Most 3D printers are NOT precision machines. 
  • They aren't designed to be squared and leveled easily, they aren't made from precision machined parts, most of them have some sort of fundamental design flaw (or more than one).
  • Stepper motors are not great substitutes for real servos
  • The cheap ones are just barely usable. 
  • It takes a fuck ton of work to get them to print anything, let alone print well. 
  • Don't buy a cheap 3D printer, especially used, unless your time is worth nothing, you really like fixing things, and/or you're a masochist
  • About half the things on thingiverse don't work, and about 75% have some sort of flaw. 
These things are what you pay the extra $1000 for a high end hobby printer and $10000 more for a professional printer.

Wednesday, March 21, 2018

3D Printer Wanhao i3, Part 1

I've always wanted a 3D printer, but they were always too expensive for me to pull the trigger on one. Some reasonably good Chinese ones exist now for a reasonable price, so I thought I'd try one of these. I did a lot of research and settled on the Wanhao i3 V2.1, which is a knock off of a Prusa i3, and also rebranded as the Maker Select v2 and the Cocoon. With tuning, you can supposedly make it print as good as $2000 printers, and it has a large community around it. The last thing is important because many of these Chinese 3D printers need a lot of modifications to get them working well. I wanted a build volume around that size. I also wanted to be able to use any filament, which ruled out a few companies. The Tevo tarantula/tornado, Anycubic i3 mega, and creality cr10 were other close contenders. The Anet A8 seemed like too much work.



I purchased a used Wanhao i3 v2.1 3D printer for about $280 (UK eBay) shipped. I then spent many hours watching youtube videos, browsing thingiverse, and reading forums about what I would need to do it. It already had some modifications done to it, but it had some other issues. The insulation is mostly worn off of the hot end and the carriage plate is super warped (common problem on these). The worn insulation makes me have to print at very high temperatures because the nozzle ends up being far colder than the thermistor. Either that or the thermistor is failing. I think it's also causing the layer adhesion and some other problems I've been having. Interestingly, I haven't had any issues with first layer adhesion, which seems to be a common problem. Overall though, I haven't been satisfied with the prints, but I did manage to successfully print a 3DBenchy and a few XYZ calibration cubes.

Anyways, here's the list of upgrades it has now. The first 5 came with the printer.
  • PEI sheet on original aluminum bed
  • Z frame braces
  • CiiiCooler shroud + blower fan
  • removed spring belt tensioners
  • Z axis stepper dampers
  • Re-squared everything
  • Attempted to align y axis belt
  • brace the cantilevered y belt pulley with a zip tie
  • Medium loctite most screws
  • longer screws where need zip ties
  • oiled bearings
  • dial indicator + mount (cheap dial indicator, 3d printed mount)
  • Sunon MF50151VX-A99 fan for cooler. At least 2x flow rate of common ones for just slightly more.
  • Replaced PEI build plate with borosilicate (I know, not necessary, but it was cheaper and precut) glass + thin PEI sheet. Held down with blue painters tape on edges.
  • Added a second spring inside the main spring for the extruder arm. See below.
  • Turned motor sideways so wires clear top edge. Have to pull some more wire through guides.
  • Ball bearing spool holder
  • Fixed z steps
  • Calibrate extruder steps
  • extra screws kit
The dial indicator mount didn't end up working well. This is the kind that clips onto the x axis bars. I found that putting it on and removing it would move the Z axis slightly, which defies the whole point. I'm going to try one of the stepper motor mounted ones.

I went through 3 different spool holders. The original is too short for most spools. This is the one that worked best for me, though I had to sand about 0.2mm off the width to get it to fit in the stock spool holder. 

The cheap 5015 blower fans commonly sold for 3D printer coolers suck. I spent a few hours looking for a better one that wasn't expensive. The Sunon MF50151VX-A99 has at least 2x the flow rate of the one that came with the printer's CiiiCooler (one of the mods that came with it). 

The original extruder idler didn't exert much force on the filament. With the extruder off and PLA filament inserted, you should be able to push it in/out and see visible teeth marks. Link. If you don't see teeth marks, then the behavior can be exactly like too low of a temperature. Also see this link. I ended up having a spring that happened to fit inside the old one, so I did just that. 
Spring inside spring
If you do this, I suggest getting an inside spring that's just a little shorter (~2mm) than the original spring so that the original spring can still compress fully. There's just barely enough travel on the lever for me to relieve pressure on the filament so I can pull it out. There are some modifications that you can print that allow to precompress the original spring more, but this is way easier.

My z axis was going up a few microns too much each layer. I discovered this by measuring the Z height on a test 3DBenchy print. Turns out that these Wanhao's ship with a somewhat random z axis steps value in their firmware. You need to download Repetier host, install it, connect with a USB cable (make sure the FTDI drivers are installed first), shutoff Repetier server (which will try to autoconnect and keep Repetier host from working), and connect to your printer. Then under the firmware (under a menu), you can change the value of Z steps. This should be 400 exactly. 

Removing the adhesive holding the old plastic build plate on was a pain in the ass. The best method I found was to heat the build plate up and use a razor to scrape most of it up. Then let the plate cool off and use acetone to get the rest.

Goo. Washcloth to protect heater from steel carriage plate


No goo

New glass plate with thin PEI sheet adhered to it
eBay has been a great place to buy screws in small quantities. I highly suggest it.

Next modifications:
  • Install the bed spring cups I printed to keep springs aligned.
  • Install the x and y axis belt tensioners I printed, along with new belts. Spend some time aligning the belts so they don't rub
  • New heat block insulation and thermistor. Going to replace ptfe tube while I have it taken apart.
  • Install DiiiCooler I printed. Used PLA, so might have to replace this with PTEG eventually.
  • New y-axis carriage plate. I bought one from tehnologika. 
I'm currently sanding down all the crud off of the heater block. I would be replacing the heater core/cartridge, too, but the one I bought was 6mm and the hole is 6.35mm (1/4"), which are difficult to find, so I'll keep the original for now.

Caked on gunk

There are a few other mods I'm considering, but I want to get it put back together with the new thermistor, cooler, and heater block insulation to see if that improved the print quality a lot.

Potential future mods:
  • Microswiss all metal hot end
  • New heater core
  • Z axis flex couplers
  • PETG cooler
  • Encloser for ABS
  • Ultrabase build plate
This printer (and I think all Wanhao i3's) suffer from X axis wobble. By that, I mean if you hold the extruder carriage by the front and move it up or down, you can twist it about the X axis a few mm. My X axis rods are locked down by the way...they are not at all loose. This behavior is caused by two things. One: the Z axis bearing holders and bearings have some play. This is partially caused by the fact that the 45mm linear bearings are only be held by their top ~5mm. Two: The Z axis smooth rods are not fixed in place. They can slide up and down, and they are not fully constrained radially (loose holes). Since the upper mount points are the same triangular metal blocks as used on the x and y axes, I think I can add a set screw to them, but it requires disassembling most of the printer. There's no obvious fix for fixing the lower Z axis smooth rods in place. I'm planning to design and print two things: 1. A pillow block/brace for the z axis bearings and metal brackets, 2. brackets to hold the lower end of the Z axis smooth rods. If you look at the Prusa i3, you can tell that they have a far superior Z axis carriage design...the smooth rods are full constrained and the Z axis bearing bracket/carriage things are one solid piece with the bearings in full pillow blocks. They also have a X axis belt tensioner built in. *sigh...

My goal is to get really nice prints out of this thing. In summary, I've spent about $500 so far (including two rolls of filament and some spare parts I haven't used yet). I've also probably spent about 50 hours working on it, not including printing time. If you include the hours (at a rate of say, $10/hour), then I'm up to about $1000. A genuine prusa i3 mk3 kit is about $1000. So yeah...I've learned why more expensive printers are more expensive...they just work. If I could do it over again, I'd probably consider some of the more expensive printers.

I just found out about the folgertech 2020 i3, which looks easier to modify, though might need even more work than this one did. There are about 100 cheap options, and it seems like some people get lucky and get amazing prints out of them without much effort, but it takes others ages to get good prints. 

Homelab cabinet thermal managment

My homelab's cabinet is a 12U APC NetshelterCX (actually a Kell Systems PSEM, but they're basically the same) soundproofed cabinet. It has a thermal rating of 800W with the stock fans, though my servers running anywhere near that would shutdown due to too high inlet air temperature. Then I sold those and bought a 1600W one.

Thus, I had to build a more powerful fan system for the cabinet. The original bracket was cut and bent sheet steel with a heavy coating of black paint. This screwed into a large slot in the side of the cabinet, which fed into a sound-baffled passage which exited out of the bottom of the cabinet.

This involved milling, bending, and hammering a piece of sheet aluminum, selecting and installing fans, selecting and installing a 24V power supply, and selecting and installing a fan controller. Even though it has large fans, it protrudes into the cabinet less than the original.


Old and anemic

New and awesome

Turning vanes. An obstruction in the cabinet prevents the middle one from having a set.

Laser cut guide vane holders plus cover plate in case I decided to remove the middle fan.
Feature too small...cracked

Sealed with blue RTV
Installed. Hanging wires: I needed the fan controller for another project...

I went with 3x 120x120x38mm high flow rate axial fans I found on ebay for cheap. I added fins to redirect the flow on the underside of the bracket. These were made from laser cut ABS and hand bent super thin sheet aluminum.

I purchased a cheap chinese PWM fan temperature controller from eBay. I taped it and the thermal probe to the inside top of the server cabinet, and it worked...but poorly. The temperature control chip PWM ran at 30 Hz, which was way too low. It was also underdamped in the operating band, which caused the fan speed to oscillate widely. Pretty sure it wasn't a noise issue. I tried everything I could think of, including constant temperature sources, adding thermal mass to the thermocouple, and grounding/shielding wires, but none of it helped. I'd avoid these, though if you do get one, make sure you get the one with all of the fets populated (higher current). The 2A or 3A ones have one missing, but the 4A one has all of them. The one in picture is of a 2A one, but I received a 4A one...classic chinese ebay item.

12V-24V-DC-4A-PWM-PC-CPU-Fan-Temperature-Control-Thermostat-Speed-Controller
The eBay temperature controlled PWM fan controller I bought

Brief segway about fan control. There is a lot of confusing terminology out there regarding small dc fan control. 2 pin fans just have power and ground. These can be controlled either by varying voltage linearly, or by PWM'ing the power line. The former only works down to ~half the rated voltage for most fans or they don't have enough power to start. The latter requires a PWM fan controller (like above). 3 pin fans have an extra wire that outputs the tachometer readings. This is useful for measuring fan speed if the power source is constant, i.e. not PWM'd. If the power is PWM'd, then the sensor is, too, which usually messes up its readings. 4 pin fans have power, gnd, tach, and control wires. In addition to the two methods mentioned for two pin fans, these have a third option for control. In stead of PWM'ing the power wire, a low voltage/low current PWM signal can be sent to the control wire. The fan's internal electronics then handle the actual power PWM'ing. This has the added benefit of not screwing up the tach sensor readings because the voltage on the power wire is still consistent. Unfortunately, finding a cheap controller for these fans is difficult. There are some chinese ebay temperature control ones that are meant for 4 pin fans, but I'm skeptical how well they work. Noctura makes a ~$20 manual pot one, but that's the only one I could find. Anyways, back to the fan bracket...

I ended up buying a cheap chinese manual pot PWM fan controller, which works better, though still has a very slight oscillation. The control band of the pot is also very small...maybe the first 3 degrees or so, the rest being full power.

The pot fan controller I bought...supposedly rated for 60V 20A. lol

Turns out that the 3rd fan wasn't doing anything...the flow rate didn't increase when I added it. The 2nd fan didn't add much either. Because the fans are in parallel, and the exhaust passage has a high static pressure drop, I think I'm on a portion of the system pressure vs. flow rate curve that would be better served by high pressure rather than high flow rate fans. I had never designed an exhaust fan system before, but I guess this is why bathroom vent fans are blowers and not axial. I may redo this design to use blowers if it gets too hot in the cabinet.

Update from Nov 2018: I discuss this in more detail in another post, but I tried measuring the passage pressure drop and found it was actually pretty low. I ended up switching to much quieter, same flow spec, lower pressure spec fans and the temperatures seem stable. Still need to figure out automatic control. I currently have to open the back of the cabinet every time to I use the server to plug/unplug the fans, which then operate at full throttle.

Yes, I did all of this for what is essentially an exhaust fan system. I hadn't really made anything in over a year, so I was suffering withdrawal, so don't judge.

Building a small HPC cluster, part 1

It's amazing how detrimental small apartment living has been to me as a maker. I can't emphasize enough how important the lab spaces and tools available to me at MIT were, all within walking distance. FIT has a nice shop and labs, but they close at 5 and I had to drive 20 minutes to get there. Living in England has been even harder in this respect. Everything is compact, and the small apartment we had in FL looks large compared to what we have now. I also don't have access to any machine shops. There are a few makerspaces in a city about 20 minutes away, but they don't have any serious machine tools.

I could whine some more, but I won't. I ended up adapting my interests.

I built a 15 machine ethernet windows cluster for running STAR-CCM+, a commercial CFD program, for the numerical part of my MS research. I borrowed all the computers in the lab for that, haha. Since my research is now primarily computational, I naturally began learning more and more about computers, server hardware/software, and networking. A few months of researching, and I decided I wanted to build a small cluster. I jumped and I had no idea how far down the hole went...

As part of the preliminary research into the world of server architecture, I asked myself, "What is the best $/performance ratio processors for CFD?" Turns out this is an incredibly complicated question to answer. Up until this point, I didn't really have a choice which systems I ran on. MSFC used the AMES Pleiades supercomputer, which at the time had a mix of sandy bridge, ivy bridge, and haswell nodes (and the haswells were always taken). At KSC, I could only really use the much neglected "america" mini-cluster, relegated to serving interns and low priority projects...I think it might have had dual 8 core ivy bridge processors, though I'm not sure which ones. At FIT, I had the random mix of AMD operton, nehalem xeon, and random i7's that had been in our lab for generations of grad students. This was the first time I had a choice, and I wanted to make the "right" one.

I knew I wanted real server hardware, meaning AMD server CPUs or Intel Xeons. Increased reliability and ECC are the two main reasons. However, I decided to keep Intel i5/i7/i9 processors in the running just for the sake of comparison.

The "$" and "performance" parts of the magic ratio can be looked at separately. Cost includes initial purchasing costs, and costs of ownership, which for me includes maintenance and home electricity costs, but for a company might include service contracts and rack space rental. I didn't want a system so old that the parts were likely to fail or have it be more of a space heater than a server. Overall though, the "$" is fairly straightforward to estimate.

"Performance", particularly for CFD, is much much harder to estimate, though it's fairly simple to define. A system which can complete more iterations in the same amount of time of the exact same case with the same program compared to another system has the superior performance. The relative performance between programs and solvers used may vary with server architecture, but most of the finite volume Navier-Stokes CFD programs will have the same trends, i.e. what will be faster for one program will be faster for another program. Processors determine the whole server system, since the systems are built around the processors. Thus, I focused on processor selection, in particular, two specific aspects: core*GHz and memory bandwidth.

Now, you're probably wondering why I didn't just look at one of the popular CPU benchmarks, e.g. cinebench, and use those numbers for my selection process. It turns out these benchmarks are pretty much useless for CFD. The algorithms used in CFD have very different process utilization than most programs. First, they will use 100% of a core 100% of the time. This makes hyperthreading at best useless, and at worst, detrimental to performance. Luckily, that can be turned off. Second, above some number of cores, they will use 100% of the memory bandwidth almost 100% of the time. In fact, CFD core scaling is usually memory bandwidth limited. In other words, if you add more cores (even within the same cpu) without increasing memory bandwidth, the performance plateaus and stops increasing. After how many cores this happens is dependent on many many factors. Third, accurate CFD simulations require (almost exclusively) double precision floating point operations. To put it simply, these bog down cores. Modern CFD programs can take advantage of modern advanced instruction sets, like AVX and AVX 2.0, but these don't exist in older CPU architectures.

I needed to set are a starting point for my search. I decided that anything older than sandy bridge (c. 2011) was too old to be worth it. This wasn't completely arbitrary, as AVX was first introduced with sandy bridge, and these see a significant inherent speed up over nehalem and older processors.

I also decided that I needed a model, one that I could input the specifications of a CPU, and the relative performance would come out. Unfortunately, as hinted at above, I can't simply go based on number of cores*GHz/core (or FLOPs). Newer generations of processors are inherently faster, and not just due to memory bandwidth. I needed to determine the relative influences of core*GHz, memory bandwidth, and processor generation on overall performance. And I needed real benchmarks to do this.

There are some CFD related benchmarks. There are a few random ones for random cases scattered throughout the google-verse (I'm pretty sure I found all of them up to a few months ago). These aren't particularly useful; to do a relevant comparison between processors requires a consistent benchmark. I stumbled upon the specfp benchmarks, and 3-4 of them are actually simple CFD codes. Comparing those to the other specfp benchmarks and you'll see some of the trends I mentioned earlier...CFD is "different". I wrote a python script that scraped all of the specfp benchmarks and sorted/averaged the relevant ones based on many different parameters. Unfortunately, I couldn't answer the memory bandwidth question because most of the benchmarks were done with the fastest memory rated for those CPUs. If I grouped the memory bandwidth in with the inherent generation performance increase, was able to determine some relative performance differences between processor generations. For example, given a haswell and a sandy bridge processor with the same core*GHz (or scaled to the same core*GHz), I was able to determine an average % performance increase to be expected of the haswell. The error on this number was very high, though. I believe part of the problem is that the specfp CFD benchmarks were created many many years ago for much slower processors, and thus they don't scale well, particularly for the very high core count systems from the past few years.

It was at this point that this research endeavor kind of fell apart. There just isn't enough publicly available CFD benchmark data. I was, however, able to determine some basic trends (for CFD): 1. Each generation is inherently a little faster than the previous...this can be as low as ~3%, but as high as ~15-30%. 2. Memory bandwidth is almost equally important as core*GHz, 3. Higher (10+) core count CPUs aren't very useful because the memory bottleneck is reached before all of the cores can be utilized...better off with higher GHz, lower core count processors. I decided that was enough information to get started. I don't think I've exhausted all of my options for this project yet, so I'll probably come back to it in the future.

If you're a computer/server savvy person, you may have already guessed which generation the magic ratio would favor. Sandy bridge processors won hands down. Comparable ivy bridge processors are about 2x the price new, for maybe 5-20% performance increase. Haswell and Broadwell systems were significantly faster, maybe 30-60%, but their cost is 4-20 times more, not accounting for the absurd costs of DDR4 ram at this time. I didn't bother considering the newer skylakes. I haven't examined the i5/i7/i9's, or AMD's architectures in any meaningful way yet, but I probably will one day. AMD EPYC seems very promising...2-3x the memory bandwidth of Broadwell at way way lower cost.

I did not consider electricity usage, but I probably should have. Another way of looking at the % faster numbers is % less electricity of the CPU for the same computation. The power consumption of the rest of the server is probably about the same, but since the CPU takes up a large portion of that, it's probably important.

Anyways, since initial costs seemed more important than eventual electricity costs, I had my generation chosen. The E5-2690 is the fastest sandy bridge (8 cores at ~3.2GHz all core turbo), so I figured I'd aim for that. Most sandy bridge systems are upgradeable to ivy bridge, so I figured I could do that in the future when the prices come down.

So the first computer I bought was based on an I7-5960x. Haha, oops. I found an incredibly good deal on it, so I decided I could use it as my head node and graphics processing node. I7-5960X with liquid cooler, 32GB 2400MHz RAM, 256GB M2 SSD, 3TB RAID 1 for storage, 1000W Platnium Superflower PSU, dual GTX Titan GPUs. The GTX Titan, Titan Black, and Titan Z have the same chip as the Tesla K20/K40/K80. Incredible double precision floating point performance. The only downside is they don't have ECC RAM, but they were significantly cheaper than the Tesla's when I bought them. The prices have shot up with the recent crypto mining craziness.

Desktop. I needed 12V from a molex connector for another project

The following is a summary of my first few months with server hardware.

Servers are loud. The noise doesn't matter in a data center, but it really matters in a homelab. I managed to score a ~$2000 APC Netshelter CX 12U soundproof server cabinet for ~$130. The thing is stupid heavy, and I had to take the castor wheels off to get it into my office. Unfortunately, my office has carpet, and this ~200kg behemoth doesn't roll on the carpet. I tried furniture sliders, but it's too heavy for them, so they don't slide. I ended up buying some plywood cut offs from eBay a few months later to lay down. Here's how it looks now:

Now it rolls

It killed these furniture sliders



Now it was time to fill it up. Well, sort of...my time line is all sorts of messed up. Anyways...

I've spent the last few months buying and selling various servers and server hardware. I made rule not to buy unless it was an absurdly good deal. I won't go through all of the details....rather dull anyways. I started with a HP DL380p G8 and DL360p G8, both with 2x E5-2690s and 64GB of DDR3 1600MHz (pc3-12800r) RAM. Unfortunately, something was messed up with the DL360p's on board memory, so I had some serious problems updating firmware. Eventually figured it out. I also got a Sun QDR Infiniband switch, a sun HCA for the desktop, and two HP Infiniband cards for the servers. I got lucky...it only took about a week to get QDR speeds working. I ran CentOS 7 on all three. Right about the time I got that system fully working, I bought a Supermicro 6027TR-HTRF for less than the price I paid for one of the HP servers. The 6027TR-HTRF contains 4x dual cpu sandy/ivy nodes, all in a 2U chassis. This particular one came with 8x E5-2650's, 256GB of pc3-12800r RAM, and 4x QLE7340 QDR infiniband cards. I've never seen a deal that good since then. The seller had ~5 of them originally and they all sold in a few days. I would have bought more, but my cabinet's thermal rating was only 800W, and this new server would be 1600W by itself. I had to build a more powerful fan system for the cabinet because I had doubled the thermal load. See this post.

In the middle of all of this, I won an auction for 12x Xeon Phi untested Co-Processor pcie cards. More on that here.

So I sold all of the HP servers after switching the E5-2690's for E5-2650's (made a little money if accounting for processors). I also switched out the RAM since I had previously managed to get 16x 16GB PC3L-12800R sticks for way less than it was worth. I got 2x more E5-2690's by smart bidding on servers/cpus on eBay. So now I had a kick ass 4 node server.

I then ran into problems with the Infiniband system. Basically, while Infiniband is supposed to be a standard similar to USB, it turns out that it's really not followed closely enough by all of the manufacturers. Intel/QLogic does not usually play nice with Mellanox and its rebrands. This is partially due to differences in architecture and the way MPI is handled (Intel uses PSM, Mellanox the traditional offload processing and Verbs). See this post for more details on my quest for Infiniband.

And now we're up to the present! Currently, I have the I7 desktop, the Supermicro 4 node server, the Sun Infiniband switch, an unmanaged 1Gbe switch, and a big Riello UPS (which I got new for ~$100). Near future changes include the 4x HCAs for the Supermicro nodes, which should get the Infiniband system back up and running. I can't expand anymore because I'm at the power limit of the circuit in my office, so this should be pretty close to the final configuration. My plan is to run some CFD benchmarks on these once I get everything setup.

Front view of inside of cabinet

3D printer
Stuff cleaned up

I also just bought another 140 of the Xeon Phi coprocessors, but that's a story for another time, haha. 3D printer in above photos is related.




If you'd like to start your own homelab, the /r/homelab and /r/homelabsales subreddits are great places to start.

Xeon Phi Co-processor Testing, part 1

I won an auction for 12 71S1p xeon phi coprocessor cards.


These are awesome...basically a 61 cores @ ~1.1GHz linux server with on-chip ultra high bandwidth memory in a double wide pcie card package. They're very picky about motherboards and thermals though. The motherboard must have "above 4G decoding" or "large memory BAR support" or something like that. Most supermicro's do, and all of the ASUS WS's seem to. Even then, it's not guaranteed all phis will work. The "p" phis are passively cooled, which means they're really meant for server applications. You can create some cooling fan ducts for them, or buy or 3D print them.

I tested them with another DL380p G8 (with 2x E5-2690's I later harvested) I had purchased. I bought the complete GPU package (see this post) for it, but the DL380p G8 only supports the 5110p Phi, which has a lower wattage rating than the 71S1p. I hacked (shorted the sense pins) an extra ATX PSU to power them, with the cables coming out of the other PCI riser's slots so I could close the lid.

The testing and firmware update procedure was fairly straightforward once I figured it out. This can probably be adapted for your own system. Most of this follows the readme text file and user guide that comes with the MPSS software.
  1. Update DL380p's firmware
  2. Install CentOS 7
  3. Install a Phi
  4. Change bios settings to enable large BAR support (in advanced menu) and set fans to max.
  5. Disable SELinux (re-enable after done testing Phi's.
  6. login as root, create RSA key so can use SSH later. ssh-keygen . You want to do this before configuring MPSS for the first time, otherwise you have to manually load the key (see readme text file)
  7. Download the MPSS software, readme, and user guide. If your firmware is older than that in the readme, try starting with an older MPSS. If you're using a kernel that isn't listed, then you can recompile the rpms using the instructions in the readme.
  8. Install MPSS (see the readme and user guide). I suggest rebooting.
On the host, from a terminal run: 
lspci | grep -i Co-processor
That will tell you which PCI port/slot thing its in. Mine was 24:00.0, so I did:
lspci -s 24:00.0 -vv
If lspci doesn't recognize it, then there's a problem with your card (assuming your motherboard is compatible). A likely culprit is thermal overload, especially if you're trying to use a passive "P" card without a cooling system. I actually went back to bios and enabled maximum cooling to help with this. If you have a desktop, you'll need to construct a custom cooling system (see above). Another possibility is that the card isn't seated well. Try reseating it. When none of that worked, I gave up on the card. I'm sure there is more advanced troubleshooting you could do, but I just don't know how to do it. Intel tech support seems to be pretty good, so it might be worth asking them.

Next, type:
modprobe mic
This starts the mic process. If you have just installed or reinstalled MPSS, then you need to do:
micctrl --initdefaults
 Then:
micflash -getversion 
This must be 375 for the latest MPSS release. Mine were 390. Then:
micctrl -s
This should return "ready". I'm not sure what to do if it does not.

Run:
micinfo -group Board
This should return a bunch of information about your Phi, though not all of it will be available because MPSS isn't running yet. Next:
micflash -update -device all -smcbootloader
Then restart the host, and:
modprobe mic 
micflash -getversion 
This should show the new firmware version. Next, start MPSS:
systemctl start mpss
 Now you should be able to ssh into the Phi's filesystem:
ssh mic0
 If that didn't work, you need to see the readme section about ssh keys and loading them.

Now, from the host, run:
miccheck
This should show all passes. Then run:
micinfo
This will show a lot of information about your Phi. You can launch a monitoring gui with:
micsmc
That's it. If your Phi passed all of that, you should be able to install software on it. I haven't done this yet...that will be the topic of another post.

You should also go to /etc/sysconfig/network-scripts/micX and change "onboot" to "no". I can't remember the exact reason for this, but it's in my notes.



For my lot, after all was said and done, 8/12 were recognized by lspci and tested to be working. The lspci recognition was spotty, though...probably because these weren't 5110p's. I managed to sell all of them for a profit. I kept one, the 7110p that was in the lot. While probably not useful for conventional CFD, it might be useful for something like OpenLB or anything super vectorizable that needs more umph per core than a GPU can provide. They're also supposedly really good for mining some cryptocurrencies, though I haven't tried.

Homelab Infiniband part 1

Infiniband is a high speed network protocol. It uses special pcie cards (called Host Bus or Channel adapters), cables (copper or fiber optic, both passive and active), and switches. The Infiniband network requires a "subnet manager" running on one of the nodes, either a computer or a switch, to manage it. There are multiple versions/speeds. It started with "SDR" or "Standard Data Rate", which was 4x links at 2.5Gbit/s = 10Gbit/s. Then DDR, "Dual Data Rate", which has 4x 5Gbit/s links for 20Gb/s. Then QDR ("Quad") for 40 Gbit/s, FDR10 (which is 40 Gbit/s), FDR (56 Gbit/s), etc. The actual throughput is usually lower than that. For example, the theoretical throughput for QDR is 32 Gbit/s, though starting with FDR10 they tried to make it closer to link speed x number of links. The number of links is almost always 4x. There are various manufacturers, but the two main ones are Mellanox and Intel. Mellanox hardware is often rebranded as HP, IBM, and Sun.  Intel purchased QLogic, and there are various Intel/QLogic rebrands, too. While they're all supposed to be compatible, they really aren't. The Intel/QLogic cards use the PSM protocol for MPI, while the Mellanox cards use Verbs and offload the processing from the CPU to the card. There are advantages and disadvantages to both. While you can sometimes get Intel and Mellanox hardware to work together, it's rare and usually means something is not running efficiently. It's strongly suggested to stick with Mellanox and its rebrands, or Intel and its rebrands, only.

A note about firmware: The Mellanox HCAs use firmware that has to be periodically updated, and it's usually available for free. However, the rebranded ones have their own firmare built from the Mellanox firmware. If you can figure out which Mellanox card you have, then you can reflash it with the original Mellanox firmware. The Intel/QLogic cards do not require firmware. They are ASICs, and since the message processing isn't offloaded to the cards, they can be simpler. However, they require a special software library (which only works with specific versions of specific OS's) to function correctly.

A note about cables and connections: you have to be careful about connections when buying Infiniband hardware. The SDR and DDR hardware typically has CX4 connectors and cables. However, some DDR hardware has the QSFP+ ports and cables, which are also used by QDR and FDR, and are not compatible with CX4. None of this is compatible with SFP+ or Ethernet cables though you can buy adapters, though that limits your speed. Your hardware will auto-negotiate to the lowest speed in the network. For example, if you buy 3x FDR cards, a FDR switch, and QDR cables, you will probably end up with a QDR speed network.

A note about Ethernet: a lot of homelabbers want a high speed network. The most cost effective way to do this at the time of this writing is actually with QDR infiniband hardware, not 10Gbe hardware. You can generally buy QDR HCAs and switches for less than equivalent 10Gbe hardware. The best thing to do then once you have your QDR network setup is to run IPoIB (IP over infiniband). That will give you much higher data throughput than 10Gbe but still act a lot like Ethernet.

This is a very brief summary. The wikipedia article is good, but there is a lot more information available elsewhere online.

Anyways, before I knew all of the above, I had purchased a random hodgepodge of Infiniband hardware from eBay. I had a 36 port internally managed Sun switch, a Mellanox IS50XX, two HP 544FLR-QSFP QDR ConnectX-3 HCAs, one Sun dual port QDR ConnectX-2 HCA, 4x QLE7340 single port QDR QLogic cards (came with Supermicro server), 5x 2M HP (Mellanox) QDR cables, and 2x QLogic QDR cables.

The first setup I tried that mostly just worked was the Sun switch with the Sun and HP HCAs. I updated the firmware on the HP HCAs since that was freely available. Sun's firmware updates are only available with a very expensive paid customer service contract. I could reflash the Sun HCA with Mellanox firmware, but turns out I didn't need to. Luckily the manuals are all available online. I plugged it all together, installed the "Infiniband Support" package in CentOS on all three nodes, and it just worked. Boom, 40 Gbit/s (according to ibstat). This meant the Sun switch's internal subnet manager was working. I wanted to be able to access it, though, just for fun. I didn't know the IP address of the ethernet management port, but it has serial USB port on it. I purchased two usb to serial converters, a null modem cable to go between them, and connected it to my laptop and tried to connect to it. Nothing...not a peep. I tried this every way I could think of. I even plugged the other usb end into another usb port just to make sure the cables were working (they were). It seems that either the usb converter wasn't really converting right, or the prior owner (oddly enough it identifies itself as a dhs.gov switch with ibswitches) locked it down. It turns out that these switches have no hardware reset. That sucks. I then tried something a little crazy: bruteforce pinging all of the private IP addresses to see if I got a response. There are some parallelized linux and windows tools available to do this. It still took about a day...nothing. *sigh. Oh well, at least the subnet manager starts up every time and seems to be doing its job.

It was around this time I bought the Supermicro server. This came with the 4x QLE7340s, which turned out to be nightmares. I also bought the Mellanox switch....also a nightmare.

I'll start with the Mellanox switch. I purchased an IBM rebrand of a Mellanox IS50XX from eBay. Turns out you really need to check to see if they come with the 36 port enable license (otherwise you're limited to 18 ports) and the FabricIT license (which runs the internal subnet manager). To save space, here are the mellanox community post and servethehome post on this mess. While I never figured out if the switch was DDR or QDR, I'm fairly certain IBM never sold any DDR switches, so it was probably some sort of configuration problem. I ended up re-selling it.

The QLE7340's: after about two weeks emailing back and forth with the Intel rep, I finally got the QLE7340's working back-to-back (plugged into each other) at QDR speeds. The problem was that the published pdf user guide for the required software stack is missing some prerequisites. Here's the user guide you need, and here are the commands you need to enter after install CentOS 7.2 minimal. You must use 7.2 unless they've updated the software since this post. It will say in the readme text file.

yum install -y dmidecode tcl tcl-devel pciutils-devel binutils-devel tk libstdc++ libgfortran sysfsutils zlib-devel perl lsof tcsh glibc libstdc++-devel gcc-gfortran rpm-build glibc.i686 libtool bison flex gcc-c++
yum install -y http://vault.centos.org/7.2.1511/os/x86_64/Packages/kernel-devel-3.10.0-327.el7.x86_64.rpm

Then install "OFED+ Host" software as in the pdf guide. The True Scale software is pay-for only, but you don't need it unless you have a huge Infiniband "fabric". That should get the QLE7340's working back-to-back. However, they refused to negotiate to more than 10 Gb/s with my Sun switch, or with the Mellanox IS5030 switch. After another month with the Intel Rep, turns out that the QLE7340's only support the Mellanox X series or newer switches and Intel/Qlogic switches (all are still pretty expensive). I was lucky he helped me out so much...none of the companies officially support EoL hardware. I sold the QLE7340's and am going to buy 4x Sun HCAs identical the one I currently have since I know they work with my switch. I test fit that one in the cramped high density nodes...supermicro did an excellent job designing those by the way. It fits. Oh, I also sold the HP HCAs because I sold the HP servers. My timeline is kind of messed up, sorry.

Here is a table of Mellanox Infiniband HCA rebrands and corresponding Mellanox part numbers. This is useful for finding the correct Mellanox firmware for IBM, HP, and Sun/Oracle cards.

Part 2 will be about getting the full setup working.

HP DL380p G8 GPU Options Explained

I spent many many hours looking into GPU options for the HP DL380p G8. The documentation and part numbers are very confusing. The goal of this post is to save you a ton of time by explaining what each part number is and exactly what you should buy if you want to have GPUs in your DL380p G8. The DL380p G8 can hold a maximum of 2x ~225W GPUs or Xeon 5110P Phis (the only Phi officially supported...it will recognize others, but it will think they are 5110Ps). There should be about 300W available per GPU (225W from cable, 75W from PCI slot), but you might run into cooling issues. The total cost for a dual double wide GPU or Xeon Phi setup (minus the GPUs or Phis) can be as low as $100. See the following google spreadsheet for everything you need. Make sure you have the dual 1200W PSUs option if you do this. The dual 750W or 500W ones won't be powerful enough.

Slot1 and 2 GPU cages

Installation is fairly straightforward. You remove the old PCI risers, install the GPUs/Phis into the riser cages, and install the new risers. Then plug a cable into the riser and into the GPU, repeat for other GPU. Make sure you don't mix them up. Unlike the original risers, these are slot dependent.