Search This Blog

Monday, May 28, 2018

Homelab Cluster: Software Part 1

This is going to be a mess, curt, and full of errors. Fucking blogger blanked out then autosaved my nice draft of detailed instructions, so I had to recreate all of this from memory. I'm going to be updating it in stages since that's the only way to prevent autosave from deleting everything. There will be many places where I note I can't remember something exactly.

I finished updating BIOS and IMPI on all slave nodes. Downgraded all Sun Infiniband cards to the 2010 firmware and installed them. I built the new work station that will serve as a headnode. It's first configuration had a SM X10DAi motherboard and a single E5-2667v3 QS. It's newest configuration will be discussed in the next post.

Since the slave and head nodes have different software requirements, I've split them up.

Headnode Software

OS and Initial Setup

Most NVMe's and the X10DAi do not play nice together, so I ended up using a standard SATA SSD. Create a CentOS DVD USB installer. Install it with development tools, infiniband support, a gui, etc. Create an admin user "cluster". Do yum update. You may have to do "sudo systemctl stop packagekit" to stop the auto-updater from locking yum. Reboot. Install the following packages:
zlib-devel libXext-devel libGLU-devel libXt-devel libXrender-devel libXinerama-devel libpng-devel libXrandr-devel libXi-devel libXft-devel libjpeg-turbo-devel libXcursor-devel readline-devel ncurses-devel python python-devel qt-devel qt-assistant mpfr-devel gmp-devel libibverbs-devel numactl numactl-devel boost boost-devel environment-modules ntp
Most of the above are from step 2 here (scroll down to CentOS 7.4 instructions). If you'll be using slurm, you need to install the following:

  • yum install epel-release
  • yum install munge munge-libs munge-devel rpm-build gcc openssl openssl-devel libssh2-devel pam-devel hwloc hwloc-devel lua lua-devel rrdtool-devel gtk2-devel man2html libibmad libibumad perl-Switch perl-ExtUtils-MakeMaker
Most of those will probably already be installed. If using an nvidia graphics card, you may need to install CentOS in basic graphics mode or with another simpler graphics card, then install the Nvidia driver before yum update all and reboot. Note: if you install the nvidia driver with a non-nvidia graphics card, you will need to put the nvidia gpu in between shutting down and rebooting or CentOS boot will hang. Uninstall the system cmake (sudo yum remove cmake). Name the headnode headnode using the hostnamectl command (hostnamectl set-hostname headnode).

I'll be installing pretty much everything as root instead of user, so that changes some things, but I can't remember what exactly.

Install cmake:
  1. Download the latest source and install script from their website. 
  2. Move the tar and .sh script to the /usr/local directory
  3. Run the .sh install script
  4. Keep hitting enter until see kitware. Hit enter 1 more time
  5. Type y to accept license agreement
  6. Type n to install in /usr/local so the system can find cmake
  7. cmake -version should return cmake's version

OpenMPI

The OpenFOAM ThirdParty folder contains an old version of OpenMPI. It's better to use a newer system install of OpenMPI. Download the latest version source and put it in the opt folder. Instructions for building OpenMPI are here. Configure "--prefix=/opt/openmpi-3.1.0 --with-verbs --with-slurm --with-pmi=/usr" for infiniband and slurm support. Note that slurm (and slurm's libpmi package) must be installed first if using pmi/pmi2. I detail this in part 3 of this series of posts. If not using slurm, can leave off the "--with-slurm --with-pmi" support openmpi configure options. It's probably a good idea to redirect output of configure and install to files, e.g. >log.install 2>&1. Make sure no build errors. Then add the openmpi bin to PATH, e.g. PATH=/opt/openmpi-3.1.0/bin:$PATH, and lib to LD_LIBRARY_PATH, e.g. LD_LIBRARY_PATH=/opt/openmpi-3.1.0/lib:$LD_LIBRARY_PATH, to both root and user's .bashrc. If you build in usr/local, then you don't have to do that, but then it's harder to maintain different versions of MPI. mpirun -version should give the openmpi version. In part 3, I discuss how to use environment modules instead of adding the paths to the .bashrc. Download a test mpirun hello world script. Compile and run it:
  • mpicc -o mpi_hello_world mpi_hello_world.c 
  • mpirun -n 16 -mca btl ^openib ./mpi_hello_world
You will get a warning about not using infiniband if you don't exclude openib.

OpenMPI can use different core bindings and distribution methods. I tried a bunch of these and had two long paragraphs detailing them, but again, this got deleted. The conclusion was that the v3.1.0 defaults seem to work the best for OpenFOAM.

OpenFOAM

These instructions loosely follow these links: wiki, install guide, build guide, system requirements.

The differences between OpenFOAM+, e.g. v1712, and OpenFOAM, e.g. v5, is not clear. They are maintained by different companies, but they work together. The two types of OpenFOAM share most of the code. OpenFOAM+ is released every 6 months, while OpenFOAM is released more often, but they both have development git hubs where you can download dev versions. The last version I used was OpenFOAM v4, so I decided to try OpenFOAM+ this time. They're both using Docker now with precompiled binaries. This is very convenient for single machine installs. However, Docker does not work well on clusters, which is why I have to install from source.

Download and untar v1712 according to the instructions. I'm installing OpenFOAM in /opt. Must modify the OpenFOAM/etc/bashrc file's install directory to be /opt (uncomment and comment out a line). While in there, change wm_label_size to 64 and make sure the mpi type is systemopenmpi.

CentOS 7.5's system installed stuff is all recent enough versions except for cmake, which was already taken care of above. CGAL is installed automatically by the ThirdParty folder, but must modify it's OpenFOAM/etc/config.sh file to set the boost library to boost-system so the ThirdParty boost isn't installed. The ThirdParty folder is missing METIS. This needs to be downloaded and unpacked in the ThirdParty folder, which is already setup to install it, so nothing further needs to be done (unless the version is different, in which case the config file needs to be changed). MESA is also missing, but only need that for the GPU-less slave nodes.

The shipped version of cfmesh does not build with OpenFOAM. You must update it from the github repository. Go to OpenFOAM/modules and mv cfmesh /opt/oldcfmesh. The old directory must be moved to a not-openfoam directory or Allwmake will find it and try to build it. Then clone the latest github version of the cfmesh repository to "cfmesh" in the same location. Now you should have a cfmesh folder with the new files. This will build automatically and shouldn't have any errors. This wasn't well documented until I filed a bug report, though it had been fixed back in January.

Need to source the OpenFOAM bashrc in the root and user bashrc. See the wiki guide for how to set this up as a convenient alias. Example:
  • alias of1712='source /opt/OpenFOAM-v1712/etc/bashrc FOAMY_HEX_MESH=yes'
Source it before continuing installation. You will probably see this warning: "No completion added for /opt/OpenFOAM-v1712/platforms/linux64GccDPInt640Opt/bin". It can be ignored. It should go away after resourcing the bashrc after building OpenFOAM or after a reboot.

Do not install the ThirdParty first. Follow these instructions, which are a mix of the wiki and official instructions:
  1. cd to the ThirdParty directory
  2. ./makeParaView -mpi -python -qmake $(which qmake-qt4) > log.makePV 2>&1
  3. check log for errors
  4. wmRefresh
  5. foam
  6. the above should change to the OpenFOAM directory. If it does not, something is wrong.
  7. foamSystemCheck
  8. export WM_NCOMPPROCS=8
  9. ./Allwmake > log.make 2>&1 (this should be the openfoam Allwmake, not the thirdparty folder one)
Check log for any errors.

After successful install:
  1. become user, source openfoam alias
  2. foamInstallationTest 
  3. mkdir -p $FOAM_RUN
  4. sudo chown -R cluster:cluster ~/OpenFOAM
  5. run 
  6. cp -r $FOAM_TUTORIALS/incompressible/simpleFoam/pitzDaily ./ 
  7. chown -R cluster pitzDaily
  8. cd pitzDaily 
  9. blockMesh 
  10. simpleFoam 
  11. paraFoam
  12. cp -r $FOAM_TUTORIALS/incompressible/simpleFoam/motorBike ./ 
  13. chown -R cluster motorBike
  14. cd motorBike 
  15. disable streamlines stuff in the controlDict
  16. ./Allrun
If the above works, then your OpenFOAM installation is working. The streamlines functions syntax has changed, but hasn't been updated in the tutorials, so it causes errors.

Slave Node Software

The plan is to create an installation on a small SSD on one slave node, make sure it is fully working, then clone it for all of the other slave nodes. If you do this, make sure that you are using the smallest SSD you have. Cloning from a smaller to a larger drive is easy, but cloning a larger drive to a smaller is almost impossible. There is another way which involves storing the configured slave node OS on the headnode, serving to each of the slave nodes with PXEBoot, and booting the OS into their RAM. This is better for small compute node OS installations, but ends up taking up a lot of RAM for larger installations. I'm going to use the cloned SSD's method for now, but I may try the network booting method later.

OS

Install CentOS compute node type with Infiniband support, development tools, etc, but no gui. Create the partitions manually. Use ext4 and no logical volumes because XFS is less flexible and logical volumes can be difficult to deal with for something this simple. Create a user "cluster". Make sure that the UID and GID of this user is the same for all nodes, including the headnode: "id (username)". You may have to set it. Do yum update. Reboot. Install the following packages:
zlib-devel libXext-devel libGLU-devel libXt-devel libXrender-devel libXinerama-devel libpng-devel libXrandr-devel libXi-devel libXft-devel libjpeg-turbo-devel libXcursor-devel readline-devel ncurses-devel python python-devel mpfr-devel gmp-devel libibverbs-devel numactl numactl-devel boost boost-devel environment-modules ntp
Most of the above are from step 2 here (scroll down to CentOS 7.4 instructions). This is similar to the headnode except no qt because only need qt for ParaView, which will not be installed on the slave nodes. If you'll be using slurm, you need to install the following:

  • yum install epel-release
  • yum instalmunge munge-libs munge-devel rpm-build gcc openssl openssl-devel libssh2-devel pam-devel hwloc hwloc-devel lua lua-devel rrdtool-devel gtk2-devel man2html libibmad libibumad perl-Switch perl-ExtUtils-MakeMaker
Most of those will probably already be installed.  Name the slave nodes sequentially with the nodeXXX format using the "hostnamectl set-hostname" command. I like to make the XXX the same as the last three digits of the static intranet IP or IMPI IP address I set.

Install cmake as with the headnode.

OpenMPI

Install OpenMPI like on the headnode.

OpenFOAM

Follow the headnode instructions until it mentions MESA, then come here.

MESA is a graphics driver library thing for linux machines that do not have graphics hardware, like these slave nodes. Download MESA and unpack it in the third party folder. The rest of these instructions I can't remember clearly. The basic idea is to compile the VTK libraries and MESA so that the slave nodes can do stuff like write VTK files. There are some text files (build and readmes?) in the ThirdParty folder that help. You need to create a symbolic link in the main ThirdParty folder to the VTK library in the ParaView folder. Then you might need to change the MESA and VTK versions in some config files and/or in some files in the ThirdParty folder. Then you need to make MESA with the ThirdParty make MESA file, and make VTK with the ThirdParty make VTK file. I think there is an example you can use for this. ParaView is not built on the slave nodes.

Follow the headnode instructions concerning cfmesh and the OpenFOAM bashrc. Follow the build instructions, except don't make ParaView. Follow the same test instructions, except don't do parafoam (since paraview isn't installed).

Benchmarks and conclusions

Someone on CFDOnline created a convenient benchmark for OpenFOAM based on the motorBike tutorial. I downloaded this and ran it on various node configurations. Remember to change the controlDict's streamlines stuff (mentioned above). I had a very nice, multi-paragraph, well-laid out instructions, presentation, and analysis of the results, but again, it got deleted by the fucking autosave. The main conclusion was that my results make sense compared to the prior results, and that the AMD EPYCs are awesome for CFD due to their high memory bandwidth.

I made sure to set all of the slave node's BIOS to performance, but I didn't see a setting for that with the X10DAi. Turns out it's hidden until you select custom in power management. Link. Doing the things in that link improved iterations/second by 22%.

Next Steps

Now that OpenMPI and OpenFOAM are working on individual nodes, the next steps will be getting them working over ethernet, then Infiniband.

No comments:

Post a Comment