@@ -28,6 +28,7 @@ Things that could go wrong during an installation include:
2828* required dependencies that are not specified in the easyconfig file;
2929* failing shell commands;
3030* running out of available memory or disk space;
31+ * compiler errors and compiler crashes ;
3132* a segmentation fault caused by a flipped bit triggered by a cosmic ray
3233 ([ really, it happens!] ( https://blogs.oracle.com/linux/post/attack-of-the-cosmic-rays ) );
3334
@@ -239,7 +240,7 @@ checksums = ['d808eb5b1823c572cb45a97c95a3c5acb3d8e29aa47ec74e3ca1eb345787c17b']
239240
240241start_dir = ' src'
241242
242- # -fcommon is required to compile Subread 2.0.1 with GCC 10,
243+ # -fcommon is required to compile Subread 2.0.1 with GCC 10/11 ,
243244# which uses -fno-common by default (see https://www.gnu.org/software/gcc/gcc-10/porting_to.html)
244245buildopts = ' -f Makefile.Linux CFLAGS="-fast -fcommon"'
245246
@@ -287,10 +288,60 @@ Remember though: *no peeking* before you tried to solve each step yourself!
287288
288289---
289290
290- *** Exercise T.1**** - Sources *
291+ *** Exercise T.1**** - Toolchain *
291292
292293Try to install the ` subread.eb ` easyconfig file, see what happens.
293294
295+ Take into account that we just want to get this software package installed,
296+ we don't care too much about details like the version of the dependencies or
297+ the toolchain here...
298+
299+
300+ ??? success "(click to show solution)"
301+
302+ The installation fails because the easyconfig specifies that `PrgEnv-gnu/21.10`
303+ should be used as toolchain:
304+
305+ ```
306+ $ eb subread.eb
307+ ...
308+ ERROR: Failed to process easyconfig /pfs/lustrep3/users/kurtlust/easybuild-tutorial/Troubleshooting/subread.eb: Toolchain PrgEnv-gnu not found,
309+ available toolchains: ...
310+ ...
311+ ```
312+
313+ `PrgEnv-gnu` is an HPE Cray PE module that may look like a toolchain - it certainly has
314+ the same function: provide compiler, MPI and basic math libraries - but it is not
315+ recognised as a toolchain by EasyBuild. EasyBuild prefers to manage its own modules so that it knows
316+ well what is in it which is not the case with the `PrgEnv-*` modules from the Cray PE
317+ as the content may differ between systems and as the versions of the compilers etc. that
318+ are loaded differ on other modules that are loaded. Hence we created Cray-specific toolchains.
319+ You'll actually find two series of Cray toolchains in the list of available toolchains.
320+
321+ A more readable list of toolchains supported by EasyBuild can be generated using
322+
323+ ```shell
324+ eb --list-toolchains
325+ ```
326+
327+ The `CrayGNU`, `CrayIntel`, `CrayPGI` and `CrayCCE` are included with the EasyBuild distribution
328+ and where developed by CSCS for their systems using Environment Modules. These were not compatible
329+ with the initial releases of the Cray PE with Lmod modules so new ones were developed on which we
330+ also built for the LUMI toolchains. Those are called `cpeCray`, `cpeGNU`, `cpeAOCC` and `cpeAMD`
331+ and are maintained by LUST and available via the LUMI repositories.
332+
333+ Note: Depending on how you use EasyBuild you may now first run into the problem of Exercise T.2 or
334+ first run into the problem covered by Exercise T.3.
335+
336+
337+ ---
338+
339+ *** Exercise T.2**** - Sources*
340+
341+ After fixing the problem with the name of the toolchain, try running ` eb ` again.
342+
343+ What's wrong now? How can you fix it quickly?
344+
294345Can you fix the problem you run into, perhaps without even changing
295346the easyconfig file?
296347
@@ -323,6 +374,9 @@ the easyconfig file?
323374 mv subread-2.0.1-source.tar.gz $EBU_USER_PREFIX/sources/s/Subread/
324375 ```
325376
377+ (assuming you have set `EBU_USER_PREFIX`, otherwise replace `$EBU_USER_PREFIX` with
378+ `$HOME/EasyBuild`).
379+
326380 Or, we can change the easyconfig file to specify the location where
327381 the easyconfig file can be downloaded from:
328382 ```python
@@ -344,7 +398,7 @@ the easyconfig file?
344398
345399---
346400
347- *** Exercise T.2 **** - Toolchain*
401+ *** Exercise T.3 **** - Toolchain revisited *
348402
349403After fixing the problem with missing source file, try the installation again.
350404
@@ -357,41 +411,18 @@ the toolchain here...
357411
358412??? success "(click to show solution)"
359413
360- The installation fails because the easyconfig specifies that GCC 8.5.0
414+ The installation fails because the easyconfig specifies that `PrgEnv-gnu/21.12`
361415 should be used as toolchain:
362416
363- ```
364- $ eb subread.eb
365- ...
366- ERROR: Failed to process easyconfig /pfs/lustrep3/users/kurtlust/easybuild-tutorial/Troubleshooting/subread.eb: Toolchain PrgEnv-gnu not found,
367- available toolchains: ...
368- ...
369- ```
370-
371- `PrgEnv-gnu` is an HPE Cray PE module that may look like a toolchain - it certainly has
372- the same function: provide compiler, MPI and basic math libraries - but it is not
373- recognised as a toolchain by EasyBuild. EasyBuild prefers to manage its own modules so that it knows
374- well what is in it which is not the case with the `PrgEnv-*` modules from the Cray PE
375- as the content may differe between systems and as the versions of the compilers etc. that
376- are loaded differ on other modules that are loaded. Hence we created Cray-specific toolchains.
377- You'll actually find two series of Cray toolchains in the list of available toolchains. The
378- `CrayGNU`, `CrayIntel`, `CrayPGI` and `CrayCCE` are included with the EasyBuild distribution
379- and where developed by CSCS for their systems using Environment Modules. These were not compatible
380- with the initial releases of the Cray PE with Lmod modules so new ones were developed on which we
381- also built for the LUMI toolchains. Those are called `cpeCray`, `cpeGNU`, `cpeAOCC` and `cpeAMD`
382- and are maintained by LUST and available via the LUMI repositories.
383-
384- Changing the toolchain name to `cpeGNU` is not enough to solve all problems though:
385-
386- ```
417+ ```shell
387418 $ eb subread.eb
388419 ...
389420 ERROR: Build of /pfs/lustrep3/users/kurtlust/easybuild-tutorial/Troubleshooting/subread.eb failed (err: 'build failed (first 300 chars):
390421 No module found for toolchain: cpeGNU/21.10')
391422 ...
392423 ```
393424
394- We don't have this `cpeGNU` version installed, but we do have GCC 21.12:
425+ We don't have this `cpeGNU` version installed, but we do have `cpeGNU/ 21.12` :
395426
396427 ```shell
397428 $ module avail cpeGNU/
@@ -407,11 +438,12 @@ the toolchain here...
407438 ```python
408439 toolchain = {'name': 'cpeGNU', 'version': '21.12'}
409440 ```
441+
410442---
411443
412- *** Exercise T.3 **** - Build step*
444+ *** Exercise T.4 **** - Build step*
413445
414- With the first two problems fixed, now we can actually try to build the software.
446+ With the first three problems fixed, now we can actually try to build the software.
415447
416448Can you fix the next problem you run into?
417449
@@ -489,9 +521,10 @@ Can you fix the next problem you run into?
489521 defaults in EasyBuild, settings in the EasyBuild configuration and settings in the
490522 easyconfig file that we shall discuss later).
491523
524+
492525---
493526
494- *** Exercise T.4 **** - Sanity check*
527+ *** Exercise T.5 **** - Sanity check*
495528
496529After fixing the compilation issue, you're really close to getting the installation working, we promise!
497530
@@ -529,7 +562,12 @@ Don't give up now, try one last time and fix the last problem that occurs...
529562
530563---
531564
532- In the end, you should be able to install Subread 2.0.1 with the cpeGNU 21.12 toolchain by fixing the problems with the ` subread.eb ` easyconfig file.
565+ ---
566+
567+ *** Exercise T.6**** - Post-install check of the log file*
568+
569+ In the end, you should be able to install Subread 2.0.1 with the cpeGNU 21.12 toolchain by
570+ fixing the problems with the ` subread.eb ` easyconfig file.
533571
534572Check your work by manually loading the module and checking the version
535573via the ` featureCounts ` command, which should look like this:
@@ -541,6 +579,182 @@ $ featureCounts -v
541579featureCounts v2.0.1
542580```
543581
582+ So all is well know, or is it?
583+
584+ Unfortunately we don't have a complete log file of the last build (at least if you only re-installed
585+ the module) as most of the steps were skipped in the last build.
586+
587+ Let's do the build again and check the full log file, just to be sure. But we'll first need to
588+ clean up a bit as EasyBuild doesn't like to build in a shell in which the modules which are
589+ used for the build are already loaded:
590+
591+ ``` shell
592+ module unload Subread cpeGNU
593+ ```
594+
595+ Now look at the output of an extended dry run and then rebuild to have a full log file so
596+ that we can expect if EasyBuild really did what we expected:
597+
598+ ``` shell
599+ eb subread.eb -x
600+ eb subread.eb -f
601+ ```
602+
603+ (the last line to force a rebuild).
604+
605+ Now go to the ` $EBU_USER_PREFIX/SW/LUMI-21.12/L/Subread/2.0.1-cpeGNU-21.12/easybuild `
606+ (or ` $HOME/EasyBuild/SW/LUMI-21.12/L/Subread/2.0.1-cpeGNU-21.12/easybuild ` , depending on your configuration,) directory and open
607+ the log file in your favourite editor. Search for the build step by searching for the string
608+ ` INFO Starting build ` and look carefully at how the program was actually build...
609+
610+ You'll very likely have to look at the solution to understand how to correct the
611+ problems as that requires more advanced knowlege than we have at this point in
612+ the tutorial, but try to figure out what could be wrong first though...
613+
614+ ??? hint "(Click for a hint)"
615+ Check the compiler that has been used and the compiler flags. Are these really
616+ what you would like to see and what you would expect from running ` eb subread.eb -x `
617+ as we did before?
618+
619+
620+ ??? success "(click to show solution)"
621+ According to the output of ` eb subread.eb -x ` , the build should be done using
622+ ` cc ` as the compiler as that is the value assigned to the ` CC ` environment which
623+ by convention points to the C compiler. Moreover, EasyBuild sets ` CFLAGS ` to
624+ ` -O2 -ftree-vectorize -fno-math-errno ` , and then the ` make ` command line adds
625+ ` -fcommon ` to those flags.
626+
627+ However, this is not what we see in the build log. It turns out that Subread
628+ is one of those horror packages that follows no established convention for
629+ build procedures.
630+
631+ One of the first lines we
632+ run into (yours may differ since this is a parallel build) is
633+
634+ ```
635+ gcc -mtune=core2 -O3 -DMAKE_FOR_EXON -D MAKE_STANDALONE -D SUBREAD_VERSION=\""2.0.1"\" -D_FILE_OFFSET_BITS=64 -fmessage-length=0 -ggdb -O2 -ftree-vectorize -fno-math-errno -fcommon -I/opt/cray/pe/libsci/21.08.1.2/GNU/9.1/x86_64/include -c -o core.o core.c
636+ ```
637+
638+ The flags that we added via `CFLAGS` are in there but only after some other flags.
639+ The build process didn't pick up our C compiler either! And o horror, it even defines
640+ the processor architecture! So it will not run on older architectures than the Intel Sandy
641+ Bridge family, but it will not exploit newer architectures either (well, it it could, the code
642+ may not benefit at all from newer vectorisation instructions, but at least the compiler might
643+ do a better job optimising).
644+ Scrolling down a bit we see some lines that generate executables from a single
645+ C file and a list of already generated object files, and there we don't even see our
646+ compiler flags at all!
647+
648+ The problem is truly in the makefiles of Subread. We could now untar the source file
649+ that was saved by EasyBuild in a temporary work directory and inspect the sources, or we could
650+ retry the build and stop after the build step. Let's take the latter option. The command to
651+ do this is
652+
653+ ```
654+ eb subread.eb -f --stop build
655+ ```
656+
657+ We'll need to search for the build directory now as it is not printed when EasyBuild stops in
658+ a regular way.
659+
660+ ```
661+ pushd $EASYBUILD_BUILDPATH/Subread/2.0.1/cpeGNU-21.12
662+ cd subread-2.0.1-source
663+ cd src
664+ ```
665+
666+ The EasyConfig uses the makefile `Makefile.Linux` so let's check that one. Some of the crucial
667+ lines are:
668+
669+ ```
670+ CC_EXEC = gcc
671+ OPT_LEVEL = 3
672+
673+ CCFLAGS = -mtune=core2 ${MACOS} -O${OPT_LEVEL} -DMAKE_FOR_EXON -D MAKE_STANDALONE -D SUBREAD_VERSION=\"${SUBREAD_VERSION}\" -D_FILE_OFFSET_BITS=64 ${WARNING_LEVEL}
674+ CC = ${CC_EXEC} ${CCFLAGS} -fmessage-length=0 -ggdb
675+ ```
676+
677+ We see several problems at once
678+
679+ - `CC` is defined in the Makefile in a way that we do not want to redefine it on the `make`` command
680+ line as it also already includes all compiler options. It turns out we need to redefine `CC_EXEC`
681+ instead to use a different compiler.
682+ - `CCFLAGS` includes several options that should enter through `CFLAGS` and should not be imposed in
683+ a proper build process. The most dangerous one is the `-mtune=core2`, but in general we prefer to
684+ leave the choice of the optimisation level to EasyBuild also unless there are good reasons to use
685+ a very specific optimisation level.
686+ - One may wonder why at least some of the compiles did pick up `CFLAGS` then. This is because these
687+ files were compiled using an implicit rule that used the `CC` command as defined in `Makefile.Linux`
688+ so with a lot of compiler flags already added to it and then adds `CFLAGS` as defined on the `make`
689+ command line generated by EasyBuild. Those compile commands that were generated from an explicit rule
690+ don't pick up `CFLAGS` though.
691+
692+ There are two ways to fix this in EasyBuild (besides teaching the developer of this software package how
693+ to write a proper Makefile following the usual conventions).
694+
695+ 1. The approach which is usually followed is to make a patch file for `Makefile.Linux` that changes the line
696+
697+ ```
698+ CCFLAGS = -mtune=core2 ${MACOS} -O${OPT_LEVEL} -DMAKE_FOR_EXON -D MAKE_STANDALONE -D SUBREAD_VERSION=\"${SUBREAD_VERSION}\" -D_FILE_OFFSET_BITS=64 ${WARNING_LEVEL}
699+ ```
700+
701+ to, e.g.,
702+
703+ ```
704+ CCFLAGS = ${CFLAGS} -DMAKE_FOR_EXON -D MAKE_STANDALONE -D SUBREAD_VERSION=\"${SUBREAD_VERSION}\" -D_FILE_OFFSET_BITS=64 ${WARNING_LEVEL}
705+ ```
706+
707+ combined with changing the `buildopts` line to also overwrite `CC_EXEC`:
708+
709+ ```
710+ buildopts = '-f Makefile.Linux CC_EXEC="$CC" CFLAGS="-fast -fcommon"'
711+ ```
712+
713+ (or you could also change the `CC_EXEC` line in `Makefile.Linux` with the same patch to use the `cc` command,
714+ but that would also make the patch file Cray-only.)
715+
716+ 2. The other option is to simply edit `Makefile.Linux` using `sed` to replace
717+ `-mtune=core2 ${MACOS} -O${OPT_LEVEL}` with
718+ `${CFLAGS}`. This can be done by executing a `sed` command before calling `make`.
719+ As we shall see later in this tutorial, this can be done with `prebuildopts`:
720+
721+ ```python
722+ prebuildopts = "sed -e 's/-mtune=core2 ${MACOS} -O${OPT_LEVEL}/${CFLAGS}/' -i Makefile.Linux && "
723+ ```
724+
725+ and as in the previous case we also still need to overwrite `CC_EXEC` with the
726+ correct compiler on the `make` command line:
727+
728+ ```
729+ buildopts = '-f Makefile.Linux CC_EXEC="$CC" CFLAGS="-fast -fcommon"'
730+ ```
731+
732+ Now check the output of `eb subread.eb -x` to see what will happen during the build phase.
733+
734+ Let's implement the second approach, then do a full rebuild:
735+
736+ ```shell
737+ eb subread.eb -f
738+ ```
739+
740+ and then open the log file (again in the `easybuild` subdirectory of the software installation
741+ directory) and check what happened now during the build step.
742+
743+ As we scroll through the output of the build step, we still see a few lines mentioning
744+ `gcc`... It turns out there is a second Makefile hidden in the subdirectory `longread-one` so we
745+ need to edit that one too... So following the second approach we can do this with
746+
747+ ```python
748+ prebuildopts = "sed -e 's/-mtune=core2 ${MACOS} -O${OPT_LEVEL}/${CFLAGS}/' -i Makefile.Linux && "
749+ prebuildopts += "sed -e 's/-mtune=core2 ${MACOS} -O${OPT_LEVEL}/${CFLAGS}/' -i longread-one/Makefile && "
750+ ```
751+
752+ Now we can build once more and check the log file and finally we can be satisfied...
753+
754+ This exercise also show how tedious developing an easyconfig can be. And it also shows mistakes that
755+ are sometimes overlooked in easyconfigs that come with EasyBuild.
756+
757+
544758---
545759
546760* [[ next: Creating easyconfig files]] ( 2_02_creating_easyconfig_files.md ) *
0 commit comments