... | ... | @@ -7,6 +7,7 @@ |
|
|
The configuration has a frontal node, named irma-atlas, and 4 compute nodes.
|
|
|
|
|
|
Frontal node: irma-atlas:
|
|
|
|
|
|
* 64 cores on 4 sockets (AMD Opteron 6386 SE 2.8 Ghz),
|
|
|
* 512 GB of RAM,
|
|
|
* 2.4 TB of storage on SSD, (directory: /ssd)
|
... | ... | @@ -14,6 +15,7 @@ Frontal node: irma-atlas: |
|
|
* NFS mount to access laboratory data, (such as /home)
|
|
|
|
|
|
Compute nodes (x4):
|
|
|
|
|
|
* 24 cores on 2 sockets (Intel Xeon E5-2680 v3 2.50GHz), hyperthreaded
|
|
|
* 256 GB of RAM
|
|
|
* 1 TB scratch dir (/scratch)
|
... | ... | @@ -28,6 +30,7 @@ The workload manager is [slurm](https://computing.llnl.gov/linux/slurm/). |
|
|
#### Storage
|
|
|
|
|
|
On the frontal node irma-atlas, you have access to several storage:
|
|
|
|
|
|
* Your home directory: /home/<username>. This directory is meant only to store important files and have to be kept to a minimal size.
|
|
|
* The /data/<username> directory. If this directory does not exist, you must create one, so you don't mix your files with other users. This partition has a size of 50 TB, so you can store big data, like simulation results, compilation-related files and libraries ...
|
|
|
* The /ssd/<username> directory. If this directory does not exist, you must create one, so you don't mix your files with other users. This partition has a size of 2 TB and is put on SSDs for increased access speed. You can use it to store medium-sized data.
|
... | ... | @@ -39,9 +42,11 @@ The `module` command is installed on the cluster. It allows you to use specific |
|
|
The directory for custom installation is `/data/software`.
|
|
|
|
|
|
To be able to use module, you first have to source the base config file:
|
|
|
|
|
|
```
|
|
|
source /etc/profile.d/modules.sh
|
|
|
```
|
|
|
|
|
|
This can be done for example in the config file of your shell (e.g. `~/.bashrc` for bash or `~/.zshrc`for zsh).
|
|
|
|
|
|
Here is a sample code that you can put in your `.bashrc` or `.zshrc` to get the module command and all the available modules on atlas:
|
... | ... | @@ -76,23 +81,31 @@ fi |
|
|
|
|
|
You can then:
|
|
|
* List the available modules with:
|
|
|
|
|
|
```
|
|
|
module avail
|
|
|
```
|
|
|
|
|
|
* List the currently load modules with:
|
|
|
|
|
|
```
|
|
|
module list
|
|
|
```
|
|
|
|
|
|
* Load a new module:
|
|
|
|
|
|
```
|
|
|
module load <modulename>
|
|
|
```
|
|
|
|
|
|
* Unload a module:
|
|
|
|
|
|
```
|
|
|
module unload <modulename>
|
|
|
```
|
|
|
|
|
|
When using module, there are two kinds of modules available:
|
|
|
|
|
|
* Single modules: Modules that load environment for a specific library
|
|
|
* Profile modules: Meta-modules that load other modules
|
|
|
|
... | ... | @@ -104,17 +117,19 @@ In case you encounter problems, please ask the persons in charge, indicated when |
|
|
Feel++ is installed as a library available thanks to the module. It is recommended to take that solution if you do not need to modify the library.
|
|
|
|
|
|
* After the module configuration in your shell config file, add:
|
|
|
|
|
|
```sh
|
|
|
module load latest.testing.profile
|
|
|
module load science/feelpp/nightly
|
|
|
export CC=/usr/bin/clang-3.7
|
|
|
export CXX=/usr/bin/clang++-3.7
|
|
|
```
|
|
|
|
|
|
* The first line is to load the meta module that contains every Feel++ dependancies.
|
|
|
* The second is to load the library
|
|
|
* The two lasts are to use the same compiler as the one the library was installed with.
|
|
|
* Your root `CMakeLists.txt` has to be - considering you are trying to compile `yourCode.cpp`:
|
|
|
|
|
|
* Your root `CMakeLists.txt` has to be - considering you are trying to compile `yourCode.cpp` :
|
|
|
```sh
|
|
|
cmake_minimum_required(VERSION 2.8)
|
|
|
|
... | ... | @@ -140,6 +155,7 @@ By default, OpenMPI will use the best network available, i.e. Infiniband on the |
|
|
However, if you want to use TCP, please refer to [tcp params](http://icl.cs.utk.edu/open-mpi/faq/?category=tcp#tcp-params).
|
|
|
|
|
|
If you want to use ethernet instead of infiniband, just add the following option to the mpirun command:
|
|
|
|
|
|
```
|
|
|
mpirun -mca btl tcp,self -np X ...
|
|
|
```
|
... | ... | @@ -159,6 +175,7 @@ There are two partitions on which you can submit jobs on irma-atlas: |
|
|
You can access with ssh to the nodes, as long as you have a job running on that node.
|
|
|
If you use `sbatch` and then ssh, you will be disconnected from the node when the job ends.
|
|
|
If you want to keep accessing a node for a certain period of time, you can allocate a job and then connect to the node. To do so, you can use the `salloc` command, e.g:
|
|
|
|
|
|
```
|
|
|
# Here you allocate a job with the following constraints:
|
|
|
# -t "02:00:00": the job will remain active for 2 hours
|
... | ... | @@ -168,7 +185,7 @@ If you want to keep accessing a node for a certain period of time, you can alloc |
|
|
salloc -t "02:00:00" -p K80 -w irma-atlas4 --exclusive
|
|
|
```
|
|
|
|
|
|
*IMPORTANT NOTE*: Please be reasonable with your use of the `--exclusive` and `-t "XX:YY:ZZ"`, as it could prevent other users to access the node. You can cancel a job with `scancel`.
|
|
|
> **IMPORTANT NOTE:** Please be reasonable with your use of the `--exclusive` and `-t "XX:YY:ZZ"`, as it could prevent other users to access the node. You can cancel a job with `scancel`.
|
|
|
|
|
|
### Basic slurm script with an MPI application
|
|
|
|
... | ... | @@ -256,6 +273,7 @@ Note: Be careful, you can't use nohup in your ssh session. Why ? When you exit y |
|
|
Screen is a windows manager that let you create virtual terminals in several process. (See the Manual for full details)
|
|
|
|
|
|
Run a program in the background:
|
|
|
|
|
|
```
|
|
|
screen # open shell in a virtual window (BEFORE using chroot!)
|
|
|
screen -d # detach your virtual terminal
|
... | ... | @@ -263,14 +281,18 @@ screen -d # detach your virtual terminal |
|
|
You can now exit from your ssh session.
|
|
|
|
|
|
To recover your virtual terminal, use
|
|
|
|
|
|
```
|
|
|
screen -r
|
|
|
```
|
|
|
To list all available windows, type
|
|
|
|
|
|
```
|
|
|
screen -ls
|
|
|
```
|
|
|
|
|
|
There are many shortcuts you can use. Some of them are sumed up here:
|
|
|
|
|
|
```
|
|
|
<ctrl+a> <?> : Displays commands infos
|
|
|
<ctrl+a> <:> : Enter to the command prompt of screen
|
... | ... | @@ -285,6 +307,7 @@ There are many shortcuts you can use. Some of them are sumed up here: |
|
|
<ctrl+a> <X> : Close the current region
|
|
|
<ctrl+a> <d> : Detach from the current screen session
|
|
|
```
|
|
|
|
|
|
If you want to make you screen more user-friendly, you can customize it so that the bottom status line displays all the terminals opened in screen and the currently opened one. There are some configuration examples in the following link: [.screenrc examples](https://bbs.archlinux.org/viewtopic.php?id=55618)
|
|
|
|
|
|
See Manual for other features.
|
... | ... | @@ -294,18 +317,22 @@ See Manual for other features. |
|
|
### My code runs slower on a computing server than on my laptop. Whay is the problem ?
|
|
|
|
|
|
You might experience drops in performance when scaling to a larger computer, for example from a laptop. The most common way to solve this is to export the following variable:
|
|
|
|
|
|
```
|
|
|
export OMP_NUM_THREADS=1
|
|
|
```
|
|
|
|
|
|
This is due to an underlying library using the maximum number of available threads to execute a code. This causes drops in performance. When declaring the previous variable, you will only allow one thread per process, thus restoring expected performance.
|
|
|
|
|
|
Other factors that might harm performance:
|
|
|
|
|
|
* Slower hard drives
|
|
|
* Slower CPUs / Hyperthreading
|
|
|
|
|
|
### CMake
|
|
|
|
|
|
* You end up with errors similar to this one at the end of the cmake step:
|
|
|
|
|
|
```
|
|
|
CMake Warning at feel/CMakeLists.txt:59 (add_library):
|
|
|
Cannot generate a safe runtime search path for target feelpp because files
|
... | ... | |