.

Experimental platforms

Experimental platforms

The ARC provides a few experimental platforms for computing research.  This guide provides user information regarding the following platforms:

  • phileas -- a SandyBridge-EP server with an Intel Phi 5110P coprocessor

 


Phileas

Phileas is the name of an ARC experimental server, which features an Intel Phi coprocessor.  The server is a dual socket SandyBridge-EP E5-2650 with a total number of 32 physical cores and 64 GB of memory, and runs Red Hat Enterprise Linux 6.4.  Phileas has its own file system, and the common file system used on the production ARC clusters is not mounted.

The Intel Phi is a coprocessor which sits on a separate card and can be regarded as an independent linux node to which users can sshPhileas hosts a 5110P card, with 60 physical cores at 1.053 GHz, featuring 4 threads each (so a total of 240 logical cores) and 8 GB of memory.

The Phi coprocessor is best used as an accelerator for highly parallel applications.  It runs Linux OS and is very flexible in terms of usability.  Thus, users can

  • compile their C/C++ or Fortran OpenMP or MPI parallel application to target the coprocessor and run them directly on the coprocessor,
  • run MPI applications using the host and the Phi coprocessors concurrently (with the MPI work load carefully balanced to optimal performance) and
  • run an application on the host and offload parts of the computation to the coprocessor using OpenCL, or OpenMP.

In what follows, we give basic instructions for the first and second options, i.e. on launching OpenMP and MPI applications to run on the host and coprocessor. Offloading parts of the work using the OpenMP 4.0 directives is fully supported by the Intel compilers v. 14. They are available under the module "intel-compilers/2014".

Running OpenMP applications

Running OpenMP threaded applications on the Phi coprocessor is relatively straightforward.

Start by compiling the code on the host.  Load the module for the Intel compilers

module load intel-compilers/2013

Then, issue the command

icc -mmic -openmp -o myApp.mic myApp.c

which builds the target myApp.mic form the C source myApp.c.  Load the module for micnativeloadex with the command

module load mic-nativeloadex/2013

and launch the application on the host to run on the coprocessor with the command

micnativeloadex myApp.mic

Lauucning this way automatically uses all the threads available on the card (236 threads in the case of Phileas).  This behaviour can be simply modified via the environment variable OMP_NUM_THREADS, passed to the cards using the command

micnativeloadex myApp.mic -e "OMP_NUM_THREADS=120"

If the application needs command line parameters, these can be passed using the -a "args" flag to micnativeloadex.  Find more help from

micnativeloadex --help

 

Running MPI applications

This section contains instructions on how to compile and run MPI applications on Phileas.

As MPI implementations rely on ssh to establish communication with hosts and launch remotely, users need an account on both the host and the Phi card in order to sun MPI applications.  (This is in contrast with running OpenMP code on the card, which uses a different mechanism, and does not require an extra account.)

Using MPI, an applications can run on the host, on the coprocessor, as well as on the host and on the coprocessor at the same time.  Separate binaries have to be built from the same code, targeting the host and the coprocessor separately.

First, compile the source myApp.c to target the coprocessor by loading the appropriate modules, compiling and uploading the binary to the coprocessor.  The sequence of commands is

module load mic-compilers/2013 mic-mpi/2013
mpicc -mmic -o myApp.mic myApp.c
scp myApp.mic mic0:

Second, the same source myApp.c is compiled to target the host.  This is done using the appropriate compilers and MPI library for the host.  Following the sequence of commands above (targeting the coprocessor), this can be achieved by the following

module swap mic-compilers/2013 intel-compilers/2013
module swap mic-mpi/2013 intel-mpi/2013
mpicc -o myApp.host myApp.c

Finally, provided the modules intel-compilers/2013 and intel-mpi/2013 are loaded, users can launch the application into execution in three ways by issuing commands on the host.  Thus,

  • to run the application the host, do
mpirun -n 2 ./myApp.host
  • to run the application on the coprocessor, do
mpirun -n 2 -host mic0 $HOME/myApp.mic
  • to run the application on both the host and the coprocessor, do
mpirun -n 2 -host localhost ./myApp.host : -n 2 -host mic0 $HOME/myApp.mic

In addition to the above, there is a fourth possibility to launch into execution on the coprocessor directly from the coprocessor.  This involves ssh-ing into the coprocessor "linux node" and the commands are

ssh mic0
mpiexec -n 2 ./myApp.mic

 

Debugging

There are three ways to debug code on the Xeon Phi:
  • Debug offloaded computation by doing the following
    • insert an infinite loop at the beginning of the offloaded computation;
    • start gdbserver on the coprocessor;
    • start debugging on host and
    • when code reaches the infinite loop on the coprocessor, attach remotely to the debugger server running on the MIC
  • Debug natively on the Phi by running the debugger on the coprocessor.
  • Debug native code from the host by doing the following:
    • upload native code to the Phi
    • start gdbserver on the Phi
    • connect gdb (running on the host)  to gdbserver (running on Phi)
    • use gdb to debug the code.
The easiest wasy of debugging is the 3rd.  Here is a breakdown of operations and specific commands necessary to debug Phi native code from the host:
  • as root, upload the necessary libraries to the coprocessor

  • sudo scp /opt/intel/composerxe/composer_xe_2013.3.163/compiler/lib/mic/libiomp5.so root@mic0:/lib64
    sudo scp /usr/linux-k1om-4.7/linux-k1om/usr/bin/gdbserver root@mic0:/usr/bin
  • upload the native code (the executable is assumed to be program) to the coprocessor

  • scp program user@mic0:~
  • start gdbserver

  • ssh -t gdbserver :2001 program
  • start the debugger on host and beign debugging
  • user@host$ /usr/linux-k1om-4.7/bin/x86_64-k1om-linux-gdb
    (gdb) target extended-remote mic0:2001
    (gdb) file /home/user/program
    (gdb) run
    ...
    
Notes:
  • Intel provides users with its own gdb debugger binary x86_64-k1om-linux-gdb for the host and gdbserver for the coprocessor;
  • argument :2001 specifies the port on which the server is listening;
  • libraries uploaded as a user have to be in the path using LD_LIBRARY_PATH.