Quick Start Guide

This page contains the information you need to start using ARC systems. Advanced Research Computing (ARC) is a central IT resource, which provides a High Performance Computing (HPC) service to
all Oxford University researchers. At the heart of the ARC service are our HPC compute clusters and high-performance data storage.

Clusters

ARC operates two compute clusters - arcus-htc, our High Throughput cluster, and arcus-b, our parallel workloads system. The main differences between them are:

cluster description node details min. job size notes
arcus‑htc

Optimised for single core jobs, and SMP jobs up to one node in size.

Also catering for jobs requiring resources other than CPU cores (e.g. GPUs).

CPU: Sandybridge (16 core)

GPU: K40, K80, P100, V100

Novel architectures: KNL, Titan RTX GPUs

1 core

Nodes on arcus-htc are in not allocated exclusively to jobs; jobs on arcus-htc are allocated the requested number of cores and share nodes with other jobs. The default number of cores allocated is 1.

If you want to use a node in exclusive mode add "--exclusive" to your sbatch command or the line "#SBATCH –exclusive" to your submit script.

arcus-b

Our largest compute cluster. Optimised for large parallel jobs spanning multiple nodes.

Offers low-latency interconnect.

CPU: Haswell (16 cores), Broadwell (20 cores)

Higher memory nodes (128, 256Gb nodes or 1.5Tb) available.

16 cores

(1 node)

Nodes are allocated exclusively to jobs.

The minimum chargeable allocation is 16 cores (1 node).

 

arcus-htc can be accessed via oscgate.arc.ox.ac.uk. 

Operating system

All ARC systems run the Linux Operating System (as opposed to Windows or MacOS), which is commonly used in HPC. Arcus-htc runs CentOS 7; arcus-b runs CentOS 6. We do not have any clusters running Windows (or MacOS), so your software must work on Linux
if you want to use ARC systems. If you are unfamiliar with using Linux, please consider:

  • Finding introduction to Linux resources online (through Google/Bing/Yahoo etc).
  • Working through our brief Introduction to Linux course.
  • Atttending our Introduction to ARC training course (this does not teach you how to use Linux but the examples will help you gain a greater understanding).

Connecting to ARC

Access to ARC systems (including arcus-htc and arcus-b) is via Secure SHell (SSH). 

How do I login to my account?

Remote access : Once you get your account you need to remotely access ARC systems. Linux and Mac users (and other Unix/Unix-like) should use ssh (secure shell) to connect to the ARC systems from a terminal 

arcus-b : ssh -X username@arcus-b.arc.ox.ac.uk (left)

arcus-htc  : ssh -X  username@oscgate.arc.ox.ac.uk  followed by ssh -X arcus-htc (right)

ssh to ARC

Windows

Windows users will need to download and install an application that allows them to use ssh - a great example is MobaXterm.

Download MobaXterm here.

mobaxterm

Please note: Access to ARC machines is only from within the University of Oxford network. If you are not on the University network, you should use the University VPN service to connect. If you are unable to use the VPN service, you may be able to register
your static IP with the ARC team (support@arc.ox.ac.uk) to enable access.

How do I copy data to/from ARC?

MacOS (terminal) and Linux users can use scp (secure copy) or secure file transfer protocol (sftp).

Windows users can use MobaXterm.

How do I change my password?

User account management is performed on myaccount.arc.ox.ac.uk (use ssh to connect). Passwords can be changed by running the "passwd" command from a terminal:

user@myaccount:~$ passwd Enter login(LDAP) password:

Enter new passwd:

Re-enter new passwd:

If you need to change other things in your account (e.g. email address), this is possible - please ask the ARC team by emailing support@arc.ox.ac.uk.

 

(Should include what a job is, instead of just the difference between a batch and interactive)

Batch vs Interactive job ?

An interactive job starts when you log on to the system and ends when you logout. During the run, you interact with the system and / or your software in real time, much like you would with your PC or laptop.

A batch job is a predefined group of processing actions that require no interaction between you and the system. When you submit a batch job, the job is added to a work queue and executed at a later time, when the system has the resources to process the
job.

SLURM – Queueing system in arcus-b and arcus-htc

Job scheduling and workload management on both ARC clusters is SLURM (Simple Linux Utility for Resource Management).

How do I submit a batch job ?

You need to use a shell script with instructions to SLURM, containing shell commands to indicate what is to be done in the job.

Edit your script on the login node. At the prompt use a standard editor such as nano:

Prompt> nano job1.run

arcus-b

As an example - to request two compute nodes, running 16 processes per node (using MPI), with a two hour wall time, enter the following:

#!/bin/bash

#SBATCH --nodes=2

#SBATCH --ntasks-per-node=16 

#SBATCH --time=02:00:00 

#SBATCH --job-name=myjob

. enable_arcus-b_mpi.sh 

mpirun myprogram

arcus-htc

As an example for arcus-htc -iIf you were to request a single core for 10 minutes, with one task on the node, this is what you would type:

#!/bin/bash

#SBATCH --time=00:10:00

#SBATCH --job-name=single_core

#SBATCH --ntasks-per-node=1

#SBATCH --partition=htc

module purge

module load testapp/1.0

#Calculate number of primes from 2 to 10000

prime 2 10000

 

Basic SLURM Commands

  •  sbatch: Submit job (text file) to queue for example sbatch job1.run
  •  squeue: Monitor the queue
  •  scancel: Cancel the job (made a mistake?)
  •  srun: used to submit a job for execution in real time
  •  sinfo: reports the state of partitions and nodes managed by SLURM
  •  sacct: report job accounting information for active or completed jobs
  •  salloc: allocate resources in real time

ARC Accounts

All jobs (regardless of whether they are free or charged) consume credits. Credits are usually consumed at the project level. Use the command "mybalance" to check your credit level.

"JobHeldAdmin" job state usually means your project has run out of credit

Further help

Manual (or man) pages.

Works for many Linux commands, e.g. man sbatch. Quit the man page by pressing the 'q' key.

For support email support@arc.ox.ac.uk.

Disk space

Users have a $HOME area with a 15GB quota; this is the directory you are in when you log in. Users also have a $DATA area which shares a 5TB quota with your project colleagues, and a per-user $SCRATCH area which is for temporary data/workfiles. As a rule
we recommend that you use your $DATA area to store your data, but work from $SCRATCH - generally, by copying all required data (including submission scripts) to $SCRATCH and run your jobs from there. For more details on where you can store files,
please see our Storage page.

Software

There are many software packages already installed; these are managed through the environment modules system. You will find advice on how to run some of the more popular applications under the 
applications & software section of our support pages. You can also build your own software in your home or data directories using one of the compilers provided (which are also available through the environment
modules system).

Running jobs

To do work on ARC's clusters, you will need to submit a job to the job scheduler; the login nodes are for preparing the work that you need to run and should not be used for performing computational work. The systems use SLURM as job scheduler. You will find information on preparing your job submission script on our slurm pages. When requesting resources in your job submission script, it is important to remember that the smallest unit you can request on arcus-b is a single node;
jobs requiring 16 processors or less should be run on arcus-htc. Please contact us if you need help with getting more out of your requested resources.

Running jobs consume credits on ARC. Please read our Accounting System and our Requesting Credit pages for an explanation as to how credits are calculated and how
to request more.

Training

ARC run a number of training courses that can help you get more from the ARC facility, and teach you parallel programming (for example). For a full list of courses please see our Training page.