.

FAQ

Registration and Access

Software and Applications

Jobs

General

 

Registration and Access


How do I become a ARC user?

The supercomputing facility is available to all Oxford University researchers.  You can become a user by registering with the ARC. Start with the registration page.


Where should a new ARC user begin?

Look at information in the Getting Started section of the ARC website.


What systems can I access as a user?

All registered users have access to all the ARC systems.


How much disk space do I have?

Default quota on /home is 15 gigabytes (GB) per user and on /data is 5 terabytes (TB) per project/group.  Larger quotas limits are available on request.


How do I log on to the ARC systems?

Linux and Mac users should use ssh to connect to the ARC.  For example, issuing the following command 

ssh -X bob@arcus.arc.ox.ac.uk
ssh -X box@arcus-b.arc.ox.ac.uk

from a Linux or Mac terminal, the user bob log on to system (this could be arcus, arcus-b, arcus-gpu or caribou).  The user is prompted to provide the password.

We recommend the use the -X flag, which allows users to run graphical applications remotely (such as the emacs editor or the arctool utility).  This option allows X11 forwarding using subject to security control.

Windows users should install an application called PuTTY, an open source client for ssh. To connect to one of the ARC machines, PuTTY has to operate in the "ssh" mode.  To allow remote applications to display graphics through the link, the X11 tunelling option has to be enabled.  (The best way to learn about how to do this is to search the web for "putty x11 tunnelling".)  Additionally, Windows users must install a X11 server, and xming is arguably the best option.


How do I transfer files to/from the ARC systems?

Linux and Mac users should use scp for transfering files.  For example, the command

scp localfolder/myfile.txt bob@arcus-b.arc.ox.ac.uk:/path/to/remote/folder

copies the file myfile.txt from the local host to system in the directory /path/to/remote/folder on the ARC system.  Also,

scp bob@arcus-b.arc.ox.ac.uk:/path/to/remote/folder/myfile.txt localfolder/

copies myfile.txt from the remote directory to the local one.  In both cases, the target file is overwritten if it exists.  Study the scp Linux manual (man scp) to learn more; for example the -r (recursive) option is useful for transferring entire directories.

 

Linux and Mac users may also find rsync very useful, for instance for maintaining directory structures on their workstation in sync with copies on the ARC storage.  The command

rsync -avz localfolder bob@arcus-b.arc.ox.ac.uk:/path/to/remote/folder

copies the directory localfolder to the directory /path/to/remote/folder on the ARC system. The files are transferred recursively in "archive" mode, which ensures that symbolic links, permissions, ownerships, etc. are preserved.  The example above also uses compression (-z) to reduce the size of the transferred data. Unlike scp, rsync does not simply overwrite existing files but rather updates, transferring only the differences between two files.

Windows users can use pscp, usually bundled with PuTTY.  Alternatively, WinSCP is a very easy to use, open source scp client for Windows.


How do I access the ARC systems from outside the University network?

There are two main ways to access the ARC systems from outside the University network:

  • By connecting to the University of Oxford virtual private network (VPN) and accessing the ARC systems as though users were on campus (see http://help.it.ox.ac.uk/network/vpn/index for more details).
  • By connecting to the ARC external access server, oscgate. oscgate has a firewall in place and only permits access from known locations.  Thus, users need to provide the ARC (static) IP addresses from the location that they need to connect from.  This should ideally be an easily identifiable institution.

The first method is the preferred form for any users who have University SSO passwords.  The second method is useful for users who are external collaborators.  Further information at http://www.arc.ox.ac.uk/content/oscgate.

 

Software and Applications


What software/applications are available on the ARC systems?

A table of some of the software available is given on the Software page.  To obtain a complete list of software available on any particular ARC system, use the module avail command.


What compilers are available on the ARC systems?

The Performance Compilers section of the Software page provides details of currently supported compilers.  ARC currently has Intel C/C++/Fortran compilers and Portland C/C++/Fortran/CUDA compilers.


Do you support application X?

Please check availability of software using this table and/or the module utility.  If you would like us to support applications which are not available yet, let us know.  Users can also install applications by themselves (in their home directories) if they prefer so, and the ARC staff can assist with this if required.


Why do I get "command not found" when I try to run application X?

Applications must be loaded using the module utility.  For example, the command module load intel-compilers loads the Intel compilers, so that executables like icc and ifort are in the path.  Without loading the module, invoking the application leads to a shell error with the message "command not found".

If no version of the module is specified, e.g. as in doing module load intel-compilers, the default is loaded.  You can also load a specific version of a module, for example module load intel-compilers/2012.


How do I run application X as a batch job?

In essence, to run an application, you have to

  • load the appropriate module,
  • prepare a job script for the job scheduler and
  • submit the job to the batch processing queue.

The ARC runs an open-source job scheduler called Torque Resource Manager.  Information on how to prepare a script, submit a job and use some of the advanced features is found in the Torque documentation.


How do I run a MPI application?

mpirun is the command used to start any MPI distributed application on a cluster.  mpirun is one of the utilities that are part of a MPI library and is responsible for running an application on distributed resources, i.e. launching MPI processes on remote hosts and controlling and managing the communication between them.  More information can be found on this page, which shows how to compile and run ab MPI application.


How do I run an OpenMP application?

OpenMP threaded applications are launched into execution by simply invoking the executable name, without any special launcher.  Depending on the application and the way it was programmed, the number of execution threads can be set in number of ways.  Typically, the number of OpenMP threads is controlled via the environment variable OMP_NUM_THREADS, e.g. export OMP_NUM_THREADS=8 sets 8 execution threads.  Nevertheless, there are applications that take this number as a command line input or read it from an input file.


Does my code benefit from hyper-threading?

The proprietary hyper-threading technology (HT) from Intel can be used to boost the performance of most (but not all!) scientific application. For each physical processor core, the operating system addresses two virtual (or logical) cores through HT, so that the operating system can schedule two concurrent processes or two threads on the same physical core. Most applications benefit in performance from this technology and where applications see an increase in performance, this varies from almost no increase to substantial (expecting an increase of 10% is reasonable). For instance, Gromacs can see a increase of up to 16% (depending on the problem run). On the other hand Gaussian does not benefit from HT, on the contrary - performance deteriorates slightly. Users are advised to experiment with their applications to determine if there is a benefit from HT or not

 

Jobs


How do I submit, check the status, and/or delete a batch job?

arcus a ( for arcus-b cluster consult slurm pages) 

With a submission script called submit.sh, then to submit this batch script, use the qsub command:

qsub submit.sh

To check the status of a batch job use the command qstat.  If you know the job ID, ie. 12345.sal then use the command:

qstat 12345

To delete a batch job, you need to know the job ID and then use the qdel command:

qdel 12345


Why does my queued job not start?

Jobs may queued for various reasons.  A job may be waiting for resources to become available.  Or you might have hit a limit for the maximum number of jobs that can be running on the system. One way to determine why a job is queuing is to use the qstat -f command and look at the comment output.  ie, if the job ID is 12345.sal

qstat -f 12345

The comment field is normally towards the bottom of the qstat output.


Why do I get "Job rejected by all possible destinations"?

The Torque scheduler rejected the job on a particular queue because one of an error in the Torque  submission script.  For example mis-spelling "select" or requesting "ppn=8" instead of "mpiprocs=8" causes Torque to reject the job with the above message.


Why do I get "Job exceeds queue and/or server resource limits"?

The Torque scheduler rejected the job on a particular queue because one of the resources requested in the submission script exceeded the resources allocated for that queue.  For example, if the maximum walltime for a job in a queue is 120 hours, the job will be rejected with the above message when 200 hours are requested for walltime.  The resources allocated for a queue can be checked using the qmgr command; for example

qmgr -c "print queue workq"

prints the settings for the queue workq.


Why does my job fail with the error "/bin/bash^M: bad interpreter: No such file or directory"

You have edited your submission script using a Windows editor (such as notepad).  Windows has extra characters at the end of a line, which are not needed under Linux and which cause the Torque qsub command to fail.

Eliminate the extra characters by doing

dos2unix myScript.sh


Why does my job fail/die after running for just a few seconds?

There is a problem with the job submission script.  The error output from the job submission should provide some information as to why the job failed.  If you require help with determining what the problem is, please contact the ARC support team and provide relevant details to help with diagnosis.  This should include Job ID and batch script details.  Time/date of submission and which ARC system the job was submitted is also useful.


Why does my job fail/die after running for a few hours/days?

Possibly your job has run out of walltime.  Every job has a walltime limit that is specified in the submission script or by the qsub command or picked up from the relevant default value.  See next question regarding requesting increase to the walltime of a running job.


How can I increase the walltime of a running job?

If you submit a job and find that it may not finish within the requested walltime, then to avoid having the job terminated when it reaches its walltime limit, please contact the ARC support team with details of the job (Job ID and ARC system the job is running on) requesting that the job walltime be increased.  If you are able to estimate the additional walltime required this is helpful.


How can I get an email notification when a job begins/finishes?

Include the "-m be" options in the job submission script.  These can be specified at the beginning of the job submission script as a line of the form:

#PBS -m be

or included on the qsub command line as:

qsub -m be submit.sh

More details about Torque qsub options can be found in the qsub man page (man qsub)


How can I check the availability of free compute nodes?

On ARC clusters use the command qfree.

 

General


How do I acknowledge ARC in my publications?

Please see the acknowledgements section on the ARC Terms and Conditions page.


Will application X run on the ARC supercomputers faster than on my workstation?

We hope so.  If you wish to investigate whether you can achieve speed up on your application, then please contact us.


How many credits do I have left?

Use the mybalance command to find out how many credits you have left.


How do I change my password?

In order to change your account password you need to use ssh to connect to myaccount.arc.ox.ac.uk and run the passwd command.

$ passwd

In order to change your account password you need to use ssh to connect to myaccount.arc.ox.ac.uk and run the pa

 


I have forgotten my password. How do I reset my password?

If you have forgotten your password or your password has expired and you can no longer access ARC systems to change it yourself using the arctool, then you will need to contact us using the support email address.  We will then reset your password to a temporary value via email.


I accidentally deleted files, how do I get them back?

Unfortunately ARC does not keep dedicated backups of its storage resources.  We recommend that users store data at sites other than the ARC, for example, on departmental resources.

On the new Panasas storage daily snapshots are available. Please see our page under Services on storage management