-
Notifications
You must be signed in to change notification settings - Fork 0
Home
Welcome to the synergy wiki! Here you'll find all the info you need about using the synergy compute cluster
About
synergy is a joint effort between the Faculty of Veterinary Medicine, the Western Canadian Microbiome Centre and the Alberta Children's Hospital Research Institute. It is managed and hosted by the Cummings School of Medicine Centre for Health Genomics and Informatics. It currently consists of 16 general compute nodes each with 28 cores (56 threads) and 256 GB of memory attached to a 1.5 PB file-system. Several more high-memory (1.2 - 3 TB) compute nodes will be added in at a later date.
Usage
The cluster is available to all members of the above institutes. See the sidebar for login instructions and instructions for using the LSF batch job submission system. If you use synergy there are a few important items that we ask our community to adhere to:
- Please don't run big jobs on the login node, which is to be used for file manipulation, writing scripts and other relatively small tasks.
- Most common bioinformatic tools can be installed with conda so with a few rare exceptions you are responsible for installing the programs you need.
- The amount of disk space you have access to is dependent on your faculty/institute/PI. All users are required to actively manage their disk space, removing files as necessary. Avoid duplicating your files and use symlinks where you can. Old project files should be removed for archiving.
Support
Support is provided for technical issues, however, due to the wide variety of projects and users it is not possible to always troubleshoot problems with specific software. Instead we would like to encourage all users to post any problems on the CalBug slack synergy channel where, as a community of users, we can support each other.
Also, we highly encourage anyone who would like, to post tutorials and how-to's here on this wiki. These can be general computing tutorials or specific to different types of analysis.
Finally, if you're new to the command line the best resource to get started is the Command Line course on DataCamp. A DataCamp subscription is very worthwhile and there is a wealth of good material. A few other resources to get you started are:
- R, Python and other data science related courses on Coursera and EdX. There are some particularly good genomics courses on EdX.
- Lesson material on Software Carpentry and Data Carpentry
- Mike Love's course notes on Introduction to Computational Biology
- There are loads of good blogs online. A few gooders to start with are Getting Genetics Done, Dave Tang's blog, Blue Collar Bioinformatics, The Genome Factory, and Living in an Ivory Basement. Not all of these are solely tutorial based but you can browse through the archives for good material.
If you find other very helpful material please feel free to post it here.