OREGON STATE UNIVERSITY

Account Request Video

You are missing some Flash content that should appear here! Perhaps your browser cannot display it, or maybe it did not initialise correctly.

Welcome to the CGRB bioinformatics information video. This video is  
meant to give you a general layout of the infrastructure and provide  
some techniques to start doing bioinformatics. The CGRB maintains many 
services that utilize our cloud infrastructure.

The first services we will discuss is the website 
"http://shell.cgrb.oregonstate.edu".This is the main portal website and 
for the  bioinformatics infrastructure. This site will have links, 
documentation, account request and other basic resources. Users should 
look to this site if they are seeing problems as we will post messages 
on the announcements page.

One of the main links on the website is called ETA the main 
bioinformatics access portal. This tool will allow users to interact 
with data, submit jobs, monitor jobs, create wrappers, request features 
and share data. All jobs submitted through ETA will automatically be 
submitted to the computational cloud. Inside ETA users will find 
"wrappers" or web forms to different programs providing simple ways to 
interact with your data and tools for analysis. Users can also use the 
request system to ask for a new program, wrapper, feature or bug fix.

Some users will need access to a command line for processing and 
development. We provide an SSH or secure shell access server to provide 
a command line access point to the bioinformatics infrastructure. This 
method represents the resource with the fewest limitations. Users using 
SSH will be asked to submit work using a command line tool called 
SGE_Batch. This tool is both interactive and non-interactive. When using 
the command line please read the instructions provided or ask questions.

We ask users of the infrastructure to apply one computational rule when 
processing data or testing new algorithms. This rule is called 
10-100-1000. This is a method on how to approach working with large data 
sets and / or new algorithms that need to be tested. We are asking that 
users test all aspects of process before full jobs are launched into the 
cloud.

This boils down to creating test data files from your raw data files 
that contain 10, 100 and 1000 entries to use as test data. This will 
allow you to ensure that you have the correct inputs to the program, 
that you are using the options you want correctly and that you are 
getting proper output. This method will also let you know how much down 
stream data will be generated and how much space is needed to store that 
work. Finally this method will also let you evaluate the time need to 
accomplish the work. If the time needed to process data with a 
particular program increases exponentially then you may not want use 
that tool or filter data before it process through that part of the 
pipeline. Users who are not testing programs and data using the 
10-100-1000 rule may loose access to the infrastructure. Please ask 
questions or make a request for help if you are having problems testing 
your work. We have created a wrapped called "Generate Test Set" that can 
help you create test data files from fasta for fastq formatted files.

In general there are 3 limiting reagents to any computer; number of 
processors amount of memory, and storage space. If the one of these 
items is used up the machine is at max load. It is important to know the 
amount of memory , cpu and storage you need prior to submitting work on 
the cloud. Please submit jobs with the needed resources so the correct 
machine will be reserved to do the work. Again please ask questions or 
make a request for help if you are having problems testing your work.

Users use the infrastructure in many ways and the CGRB tries to 
facilitate moving in new directions. If you have need of new tools or 
would like to request new features to either the web interface (ETA) or 
the command line access (SSH) please fill out a request within the ETA 
system. You will find a request link near the top right of the page 
after you login to the system. Please ask questions or make a request 
for help if you are having problems.

Thanks you very much for watching this video.

CGRB Support