Martin Senger <martin.senger@kaust.edu.sa> [using also few slides from the presentations from the Galaxy Developers Conference 2011] 1. 2. 3. What Galaxy can do... (or could do) Show me where I can try for myself What can we do to make our Galaxy better ...and what this is not • a detailed tutorial how to use Galaxy • a way to convince you that I understand everything about Galaxy A web-based interface to the command-line tools (of any kind) and their combinations (“workflows”) Galaxy performs analysis interactively through the web, on arbitrarily large datasets Galaxy remembers what it did - history Flexibility to include anybody’s command-line tools by writing wrappers whose templates are available An environment for sharing tools (or their wrappers) “Tools Shed” repository Locally stored data user-specific shared between users e.g. genome builds Origin of data uploaded data from your computer using a web interface using an FTP server fetched from external databases (“datasources”) only those that are “aware” of Galaxy internally: two ways how to fetch data (async vs. sync.) you need to be familiar with these databases and their UIs 1 2 3 • Data have metadata • allowing to use data only for those tools that recognize such data types • Data have attributes • annotate data • convert data to a new format • change data type Automated set of steps – perhaps each time with different input data (of the same type) reproducibility (usable in publications) reusability (sharing workflows with others) created from the scratch (using a workflow editor) or from your history An example – a workflow editor Thanks to: user would not have done this from the command line on our cluster • http://main.g2.bx.psu.edu/screencast • If we have time (6mins) click here: • Creating a workflow from your history Where are all these galaxies? public servers available immediately, free of charge http://main.g2.bx.psu.edu/ and few others, such as http://galaxy.nbic.nl/ usually limited resources you cannot customize them to your special needs KAUST/CBRC Galaxy http://galaxy.cbrc.kaust.edu.sa/ running on an internal cluster with limited resources but we can do with it whatever we need to do Galaxy in the Amazon clouds (CloudMan) when you do not have infrastructure in house when you have particular resource (cores, memory...) needs when you need a customization if you have a credit card details in this presentation: http://wiki.g2.bx.psu.edu/GCC2011?action=AttachFile&do=get&target= CloudManGalaxyOnTheCloud.pdf Galaxy has also the RESTfull API for programmatic access (beta) ...we need to: Image courtesy of http://mychinaconnection.com/english-proverb/there-is-no-free-lunc Data issues Tools make a subset of tools we really need and test them fully consider to wrap other tools (not yet available by default) Logistics add genome-wise data we (CBRC) need add data usable for others (Core, students...) provide user-oriented courses create a user group to share experience and to promote knowledge monitor its stability and usage Hardware/sysadmin issues Install it on better hardware (in due time) Change the current queue priority (a chicken-egg problem) Add an ftp server Galaxy home page: http://galaxy.psu.edu/ An overview presentation: http://wiki.g2.bx.psu.edu/GCC2011?action=AttachFile&do= get&target=IntroductionSession.pdf