e-Science in the cloud with CARMEN

Watson, P

doi:10.1109/PDCAT.2007.4420133

e-Science in the cloud with CARMEN

Lookup NU author(s): Professor Paul Watson ORCiD

Downloads

Full text for this publication is not currently held within this repository. Alternative links are provided below where available.

Abstract

Understanding how the brain works is a major scientific challenge which will benefit medicine, biology and computer science. It requires knowledge of how information is encoded, accessed, analysed, archived and decoded by networks of neurons. Globally, over 100,000 neuroscientists are working on this problem. However, the data that forms the basis for their work is rarely shared even though it is difficult and expensive to produce. One of the main reasons for this is that vast amounts of data are produced in a variety of formats; this is then locally described and curated. One consequence is a shortage of analysis techniques that can be applied across neuronal systems. Further, there is only a limited amount of interaction between research centres with complementary expertise. The CARMEN project (www.carmen.org.uk) is addressing these challenges. It enables data sharing, integration, and analysis supported by metadata. An expandable range of services are provided to extract value from raw and transformed data. The project's approach is to design and build a generic e-science platform in the cloud. This provides functionality to neuroscientists, who access it over the web. Scientists upload the data they generate into the system (called a CAIRN) and describe it with metadata. Internally, the CAIRN is built as a set of Web Services. These include a workflow enactment service which allows scientists to analyze data by running workflows that utilise the analysis services also held in the CAIRN. Users can browse the catalogue of existing workflows to select one that is appropriate for the task they are trying to accomplish. More expert users can build their own workflows from the available services. Even more sophisticated users can create their own services to use in workflows. A novel feature of the architecture is that the services are stored in a repository in the CAIRN and scheduled on a grid as required by the execution of workflows. This promotes the sharing of analysis services as well as data, and allows services to execute close to the data on which they operate. This is essential to avoid having to ship vast quantities (TBs) of data out of the CAIRN to the user's machine for analysis. Storing both the data and the services in the CAIRN also enables the reproducibility of analyses. This talk describes the design of the CAIRN and shows how it is used to support neuroinformatics. © 2007 IEEE.

Publication metadata

Author(s): Watson P

Publication type: Conference Proceedings (inc. Abstract)

Publication status: Published

Conference Name: Parallel and Distributed Computing, Applications and Technologies (PDCAT)

Year of Conference: 2007

Pages: 5

Publisher: IEEE

URL: http://dx.doi.org/10.1109/PDCAT.2007.4420133

DOI: 10.1109/PDCAT.2007.4420133

Library holdings: Search Newcastle University Library for this item

ISBN: 0769530494

ePrints

e-Science in the cloud with CARMEN

Downloads

Abstract

Publication metadata

Share