lecture image CCT Colloquium Series
Harnessing distributed computing resources with workflow systems.
Michael Wilde, Mathematics and Computer Science division, Argonne National Laboratory
Software Architect
Johnston Hall 338
February 23, 2007 - 03:00 pm
While national cyberinfrastructure facilities such as TeraGrid and the Open Science Grid offer vast computing and storage resources, individual researchers and engineers often find it challenging to effectively aggregate multiple computing sites and to utilize these resources in a distributed manner. And, once their computing problems are conquered, grid users are then faced with the challenge of tracking their work and the massive datasets they derive, communicating their methods to collaborators and assessing the provenance of their results. A solution to both of these problems may be found in the evolving workflow systems that are shaping up to become the scripting languages of scientific computing in the coming decade. We will present some of the challenges faced by users of distributed computing facilities, explain various strategies applied by workflow systems to solve these problems, and describe case studies of grid workflows in a variety of disciplines. The talk will focus on an approach to grid workflow that that can leverage large-scale distributed resources by applying a location-independent model of distributed computing and by integrating provenance tracking into the workflow execution model.
Speaker's Bio:
Michael Wilde is a software architect in the Mathematics and Computer Science division of Argonne National Laboratory, and fellow of the Computation Institute of Argonne and the University of Chicago. Wilde served as project coordinator of the Grid Physics Network – GriPhyN – project, and serves as education, outreach and training coordinator of the Open Science Grid. He works with researchers in disciplines including physics, astronomy, computational biology, neuroscience, sociology, medicine and psychology to solved computational problems on the national grids using workflow tools.