Text Only Login to PAWS Baton Rouge, Louisiana |
LSU Homepage
homeaboutprogramprojectscyberinfrastructurenewseventscontact
Dan: Contact/Bio | Education | Projects | Students | Activities | Copies of Papers | Links&Quotes
Access to published material may be restricted.
  • G. Singh, M.-H. Su, K. Vahi, E. Deelman, B. Berriman, J. Good, D. S. Katz, and G. Mehta, "Workflow Task Clustering for Best Effort Systems with Pegasus," Proceedings of 15th Mardi Gras Conference, 2008.
    (This paper may be subject to copyright restrictions)

    Abstract: Many scientific workflows are composed of fine computational granularity tasks, yet they are composed of thousands of them and are data intensive in nature, thus requiring resources such as the TeraGrid to execute efficiently. In order to improve the performance of such applications, we often employ task clustering techniques to increase the computational granularity of workflow tasks. The goal is to minimize the completion time of the workflow by reducing the impact of queue wait times. In this paper, we examine the performance impact of the clustering techniques using the Pegasus workflow management system. Experiments performed using an astronomy workflow on the NCSA TeraGrid cluster show that clustering can achieve a significant reduction in the workflow completion time (upto 97%).

  • D. S. Katz, J. C. Jacob, P. P. Li, Y. Chao, G. Alen, "Data-Oriented Distributed Computing for Science: Reality and Possibilities, On the Move to Meaningful Internet Systems 2006: CoopIS, DOA, GADA, and ODBASE, Lecture Notes in Computer Science, v. 4276, pp. 1119-1124, 2006.
    (This paper may be subject to copyright restrictions)

    Abstract: As is becoming commonly known, there is an explosion happening in the amount of scientific data that is publicly available. One challenge is how to make productive use of this data. This talk will discuss some parallel and distributed computing projects, centered around virtual astronomy, but also including other scientific data-oriented realms. It will look at some specific projects from the past, including Montage, Grist, OurOcean, and SCOOP, and will discuss the distributed computing, Grid, and Web-service technologies that have successfully been used in these projects.

  • J. C. Jacob, D. S. Katz, G. B. Berriman, J. Good, A. C. Laity, E. Deelman, C. Kesselman, G. Singh, M.-H. Su, T. A. Prince, R. Williams, "Montage: A Grid Portal and Software Toolkit for Science-Grade Astronomical Image Mosaicking," International Journal of Computational Science and Engineering, in press.
    (This paper is not subject to copyright restrictions in the US)

    Abstract: Montage is a portable software toolkit for constructing custom, science-grade mosaics by composing multiple astronomical images. The mosaics constructed by Montage preserve the astrometry (position) and photometry (intensity) of the sources in the input images. The mosaic to be constructed is specified by the user in terms of a set of parameters, including dataset and wavelength to be used, location and size on the sky, coordinate system and projection, and spatial sampling rate. Many astronomical datasets are massive, and are stored in distributed archives that are, in most cases, remote with respect to the available computational resources. Montage can be run on both single- and multi-processor computers, including clusters and grids. Standard grid tools are used to run Montage in the case where the data or computers used to construct a mosaic are located remotely on the Internet. This paper describes the architecture, algorithms, and usage of Montage as both a software toolkit and as a grid portal. Timing results are provided to show how Montage performance scales with number of processors on a cluster computer. In addition, we compare the performance of two methods of running Montage in parallel on a grid.

  • G. B. Berriman, J. C. Good, A. C. Laity, J. C. Jacob, D. S. Katz, E. Deelman, G. Singh, M.-H. Su, and T. Prince, "The Design and Applications of Montage: An Astronomical Image Mosaic Engine," Proceedings of the 2006 Earth Science Technology Conference (ESTC-06), 2006.
    (This paper may be subject to copyright restrictions)

    Abstract: Montage is a portable toolkit for constructing custom, science-grade mosaics by composing multiple astronomical images. The mosaics constructed by Montage preserve the astrometry (position) and photometry (intensity) of the sources in the input images. The mosaic to be constructed is specified by the user in terms of a set of parameters, including dataset and wavelength to be used, location and size on the sky, coordinate system and projection, and spatial sampling rate. Many astronomical datasets are massive, and are stored in distributed archives that are, in most cases, remote with respect to the available computational resources. The paper describes scientific applications of Montage by NASA projects and researchers, who run the software on both single- and multi-processor computers, including clusters and grids. Standard grid tools are used to run Montage in the case where the data or computers used to construct a mosaic are located remotely on the Internet. This paper describes the architecture, algorithms, and performance of Montage as both a software toolkit and as a grid portal.

  • D. E. Bernholdt, B. A. Allan, R. Armstrong, F. Bertrand, K. Chiu, T. L. Dahlgren, K. Damevski, W. R. Elwasif, T. G. W. Epperly, M. Govindaraju, D. S. Katz, J. A. Kohl, M. Krishnan, G. Kumfert, J. W. Larson, S. Lefantzi, M. J. Lewis, A. D. Malony, L. C. McInnes, J. Nieplocha, B. Norris, S. G. Parker, J. Ray, S. Shende, T. L. Windus, and S. Zhou, "A Component Architecture for High-Performance Scientific Computing," International Journal of High Performance Computing Applications, v. 20(2), pp. 163-202, Summer 2006.
    (This paper may be subject to copyright restrictions)

    Abstract: The Common Component Architecture (CCA) provides a means for software developers to manage the complexity of large-scale scientific simulations and to move toward a plug-and-play environment for high-performance computing. In the scientific computing context, component models also promote collaboration using independently developed software, thereby allowing particular individuals or groups to focus on the aspects of greatest interest to them. The CCA supports parallel and distributed computing as well as local high-performance connections between components in a language-independent manner. The design places minimal requirements on components and thus facilitates the integration of existing code into the CCA environment. The CCA model imposes minimal overhead to minimize the impact on application performance. The focus on high performance distinguishes the CCA from most other component models. The CCA is being applied within an increasing range of disciplines, including combustion research, global climate simulation, and computational chemistry.

  • E. Deelman, G. Singh, M.-H. Su, J. Blythe, Y. Gil, C. Kesselman, G. Mehta, K. Vahi, G. B. Berriman, J. Good, A. Laity, J. C. Jacob, and D. S. Katz, "Pegasus: a Framework for Mapping Complex Scientific Workflows onto Distributed Systems," Scientific Programming, v.13(3), pp. 219-237, November 2005.
    (This paper is subject to copyright restrictions)

    This paper describes the Pegasus framework that can be used to map complex scientific workflows onto distributed resources. Pegasus enables users to represent the workflows at an abstract level without needing to worry about the particulars of the target execution systems. The paper describes general issues in mapping applications and the functionality of Pegasus. We present the results of improving application performance through workflow restructuring. Abstract:

  • D. S. Katz, N. Anagnostou, G. B. Berriman, E. Deelman, J. Good, J. C. Jacob, C. Kesselman, A. Laity, T. A. Prince, G. Singh, M.-H. Su, and R. Williams, "Astronomical Image Mosaicking on a Grid: Initial Experiences," Engineering the Grid - Status and Perspective, (Editors: B. Di Martino, J. Dongarra, A. Hoisie, L. Yang, and H. Zima,) American Scientific Publishers, 2006.
    (This chapter is not subject to copyright restrictions in the US)

    Abstract: This chapter discusses some grid experiences in solving the problem of generating large astronomical image mosaics by composing multiple small images, from the team that has developed Montage (http://montage.ipac.caltech.edu/). The problem of generating these mosaics is complex in that individual images must be projected into a common coordinate space, overlaps between images calculated, the images processed so that the backgrounds match, and images composed while using a variety of techniques to handle the presence of multiple pixels in the same output space. To accomplish these tasks, a suite of software tools called Montage has been developed. The modules in this suite can be run on a single processor computer using a simple shell script, and can additionally be run using a combination of parallel approaches. These include running MPI versions of some modules, and using standard grid tools. In the latter case, processing workflows are automatically generated, and appropriate data sources are located and transferred to a variety of parallel processing environments for execution. As a result, it is now possible to generate large-scale mosaics on-demand in timescales that support iterative, scientific exploration. In this chapter, we describe Montage, how it was modified to execute in the grid environment, the tools that were used to support its execution, as well as performance results.

  • D. S. Katz, G. B. Berriman, E. Deelman, J. Good, J. C. Jacob, C. Kesselman, A. C. Laity, T. A. Prince, G. Singh, and M.-H. Su, "A Comparison of Two Methods for Building Astronomical Image Mosaics on a Grid," Proceedings of 34th International Conference on Parallel Processing Workshops , pp. 85-94, 2005.
    (This paper is not subject to copyright restrictions in the US)

    Abstract: This paper compares two methods for running an application composed of a set of modules on a grid. The set of modules (collectively called Montage) generates large astronomical image mosaics by composing multiple small images. The workflow that describes a particular run of Montage can be expressed as a directed acyclic graph (DAG), or as a short sequence of parallel (MPI) and sequential programs. In the first case, Pegasus can be used to run the workflow. In the second case, a short shell script that calls each program can be run. In this paper, we discuss the Montage modules, the workflow run for a sample job, and the two methods of actually running the workflow. We examine the run time for each method and compare the portions that differ between the two methods.

  • G. Singh, E. Deelman, G. Mehta, K. Vahi, M.H.-Su, G. B. Berriman, G. Good, J. Jacob, D. S. Katz, A. Lazzarini, K. Blackburn, and S. Koranda, "The Pegasus Portal: Web Based Grid Computing," Proceedings of the The 20th Annual ACM Symposium on Applied Computing (SAC 2005), pp. 680-686, 2005.
    (This paper is subject to copyright restrictions)

    Abstract: Pegasus is a planning framework for mapping abstract workflows for execution on the Grid. This paper presents the implementation of a web-based portal for submitting workflows to the Grid using Pegasus. The portal also includes components for generating abstract workflows based on a metadata description of the desired data products and application-specific services. We describe our experiences in using this portal for two Grid applications. A major contribution of our work is in introducing several components that can be useful for Grid portals and hence should be included in Grid portal development toolkits.

  • A. C. Laity, N. Anagnostou, G. B. Berriman, J. C. Good, J. C. Jacob, D. S. Katz, and T. Prince "Montage: An Astronomical Image Mosaic Service for the NVO," Astronomical Data Analysis Software & Systems (ADASS) XIV, 2004.
    (This paper is subject to copyright restrictions)

    Abstract: Montage is a software system for generating astronomical image mosaics according to user-specified size, rotation, WCS-compliant projection and coordinate system, with background modeling and rectification capabilities. Its architecture has been described in the proceedings of ADASS XII and XIII. It has been designed as a toolkit, with independent modules for image reprojection, background rectification and coaddition, and will run on workstations, clusters and grids. The primary limitation of Montage thus far has been in the projection algorithm. It uses a spherical trigonometry approach that is general at the expense of speed. The reprojection algorithm has now been made 30 times faster for commonly used tangent plane to tangent plane reprojections that cover up to several square degrees, through modification of a custom algorithm first derived by the Spitzer Space Telescope. This focus session will describe this algorithm, demonstrate the generation of mosaics in real time, and describe applications of the software. In particular, we will highlight one case study which shows how Montage is supporting the generation of science-grade mosaics of images measured with the Infrared Array Camera aboard the Spitzer Space Telescope.

  • J. C. Jacob, R. Williams, J. Babu, S. G. Djorgovski, M. J. Graham, D. S. Katz, A. Mahabal, C. D. Miller, R. Nichol, D. E. Vanden Berk, and H. Walia, "Grist: Grid Data Mining for Astronomy," Astronomical Data Analysis Software & Systems (ADASS) XIV, 2004.
    (This paper is not subject to copyright restrictions in the US)

    Abstract: The Grist project is developing a grid-technology based system as a research environment for astronomy with massive and complex datasets. This knowledge extraction system will consist of a library of distributed grid services controlled by a workflow system, compliant with standards emerging from the grid computing, web services, and virtual observatory communities. This new technology is being used to find high redshift quasars, study peculiar variable objects, search for transients in real time, and fit SDSS QSO spectra to measure black hole masses. Grist services are also a component of the "hyperatlas" project to serve high-resolution multi-wavelength imagery over the Internet. In support of these science and outreach objectives, the Grist framework will provide the enabling fabric to tie together distributed grid services in the areas of data access, federation, mining, subsetting, source extraction, image mosaicking, statistics, and visualization.

  • G. B. Berriman, E. Deelman, J. Good, J. Jacob, D. S. Katz, C. Kesselman, A. Laity, T. A. Prince, G. Singh, M. Su, "Montage: A Grid Enabled Engine for Delivering Custom Science-Grade Mosaics on Demand," Proceedings of the SPIE Conference on Astronomical Telescopes and Instrumentation, SPIE, 2004.
    (This paper is subject to copyright restrictions)

    Abstract: This paper describes the design of a grid-enabled version of Montage, an astronomical image mosaic service, suitable for large scale processing of the sky. All the re-projection jobs can be added to a pool of tasks and performed by as many processors as are available, exploiting the parallelization inherent in the Montage architecture. We show how we can describe the Montage application in terms of an abstract workflow so that a planning tool such as Pegasus can derive an executable workflow that can be run in the Grid environment. The execution of the workflow is performed by the workflow manager DAGMan and the associated Condor-G. The grid processing will support tiling of images to a manageable size when the input images can no longer be held in memory. Montage will ultimately run operationally on the Teragrid. We describe science applications of Montage, including its application to science product generation by Spitzer Legacy Program teams and large-scale, all-sky image processing projects.

  • D. S. Katz, A. Bergou, G. B. Berriman, G. L. Block, J. Collier, D. W. Curkendall, J. Good, L. Husman, J. C. Jacob, A. Laity, P. P. Li, C. Miller, T. Prince, H. Siegel, and R. Williams, "Accessing and Visualizing Scientific Spatiotemporal Data," Proceedings of the 16th International Conference on Scientific and Statistical Database Management, pp. 107-110, 2004.
    (This paper is not subject to copyright restrictions in the US)

    Abstract: This paper discusses work done by JPL's Parallel Applications Technologies Group in helping scientists access and visualize very large data sets through the use of multiple computing resources, such as parallel supercomputers, clusters, and grids. These tools do one or more of the following tasks: visualize local data sets for local users, visualize local data sets for remote users, and access and visualize remote data sets. The tools are used for various types of data, including remotely sensed image data, digital elevation models, astronomical surveys, etc. The paper attempts to pull some common elements out of these tools that may be useful for others who have to work with similarly large data sets.

  • J. C. Jacob, D. S. Katz, T. Prince, G. B. Berriman, J. C. Good, A. C. Laity, E. Deelman, G. Singh, and M.-H. Su, "The Montage Architecture for Grid-Enabled Science Processing of Large, Distributed Datasets," Proceedings of the 2004 Earth Science Technology Conference (ESTC-04), 2004.
    (This paper may be subject to copyright restrictions)

    Abstract: Montage is an Earth Science Technology Office (ESTO) Computational Technologies (CT) Round III Grand Challenge project that will deploy a portable, compute-intensive, custom astronomical image mosaicking service for the National Virtual Observatory (NVO). Although Montage is developing a compute- and data-intensive service for the astronomy community, we are also helping to address a problem that spans both Earth and space science: how to efficiently access and process multi-terabyte, distributed datasets. In both communities, the datasets are massive, and are stored in distributed archives that are, in most cases, remote with respect to the available computational resources. Therefore, use of state- of-the-art computational grid technologies is a key element of the Montage portal architecture. This paper describes the aspects of the Montage design that are applicable to both the Earth and space science communities.

  • E. Ciocca, I. Koren, Z. Koren, C. M. Krishna, and D. S. Katz, "Application-Level Fault Tolerance and Detection in the Orbital Thermal Imaging Spectrometer," Proceedings of the 2004 Pacific Rim International Symposium on Dependable Computing, pp. 43-48, 2004.
    (This paper is subject to copyright restrictions)

    Abstract: Systems that operate in extremely volatile environments, such as orbiting satellites, must be designed with a strong emphasis on fault tolerance. Rather than rely solely on the system hardware, it may be benecial to entrust some of the fault handling to software at the application level, which can utilize semantic information and software communication channels to achieve fault tolerance with considerably less power and performance overhead. This paper details the implementation and evaluation of such a software-level approach, Application-Level Fault Tolerance and Detection (ALFTD) into the Orbital Thermal Imaging Spectrometer (OTIS).

  • J. W. Larson, B. Norris, E. T. Ong, D. E. Bernholdt, J. B. Drake, W. R. El Wasif, M. W. Ham, C. E. Rasmussen, G. Kumfert, D. S. Katz, S. Zhou, C. DeLuca, and N. S. Collins, "Components, The Common Component Architecture, and the Climate/Ocean/Weather Community," Proceedings of the 20th International Conference on Interactive Information and Processing Systems (IIPS) for Meteorology, Oceanography, and Hydrology, 84th American Meteorological Society Annual Meeting, 2004.
    (This paper is subject to copyright restrictions)

    Abstract: Earth system and environmental models present the scientist/programmer with multiple challenges in software design, development, and maintenance, overall system integration, and performance. We describe how work in the industrial sector of software engineering - namely component-based software engineering - can be brought to bear to address issues of software complexity. We explain how commercially developed component solutions are inadequate to address the performance needs of the Earth system modeling community. We describe a component-based approach called the Common Component Architecture that has as its goal the creation of a component paradigm that is compatible with the requirements of high-performance computing applications. We outline the relationship and ongoing collaboration between CCA and major Climate/Weather/Ocean community software projects. We present examples of work in progress that uses CCA, and discuss long-term plans for the CCA-climate/weather/ocean collaboration.

  • M. Turmon, R. Granat, D. S. Katz, and J. Z. Lou, "Tests and Tolerances for High-Performance Software-Implemented Fault Detection," IEEE Transactions on Computers, v.52(5), pp. 579-591, May 2003.

    (This paper is not subject to copyright restrictions in the US)

    Abstract: We describe and test a software approach to fault detection in common numerical algorithms. Such result checking or algorithm-based fault tolerance (ABFT) methods may be used, for example, to overcome single-event upsets in computational hardware or to detect errors in complex, high-efficiency implementations of the algorithms. Following earlier work, we use checksum methods to validate results returned by a numerical subroutine operating subject to unpredictable errors in data. We consider common matrix and Fourier algorithms which return results satisfying a necessary condition having a linear form: the checksum tests compliance with this condition. We discuss the theory and practice of setting numerical tolerances to separate errors caused by a fault from those inherent in finite-precision numerical calculations. We concentrate on comprehensively defining and evaluating tests having various accuracy/computation burden tradeoffs, and emphasize average-case algorithm behavior rather then using worst-case upper bounds on error.

  • D. S. Katz and R. R. Some, "NASA Advances Robotic Space Exploration," IEEE Computer, v. 36(1), pp. 52-61, January 2003.

    (This paper is not subject to copyright restrictions in the US)

    Abstract: NASA's successful exploration of space has uncovered vast amounts of new knowledge about the Earth, the solar system and its other planets, and the stellar spaces beyond. To continue gaining new knowledge has required - and will continue to require - new capabilities in onboard processing hardware, system software, and applications such as autonomy.

    For example, initial robotic space exploration missions functioned, for the most part, as large flying cameras. These instruments have evolved over time to include more sophisticated imaging radar, multispectral imagers, spectrometers, gravity wave detectors, a host of prepositioned sensors and, most recently, rovers.

  • D. S. Katz, E. R. Tisdale, and C. D. Norton, "The Common Component Architecture (CCA) Applied to Sequential and Parallel Computational Electromagnetic Applications," Recent Advances in Computational Science & Engineering: Proceedings of the International Conference on Scientific & Engineering Computation (IC-SEC) 2002, pp. 353-356, Imperial College Press, 2002.

    (This paper is not subject to copyright restrictions in the US)

    Abstract: The development of large-scale multi-disciplinary scientific applications for high-performance computers today involves managing the interaction between portions of the application developed by different groups. The CCA (Common Component Architecture) Forum is developing a component architecture specification to address high-performance scientific computing, emphasizing scalable (possibly-distributed) parallel computations. This paper presents an examination of the CCA software in sequential and parallel electromagnetics applications using unstructured adaptive mesh refinement (AMR). The CCA learning curve and the process for modifying Fortran 90 code (a driver routine and an AMR library) into two components are described. The performance of the original applications and the componentized versions are measured and shown to be comparable.

  • G. B. Berriman, D. Curkendall, J. Good, J. Jacob, D. S. Katz, M. Kong, S. Monkewitz, R. Moore, T. Prince, R. Williams, "An Architecture for Access to Compute Intensive Image Mosaic and Cross-Identification Services in the NVO," Proceedings of the SPIE Conference on Astronomical Telescopes and Instrumentation, SPIE, pp. 91-102, 2002.
    (This paper is subject to copyright restrictions)

    Abstract: The National Virtual Observatory (NVO) will provide on-demand access to data collections, data fusion services and compute intensive applications. The paper describes the development of a framework that will support two key aspects of these objectives: a compute engine that will deliver custom image mosaics, and a "request management system," based on an e-business applications server, for job processing, including monitoring, failover and status reporting. We will develop this request management system to support a diverse range of astronomical requests, including services scaled to operate on the emerging computational grid infrastructure. Data requests will be made through existing portals to demonstrate the system: the NASA/IPAC Extragalactic Database (NED), the On-Line Archive Science Information Services (OASIS) at the NASA/IPAC Infrared Science Archive (IRSA); the Virtual Sky service at Caltechs Center for Advanced Computing Research (CACR), and the yourSky mosaic server at the Jet Propulsion Laboratory (JPL).

  • T. Sterling, D. S. Katz, and L. Bergman, "High-Performance Computing Systems for Autonomous Spaceborne Missions," International Journal of High Performance Computing Applications, v. 15(3), pp. 282-296, Fall 2001.

    (This paper is not subject to copyright restrictions in the US)

    Abstract: Future generation space missions across the solar system to the planets, moons, asteroids, and comets may someday incorporate supercomputers both to expand the range of missions being conducted and to significantly reduce their cost. By performing science computation directly on the spacecraft itself, the amount of data required to be down-linked may be reduced by many orders of magnitude, thus greatly reducing the mass of the resources needed for communication while increasing the quality and quantity of the science achieved. By performing the mission planning in real time directly on the spacecraft, complex and highly responsive missions can be conducted out of range of direct human intervention and the cost of mission management can be reduced. Through highly replicated computing structures, continued operation can be maintained in the presence of faults by means of graceful degradation. Two classes of systems, reflecting very different strategies of computer system architecture, are actively being pursued by the NASA Jet Propulsion Laboratory to take advantage of the opportunity of embedded high performance computing on spacecraft for deep space missions. COTS Clusters may permit the direct application of commercial computing hardware in loosely coupled ensembles to benefit from the enormous investment of industry in mass market components. New Processor-in-Memory architectures combine multiple nodes on a single chip of processor-memory pairs exposing the full memory bandwidth. This paper examines the driving issues motivating the use of supercomputing for future deep space missions and describes two active research projects at NASA JPL that are pursuing both the COTS and PIM strategies for next generation spaceborne computing.

  • J. A. Gunnels, D. S. Katz, E. S. Quintana-Ortí, and R. A. van de Geijn, "Fault-Tolerant High-Performance Matrix Multiplication: Theory and Practice," Proceedings of International Conference on Dependable Systems and Networks, 2001.

    (This paper is not subject to copyright restrictions in the US)

    Abstract: In this paper, we extend the theory and practice regarding algorithmic fault-tolerant matrix-matrix multiplication, C = A B , in a number of ways. First, we propose low-overhead methods for detecting errors introduced not only in C but also in A and/or B . Second, we show that, theoretically, these methods will detect all errors as long as only one entry is corrupted. Third, we propose a low-overhead roll-back approach to correct errors once detected. Finally, we give a high-performance implementation of matrix-matrix multiplication that incorporates these error detection and correction methods. Empirical results demonstrate that these methods work well in practice while imposing an acceptable level of overhead relative to high-performance implementations without fault-tolerance.

  • D. S. Katz, and J. Kepner, "Embedded/Real-Time Systems," International Journal of High Performance Computing Applications (special issue: Cluster Computing White Paper), v. 15(2), pp. 186-190, Summer 2001.
    (This paper is not subject to copyright restrictions in the US; however, the full special issue is subject to copyright restrictions)

  • T. Cwik, D. S. Katz, and F. J. Villegas, "Integrated Design and Simulation for Millimeter-Wave Antenna Systems," Proceedings of the IEEE Aerospace Conference, 2001.

    (This paper is not subject to copyright restrictions in the US)

    Abstract: Several instruments operating in the microwave and millimeter-wave bands are to be developed over the next several years at either JPL or JPL in conjunction with various other companies and laboratories. The design and development of these instruments requires an environment that can produce a microwave or millimeter-wave optics design, and can assess the sensitivity of key design criteria (beamwidth, gain, sidelobe levels, etc.) to thermal and mechanical operating environments. An integrated design tool has been developed to carry out the design and analysis using software building blocks from the computer-aided design, thermal, structural and electromagnetic analysis fields. The capability to simultaneously assess the effects of design parameter variation resulting from thermal and structural loads can reduce design and validation cost and generally lead to more optimal designs, hence higher performing instruments.

    In this paper the development and application of MODTool (Millimeter-wave Optics Design), a design tool that efficiently integrates existing millimeter-wave optics design software with a solid body modeler and thermal/structural analysis packages, will be discussed. The design tool is also directly useful over other portions of the spectrum, though thermal or dynamical loads may have less influence on antenna patterns at the longer wavelengths. Under a common interface, interactions between the various components of a design can be efficiently evaluated and optimized. One key component is the use of physical optics analysis software for antenna pattern analysis. This software has been ported to various platforms including distributed memory parallel supercomputers to allow rapid turn-around for electrically large designs.

  • S. A. Curtis, M. Rilee, M. Bhat, and D. Katz, "Small Satellite Constellation Autonomy via on-board Supercomputers and Artificial Intelligence," International Astronautical Federation, 51st Congress, 2000.

    (This paper is not subject to copyright restrictions in the US)

    Abstract: Under NASA's Remote Exploration and Experimentation program, we have examined the role of fault tolerant on-board supercomputing in the context of the science requirements of nanospacecraft missions for both in situ and remote sensing scenarios relating to the study of Sun-Earth connections. The capability of on-board reduction of data allows a great decrease in data bandwidth requirements (factors of more than 1000 are achievable) to earth, this opens the pathway to much greater autonomy owing to fewer ground contacts of shorter duration needed. Further reductions are possible using the on-board supercomputers to evaluate the data, to make operational decisions based on this, and to directly evaluate spacecraft subsystems. The ultimate goal is to achieve absolute scalability wherein it is no more demanding on ground support to operate a constellation of more than 100 spacecraft than it presently is to operate one spacecraft. Specific applications of heuristic systems to these problems are discussed.

  • M. Turmon, R. Granat, and D. S. Katz, "Software Fault Tolerance for High-Performance Space Applications," Proceedings of International Conference on Dependable Systems and Networks, 2000.
    (This paper is not subject to copyright restrictions in the US)

    Abstract: We describe and test a software approach to overcoming radiation-induced errors in spaceborne applications running on commercial off-the-shelf components. The approach uses checksum methods to validate results returned by a numerical subroutine operating subject to unpredictable errors in data. We can treat subroutines that return results satisfying a necessary condition having a linear form: the checksum tests compliance with this condition. We discuss the theory and practice of setting numerical tolerances to separate errors caused by a fault from those inherent in finite-precision numerical calculations. We test both the general effectiveness of the linear fault tolerant schemes we propose, and the correct behavior of our parallel implementation of them.

  • F. Chen, L. Craymer, J. Deifik, A. J. Fogel, D. S. Katz, A. G. Silliman, Jr, R. R. Some, S. A. Upchurch, and K. Whisnant, "Demonstration of the Remote Exploration and Experimentation (REE) Fault-Tolerant Parallel-Processing Supercomputer for Spacecraft Onboard Scientific Data Processing," Proceedings of International Conference on Dependable Systems and Networks, 2000.
    (This paper is not subject to copyright restrictions in the US)

    Abstract: This paper is the written explanation for a demonstration of the REE Project's work to-date. The demonstration is intended to simulate an REE system that might exist on a Mars Rover, consisting of multiple COTS processors, a COTS network, a COTS node-level operating system, REE middleware, and an REE application. The specific application performs texture processing of images. It was chosen as a building block of automated geological processing that will eventually be used for both navigation and data processing. Because the COTS hardware is not radiation hardened, SEU-induced soft errors will occur. These errors are simulated in the demonstration by use of a software-implemented fault-injector, and are injected at a rate much higher than is realistic for the sake of viewer interest. Both the application and the middleware contain mechanisms for both detection of and recovery from these faults, and these mechanisms are tested by this very high fault-rate. The consequence of the REE system being able to tolerate this fault rate while continuing to process data is that the system will easily be able to handle the true fault rate.

  • J. Beahan, L. Edmonds, R. Ferraro, A. Johnston, D. S. Katz, and R. R. Some, "Detailed Radiation Fault Modeling of the Remote Exploration and Experimentation (REE) First Generation Testbed Architecture," Proceedings of the IEEE Aerospace Conference, 2000.

    (This paper is not subject to copyright restrictions in the US)

    Abstract: The goal of the NASA HPCC Remote Exploration and Experimentation (REE) Project is to transfer commercial supercomputing technology into space. The project will use state of the art, low-power, non-radiation-hardened, Commercial Off-The-Shelf (COTS) hardware chips and COTS software to the maximum extent possible, and will rely on Software-Implemented Fault Tolerance (SIFT) to provide the required levels of availability and reliability. In this paper, we outline the methodology used to develop a detailed radiation fault model for the REE Testbed architecture. The model addresses the effects of energetic protons and heavy ions which cause Single Event Upset (SEU) and Single Event Multiple Upset (SEMU) events in digital logic devices and which are expected to be the primary fault generation mechanism. Unlike previous modeling efforts, this model will address fault rates and types in computer subsystems at a sufficiently fine level of granularity (i.e., the register level) that specific software and operational errors can be derived. We present the current state of the model, model verification activities and results to date, and plans for applications.

  • Yi Chao, P. Peggy Li, Ping Wang, Daniel S. Katz, Benny N. Cheng, and Scott Whitman, "Ocean Modeling and Visualization on a Massively Parallel Computer," Industrial Strength Parallel Computing: Programming Massively Parallel Processing Systems, (Editor: Alice Koniges,) Morgan Kaufmann Publishers, Inc., 1999.

    (This chapter is not subject to copyright restrictions in the US)

  • T. Cwik, C. Zuffada, D. S. Katz, and J. Parker, "Radar Scattering and Antenna Modeling on Scalable High Performance Computers," Industrial Strength Parallel Computing: Programming Massively Parallel Processing Systems, (Editor: Alice Koniges,) Morgan Kaufmann Publishers, Inc., 1999.

    (This chapter is not subject to copyright restrictions in the US)

  • D. S. Katz and T. Cwik, "Computational Electromagnetics," High Performance Cluster Computing: Programming and Applications, v. 2 (Editor: Rajkumar Buyya,) Prentice Hall, Inc., 1999.

    (This chapter is not subject to copyright restrictions in the US)

  • D. S. Katz, T. Cwik, and T. Sterling, "An Examination of the Performance of Two Electromagnetic Simulations on a Beowulf-Class Computer," High Performance Computing Systems and Applications, (Editor: J. Schaeffer,) pp. 207-216, Kluwer Academic Publishers, Norwell, MA, 1998.
    (This chapter is not subject to copyright restrictions in the US)

  • D. S. Katz, T. Cwik, B. H. Kwan, J. Z. Lou, P. L. Springer, T. L. Sterling, and P. Wang, "An Assessment of a Beowulf System for a Wide Class of Analysis and Design Software," Advances in Engineering Software, v. 26(3-6), pp. 451-461, July 1998.

    (This paper is not subject to copyright restrictions in the US)

    Abstract: This paper discusses Beowulf systems, focusing on Hyglac, the Beowulf system installed at the Jet Propulsion Laboratory. The purpose of the paper is to assess how a system of this type will perform while running a variety of scientific and engineering analysis and design software. The first part of the assessment contains a measurement of the communication performance of Hyglac, along with a discussion of factors which have the potential to limit system performance. The second part consists of performance measurements of six specific programs (analysis and design software), as well as discussion about these measurements. Finally, the measurements and discussion lead to the conclusion that Hyglac is suitable for running these types of codes (in a research/industrial environment such as at JPL,) and that the primary factor for determining how a given code will perform is that codes ratio of communication to computation.

  • T. Cwik, J. Z. Lou, and D. S. Katz, "Scalable, Finite Element Analysis of Electromagnetic Scattering and Radiation," Advances in Engineering Software, v. 26(3-6), pp. 289-296, July 1998.

    (This paper is not subject to copyright restrictions in the US)

    Abstract: In this paper a method for simulating electromagnetic fields scattered from complex objects is reviewed; namely, an unstructured finite element code that does not use traditional mesh partitioning algorithms. The complete software package is implemented on the Cray T3D massively parallel processor using both Cray Adaptive FORTRAN (CRAFT) compiler constructs to simplify portions of the code that operate on the irregular data, and optimized message passing constructs on portions of the code that operate on regular data and require optimum machine performance. The above finite element solution package is then integrated into an error estimation and adaptive mesh refinement algorithm. An error for the fields over the mesh is estimated and used to drive an adaptive mesh refinement algorithm that refines the mesh where errors are above a given tolerance. This refined mesh is then used once again for a finite element solution of the fields. After estimating the error a second time, the mesh is again refined as needed. This process continues until the error is reduced to an allowable level.

  • T. Cwik, D. S. Katz, C. Zuffada, and V. Jamnejad, "The Application of Scalable Distributed Memory Computers to the Finite Element Modeling of Electromagnetic Scattering and Radiation," International Journal of Numerical Methods in Engineering, v. 41(4), pp. 759-776, February 28, 1998.

    (This paper is not subject to copyright restrictions in the US)

    Abstract: Large scale parallel computation can be an enabling resource in many areas of engineering and science if the parallel simulation algorithm attains an appreciable fraction of the machine peak performance, and if undue cost in porting the code or in developing the code for the parallel machine is not incurred. The issue of code parallelization is especially significant when considering unstructured mesh simulations. The unstructured mesh models considered in this paper result from a finite element simulation of electromagnetic fields scattered from geometrically complex objects (either penetrable or impenetrable.) The unstructured mesh must be distributed among the processors, as must the resultant sparse system of linear equations. Since a distributed memory architecture does not allow direct access to the irregularly distributed unstructured mesh and sparse matrix data, partitioning algorithms not needed in the sequential software have traditionally been used to efficiently spread the data among the processors. This paper presents a new method for simulating electromagnetic fields scattered from complex objects; namely, an unstructured finite element code that does not use traditional mesh partitioning algorithms.

  • P. Wang, D. S. Katz, and Y. Chao, "Optimization of a Parallel Ocean General Circulation Model," (Best Paper Award Winner), SC97, 1997.
    Global climate modeling is one of the grand challenges of computational science, and ocean modeling plays an important role in both understanding the current climatic conditions and predicting the future climate change. Three-dimensional time-dependent ocean general circulation models (OGCMs) require a large amount of memory and processing time to run realistic simulations. Recent advances in computing hardware have dramatically affected the prospect of studying the global climate. The significant computational resources of massively parallel supercomputers promise to make such studies feasible. In addition to using advanced hardware, designing and implementing a well-optimized parallel ocean code will significantly improve the computational performance and reduce the total research time to complete these studies.

    In our present work, we chose the most widely used OGCM code as our base code. This OGCM is based on the Parallel Ocean Program (POP) developed in FORTRAN 90 on the Los Alamos CM-2 Connection Machine by the Los Alamos ocean modeling research group. During the first half of 1994, the code was ported to the Cray T3D by Cray Research using SHMEM-based message passing. Since the code on the T3D was still time-consuming when large problems were encountered, improving the code performance was considered essential.

    We have developed several general strategies to optimize the ocean general circulation model on the Cray T3D. These strategies include memory optimization, effective use of arithmetic pipelines, and usage of optimized libraries. The optimized code runs 2 to 2.5 times faster than the original code, which gives significant performance improvements for modeling large scaled ocean flows. Many test runs for both of the original and the optimized code have been carried out on the Cray T3D using various numbers of processors (1-256). Comparisons are made for a variety of real-world problems. A nearly linear scaling performance line is obtained for the optimized code, while the speed up data of the optimized code also shows excellent improvement over the original code.

    In addition to discussing the optimization of the code, we also address the issue of portability. Given the short life cycle of the massively parallel computer, usually on the order of three to five years, we emphasize the portability of the ocean model and the associated optimization routines across several computing platforms. Currently, the ocean modeling code has been ported successfully to the Hewlett Packard (HP)/Convex SPP-2000, and is readily portable to Cray T3E.

    This paper reports our efforts to optimize the parallel implementations of the oceanic model. So far, the work has focused on improving the load balancing and single node performance of the code on the Cray T3D. As a result, the atmosphere and ocean model components running side-by-side can achieve a performance level of slightly more than 10 GFLOPS on 512 processors of that machine. We have also developed a user-friendly coupling interface with atmospheric and biogeochemical models, in order to make the global climate modeling more complete and more realistic.

  • T. Cwik, D. S. Katz, and J. Patterson, "Scalable Solutions to Integral-Equation and Finite-Element Simulations," (invited paper for special issue on Advanced Numerical Techniques in Electromagnetics), IEEE Trans. Antennas Propagat., v. 45(3), pp. 544-555, March 1997.

    (This paper is not subject to copyright restrictions in the US)

    Abstract: When developing numerical methods, or applying them to the simulation and design of engineering components, it inevitably becomes necessary to examine the scaling of the method with a problem's electrical size. The scaling results from the original mathematical development - for example, a dense system of equations in the solution of integral equations - as well as the specific numerical implementation. Scaling of the numerical implementation depends upon many factors - for example, direct or iterative methods for solution of the linear system - as well as the computer architecture used in the simulation. In this paper, scalability will be divided into two components; scalability of the numerical algorithm specifically on parallel computer systems, and algorithm or sequential scalability. The sequential implementation and scaling is initially presented with the parallel implementation following. This progression is meant to illustrate the differences in using current parallel platforms and sequential machines, and the resulting savings. Time to solution (wall clock time) for differing problem sizes are the key parameters plotted or tabulated. Sequential and parallel scalability of time harmonic surface integral equation forms and the finite element solution to the partial differential equations are considered in detail.

  • C. E. Reuter, R. M. Joseph, E. T. Thiele, D. S. Katz, and A. Taflove, "Ultrawideband Absorbing Boundary Conditions for Termination of Waveguiding Structures in FD-TD Simulations," IEEE Microwave and Guided Wave Letters, v. 4(10), pp. 344-346, October 1994.

    (This paper is subject to copyright restrictions)

    Abstract: A new method for ultrawideband termination of waveguides in finite-difference time-domain (FD-TD) grids is presented. The Berenger perfectly matched layer (PML) absorbing boundary condition is applied to terminate both perfect electrically conducting (PEC) and dielectric waveguides in two dimensions. Reflections of less than -75 dB are obtained over the entire propagation regime. Evidence is presented that the PML ABC is effective even for the evanescent energy present below cutoff in PEC waveguides and the multimode propagation present in dielectric waveguides.

  • D. S. Katz, E. T. Thiele, and A. Taflove, "Validation and Extension to Three Dimensions of the Berenger PML Absorbing Boundary Condition for FD-TD Meshes," IEEE Microwave and Guided Wave Letters, v. 4(8), pp. 268-270, August 1994.

    (This paper is subject to copyright restrictions)

    Abstract: Berenger recently published a novel absorbing boundary condition (ABC) for FD-TD meshes in two dimensions, claiming orders-of-magnitude improved performance relative to any earlier technique. This approach, which he calls the ''perfectly matched layer (PML) for the absorption of electromagnetic waves,'' creates a nonphysical absorber adjacent to the outer grid boundary that has a wave impedance independent of the angle of incidence and frequency of outgoing scattered waves.

    This paper verifies Berenger's strong claims for PML for 2-D FD-TD grids and extends and verifies PML for 3-D FD-TD grids. Indeed, PML is > 40 dB more accurate than second-order Mur, and PML works just as well in 3-D as it does in 2-D. It should have a major impact upon the entire FD-TD modeling community, leading to new possibilities for high-accuracy simulations especially for low-observable aerospace targets.

  • M. J. Piket-May, A. Taflove, W. C. Lin, D. S. Katz, V. Sathiaseelan, and B. B. Mittal, "Initial Results for Automated Computational Modeling of Patient-Specific Electromagnetic Hyperthermia," IEEE Trans. Biomedical Engineering, v. 39(3), pp. 226-237, March 1992.

    (This paper is subject to copyright restrictions)

    Abstract: Developments in finite-difference time-domain (FDTD) computational modeling of Maxwell's equations, super-computer technology, and computed tomography (CT) imagery open the possibility of accurate numerical simulation of electromagnetic (EM) wave interactions with specific, complex, biological tissue structures. One application of this technology is in the area of treatment planning for EM hyperthermia. In this paper, we report the first highly automated CT image segmentation and interpolation scheme applied to model patient-specific EM hyperthermia. This novel system is based on sophisticated tools from the artificial intelligence, computer vision, and computer graphics disciplines. It permits CT-based patient-specific hyperthermia models to be constructed without tedious manual contouring on digitizing pads or CRT screens. The system permits in principle near real-time assistance in hyperthermia treatment planning. We apply this system to interpret actual patient CT data, reconstructing a 3-D model of the human thigh from a collection of 29 serial CT images at 10 mm intervals. Then, using FD-TD, we obtain 2-D and 3-D models of EM hyperthermia of this thigh due to a waveguide applicator. We find that different results are obtained from the 2-D and 3-D models, and conclude that full 3-D tissue models are required for future clinical usage.

  • D. S. Katz, M. J. Piket-May, A. Taflove, and K. R. Umashankar, "FD-TD Analysis of Electromagnetic Wave Radiation from Systems Containing Horn Antennas," IEEE Trans. Antennas Propagat., v. 39(8), pp. 1203-1212, August 1991.

    (This paper is subject to copyright restrictions)

    Abstract: The application of the finite-difference time-domain (FDTD) method to various radiating structures is considered. These structures include two- and three-dimensional waveguides, flared horns, a two-dimensional parabolic reflector, and a two-dimensional hyperthermia application. Numerical results for the horns, waveguides, and parabolic reflectors are compared with results from method of moments (MM). The results for the hyperthermia application are shown as extensions of the previously validated models. This new application of the FDTD method is shown to be useful when other numerical or analytic methods cannot be applied.

This material is presented to ensure timely dissemination of scholarly and technical work. Copyright and all rights therein are retained by authors or by other copyright holders. All persons copying this information are expected to adhere to the terms and constraints invoked by each author's copyright. In most cases, these works may not be reposted without the explicit permission of the copyright holder.

For papers with copyright held by IEEE:
©20xx IEEE. Personal use of this material is permitted. However, permission to reprint/republish this material for advertising or promotional purposes or for creating new collective works for resale or redistribution to servers or lists, or to reuse any copyrighted component of this work in other works must be obtained from the IEEE.

For papers with copyright held by ACM:
ACM COPYRIGHT NOTICE. Copyright ©20xx by the Association for Computing Machinery, Inc. Permission to make digital or hard copies of part or all of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, to republish, to post on servers, or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from Publications Dept., ACM, Inc., fax +1 (212) 869-0481, or permissions@acm.org.

AMS, Elsevier, and SPIE allow authors of papers to post their own papers on a personal website.

For a copy of a paper with restricted access for personal use, please contact: d.katz@ieee.org

Dan: Contact/Bio | Education | Projects | Students | Activities | Copies of Papers | Links&Quotes
LSU Homepage