Access to published material may be restricted.
- G. Singh, M.-H. Su, K. Vahi, E. Deelman, B. Berriman, J. Good,
D. S. Katz, and G. Mehta, "Workflow Task Clustering for Best Effort Systems with
Pegasus," Proceedings of 15th Mardi Gras Conference, 2008.
(This paper may be subject to copyright restrictions)
Abstract:
Many scientific workflows are composed of fine computational
granularity tasks, yet they are composed of thousands of them and
are data intensive in nature, thus requiring resources such as the
TeraGrid to execute efficiently. In order to improve the
performance of such applications, we often employ task clustering
techniques to increase the computational granularity of workflow
tasks. The goal is to minimize the completion time of the workflow
by reducing the impact of queue wait times. In this paper, we
examine the performance impact of the clustering techniques using
the Pegasus workflow management system. Experiments performed
using an astronomy workflow on the NCSA TeraGrid cluster show
that clustering can achieve a significant reduction in the workflow
completion time (upto 97%).
-
D. S. Katz, J. C. Jacob, P. P. Li, Y. Chao, G. Alen,
"Data-Oriented
Distributed Computing for Science: Reality and Possibilities,
On the Move to Meaningful Internet Systems 2006: CoopIS, DOA, GADA, and ODBASE, Lecture Notes in Computer Science,
v. 4276, pp. 1119-1124, 2006.
(This paper may be subject to copyright restrictions)
Abstract:
As is becoming commonly known, there is an explosion happening in the
amount of scientific data that is publicly available. One challenge is
how to make productive use of this data. This talk will discuss some
parallel and distributed computing projects, centered around virtual
astronomy, but also including other scientific data-oriented realms.
It will look at some specific projects from the past, including Montage,
Grist, OurOcean, and SCOOP, and will discuss the distributed computing,
Grid, and Web-service technologies that have successfully been used in
these projects.
-
J. C. Jacob, D. S. Katz, G. B. Berriman, J. Good, A. C. Laity,
E. Deelman, C. Kesselman, G. Singh, M.-H. Su, T. A. Prince, R. Williams,
"Montage:
A Grid Portal and Software Toolkit for Science-Grade
Astronomical Image Mosaicking,"
International Journal of Computational Science and Engineering,
in press.
(This paper is not subject to copyright restrictions in the US)
Abstract:
Montage is a portable software toolkit for constructing custom,
science-grade mosaics by composing multiple astronomical images.
The mosaics constructed by Montage preserve the astrometry (position)
and photometry (intensity) of the sources in the input images.
The mosaic to be constructed is specified by the user in terms of a set
of parameters, including dataset and wavelength to be used, location
and size on the sky, coordinate system and projection, and spatial
sampling rate. Many astronomical datasets are massive, and are stored
in distributed archives that are, in most cases, remote with respect
to the available computational resources. Montage can be run on both
single- and multi-processor computers, including clusters and grids.
Standard grid tools are used to run Montage in the case where the
data or computers used to construct a mosaic are located remotely
on the Internet. This paper describes the architecture, algorithms,
and usage of Montage as both a software toolkit and as a grid portal.
Timing results are provided to show how Montage performance scales with
number of processors on a cluster computer. In addition, we compare
the performance of two methods of running Montage in parallel on a grid.
- G. B. Berriman, J. C. Good,
A. C. Laity, J. C. Jacob, D. S. Katz,
E. Deelman, G. Singh, M.-H. Su, and T. Prince,
"The
Design and Applications of Montage: An Astronomical Image Mosaic Engine,"
Proceedings of the 2006 Earth Science Technology Conference (ESTC-06),
2006.
(This paper may be subject to copyright restrictions)
Abstract:
Montage is a portable toolkit for constructing
custom, science-grade mosaics by composing multiple
astronomical images. The mosaics constructed by Montage
preserve the astrometry (position) and photometry (intensity)
of the sources in the input images. The mosaic to be
constructed is specified by the user in terms of a set of
parameters, including dataset and wavelength to be used,
location and size on the sky, coordinate system and
projection, and spatial sampling rate. Many astronomical
datasets are massive, and are stored in distributed archives
that are, in most cases, remote with respect to the available
computational resources. The paper describes scientific
applications of Montage by NASA projects and researchers,
who run the software on both single- and multi-processor
computers, including clusters and grids. Standard grid tools
are used to run Montage in the case where the data or
computers used to construct a mosaic are located remotely on
the Internet. This paper describes the architecture,
algorithms, and performance of Montage as both a software
toolkit and as a grid portal.
-
D. E. Bernholdt, B. A. Allan, R. Armstrong, F. Bertrand, K. Chiu,
T. L. Dahlgren, K. Damevski, W. R. Elwasif, T. G. W. Epperly, M. Govindaraju,
D. S. Katz, J. A. Kohl, M. Krishnan, G. Kumfert, J. W. Larson, S. Lefantzi,
M. J. Lewis, A. D. Malony, L. C. McInnes, J. Nieplocha, B. Norris, S. G. Parker,
J. Ray, S. Shende, T. L. Windus, and S. Zhou,
"A
Component Architecture for High-Performance Scientific Computing,"
International Journal of High Performance Computing Applications,
v. 20(2), pp. 163-202, Summer 2006.
(This paper may be subject to copyright restrictions)
Abstract:
The Common Component Architecture (CCA) provides a means for software
developers to manage the complexity of large-scale scientific
simulations and to move toward a plug-and-play environment for
high-performance computing. In the scientific computing context,
component models also promote collaboration using independently
developed software, thereby allowing particular individuals or groups
to focus on the aspects of greatest interest to them. The CCA supports
parallel and distributed computing as well as local high-performance
connections between components in a language-independent manner. The
design places minimal requirements on components and thus facilitates
the integration of existing code into the CCA environment. The CCA
model imposes minimal overhead to minimize the impact on application
performance. The focus on high performance distinguishes the CCA from
most other component models. The CCA is being applied within an
increasing range of disciplines, including combustion research, global
climate simulation, and computational chemistry.
- E. Deelman, G. Singh, M.-H. Su, J. Blythe, Y. Gil, C. Kesselman,
G. Mehta, K. Vahi, G. B. Berriman, J. Good, A. Laity, J. C. Jacob,
and D. S. Katz,
"Pegasus:
a Framework for Mapping Complex Scientific Workflows onto
Distributed Systems," Scientific Programming, v.13(3), pp. 219-237,
November 2005.
(This paper is subject to copyright restrictions)
This paper describes the Pegasus framework that can be used to map complex scientific workflows onto distributed resources. Pegasus enables users to represent the workflows at an abstract level without needing to worry about the particulars of the target execution systems. The paper describes general issues in mapping applications and the functionality of Pegasus. We present the results of improving application performance through workflow restructuring.
Abstract:
-
D. S. Katz, N. Anagnostou, G. B. Berriman, E. Deelman,
J. Good, J. C. Jacob, C. Kesselman, A. Laity,
T. A. Prince, G. Singh, M.-H. Su, and R. Williams,
"Astronomical Image Mosaicking on a Grid: Initial Experiences,"
Engineering the Grid - Status and Perspective,
(Editors: B. Di Martino, J. Dongarra, A. Hoisie, L. Yang, and H. Zima,)
American Scientific Publishers, 2006.
(This chapter is not subject to copyright restrictions in the US)
Abstract:
This chapter discusses some grid experiences in solving the problem of
generating large astronomical image mosaics by composing multiple
small images, from the team that has developed Montage
(http://montage.ipac.caltech.edu/). The problem of generating these
mosaics is complex in that individual images must be projected into a
common coordinate space, overlaps between images calculated, the
images processed so that the backgrounds match, and images composed
while using a variety of techniques to handle the presence of multiple
pixels in the same output space. To accomplish these tasks, a suite
of software tools called Montage has been developed. The modules in
this suite can be run on a single processor computer using a simple
shell script, and can additionally be run using a combination of
parallel approaches. These include running MPI versions of some
modules, and using standard grid tools. In the latter case,
processing workflows are automatically generated, and appropriate data
sources are located and transferred to a variety of parallel
processing environments for execution. As a result, it is now possible
to generate large-scale mosaics on-demand in timescales that support
iterative, scientific exploration. In this chapter, we describe
Montage, how it was modified to execute in the grid environment, the
tools that were used to support its execution, as well as performance
results.
- D. S. Katz, G. B. Berriman, E. Deelman, J. Good, J. C. Jacob, C. Kesselman,
A. C. Laity, T. A. Prince, G. Singh, and M.-H. Su,
"A
Comparison of Two Methods for Building Astronomical Image
Mosaics on a Grid,"
Proceedings of 34th International Conference on Parallel Processing Workshops
, pp. 85-94, 2005.
(This paper is not subject to copyright restrictions in the US)
Abstract:
This paper compares two methods for running an application composed
of a set of modules on a grid. The set of modules (collectively
called Montage) generates large astronomical image mosaics by
composing multiple small images. The workflow that describes a
particular run of Montage can be expressed as a directed acyclic
graph (DAG), or as a short sequence of parallel (MPI) and sequential
programs. In the first case, Pegasus can be used to run the workflow.
In the second case, a short shell script that calls each program
can be run. In this paper, we discuss the Montage modules, the
workflow run for a sample job, and the two methods of actually
running the workflow. We examine the run time for each method and
compare the portions that differ between the two methods.
- G. Singh, E. Deelman, G. Mehta, K. Vahi, M.H.-Su, G. B. Berriman, G. Good,
J. Jacob, D. S. Katz, A. Lazzarini, K. Blackburn, and S. Koranda,
"The
Pegasus Portal: Web Based Grid Computing,"
Proceedings of the The 20th Annual ACM Symposium on Applied Computing
(SAC 2005), pp. 680-686, 2005.
(This paper is subject to copyright restrictions)
Abstract:
Pegasus is a planning framework for mapping abstract workflows for
execution on the Grid. This paper presents the implementation of a
web-based portal for submitting workflows to the Grid using Pegasus.
The portal also includes components for generating abstract workflows
based on a metadata description of the desired data products and
application-specific services. We describe our experiences in using
this portal for two Grid applications. A major contribution of our
work is in introducing several components that can be useful for Grid
portals and hence should be included in Grid portal development
toolkits.
- A. C. Laity, N. Anagnostou, G. B. Berriman, J. C. Good,
J. C. Jacob, D. S. Katz, and T. Prince
"Montage:
An Astronomical Image Mosaic Service for the NVO,"
Astronomical Data Analysis Software & Systems (ADASS) XIV, 2004.
(This paper is subject to copyright restrictions)
Abstract:
Montage is a software system for generating astronomical image mosaics
according to user-specified size, rotation, WCS-compliant projection
and coordinate system, with background modeling and rectification
capabilities. Its architecture has been described in the proceedings
of ADASS XII and XIII. It has been designed as a toolkit, with
independent modules for image reprojection, background rectification
and coaddition, and will run on workstations, clusters and grids. The
primary limitation of Montage thus far has been in the projection
algorithm. It uses a spherical trigonometry approach that is general
at the expense of speed. The reprojection algorithm has now been made
30 times faster for commonly used tangent plane to tangent plane
reprojections that cover up to several square degrees, through
modification of a custom algorithm first derived by the Spitzer Space
Telescope. This focus session will describe this algorithm,
demonstrate the generation of mosaics in real time, and describe
applications of the software. In particular, we will highlight one
case study which shows how Montage is supporting the generation of
science-grade mosaics of images measured with the Infrared Array
Camera aboard the Spitzer Space Telescope.
- J. C. Jacob, R. Williams, J. Babu, S. G. Djorgovski, M. J. Graham,
D. S. Katz, A. Mahabal, C. D. Miller, R. Nichol, D. E. Vanden Berk,
and H. Walia,
"Grist:
Grid Data Mining for Astronomy,"
Astronomical Data Analysis Software & Systems (ADASS) XIV, 2004.
(This paper is not subject to copyright restrictions in the US)
Abstract:
The Grist project is developing a grid-technology based system as a
research environment for astronomy with massive and complex datasets.
This knowledge extraction system will consist of a library of
distributed grid services controlled by a workflow system, compliant
with standards emerging from the grid computing, web services, and
virtual observatory communities. This new technology is being used to
find high redshift quasars, study peculiar variable objects, search
for transients in real time, and fit SDSS QSO spectra to measure black
hole masses. Grist services are also a component of the "hyperatlas"
project to serve high-resolution multi-wavelength imagery over the
Internet. In support of these science and outreach objectives, the
Grist framework will provide the enabling fabric to tie together
distributed grid services in the areas of data access, federation,
mining, subsetting, source extraction, image mosaicking, statistics,
and visualization.
- G. B. Berriman, E. Deelman, J. Good, J. Jacob, D. S. Katz, C. Kesselman,
A. Laity, T. A. Prince, G. Singh, M. Su,
"Montage:
A Grid Enabled Engine for Delivering Custom Science-Grade Mosaics on
Demand,"
Proceedings of the SPIE Conference on Astronomical Telescopes and
Instrumentation, SPIE, 2004.
(This paper is subject to copyright restrictions)
Abstract:
This paper describes the design of a grid-enabled version of Montage,
an astronomical image mosaic service, suitable for large scale
processing of the sky. All the re-projection jobs can be added to a
pool of tasks and performed by as many processors as are available,
exploiting the parallelization inherent in the Montage architecture.
We show how we can describe the Montage application in terms of an
abstract workflow so that a planning tool such as Pegasus can derive
an executable workflow that can be run in the Grid environment. The
execution of the workflow is performed by the workflow manager DAGMan
and the associated Condor-G. The grid processing will support tiling
of images to a manageable size when the input images can no longer be
held in memory. Montage will ultimately run operationally on the
Teragrid. We describe science applications of Montage, including its
application to science product generation by Spitzer Legacy Program
teams and large-scale, all-sky image processing projects.
- D. S. Katz, A. Bergou, G. B. Berriman, G. L. Block, J. Collier,
D. W. Curkendall, J. Good, L. Husman, J. C. Jacob, A. Laity, P. P. Li,
C. Miller, T. Prince, H. Siegel, and R. Williams,
"Accessing
and Visualizing Scientific Spatiotemporal Data,"
Proceedings of the 16th International Conference on Scientific and
Statistical Database Management, pp. 107-110, 2004.
(This paper is not subject to copyright restrictions in the US)
Abstract:
This paper discusses work done by JPL's Parallel Applications
Technologies Group in helping scientists access and visualize very
large data sets through the use of multiple computing resources, such
as parallel supercomputers, clusters, and grids. These tools do one
or more of the following tasks: visualize local data sets for local
users, visualize local data sets for remote users, and access and
visualize remote data sets. The tools are used for various types of
data, including remotely sensed image data, digital elevation models,
astronomical surveys, etc. The paper attempts to pull some common
elements out of these tools that may be useful for others who have to
work with similarly large data sets.
- J. C. Jacob, D. S. Katz, T. Prince, G. B. Berriman, J. C. Good,
A. C. Laity, E. Deelman, G. Singh, and M.-H. Su,
"The
Montage Architecture for Grid-Enabled
Science Processing of Large, Distributed Datasets,"
Proceedings of the 2004 Earth Science Technology Conference (ESTC-04),
2004.
(This paper may be subject to copyright restrictions)
Abstract:
Montage is an Earth Science Technology Office
(ESTO) Computational Technologies (CT) Round III Grand
Challenge project that will deploy a portable, compute-intensive,
custom astronomical image mosaicking service for the National
Virtual Observatory (NVO). Although Montage is developing a
compute- and data-intensive service for the astronomy
community, we are also helping to address a problem that spans
both Earth and space science: how to efficiently access and
process multi-terabyte, distributed datasets. In both
communities, the datasets are massive, and are stored in
distributed archives that are, in most cases, remote with respect
to the available computational resources. Therefore, use of state-
of-the-art computational grid technologies is a key element of
the Montage portal architecture. This paper describes the
aspects of the Montage design that are applicable to both the
Earth and space science communities.
- E. Ciocca, I. Koren, Z. Koren, C. M. Krishna, and D. S. Katz,
"Application-Level
Fault Tolerance and Detection in the Orbital Thermal Imaging Spectrometer,"
Proceedings of the 2004 Pacific Rim International Symposium on
Dependable Computing, pp. 43-48, 2004.
(This paper is subject to copyright restrictions)
Abstract:
Systems that operate in extremely volatile environments, such as
orbiting satellites, must be designed with a strong emphasis on fault
tolerance. Rather than rely solely on the system hardware, it may be
benecial to entrust some of the fault handling to software at the
application level, which can utilize semantic information and
software communication channels to achieve fault tolerance with
considerably less power and performance overhead. This paper details
the implementation and evaluation of such a software-level approach,
Application-Level Fault Tolerance and Detection (ALFTD) into the
Orbital Thermal Imaging Spectrometer (OTIS).
- J. W. Larson, B. Norris, E. T. Ong, D. E. Bernholdt, J. B. Drake,
W. R. El Wasif, M. W. Ham, C. E. Rasmussen, G. Kumfert, D. S. Katz,
S. Zhou, C. DeLuca, and N. S. Collins,
"Components,
The Common Component Architecture, and the Climate/Ocean/Weather Community,"
Proceedings of the 20th International Conference on Interactive Information
and Processing Systems (IIPS) for Meteorology, Oceanography, and Hydrology,
84th American Meteorological Society Annual Meeting, 2004.
(This paper is subject to copyright restrictions)
Abstract:
Earth system and environmental models present the
scientist/programmer with multiple challenges in software design,
development, and maintenance, overall system integration, and
performance. We describe how work in the industrial sector of
software engineering - namely component-based software engineering -
can be brought to bear to address issues of software complexity. We
explain how commercially developed component solutions are inadequate
to address the performance needs of the Earth system modeling
community. We describe a component-based approach called the
Common Component Architecture that has as its goal the
creation of a component paradigm that is compatible with the
requirements of high-performance computing applications. We outline
the relationship and ongoing collaboration between CCA and major
Climate/Weather/Ocean community software projects. We present
examples of work in progress that uses CCA, and discuss long-term
plans for the CCA-climate/weather/ocean collaboration.
-
M. Turmon, R. Granat, D. S. Katz, and J. Z. Lou,
"Tests and Tolerances for
High-Performance Software-Implemented Fault Detection,"
IEEE Transactions on Computers, v.52(5), pp. 579-591, May 2003.
(This paper is not subject to copyright restrictions in the US)
Abstract:
We describe and test a software approach to fault detection in common
numerical algorithms. Such result checking or algorithm-based fault
tolerance (ABFT) methods may be used, for example, to overcome
single-event upsets in computational hardware or to detect errors in
complex, high-efficiency implementations of the algorithms.
Following earlier work, we use checksum methods to validate results
returned by a numerical subroutine operating subject to unpredictable
errors in data. We consider common matrix and Fourier algorithms
which return results satisfying a necessary condition having a linear
form: the checksum tests compliance with this condition. We discuss
the theory and practice of setting numerical tolerances to separate
errors caused by a fault from those inherent in finite-precision
numerical calculations. We concentrate on comprehensively defining
and evaluating tests having various accuracy/computation burden
tradeoffs, and emphasize average-case algorithm behavior rather then
using worst-case upper bounds on error.
-
D. S. Katz and R. R. Some,
"NASA
Advances Robotic Space Exploration,"
IEEE Computer, v. 36(1), pp. 52-61, January 2003.
(This paper is not subject to copyright restrictions in the US)
Abstract: NASA's successful exploration of space has uncovered
vast amounts of new knowledge about the Earth, the solar system and its
other planets, and the stellar spaces beyond. To continue gaining new
knowledge has required - and will continue to require - new
capabilities in onboard processing hardware, system software, and
applications such as autonomy.
For example, initial robotic space exploration missions functioned, for
the most part, as large flying cameras. These instruments have evolved
over time to include more sophisticated imaging radar, multispectral
imagers, spectrometers, gravity wave detectors, a host of prepositioned
sensors and, most recently, rovers.
-
D. S. Katz, E. R. Tisdale, and C. D. Norton,
"The
Common Component Architecture (CCA) Applied to Sequential and Parallel
Computational Electromagnetic Applications,"
Recent Advances in Computational Science & Engineering: Proceedings of the
International Conference on Scientific & Engineering Computation
(IC-SEC) 2002, pp. 353-356, Imperial College Press, 2002.
(This paper is not subject to copyright restrictions in the US)
Abstract: The development of large-scale multi-disciplinary
scientific applications for high-performance computers today involves
managing the interaction between portions of the application developed
by different groups. The CCA (Common Component Architecture) Forum is
developing a component architecture specification to address
high-performance scientific computing, emphasizing scalable
(possibly-distributed) parallel computations. This paper presents an
examination of the CCA software in sequential and parallel
electromagnetics applications using unstructured adaptive mesh
refinement (AMR). The CCA learning curve and the process for modifying
Fortran 90 code (a driver routine and an AMR library) into two
components are described. The performance of the original applications
and the componentized versions are measured and shown to be
comparable.
- G. B. Berriman,
D. Curkendall, J. Good, J. Jacob, D. S. Katz, M. Kong, S. Monkewitz,
R. Moore, T. Prince, R. Williams,
"An
Architecture for Access to Compute Intensive Image Mosaic and
Cross-Identification Services in the NVO,"
Proceedings of the SPIE Conference on Astronomical Telescopes and
Instrumentation, SPIE, pp. 91-102, 2002.
(This paper is subject to copyright restrictions)
Abstract:
The National Virtual Observatory (NVO) will provide on-demand access
to data collections, data fusion services and compute intensive
applications. The paper describes the development of a framework
that will support two key aspects of these objectives: a compute
engine that will deliver custom image mosaics, and a "request
management system," based on an e-business applications server, for
job processing, including monitoring, failover and status reporting.
We will develop this request management system to support a diverse
range of astronomical requests, including services scaled to operate
on the emerging computational grid infrastructure. Data requests will
be made through existing portals to demonstrate the system: the
NASA/IPAC Extragalactic Database (NED), the On-Line Archive Science
Information Services (OASIS) at the NASA/IPAC Infrared Science
Archive (IRSA); the Virtual Sky service at Caltechs Center for
Advanced Computing Research (CACR), and the yourSky mosaic
server at the Jet Propulsion Laboratory (JPL).
-
T. Sterling, D. S. Katz, and L. Bergman,
"High-Performance Computing Systems for Autonomous Spaceborne Missions,"
International Journal of High Performance Computing Applications,
v. 15(3), pp. 282-296, Fall 2001.
(This paper is not subject to copyright restrictions in the US)
Abstract:
Future generation space missions across the solar system to the
planets, moons, asteroids, and comets may someday incorporate
supercomputers both to expand the range of missions being conducted
and to significantly reduce their cost. By performing science
computation directly on the spacecraft itself, the amount of data
required to be down-linked may be reduced by many orders of
magnitude, thus greatly reducing the mass of the resources needed for
communication while increasing the quality and quantity of the
science achieved. By performing the mission planning in real time
directly on the spacecraft, complex and highly responsive missions
can be conducted out of range of direct human intervention and the
cost of mission management can be reduced. Through highly replicated
computing structures, continued operation can be maintained in the
presence of faults by means of graceful degradation. Two classes of
systems, reflecting very different strategies of computer system
architecture, are actively being pursued by the NASA Jet Propulsion
Laboratory to take advantage of the opportunity of embedded high
performance computing on spacecraft for deep space missions. COTS
Clusters may permit the direct application of commercial computing
hardware in loosely coupled ensembles to benefit from the enormous
investment of industry in mass market components. New
Processor-in-Memory architectures combine multiple nodes on a single
chip of processor-memory pairs exposing the full memory bandwidth.
This paper examines the driving issues motivating the use of
supercomputing for future deep space missions and describes two
active research projects at NASA JPL that are pursuing both the COTS
and PIM strategies for next generation spaceborne computing.
-
J. A. Gunnels, D. S. Katz,
E. S. Quintana-Ortí, and R. A. van de Geijn,
"Fault-Tolerant High-Performance Matrix Multiplication: Theory and Practice,"
Proceedings of International Conference
on Dependable Systems and Networks, 2001.
(This paper is not subject to copyright restrictions in the US)
Abstract:
In this paper, we extend the theory and practice regarding
algorithmic fault-tolerant matrix-matrix multiplication, C = A B
, in a number of ways. First, we propose low-overhead methods
for detecting errors introduced not only in C but also in
A and/or B . Second, we show that, theoretically,
these methods will detect all errors as long as only one entry is
corrupted. Third, we propose a low-overhead roll-back approach to
correct errors once detected. Finally, we give a high-performance
implementation of matrix-matrix multiplication that incorporates
these error detection and correction methods. Empirical results
demonstrate that these methods work well in practice while imposing
an acceptable level of overhead relative to high-performance
implementations without fault-tolerance.
-
D. S. Katz, and J. Kepner,
"Embedded/Real-Time Systems,"
International Journal of High Performance Computing Applications
(special issue:
Cluster Computing White Paper), v. 15(2), pp. 186-190,
Summer 2001.
(This paper is not subject to copyright restrictions in the US; however, the full special issue is subject to copyright restrictions)
-
T. Cwik, D. S. Katz, and F. J. Villegas,
"Integrated Design and Simulation for Millimeter-Wave
Antenna Systems,"
Proceedings of the IEEE Aerospace Conference, 2001.
(This paper is not subject to copyright restrictions in the US)
Abstract:
Several instruments operating in the microwave and millimeter-wave
bands are to be developed over the next several years at either JPL
or JPL in conjunction with various other companies and laboratories.
The design and development of these instruments requires an
environment that can produce a microwave or millimeter-wave optics
design, and can assess the sensitivity of key design criteria
(beamwidth, gain, sidelobe levels, etc.) to thermal and mechanical
operating environments. An integrated design tool has been developed
to carry out the design and analysis using software building blocks
from the computer-aided design, thermal, structural and
electromagnetic analysis fields. The capability to simultaneously
assess the effects of design parameter variation resulting from
thermal and structural loads can reduce design and validation cost
and generally lead to more optimal designs, hence higher performing
instruments.
In this paper the development and application of MODTool
(Millimeter-wave Optics Design), a design tool that efficiently
integrates existing millimeter-wave optics design software with a
solid body modeler and thermal/structural analysis packages, will be
discussed. The design tool is also directly useful over other
portions of the spectrum, though thermal or dynamical loads may have
less influence on antenna patterns at the longer wavelengths. Under a
common interface, interactions between the various components of a
design can be efficiently evaluated and optimized. One key component
is the use of physical optics analysis software for antenna pattern
analysis. This software has been ported to various platforms
including distributed memory parallel supercomputers to allow rapid
turn-around for electrically large designs.
-
S. A. Curtis, M. Rilee, M. Bhat, and D. Katz,
"Small
Satellite Constellation Autonomy via on-board Supercomputers and Artificial
Intelligence," International Astronautical Federation, 51st Congress,
2000.
(This paper is not subject to copyright restrictions in the US)
Abstract:
Under NASA's Remote Exploration and Experimentation program, we have
examined the role of fault tolerant on-board supercomputing in the
context of the science requirements of nanospacecraft missions for
both in situ and remote sensing scenarios relating to the study of
Sun-Earth connections. The capability of on-board reduction of data
allows a great decrease in data bandwidth requirements (factors of
more than 1000 are achievable) to earth, this opens the pathway to
much greater autonomy owing to fewer ground contacts of shorter
duration needed. Further reductions are possible using the on-board
supercomputers to evaluate the data, to make operational decisions
based on this, and to directly evaluate spacecraft subsystems. The
ultimate goal is to achieve absolute scalability wherein it is no
more demanding on ground support to operate a constellation of more
than 100 spacecraft than it presently is to operate one spacecraft.
Specific applications of heuristic systems to these problems are
discussed.
-
M. Turmon, R. Granat, and D. S. Katz,
"Software
Fault Tolerance for High-Performance Space Applications,"
Proceedings of International Conference on Dependable Systems and Networks,
2000.
(This paper is not subject to copyright restrictions in the US)
Abstract:
We describe and test a software approach to overcoming
radiation-induced errors in spaceborne applications running on
commercial off-the-shelf components. The approach uses checksum
methods to validate results returned by a numerical subroutine
operating subject to unpredictable errors in data. We can treat
subroutines that return results satisfying a necessary condition
having a linear form: the checksum tests compliance with this
condition. We discuss the theory and practice of setting numerical
tolerances to separate errors caused by a fault from those inherent
in finite-precision numerical calculations. We test both the general
effectiveness of the linear fault tolerant schemes we propose, and
the correct behavior of our parallel implementation of them.
-
F. Chen, L. Craymer, J. Deifik, A. J. Fogel, D. S. Katz, A. G. Silliman, Jr,
R. R. Some, S. A. Upchurch, and K. Whisnant,
"Demonstration
of the Remote Exploration and Experimentation (REE) Fault-Tolerant
Parallel-Processing Supercomputer for Spacecraft Onboard Scientific Data
Processing,"
Proceedings of International Conference on Dependable Systems and Networks,
2000.
(This paper is not subject to copyright restrictions in the US)
Abstract:
This paper is the written explanation for a demonstration of the REE
Project's work to-date. The demonstration is intended to simulate an
REE system that might exist on a Mars Rover, consisting of multiple
COTS processors, a COTS network, a COTS node-level operating system,
REE middleware, and an REE application. The specific application
performs texture processing of images. It was chosen as a building
block of automated geological processing that will eventually be used
for both navigation and data processing. Because the COTS hardware is
not radiation hardened, SEU-induced soft errors will occur. These
errors are simulated in the demonstration by use of a
software-implemented fault-injector, and are injected at a rate much
higher than is realistic for the sake of viewer interest. Both the
application and the middleware contain mechanisms for both detection
of and recovery from these faults, and these mechanisms are tested by
this very high fault-rate. The consequence of the REE system being
able to tolerate this fault rate while continuing to process data is
that the system will easily be able to handle the true fault rate.
-
J. Beahan, L. Edmonds, R. Ferraro, A. Johnston, D. S. Katz, and R. R. Some,
"Detailed
Radiation Fault Modeling of the Remote Exploration and
Experimentation (REE) First Generation Testbed Architecture,"
Proceedings of the IEEE Aerospace Conference, 2000.
(This paper is not subject to copyright restrictions in the US)
Abstract:
The goal of the NASA HPCC Remote Exploration and Experimentation
(REE) Project is to transfer commercial supercomputing technology
into space. The project will use state of the art, low-power,
non-radiation-hardened, Commercial Off-The-Shelf (COTS) hardware
chips and COTS software to the maximum extent possible, and will rely
on Software-Implemented Fault Tolerance (SIFT) to provide the
required levels of availability and reliability. In this paper, we
outline the methodology used to develop a detailed radiation fault
model for the REE Testbed architecture. The model addresses the
effects of energetic protons and heavy ions which cause Single Event
Upset (SEU) and Single Event Multiple Upset (SEMU) events in digital
logic devices and which are expected to be the primary fault
generation mechanism. Unlike previous modeling efforts, this model
will address fault rates and types in computer subsystems at a
sufficiently fine level of granularity (i.e., the register level)
that specific software and operational errors can be derived. We
present the current state of the model, model verification activities
and results to date, and plans for applications.
-
Yi Chao, P. Peggy Li, Ping Wang, Daniel S. Katz, Benny N. Cheng, and Scott
Whitman,
"Ocean Modeling and Visualization on a Massively Parallel Computer,"
Industrial
Strength Parallel Computing: Programming Massively Parallel
Processing Systems,
(Editor: Alice Koniges,)
Morgan Kaufmann Publishers, Inc., 1999.
(This chapter is not subject to copyright restrictions in the US)
-
T. Cwik, C. Zuffada, D. S. Katz, and J. Parker,
"Radar Scattering and Antenna Modeling on Scalable High Performance Computers,"
Industrial
Strength Parallel Computing: Programming Massively Parallel
Processing Systems,
(Editor: Alice Koniges,)
Morgan Kaufmann Publishers, Inc., 1999.
(This chapter is not subject to copyright restrictions in the US)
-
D. S. Katz and T. Cwik,
"Computational Electromagnetics,"
High
Performance Cluster Computing: Programming and Applications, v. 2
(Editor: Rajkumar Buyya,)
Prentice Hall, Inc., 1999.
(This chapter is not subject to copyright restrictions in the US)
-
D. S. Katz, T. Cwik, and T. Sterling,
"An
Examination of the Performance of Two Electromagnetic Simulations on a
Beowulf-Class Computer,"
High
Performance Computing Systems and Applications,
(Editor: J. Schaeffer,) pp. 207-216, Kluwer Academic
Publishers, Norwell, MA, 1998.
(This chapter is not subject to copyright restrictions in the US)
-
D. S. Katz,
T. Cwik, B. H. Kwan, J. Z. Lou, P. L. Springer, T. L. Sterling, and P. Wang,
"An Assessment of a Beowulf System for a Wide Class
of Analysis and Design Software,"
Advances in Engineering Software, v. 26(3-6), pp. 451-461, July 1998.
(This paper is not subject to copyright restrictions in the US)
Abstract:
This paper discusses Beowulf systems, focusing on Hyglac, the Beowulf
system installed at the Jet Propulsion Laboratory. The purpose of
the paper is to assess how a system of this type will perform while
running a variety of scientific and engineering analysis and design
software. The first part of the assessment contains a measurement of
the communication performance of Hyglac, along with a discussion of
factors which have the potential to limit system performance. The
second part consists of performance measurements of six specific
programs (analysis and design software), as well as discussion about
these measurements. Finally, the measurements and discussion lead to
the conclusion that Hyglac is suitable for running these types of
codes (in a research/industrial environment such as at JPL,) and that
the primary factor for determining how a given code will perform is
that codes ratio of communication to computation.
-
T. Cwik, J. Z. Lou, and D. S. Katz,
"Scalable, Finite Element Analysis of Electromagnetic
Scattering and Radiation,"
Advances in Engineering Software, v. 26(3-6), pp. 289-296, July 1998.
(This paper is not subject to copyright restrictions in the US)
Abstract:
In this paper a method for simulating electromagnetic fields
scattered from complex objects is reviewed; namely, an unstructured
finite element code that does not use traditional mesh partitioning
algorithms. The complete software package is implemented on the Cray
T3D massively parallel processor using both Cray Adaptive FORTRAN
(CRAFT) compiler constructs to simplify portions of the code that
operate on the irregular data, and optimized message passing
constructs on portions of the code that operate on regular data and
require optimum machine performance. The above finite element
solution package is then integrated into an error estimation and
adaptive mesh refinement algorithm. An error for the fields over the
mesh is estimated and used to drive an adaptive mesh refinement
algorithm that refines the mesh where errors are above a given
tolerance. This refined mesh is then used once again for a finite
element solution of the fields. After estimating the error a second
time, the mesh is again refined as needed. This process continues
until the error is reduced to an allowable level.
- T. Cwik, D. S. Katz, C. Zuffada, and V. Jamnejad,
"The Application of Scalable Distributed Memory
Computers to the Finite Element Modeling of Electromagnetic Scattering and
Radiation,"
International Journal of Numerical Methods in Engineering,
v. 41(4), pp. 759-776, February 28, 1998.
(This paper is not subject to copyright restrictions in the US)
Abstract:
Large scale parallel computation can be an enabling resource in many
areas of engineering and science if the parallel simulation algorithm
attains an appreciable fraction of the machine peak performance, and
if undue cost in porting the code or in developing the code for the
parallel machine is not incurred. The issue of code parallelization
is especially significant when considering unstructured mesh
simulations. The unstructured mesh models considered in this paper
result from a finite element simulation of electromagnetic fields
scattered from geometrically complex objects (either penetrable or
impenetrable.) The unstructured mesh must be distributed among the
processors, as must the resultant sparse system of linear equations.
Since a distributed memory architecture does not allow direct access
to the irregularly distributed unstructured mesh and sparse matrix
data, partitioning algorithms not needed in the sequential software
have traditionally been used to efficiently spread the data among the
processors. This paper presents a new method for simulating
electromagnetic fields scattered from complex objects; namely, an
unstructured finite element code that does not use traditional mesh
partitioning algorithms.
-
P. Wang, D. S. Katz, and Y. Chao,
"Optimization
of a Parallel Ocean General Circulation Model,"
(Best Paper Award Winner), SC97, 1997.
Global climate modeling is one of the grand challenges of computational
science, and ocean modeling plays an important role in both
understanding the current climatic conditions and predicting the future
climate change. Three-dimensional time-dependent ocean general
circulation models (OGCMs) require a large amount of memory and
processing time to run realistic simulations. Recent advances in
computing hardware have dramatically affected the prospect of studying
the global climate. The significant computational resources of massively
parallel supercomputers promise to make such studies feasible. In
addition to using advanced hardware, designing and implementing a
well-optimized parallel ocean code will significantly improve the
computational performance and reduce the total research time to complete
these studies.
In our present work, we chose the most widely used OGCM code as our base
code. This OGCM is based on the Parallel Ocean Program (POP) developed
in FORTRAN 90 on the Los Alamos CM-2 Connection Machine by the Los
Alamos ocean modeling research group. During the first half of 1994, the
code was ported to the Cray T3D by Cray Research using SHMEM-based
message passing. Since the code on the T3D was still time-consuming when
large problems were encountered, improving the code performance was
considered essential.
We have developed several general strategies to optimize the ocean
general circulation model on the Cray T3D. These strategies include
memory optimization, effective use of arithmetic pipelines, and usage of
optimized libraries. The optimized code runs 2 to 2.5 times faster than
the original code, which gives significant performance improvements for
modeling large scaled ocean flows. Many test runs for both of the
original and the optimized code have been carried out on the Cray T3D
using various numbers of processors (1-256). Comparisons are made for a
variety of real-world problems. A nearly linear scaling performance line
is obtained for the optimized code, while the speed up data of the
optimized code also shows excellent improvement over the original code.
In addition to discussing the optimization of the code, we also address
the issue of portability. Given the short life cycle of the massively
parallel computer, usually on the order of three to five years, we
emphasize the portability of the ocean model and the associated
optimization routines across several computing platforms. Currently, the
ocean modeling code has been ported successfully to the Hewlett Packard
(HP)/Convex SPP-2000, and is readily portable to Cray T3E.
This paper reports our efforts to optimize the parallel implementations
of the oceanic model. So far, the work has focused on improving the load
balancing and single node performance of the code on the Cray T3D. As a
result, the atmosphere and ocean model components running side-by-side
can achieve a performance level of slightly more than 10 GFLOPS on 512
processors of that machine. We have also developed a user-friendly
coupling interface with atmospheric and biogeochemical models, in order
to make the global climate modeling more complete and more realistic.
-
T. Cwik, D. S. Katz, and J. Patterson,
"Scalable Solutions to Integral-Equation and
Finite-Element Simulations,"
(invited paper for special issue on Advanced Numerical Techniques in
Electromagnetics),
IEEE Trans. Antennas Propagat.,
v. 45(3), pp. 544-555, March 1997.
(This paper is not subject to copyright restrictions in the US)
Abstract:
When developing numerical methods, or applying them to the simulation
and design of engineering components, it inevitably becomes necessary
to examine the scaling of the method with a problem's electrical
size. The scaling results from the original mathematical development
- for example, a dense system of equations in the solution of
integral equations - as well as the specific numerical
implementation. Scaling of the numerical implementation depends upon
many factors - for example, direct or iterative methods for solution
of the linear system - as well as the computer architecture used in
the simulation. In this paper, scalability will be divided into two
components; scalability of the numerical algorithm specifically on
parallel computer systems, and algorithm or sequential scalability.
The sequential implementation and scaling is initially presented with
the parallel implementation following. This progression is meant to
illustrate the differences in using current parallel platforms and
sequential machines, and the resulting savings. Time to solution
(wall clock time) for differing problem sizes are the key parameters
plotted or tabulated. Sequential and parallel scalability of time
harmonic surface integral equation forms and the finite element
solution to the partial differential equations are considered in
detail.
-
C. E. Reuter, R. M. Joseph, E. T. Thiele, D. S. Katz, and A. Taflove,
"Ultrawideband Absorbing Boundary Conditions for
Termination of Waveguiding Structures in FD-TD Simulations,"
IEEE Microwave and Guided Wave Letters,
v. 4(10), pp. 344-346, October 1994.
(This paper is subject to copyright restrictions)
Abstract:
A new method for ultrawideband termination of waveguides in
finite-difference time-domain (FD-TD) grids is presented. The
Berenger perfectly matched layer (PML) absorbing boundary condition
is applied to terminate both perfect electrically conducting (PEC)
and dielectric waveguides in two dimensions. Reflections of less
than -75 dB are obtained over the entire propagation regime.
Evidence is presented that the PML ABC is effective even for the
evanescent energy present below cutoff in PEC waveguides and the
multimode propagation present in dielectric waveguides.
-
D. S. Katz, E. T. Thiele, and A. Taflove,
"Validation and Extension to Three Dimensions of the
Berenger PML Absorbing Boundary Condition for FD-TD Meshes,"
IEEE Microwave and Guided Wave Letters,
v. 4(8), pp. 268-270, August 1994.
(This paper is subject to copyright restrictions)
Abstract:
Berenger recently published a novel absorbing boundary condition
(ABC) for FD-TD meshes in two dimensions, claiming
orders-of-magnitude improved performance relative to any earlier
technique. This approach, which he calls the ''perfectly matched
layer (PML) for the absorption of electromagnetic waves,'' creates a
nonphysical absorber adjacent to the outer grid boundary that has a
wave impedance independent of the angle of incidence and frequency
of outgoing scattered waves.
This paper verifies Berenger's strong claims for PML for 2-D FD-TD
grids and extends and verifies PML for 3-D FD-TD grids. Indeed, PML
is > 40 dB more accurate than second-order Mur, and PML works just
as well in 3-D as it does in 2-D. It should have a major impact upon
the entire FD-TD modeling community, leading to new possibilities
for high-accuracy simulations especially for low-observable
aerospace targets.
-
M. J. Piket-May, A. Taflove, W. C. Lin, D. S. Katz,
V. Sathiaseelan, and B. B. Mittal,
"Initial Results for Automated Computational Modeling of
Patient-Specific Electromagnetic Hyperthermia,"
IEEE Trans. Biomedical Engineering,
v. 39(3), pp. 226-237, March 1992.
(This paper is subject to copyright restrictions)
Abstract:
Developments in finite-difference time-domain (FDTD) computational
modeling of Maxwell's equations, super-computer technology, and
computed tomography (CT) imagery open the possibility of accurate
numerical simulation of electromagnetic (EM) wave interactions with
specific, complex, biological tissue structures. One application of
this technology is in the area of treatment planning for EM
hyperthermia. In this paper, we report the first highly automated
CT image segmentation and interpolation scheme applied to model
patient-specific EM hyperthermia. This novel system is based on
sophisticated tools from the artificial intelligence, computer
vision, and computer graphics disciplines. It permits CT-based
patient-specific hyperthermia models to be constructed without
tedious manual contouring on digitizing pads or CRT screens. The
system permits in principle near real-time assistance in
hyperthermia treatment planning. We apply this system to interpret
actual patient CT data, reconstructing a 3-D model of the human
thigh from a collection of 29 serial CT images at 10 mm intervals.
Then, using FD-TD, we obtain 2-D and 3-D models of EM hyperthermia
of this thigh due to a waveguide applicator. We find that different
results are obtained from the 2-D and 3-D models, and conclude that
full 3-D tissue models are required for future clinical usage.
-
D. S. Katz, M. J. Piket-May, A. Taflove, and K. R. Umashankar,
"FD-TD Analysis of Electromagnetic Wave Radiation from
Systems Containing Horn Antennas,"
IEEE Trans. Antennas Propagat.,
v. 39(8), pp. 1203-1212, August 1991.
(This paper is subject to copyright restrictions)
Abstract:
The application of the finite-difference time-domain (FDTD) method
to various radiating structures is considered. These structures
include two- and three-dimensional waveguides, flared horns, a
two-dimensional parabolic reflector, and a two-dimensional
hyperthermia application. Numerical results for the horns,
waveguides, and parabolic reflectors are compared with results from
method of moments (MM). The results for the hyperthermia
application are shown as extensions of the previously validated
models. This new application of the FDTD method is shown to be
useful when other numerical or analytic methods cannot be applied.
This material is presented to ensure timely dissemination of scholarly and technical work. Copyright and all rights therein are retained by authors or by other copyright holders. All persons copying this information are expected to adhere to the terms and constraints invoked by each author's copyright. In most cases, these works may not be reposted without the explicit permission of the copyright holder.
For papers with copyright held by IEEE:
©20xx IEEE. Personal use of this material is permitted. However, permission to reprint/republish this material for advertising or promotional purposes or for creating new collective works for resale or redistribution to servers or lists, or to reuse any copyrighted component of this work in other works must be obtained from the IEEE.
For papers with copyright held by ACM:
ACM COPYRIGHT NOTICE. Copyright ©20xx by the Association for Computing Machinery, Inc. Permission to make digital or hard copies of part or all of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, to republish, to post on servers, or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from Publications Dept., ACM, Inc., fax +1 (212) 869-0481, or permissions@acm.org.
AMS, Elsevier, and SPIE allow authors of papers to post their own papers on a personal website.
For a copy of a paper with restricted access for personal use, please contact:
d.katz@ieee.org
|
|