Proceedings from the Conference on —

# **High Speed Computing**

LANL • LLNL

The Art of High Speed Computing April 20–23, 1998









This report was prepared as an account of work sponsored by an agency of the United States Government. Neither The Regents of the University of California, the United States Government nor any agency thereof, nor any of their employees, makes any warranty, express or implied, or assumes any legal liability or responsibility for the accuracy, completeness, or usefulness of any information, apparatus, product, or process disclosed, or represents that its use would not infringe privately owned rights. Reference herein to any specific commercial product, process, or service by trade name, trademark, manufacturer, or otherwise, does not necessarily constitute or imply its endorsement, recommendation, or favoring by The Regents of the University of California, the United States Government, or any agency thereof. The views and opinions of authors expressed herein do not necessarily state or reflect those of The Regents of the University of California, the United States Government, or any agency thereof. Los Alamos National Laboratory strongly supports academic freedom and a researcher's right to publish; as an institution, however, the Laboratory does not endorse the viewpoint of a publication or guarantee its technical correctness.

UC-705 Issued: August 1998

Proceedings from the Conference on High Speed Computing The Art of High Speed Computing

Compiled by Kathleen P. Hirons Manuel Vigil Ralph Carlson

April 20–23, 1998



### **Table of Contents**

| Conference Program                                                                                          | vii   |
|-------------------------------------------------------------------------------------------------------------|-------|
| Abstract                                                                                                    | ix    |
| Keynote Address                                                                                             | 1     |
| Stockpile Stewardship and Management Program                                                                | 25    |
| Predictability, and the Challenge of Certifying a Stockpile Without Nuclear Testing                         | 41    |
| 100 TeraFLOPs and Beyond, an Industry View into the Future                                                  | 59    |
| ASCI Aliances Program.  Ann Hayes, LANL                                                                     | 105   |
| The Next Fifty Years of Computer Architecture                                                               | 123   |
| Full Wave Modeling of Signal and Power Interconnects for High Speed Digital Circuits                        | 133   |
| Simulating the Physical-Biological Factors Affecting Abundance of Calanus finmarchicus in the Gulf of Maine | 141   |
| The Next Generation Internet                                                                                | 143   |
| Petaflops Computing: Opportunites and Challenges                                                            | . 151 |
| President's Information Technology Advisory Committee (PITAC): A Mid-Term Report                            | 165   |
| Processing-in-Memory: Past and Present                                                                      | 177   |
| High Volume Technology for HPC Systems                                                                      | 185   |





| High Performance Computing and the NCAR Procurement—Before, During, and After | 209 |
|-------------------------------------------------------------------------------|-----|
| Economies of Scale: HPC into the Next Millennium                              | 223 |
| Γhe Other Side of Computing                                                   | 241 |
| Crystalline Computing                                                         | 249 |
| Quantum Computing  Emanual Knill, LANL                                        | 257 |
| List of Attendees                                                             | 277 |



### **Conference Program**

### **Monday, April 20, 1998**

### **Keynote Session:**

Keynote Address: Billions and Billions Steve Wallach, CenterPoint Venture Partners

### Tuesday, April 21, 1998

### Session 1: The Stockpile Stewardship and Management Program

Stockpile Stewardship and Management Program *Larry Ferderber, LLNL* 

Predictability, and the Challenge of Certifying a Stockpile Without Nuclear Testing Ray Juzaitis, LANL

### Session 2: The Challenge of 100 TeraFLOP Computing

### 100 TeraFLOPs and Beyond, an Industry View into the Future

Panel Discussion: Moderator—John Morrison, LANL and Mark Seager, LLNL; Panelists—Tilak Agerwala, IBM; Greg Astfalk, Hewlett-Packard; Erik Hagersten, Sun Microsystems; Richard Kaufmann, Digital Equipment Corp.; Steve Oberlin, SGI/Cray.

### **Session 3: ASCI Alliance**

ASCI Alliances Program *Ann Hayes, LANL* 

### Session 3.5: Hardware Design

The Next Fifty Years of Computer Architecture Burton Smith, Tera Computer Company

### **Banquet**

Adversarial Inspection in Iraq: 1991 and Thereafter *Jay Davis, LLNL* 

### Wednesday, April 22, 1998

### **Session 5: Student Session**

Full Wave Modeling of Signal and Power Interconnects for High Speed Digital Circuits Gary Haussmann, University of Colorado

Simulating the Physical-Biological Factors Affecting Abundance of Calanus finmarchicus in the Gulf of Maine

Wendy Gentleman, Dartmouth College

### Session 6: News You Can Use

The Next Generation Internet *Bob Aiken*, *DOE* 

Petaflops Computing: Opportunities and Challenges *David Bailey, LBNL* 

President's Information Technology Advisory Committee (PITAC): A Mid-Term Report *David M. Cooper, LLNL* 

### Thursday, April 23, 1998

### **Session 7: Chip Technology for Large Scale Systems**

Processing-in-Memory: Past and Present

Ken Iobst, IDA/CCS

High Volume Technology for HPC Systems

Justin Rattner, Intel

### **Session 8: Reality Check**

High Performance Computing and the NCAR Procurement—Before, During, and After Bill Buzbee, NCAR; Jim Hack and Steve Hammond, National Center for Atmospheric Research

Economies of Scale: HPC in the Next Millennium

Gary Smaby, Smaby Group

### **Session 9: Future**

The Other Side of Computing William Trimmer, Belle Mead Research, Inc.

Crystalline Computation

Norm Margolus, MIT

Quantum Computing

Emanuel Knill, LANL

### **Proceedings from the Conference on High Speed Computing**

# The Art of High Speed Computing

April 20–23, 1998

Compiled by Kathleen P. Hirons Manuel Vigil Ralph Carlson

### **Abstract**

This document provides a written compilation of the presentations and viewgraphs from the 1998 Conference on High Speed Computing.

"The Art of High Speed Computing," held at Gleneden Beach, Oregon, on April 20 through 23, 1998.



# BILLIONS & BILLIONS

STEVE WALLACH
CENTER POINT VENTURES
WALLACH@CENTERPOINTVP.COM

# ASPECTS OF BILLIONS

- Raised to the power (giga, tera, peta, exa)
- The inverse (nano, pico, femto, atto)
- In the computer industry they are closely related. From a technology and investment perspective
- US government policy must be consistent with industry trends. (the ultimate venture capitalist)





# PRESENTATION OUTLINE

- Fundamental Laws- Physics
- Trends in Telecommunications
- Trends in Semi-conductors
- Trends in Computer Architecture
- Draw some conclusions
- US Government Policy

3

# FUNDAMENTAL LAWS



- C Speed of light
- Power Consumption
- Propagation Delay





# **POWER CONSUMPTION**

 $P \cup C * V^2 * F$ 

- C= capacitance
- V= voltage
- F= frequency

5

# PROPAGATION DELAY

• Lossless Line

Time = 
$$\sqrt{LC}$$

• Lossly Line

Time = 
$$L * \sqrt{\varepsilon_r / c_o}$$

 $\varepsilon_r$  = Dielectric Constant

 $c_o =$ Speed of Light





# OTHER CONSTRAINTS

- Cost of Investment I (billions)
- Size of Market M (millions)
- L'Hospital's Rule of Profit
  - Profit = dM/dI
    - as I approaches infinity
    - as M approaches K (sometimes 0)
  - result is { 1 (success) | 0 (failure)}
- The government uses different rules

7

# **TELECOMMUNICATIONS**

- Advances in *PHOTONIC* (mainly *WDM*) technology.
- TERAHZ (THz) requirements
- All optical networks (AON)
- Effect on digital computer architecture
- The next supercomputer topology
  - www.ll.mit.edu/aon/
  - Lemott, et. al., "low-cost WDM", Aug. 97, IEEE summer topicals, Montreal.





















# **SEMI-CONDUCTORS**

- Lets examine what is driving the *I* (investment) in our equation for success.
  - Information from 1997 SIA report (www.semichips.org)

13

# THE COST OF "FABS"



- 2 billion and climbing
- One per continent?
- Put on the moon?
- Only million piece design can be made?





# UNDERLYING REASONS



- 300 mm (12inch) wafers
- Billions to replace 8 inch fabs.
- Good news: keeps costs of chips down

15

# CONDERLYING REASONS SPEED / PERFORMANCE ISSUE The Technical Problem Sum of Delays, Al & SIO<sub>2</sub> Sum of Delays, Cu & Low « Interconnect Delay, Al & SIO<sub>2</sub> Interconnect Delay, Cu & Low « Al & Cu Line Sio<sub>2</sub> Interconnect Delay, Cu & Low « Al & Cu Line Sio<sub>2</sub> Interconnect Delay, Cu & Low « Al & Cu Line Sio<sub>2</sub> Interconnect Delay, Cu & Low « Al & Cu Line Sio<sub>2</sub> Interconnect Delay, Cu & Low « Al & Cu Line Sio<sub>2</sub> Interconnect Delay, Cu & Low « Al & Cu Line Sio<sub>2</sub> Interconnect Delay, Cu & Low « Al & Cu Line Sio<sub>2</sub> Interconnect Delay, Cu & Low « Interconnect Delay, Cu & L





### HOW WE GET THERE Table 63 Modeling & Simulation Technology Requirements (Continued) Year of First Product Shipment 1997 1999 2001 2003 2006 2009 2012 $50 \ nm$ Technology Generation 250 nm 180 nm 100 nm $70 \, nm$ Numerical Methods Linear solvers-equations/minute 150k 250k 250k 2.5M 5M 5M Parallel speedup 4× 6× Grid reliability (ppb) 300 180 120 90 MFLOPS\* required 8000 50 80 400 1000 4000 8000 MC noise NA NA NA 0.05 0.01 10 weeks 6 weeks 4 weeks 2 weeks 2 weeks 1 week 1 week Time needed for multi-tool initial 2 weeks 1 week problem setup Correct data analyses per improvement cycle Solutions Exist Solutions Being Pursued No Known Solution \* MFLOPS-million floating point operations per second \* Number of linear equations generated by discretizing an increasing number of PDFs over a typical device grid of 5000 nodes in 2-D and later 50000 nodes in 3-D. 17







| Table 3                                                                                 | Perforn   | nance of  | Package   | d Chips   | _         |          |
|-----------------------------------------------------------------------------------------|-----------|-----------|-----------|-----------|-----------|----------|
| YEAR OF FIRST PRODUCT SHIPMENT                                                          | 1997      | 1999      | 2001      | 2003      | 2006      | 2009     |
| Technology Generations<br>Dense Lines (DRAM Half-Pitch) (nm)                            | 250       | 180       | 150       | 130       | 100       | 70       |
| ISOLATED LINES (MPU GATES) (nm)                                                         | 200       | 140       | 120       | 100       | 70        | 50       |
| Number of Chip I/Os                                                                     |           |           |           |           |           |          |
| Chip-to-package (pads) high-performance                                                 | 1450      | 2000      | 2400      | 3000      | 4000      | 5400     |
| Chip-to-package (pads) cost-performance                                                 | 800       | 975       | 1195      | 1460      | 1970      | 2655     |
| Number of Package Pins/Balls                                                            | 500       | b.        | b.        | b         | b.        | 651      |
| ASIC (high-performance)                                                                 | 1100      | 1500      | 1800      | 2200      | 3000      | 4100     |
| MPU/controller, cost-performance                                                        | 600       | 810       | 900       | 1100      | 1500      | 2000     |
| Cost-performance package cost (cents/pin)                                               | 1.40-2.80 | 1.25-2.50 | 1.15-2.30 | 1.05-2.05 | 0.90-1.75 | 0.75-1.5 |
| Chip Frequency (MHz)                                                                    |           |           |           |           |           |          |
| On-chip local clock, high-performance                                                   | 750       | 1250      | 1500      | 2100      | 3500      | 6000     |
| On-chip, across-chip clock, high-performance                                            | 750       | 1200      | 1400      | 1600      | 2000      | 2500     |
| On-chip, across-chip clock, cost-performance                                            | 400       | 600       | 700       | 800       | 1100      | 1400     |
| On-chip, across-chip clock, high-performance<br>ASIC                                    | 300       | 500       | 600       | 700       | 900       | 1200     |
| Chip-to-board (off-chip) speed,<br>high-performance<br>(Reduced-width, multiplexed bus) | 750       | 1200      | 1400      | 1600      | 2000      | 2500     |

| T-1                                                     |                |                | 20.5           |                |                |               |               |
|---------------------------------------------------------|----------------|----------------|----------------|----------------|----------------|---------------|---------------|
| W                                                       | /HA            |                | VE             | GE             |                |               |               |
|                                                         |                |                |                |                |                |               |               |
| Table 24 P                                              | roduct Cr      | itical Lev     | el Lithog      | raphy Re       | quireme        | nts           |               |
| Year of First Product Shipment<br>Technology Generation | 1997<br>250 nm | 1999<br>180 nm | 2001<br>150 nm | 2003<br>130 nm | 2006<br>100 nm | 2009<br>70 nm | 2012<br>50 nm |
| Product Application                                     |                |                |                |                |                |               |               |
| DRAM (bits)                                             | 256M           | 1G             | _              | 4G             | 16G            | 64G           | 256G          |
| MPU (logic transistors/cm <sup>2</sup> )                | 4M             | 6M             | 10M            | 18M            | 39M            | 84M           | 180M          |
| ASIC (usable transistors/cm <sup>2</sup> )*             | 8M             | 14M            | 16M            | 24M            | 40M            | 64M           | 100M          |
| Minimum Feature Size (nm)**                             |                |                |                |                |                |               |               |
| Isolated lines (MPU Gates)                              | 200            | 140            | 120            | 100            | 70             | 50            | 35            |
| Dense lines (DRAM Half Pitch)                           | 250            | 180            | 150            | 130            | 100            | 70            | 50            |
| Contacts                                                | 280            | 200            | 170            | 140            | 110            | 80            | 60            |
| Development capability (minimum<br>feature size, nm)    | 140            | 120            | 100            | 70             | 50             | 35            | 25            |
| Gate CD control (nm, 3 sigma at post-etch)**            | 20             | 14             | 12             | 10             | 7              | 5             | 4             |
| Product overlay (nm, mean + 3<br>sigma)**               | 85             | 65             | 55             | 45             | 35             | 25            | 20            |





# **HOW WE USE IT**

• TRENDS IN COMPUTER ARCHITECTURE



GENERAL VIEWS
2 TO 4 YEARS
10 YEARS (USING SIA STUDY)

21

















| JAVA Vs C++        |                                                         |                                    |  |  |  |  |
|--------------------|---------------------------------------------------------|------------------------------------|--|--|--|--|
| FEATURE            | JAVA                                                    | C++                                |  |  |  |  |
| Memory Management  | Garbage collected                                       | Explicit Memory Freeing            |  |  |  |  |
| Multi-threading    | YES (Mesa-style)                                        | NO                                 |  |  |  |  |
| Inheritance Model  | Simpler (separate sub-<br>typing)                       | Complex                            |  |  |  |  |
| Exception handling | Supported                                               | Sporadic                           |  |  |  |  |
| Parametric type    | Does Not                                                | Has template                       |  |  |  |  |
| Type casts         | Checked<br>Thus easier to write<br>protected subsystems | Unchecked<br>(pointer ← → integer) |  |  |  |  |
|                    |                                                         | 26                                 |  |  |  |  |





# **SO WHAT HAPPENS?**

- Fundamentally the following architecture evolves:
  - PIM (processor in memory) or System-on-a-chip
    - more memory bandwidth
    - lower latency
    - consistent with PC pricing and technology curves

27

# WHICH APPROACH?

- SIMD
- MIMD
- MULTI-THREADED
- SUPERSCALAR
- VLIW





# **INTERCONNECT TYPE**

### • SOFTWARE

- Message Passing
- Distributed Shared Memory (DSM)
- Cache Only (COMA)
- Object oriented
- Emulated DSM (e.g.. Threadmarks)

29

# **INTERCONNECT TYPE**

### • HARDWARE

- Hierarchical number of levels is a function of the number of cpu's.
- Physical combination of copper and photonic. Ultimately *WDM* will play an important role in external chip interconnects.





# **NEXT ARCHITECTURES**

- Short Term 2 To 4 years-low performance System-on-a-chip (SOAC)
- Long Term 10 years (using SIA study)
  - High Performance
  - Supercomputing
- US Gov't R&D Policy









# ARCH. - LONG TERM- 2009

- THE SIA STUDY TEACHES US:
  - -64 gbits of dram (8 gbytes)
  - -8 gbits of sram
  - 520 million MPU transistors
  - 70 nm lithography, 2.54 cm on-a-side
  - 6 ghz clock within vliw/risc core
  - 2.5 ghz across die
  - 2500 external signal pins





# ARCH. - LONG TERM - 2009 DESIGN ASSUMPTIONS

- 9 million transistors vliw/risc core with first level cache.
- 2nd. Level cache rule of thumb. 1/4 to 1/2 mbyte per 100 mflops peak.
- 96 mbyte 2nd. Level (6 Inst, 90 data)
- 170 watts
- .6 to .9 volts power supply

35

### **MAXIMUM PIN-USE** EXTERNAL SMP- 6/8 CPU'S 80 gbytes/sec 80 gbytes/sec 2.5 ghz 2.5 ghz 256 data pins 256 data pins coherence 2nd LEVEL CACHE 2nd LEVEL CACHE 96 MBYTES 96 MBYTES 64 bytes wide 64 bytes wide 160 gbytes/sec 160 gbytes/sec VLIW/RISC CORE VLIW/RISC CORE 24 GFLOPS 24 GFLOPS 6 ghz 6 ghz 36













# US GOVERNMENT POLICY

- Examine the Past
- Use Tops 500 LINPACK
- Observe Venture Capital Investments
- What should happen in the future































# **US GOVERNMENT POLICY**

- Provide seed money high risk/reward (darpa, nsf, dod, doe)
- Further national defense initiatives
- Begin the trickle down, technology xfer. What starts out as a US Gov't special becomes COTS after 1 or 2 generations
- Keep the US the most advanced and competitive in the world
- www.hpcc.gov/talks/petaflops-24june97

45

# CONCLUDING

- Convergence of telecommunications/computing
  - -everything is digital
  - everything requires high bandwidth
  - voice is a digital packet (IP switching)
  - digital TV (a TV with a computer or a computer with a TV?)
  - overall system topology mirrors an AON
- Commodity Teraflop Computing





### Stockpile Stewardship Program (U)

1998 Conference on High Speed Computing Gleneden Beach, Oregon April 20-23, 1998



Lawrence J. Ferderber
Deputy Associate Director for National Security
Lawrence Livermore National Laboratory

Lawrence Livermore National Laboratory, P.O. Box 808, Livermore, CA 94551

NS-98-031.1

# The President tasked DOE to help maintain the nuclear deterrent through the Stockpile Stewardship Program

"... I consider the maintenance of a safe and reliable nuclear stockpile to be a supreme national interest of the United States."

"I am assured by the Secretary of Energy and the Directors or our nuclear weapons labs that we can meet the challenge of maintaining our nuclear deterrent under a Comprehensive Test Ban Treaty through a Science-Based Stockpile Stewardship program without nuclear testing..."

"In order for this program to succeed, both the Administration and the Congress must provide sustained bipartisan support for the stockpile stewardship program over the next decade and beyond. I am committed to working with the Congress to ensure this support."

"As part of this arrangement, I am today directing the establishment of a new annual reporting and certification requirement that will ensure that our nuclear weapons remain safe and reliable under a comprehensive test ban."

- August 11, 1995





Today the stockpile is safe and reliable, but we already require a Stockpile Stewardship Program to keep it that way

- · Today's stockpile has a good "pedigree" based on
  - Nuclear tests
  - An experienced workforce
  - State of the art design (then)
- But
  - The stockpile is aging beyond our experience
  - Refurbished components will be made by new processes, in new plants by new people
  - Our experienced workforce is retiring
  - We have no nuclear tests to verify the validity of our decisions
- We need a program that will:
  - Attract and train a new workforce
  - Be able to assess the effect of changes in the stockpile
  - Certify that refurbished components are functionally equivalent to the original ones

NS-98-031. 3

Like every other technological object, a nuclear weapon ages and sometimes we are surprised when we test it



- · One-point safety
- Performance at cold temperatures
- Performance under aged conditions
- · Marginal performance
- Degradation of various key materials
- Pit quality control

- · Metal components cracking
- Yield-select problems
- HE degrading
- HE cracking
- · Detonators corroding
- · Detonator system redesign
- · Metal components corroding





# The Stockpile Stewardship Program responds to these challenges via a few fundamental principals



- We have experimental data from nuclear tests which indicate that details matter – remanufactured components sometimes behave anonymously
- Current experimental and computational capabilities are not sufficient to preclude that these anomalies will occur in the future
- Without nuclear testing, we must take the conservative approach in proving our fixes are real fixes which do not introduce new problems
- We must also develop a strategy to deal with the "unknown anomalies" (e.g. Challenger O-ring) ... including residual design flaws that have not yet manifested themselves
- The stockpile will continue to age and we will be required to deal with changes to almost every components

The SSP approach is not without risk

NS-98-031. 5

# Simple remanufacture is not a credible solution for highly optimized and complex products like nuclear weapons

- This point is illustrated by the Polaris A3 motor rebuild
  - The U.S. production line was placed on standby in 1963
  - Procedures were carefully documented
  - Nineteen years later, in 1982, it was found that the "replica" rebuild of the rocket motors required extensive full scale testing to get it right (four of the flight tests failed)
  - A recall of retired personnel was necessary



Replicating nuclear weapons would be more difficult (impossible) than replicating rocket motors





# The SSP provides integrated capabilities to address DoD's near-term and longer-term issues



- Surveillance
  - to monitor, maintain and predict the condition of the stockpile
- Assessment & Certification
  - of the consequences of change
  - that modifications and maintenance do not degrade warhead safety and reliability
- Refurbishment
  - design and manufacture of refurbished components
- Tritium replacement



NS-98-031. 7

# Our surveillance program is being expanded to meet the needs of an aging stockpile



- · How do weapons age?
- What are the most likely issues?
- How will these issues affect performance and safety?
- When do components need to be refurbished?



Assessment of disassembled components



Interstitial helium



Forensic surveillance techniques





#### The new complex must refurbish/replace components to counter age, performance, or safety degradation





Plutonium pits Los Alamos, New Mexico



Integrated plutonium processing



Assembly expertise

These new plants, people and processes must be certified to be functionally equivalent those originally used

NS-98-031. 9

### SSP requires dramatic advances in computational capabilities





Accelerated Strategic Computing Initiative (ASCI)



3D turbulent mix simulation

ASCI Blue Pacific SST, LLNL 3.3 TeraFlops 2.5 Terabytes









The computational capability was sufficient to provide reasonable assurance that the test would function properly (a cost issue)

35-96-001. 11

### The UNIVAC in 1953





NS-98-001 - 12





## LLNL's first "nuclear" test was designed on the UNIVAC and slide rules





- · LLNL was less than one year old
- The device was placed on a 300foot tower and the physicists stood far away, observing with dark glasses
- Upon detonation, only a small cloud of dust appeared
- When the dust cleared, the tower was still standing

Nowhere to go but up ... and 50 years later, the ASCI program

NS-98-031. 1

## During design-test-build, our simulation codes normalized complex phenomena against test data





- Computers lacked speed and memory to run full problems
- Some nonlinear physical processes not understood
- Nuclear test data provided normalizing factors to make simulations accurate
- Normalization factors differed from system to system







\$10,000,000 (in 1996 dollars)

NS-88-001, 15

#### **LLNL Computer History** Year RRUNIVACI CDC 6600 QrayXMP CrayXMP CDC 6600 СюуХМР IBM 704 CDC 6600 IEM 704 CrayYMP CDC 6600 100000 IBM 704 CDC 7600 Cra¥YMP 🥕 CDC 7600 Meiko CS-2 CDC 7600 IBM 709 CrayT3D CDC 7600 Cray J-90 IBM 709 Cray J-90 PECTAR IBM 7090 Cray J-90 RR LARC CDCSTAR IBM7090 \*CB\*C\*7600 D<u>igita</u>l 8400 IPM 7030 rigital 8400 Cray 1 ass Cluster SCF Cluster Cray 1 IBM SPID Cray 1 IBN(7094 Cray 1 IBM SP TR CD C 3600 0.01 IBM SP SST CDC 3600 Multiprocessor Machines Serial Machines Parallel Machines





### The computational needs of the SSP span many time and length scales









#### Full-Scale Integrated Codes



- 3D simulations
- · High resolution
- Improved algorithms
  - Accuracy
  - Efficiency
  - Scalability
- Applied mathematics
- Mesh generation
- Visualization
- Validation

### sub-grid physics they rely on Subgrid Models / Zonal Physics





- Direct numerical simulations
- · First principle approaches
- · Predictive physics models
- Rigorous treatments of physical phenomena





### Our full scale, integrated codes must support tracted between dimensionality, resolution, and detailed physic





NS-98-031. 2

## SSP structural analysis codes will develop mesh and boundary partitioning for a wide variety of integrated simulations





AT-400 shipping container drop test

50,000 elements 27 contact boundary conditions

code: ParaDyn



Spin plate metal forming application

4,800 elements 4 contact boundaries

code: ParaDyn



Human index finger under flexion

26,000 nodal points 21,000 8-node brick elements 15 sliding interfaces

code: Nike3D





#### Direct 3D numerical simulation code HYDRA provides detailed comparisons between NIF predictions and experiments





Experimental measurements of 3D multimode surface perturbations on an ablatively driven foil are directly compared to simulations





NS-98-001. 21

# Analysis and visualization of terascale data sets places severe demands on all aspects of the computing environment



data set generated (2.7 MB/s) >> 3MB raster image to desktop, fly-around at 10 fps (30 MB/s)







## Meeting the ASCI challenge requires partnerships and collaborations among the three laboratories, industrial partners, and universities





Achieving the 100-teraFLOPS milestone will require carefully integrated efforts to develop unprecedented computer platforms, high-fidelity physics codes, and a world-class computing environment.





## The ASCI machines are research partnerships with U.S. Industry





ASCI "Blue Pacific" Computer Lawrence Livermore National Laboratory IBM

NS-98-031: 25

### The SSP announced strategic academic alliances with five universities



- · Stanford University
  - The Center for Integrated Turbulence Simulations
    - William C. Reynolds (wcr@thermo.stanford.edu)
- . The University of Chicago
  - Astrophysical Thermonuclear Flashes
    - Robert Rosner (rrosner@oddjob.uchicago.edu)
- The University of Illinois an Champaigne, Urbana
  - Center for Simulation of Advanced Rockets
    - Michael T. Heath (m-heath@uiuc.edu)
- · The University of Utah
  - Center for Simulation of Accidental Fires and Explosions
    - David W. Pershing (David.Pershing@dean.eng.utah.edu)
- The California Institute of Technology
  - Facility for Simulating the Dynamic Response of Materials
    - Daniel I. Meiron (dim@ama.caltech.edu)







## ASCI is an essential part of the rapidly evolving Stockpile Stewardship Program

- ASCI provides leading-edge, high-end simulation capabilities to meet weapon certification requirements
- ASCI integrates the resources of national laboratories, computer manufacturers, and academic institutions
  - national labs focus on application codes and related applied science
  - computer manufacturers develop technology and systems for 100 TeraFlops
  - Academic institutions research the basic science

The ASCI codes will need to be continually evaluated against experimental data in the relevant regimes

NS-98-031. 2

### We estimate it will take ten years to fully implement the SSP investment





Time is critical because, in the transition, we will need to rely on the judgment of a diminishing number of nuclear-test trained weapon designers

HS-99-031, 3





### Our future certification of the stockpile will rely on informed judgments



- Trained, knowledgeable people are required to assess and certify the stockpile
- A deeper understanding of the underlying science is required for practical weapon assessment capabilities
- New computational capabilities are needed to provide the integration formerly done with nuclear tests
- New experimental capabilities are required to provide detailed component level tests and validate the computation tools







Full implementation of SSP is required to sustain nuclear deterrence

NS-98-811, 39

### There are many risks inherent in the SSP





- NASA did not accept the judgment of its engineers that the design was unacceptable and,
- As the problems grew in number and severity, NASA minimized them in management briefings and reports.
  - Reports of the Presidential Commission on the Challenger accident

"The contractor did not accept the implications of tests early in the program that the design had a serious and unanticipated flaw















































A broad class of nationally-important problems requires predicting response of complex systems outside the envelope of controlled experiments and direct reliable observation

Science-based stewardship of the nuclear weapons stockpile
Global climate predictions
Nuclear reactor technology
Virtual testing aerospace, auto, military technologies
Natural disaster forecasting
National infrastructure security

Applied Theoretical & Computational Physics Division

Los Alamos













































































55



















| <ul> <li>Establish modern baseline for each enduring stockpile weapon system</li> <li>Draw a "box" in parameter regime about each system, based on quantified uncertainty</li> <li>Don't certify "outside the box"</li> <li>Enhance simulation fidelity (SBSS, ASCI)</li> <li>Use verification, validation, and stochastically-based "predictivity tools" to demonstrate an expanded range of design parameter space (go to second bullet)</li> </ul> |   | Blending responsible conservatism with a commitment to the nethodical development of true predictive capability |
|-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|---|-----------------------------------------------------------------------------------------------------------------|
| <ul> <li>uncertainty</li> <li>Don't certify "outside the box"</li> <li>Enhance simulation fidelity (SBSS, ASCI)</li> <li>Use verification, validation, and stochastically-based "predictivity tools" to demonstrate an expanded range of design parameter space (go to second</li> </ul>                                                                                                                                                              |   | Establish modern baseline for each enduring stockpile weapon system                                             |
| <ul> <li>Enhance simulation fidelity (SBSS, ASCI)</li> <li>Use verification, validation, and stochastically-based "predictivity tools" to demonstrate an expanded range of design parameter space (go to second</li> </ul>                                                                                                                                                                                                                            | • |                                                                                                                 |
| Use verification, validation, and stochastically-based "predictivity tools" to demonstrate an expanded range of design parameter space (go to second                                                                                                                                                                                                                                                                                                  |   | Don't certify "outside the box"                                                                                 |
| demonstrate an expanded range of design parameter space (go to second                                                                                                                                                                                                                                                                                                                                                                                 | • | Enhance simulation fidelity (SBSS, ASCI)                                                                        |
|                                                                                                                                                                                                                                                                                                                                                                                                                                                       | - | demonstrate an expanded range of design parameter space (go to second                                           |
|                                                                                                                                                                                                                                                                                                                                                                                                                                                       |   |                                                                                                                 |



