siamcse-md - Florida State University
Download
Report
Transcript siamcse-md - Florida State University
Computational Techniques for
Efficient Carbon Nanotube
Simulation
Ashok Srinivasan
Namas Chandra
Computer Science
Mechanical Engineering
Florida State University
Florida State University
Outline
•
•
•
•
•
Background
Parallel nanotube simulation
Nanocomposites
Parallelization
Conclusions and future work
Background
• Uses of Carbon nanotubes
–
–
–
–
–
Materials
NEMS
Transistors
Displays
Etc
CNT properties
Can span 23,000 miles without
failing due to its own weight
100 times stronger than steel
Many times stiffer than any
known material
Conducts heat better than
diamond
Can be a conductor or insulator
without any doping
Lighter than feather
Sequential computation
• Molecular dynamics, using Brenner’s
potential
– Short-range interactions
– Neighbors can change dynamically during
the course of the simulation
– Computational scheme
• Find force on each particle due to interactions
with “close” neighbors
• Update position and velocity of each atom
Force computations
• Pair interactions
• Bond angles
• Four body
Profile of execution time
• 1: Force
• 2: Neighbor list
• 3: Predictor/corrector
• 4: Thermostating
• 5: Miscellaneous
Profile for force computations
Parallel nanotube simulation
• Shared memory
• Message passing
• Load balancing
Shared memory parallelization
• Do each of the following loops in parallel
– For each atom
• Update forces due to atom i
• If neighboring atoms are owned by other threads, update an
auxiliary array
– For each thread
• Collect force terms for atoms it owns
– Srivastava, et al, SC-97 and CSE 2001
• Simulated 105 to 107 atoms
• Speedup around 16 on 32 processors
• Include long-range forces too
Lexical
decomposition
Message passing parallelization
• Decompose domain into cells
– Each cell contains its atoms
• Assign a set of adjacent cells to each processor
• Each processor computes values for its cells,
communicating with neighbors when their data is
needed
• Caglar and Griebel, World scientific, 1999
– Simulated 108 atoms on up to 512 processors
– Linear speedup for 160,000 atoms on 64 processors
Load balancing
Nanocomposites
• Matrix-nanotube
interface modeled
with springs
• An extra force term
computed for atoms
attached to springs
• Springs can break,
requiring substantial
increase in
computations in that
region
Spring
Polymer matrix
Parallelization
• Distributed shared memory
– Balance the load
– Ensure locality of data
• Simple lexical approach will result in load imbalance
– Balanced lexical
• Adjust the domain size
Breadth first search
• We want to ensure locality too
• Balanced Breadth First Search
– Perform a breadth first search on the atom interaction
graph
Other approaches
• Use general purpose software
– Jostle
– Metis
– ParMetis
Experimental parameters
• Nanotube with 1000 atoms
• Spring probability: 0.05
• Probability of a spring breaking in an
iteration: 0.01
• Load increase factor due to spring
break: 200
• Disturbance region depth: 3
• Number of time steps: 100
Load imbalance
Non-local interactions
Load balancing time
Variation of load with time
Conclusions and future work
• Neighbor search
• Parallelization
– Current approaches appear inadequate
– General purpose software is too slow
– Special purpose techniques appear promising
• Stochastic versions of certain current techniques possible
• Multi-scale simulation of nano-composites
– MD at nano-scale and FEM at larger scale