Transcript שקופית 1
Computational Biology
2008
Advisor: Dr. Alon Korngreen
Eitan Hasid
Assaf Ben-Zaken
Concepts
•
•
•
•
•
•
•
•
Biological Background
Computational Background
Goals
Methods
GPU Abilities
GPU Programming
Conclusions
The Future
Neurons
• Core components of the
nervous system.
• Highly specialized for the
processing and transmission
of cellular signals.
• Communicate with one another via electrical and chemical
synapses in a process called synaptic transmission.
• Consists of soma, axon and dendrite.
• Their behavior is examined by measuring voltage-gated
conductances.
Compartmental Models
• Models used to investigate complex neuron’s physiology.
• Consists distributions of
voltage-gated
conductance.
• Large number of loosely
constrained parameters.
• Almost impossible to
conduct manually.
Genetic Algorithm
•
Used to automate parameter search in the model.
•
Multiple recordings from number of locations
improved GA ability to constrain the model.
•
Combined cost function was found to be the most
effective.
Problem: GA was very slow - about 25 seconds per
generation.
Solution: Better Computation power.
•
•
Goals
• Create computer software that will perform
calculations on the graphics card.
•Manipulate the existing algorithm to use the
software whenever parallel computation can take
place.
•Improve running time and by that allow the
addition of even more parameters to the
algorithm.
Methods
Graphics Accelerators
•As a tool for parallel computations.
Sh – High Level Metaprogramming Language.
•Communication with the graphics card.
C++
•Wrapping of the Sh Program.
•GA also written in C++.
Graphic Accelerators
Performance
• Thanks in part to the video gaming and entertainment industries
today we have graphics cards that are extremely fast and
programmable.
• FLOPS – Floating Point Operations per Second.
• The FLOPS is a measure of a computer's performance, especially in
fields of scientific calculations that make heavy use of floating point
calculations, it is similar to instructions per second.
• CPU – can perform 12 gigaflops.
• GPU – Nvidia 6800 can perform hundreds of gigaflops.
GPU Architecture
• In order to understand the parallel
power, the architecture need to be
studied as well.
• The GPU holds large number
of processing units.
• Each unit can function as a small CPU.
• You can send multiple inputs and
receive multiple outputs for the same
operation.
• On today’s advanced hardware you can
even send a different operation for each
unit.
GPGPU
• Stands for “General-Purpose computation on
GPUs“ .
GPGPU Languages
Why do we need them?
• Make programming GPUs easier!
– Don’t need to know OpenGL, DirectX, or ATI/NV
extensions.
– Simplify common operations.
– Focus on the algorithm, not on the implementation.
Main considerations
• Cross platform Windows\Linux.
• Cross platform Nvidia\ATI.
• Operations supported.
• Memory Management.
• Program manipulation.
• Ease of Learn & Documentation available.
Sh
High Level Metaprogramming Language
Operating Systems:
• Windows
• Linux
Graphic Cards:
• NVIDIA GeForce FX 5200 and up.
• ATI Radeon 9600 and up.
Stream Processing:
• Allows very customized operations to take place on GPU. Today, the new hardware
allows even more.
Help and Support:
• We had to compromise something…
* Sh is embedded in C++ and therefore its syntax is very easy to learn.
Analyzing the algorithm
• gprof – a built in Linux tool for profiling programs.
• We used this tool to find heavy running functions
• The bottleneck functions were Matrix
multiplication and determinant calculation
that took about 57% of running time.
• Next we created stream processed functions to
perform these calculations on GPU.
Sh Sample Code
// Environment and namespace declaration
#include <sh/sh.hpp>
using namespace SH;
//Sh Enviroment Initialization
shInit();
//Sh Program Definition
//This Program adds a 3 float vector of 42.0 to an input 3 float vector
ShProgram prg = SH_BEGIN_PROGRAM(“gpu:stream")
{
ShInputAttrib3f a;
ShOutputAttrib3f b;
b = a + ShAttrib3f(42.0, 42.0, 42.0);
} SH_END;
Sh Sample Code
//Setting up the input variable
float data[] = { 1.0, 0.5, -0.5 };
ShHostMemoryPtr mem_in = new ShHostMemory(sizeof(float) * 3, data,
SH_FLOAT);
ShChannel<ShAttrib3f> in(mem_in, 1);
//Setting up the output variable
float outdata[3];
ShHostMemoryPtr mem_out = new ShHostMemory(sizeof(float) * 3, outdata,
SH_FLOAT);
ShChannel<ShAttrib3f> out(mem_out, 1);
//Executing the program
out = prg << in;
Conclusions
• The parallel power is unquestionable.
• In conjunction to a good graphics card Sh could be a
possible solution for hardcore processing.
• Sh grants a developer the ability to develop software that
runs on the GPU in real time that could not work on the
CPU at interactive rates.
• We created a program that creates its calculations on the
GPU.
• We still didn’t manage to create a program that its
execution is faster than CPU rates because of tuples issue.
References
• Constraining Compartmental Models Using Multiple Voltage Recordings and
Genetic Algorithms
Naomi Keren, Noam Peled, and Alon Korngreen
• Principles of Neural Science
Eric R. Kandel.
• GPGPU Community Website.
http://www.gpgpu.org.
• Sh Website
http://www.libsh.org
• Metaprogramming GPUs with Sh
Michael McCool, Stefanus Du Toit
Thanks for listening