Database-driven Evolutionary Computation: SQL as a full

Download Report

Transcript Database-driven Evolutionary Computation: SQL as a full

Database-driven Evolutionary
Computation:
SQL as a full programming language
Brian Blaha
The Cost of Programming
• Years ago, hardware cost much more than
programmers
• Today, the situation is by-and-large reversed
Programming Languages
•
•
•
•
Assembly directly represents a program
C abstracts low-level instructions
C++ abstracts high-level operations
Matlab abstracts matrices, neural networks,
and other constructs
• Lisp abstracts memory management
• SQL abstracts logical relationships
Programming Efficiency
• Assembly is rock bottom
• C allows portability across platforms, but
maintaining old code can be difficult
• C++ provides common data structures but
legacy elements of C make its code obscure
• All of the above usually require a program,
rather than instructions, be written and
tested
Programming Efficiency
• Matlab code can be written and tested at the
same time, but the function libraries are
complex
• Lisp encourages hierarchical program
design, and its code can be tested while
being written
• SQL provides full database functionality
C++ vs Lisp
• Erann Gat compared the development time and ru
time of comparable C++ and Lisp programs
• Many programmers implemented a simple
algorithm in either C++ or Lisp
• The resulting development times and run times
serve to compare the languages
Metric
Fastest run time
Average run time
Average
development time
Code length
Programmer
experience
C++ vs Lisp
C++
Lisp
Lisp
Lisp
C++
Interpretation
• Lisp facilitates programming by simplifying
syntax and by allowing incremental testing
• Less skill is required to write fast Lisp
programs than is required to write fast C++
programs
Extension to SQL
• SQL has different syntactic overhead than
C++, and more than Lisp. On the other
hand, one SQL statement does the work of
many C++/Lisp statements
• SQL allows incremental testing, especially
when transactions and rollbacks are used
• Rules for streamlining database operations
tend to be simple
Unique Advantages
• SQL very naturally supports observation of
a currently executing algorithm
• SQL allows for asynchronous alteration of
an algorithm’s data
• Large sets of input data are usually in
databases in the first place
• SQL proficiency is more common than Lisp
proficiency
SQL vs C++
• A genetic algorithm is a complex algorithm,
rather than a simple number-crunching
algorithm like a Fourier transform
• Steps in a generation tend to involve
population-wide operations
C++ Version of an Individual
class Individual{
private:
double fitness;
Gene* genes;
size_t genes_count;
};
SQL Version of an Individual
create table individual (
ID serial
constraint individual\_ID_pk primary key,
fitness double precision,
rank integer,
population integer constraint
individual_population_nn not null
);
SQL vs C++
• Therefore, a simple genetic algorithm was
implemented in SQL and C++
• No external data was input during the
execution, as normally might be the case
SQL vs C++
Time per generation
C++
SQL
unoptimized
.004 seconds 4 seconds
SQL with data
in RAM
1 second
Mitigating Factors
• Lisp code can be compiled for greater
execution speed
• SQL code cannot be directly compiled, but
C++ classes and functions exist that allow
indirect compilation
• Normal SQL operation involves hard disk
activity which greatly slows algorithms