Why Stackless is Cool

Download Report

Transcript Why Stackless is Cool

Stackless Python:
programming
the way
Guido
prevented it
intended 
Back To IPC9 developer‘s day
Why Stackless is Cool
• Microthreads
• Generators (now obsolete)
• Coroutines
Microthreads
•
•
•
•
Very lightweight (can support thousands)
Locks need not be OS resources
Not for blocking I/O
A comfortable model for people used to real
threads
Coroutines
Various ways to look at them
• Peer to peer subroutines
• Threads with voluntary swapping
• Generators on steroids (args in, args out)
What’s so cool about them
• Both sides get to “drive”
• Often can replace a state machine with something more
intuitive[1]
[1] Especially where the state machine features complex state but relatively simple events (or few events
per state).
Three Steps To Stacklessness
• Get Python data off the C stack
• Give each frame its own (Python)
stackspace
• Get rid of interpreter recursions
Result
• All frames are created equal
• Stack overflows become memory errors
• Pickling program state becomes
conceivable (new: *has* been done)
Getting rid of recursion is difficult
• Often there is “post” processing involved
• The C code (doing the recursing) may need its own “frame”
• Possible Approaches
• Tail optimized recursion
• Transformation to loop
Either way, the “post” code needs to be separated from the
“setup” code.
Ironic Note: This is exactly the kind of pain we seek to relieve the Python programmer of!
Stackless Reincarnate
•
•
•
•
•
Completely different approach:
Nearly no changes to the Python core
Platform dependant
Few lines of assembly
No longer fighting the Python
implementation
• Orthogonal concepts
Platform Specific Code
__forceinline static int
slp_switch(void)
{
int *stackref, stsizediff;
__asm mov stackref, esp;
SLP_SAVE_STATE(stackref, stsizediff);
__asm {
mov
eax, stsizediff
add
esp, eax
add
ebp, eax
}
SLP_RESTORE_STATE();
}
Note: There are no arguments, in order to simplify the code
Support Macros 1(2)
#define SLP_SAVE_STATE(stackref, stsizediff) \
{\
PyThreadState *tstate = PyThreadState_GET();\
PyCStackObject **cstprev = tstate->slp_state.tmp.cstprev;\
PyCStackObject *cst = tstate->slp_state.tmp.cst;\
int stsizeb;\
if (cstprev != NULL) {\
if (slp_cstack_new(cstprev, stackref) == NULL) return -1;\
stsizeb = (*cstprev)->ob_size * sizeof(int*);\
memcpy((*cstprev)->stack, (*cstprev)->startaddr - (*cstprev)->ob_size, stsizeb);\
(*cstprev)->frame = tstate->slp_state.tmp.fprev;\
}\
else\
stsizeb = (cst->startaddr - stackref) * sizeof(int*);\
if (cst == NULL) return 0;\
stsizediff = stsizeb - (cst->ob_size * sizeof(int*));\
Note: Arguments are passed via Threadstate for easy implementation
Support Macros 2(2)
#define SLP_RESTORE_STATE() \
tstate = PyThreadState_GET();\
cst = tstate->slp_state.tmp.cst;\
if (cst != NULL)\
memcpy(cst->startaddr - cst->ob_size, &cst->stack, (cst->ob_size) * sizeof(int*));\
return 0;\
}\
Stacklessness via Stack Slicing
•
•
•
•
Pieces of the C stack are captured
Recursion limited by heap memory only
Stack pieces attached to frame objects
„One-shot continuation“
Tasklets
•
•
•
•
Tasklets are the building blocks
Tasklets can be switched
They behave like tiny threads
They communicate via channels
Tasklet Creation
# a function that takes a channel as argument
def simplefunc(chan):
chan.receive()
# a factory for some tasklets
def simpletest(func, n):
c = stackless.channel()
gen = stackless.taskoutlet(func)
for i in range(n):
gen(c).run()
return c
Inside Tasklet Creation
• Create frame „before call“
– Abuse of generator flag
• Use „initial stub“ as a blueprint
– slp_cstack_clone()
• Parameterize with a frame object
• Wrap into a tasklet object
• Ready to run
Channels
• Known from OCCAM, Limbo, Alef
• Channel.send(x)
– activates a waiting tasklet with data
– Blocks if none is waiting
• y = Channel.receive()
– Activates a waiting tasklet, returns data
– Blocks if none is listening
Planned Extensions
• Async I/O in a platform independent way
• Prioritized scheduling
• High speed tasklets with extra stacks
– Quick monitors which run between tasklets
• Stack compression
• Thread pickling
• More channel features
– Multiple wait on channel arrays
Thread pickling
• Has been implemented by TwinSun
– Unfortunately for old Stackless
• Analysis of the C stack necessary
– By platform, only
– Lots of work?
– Only a few contexts need stack analysis
• Show it !!!
Stackless Sponsors
• Ironport
– Email server with dramatic throughput
– Integrating their code with the new Stackless
– Async I/O
• CCPGames
– Massive Multiplayer Online Game EVE
– Porting their client code to new Stackless next week