A Technique to Call Icon from C under Icon Version 9 Carl Sturtivant, 2008/2/20 Confidential Draft #1 1. Summary. A new Icon function written in C with a special interface may be dynamically loaded from a shared object using the built-in function loadfunc [GT95]. We show how such a function may in turn call an Icon procedure using the technique described below, provided that the procedure call itself does not suspend, but only returns or fails. Note that this does not impose constraints of any kind upon other procedures executed as a consequence of calling the original procedure. In particular, the Icon procedure called from C may in turn lead to a call of another Icon function written in C calling Icon recursively. The technique described has been implemented and briefly tested with Icon 9.51(?). 2. Overview. If the body of an Icon function written in C is to call an Icon procedure that does not suspend and retrieve its return value, all without modifying iconx, then there are a number of hurdles to jump. The procedure descriptor, and those of its arguments must be pushed onto the Icon stack, and the interpreter induced to believe it needs to execute an icode instruction to invoke it, one that is not present in the icode it loaded. Once the procedure returns (or fails) the interpreter must be induced to return control to C just after the point where the attempt to call it occurred, rather than simply to go on to the next icode instruction. Then the result of the call needs to be popped off the Icon stack so that it is in the same state as before the call, since C does not normally modify the Icon stack. (Other details of the state of the interpreter will be restored by the mechanism whereby a procedure is called in Icon.) In all other respects, the main interpreter loop must continue to behave as before. These hurdles are insurmountable, so long as the code of the main interpreter loop is inviolate. The code of that loop as it is incorporated into iconx is inviolate, since a design goal is that the technique should work with the existing implementation. Therefore, we take a copy of that loop, and modify it to the ends above, and execute it only in order to call Icon. (The original interpreter continues to be used for all other purposes.) Dynamic linking allows the new interpreter loop to refer to all C globals and functions in iconx, and so nothing else need be copied, these things are merely referred to. In fact it takes very little modification of the copy to achieve these goals, and the result is a C function called icall to which the procedure and its arguments are passed to effect the call to Icoan. To simplify this interface, the arguments are passed as a single Icon list. The resulting function then has similar semantics as the binary "!" operator in Icon, (which we henceforth call 'apply' as it applies a procedure to its argument list) except that it may be called from C. 3. Implementation. The main interpreter loop written in RTL resides in the file src/runtime/interp.r in the Icon distribution. This was translated into the corresponding C file xinterp.c by the RTL translator rtt with the command 'rtt -x interp.r'. Now this C file is edited into a file compiled into a single C function called icall, taking two descriptors (a procedure and a list of arguments) and returning an integer code. The effect of calling icall is to apply the procedure to its arguments, and restore the state of the interpreter, leaving the result of the call just beyond the stack pointer for retrieval. The contents of xinterp.c consist of some global variables and a function interp containing the interpreter loop. The global variable declarations are all modified by prefixing them with 'extern', so that they now simply refer to those used by the interpreter loop inside iconx. The function interp that returns an integer signal and has two parameters: an integer fsig used when the interpreter is called recursively to simulate suspension, and cargp, a pointer into the Icon stack. The function interp is renamed icall. Examination of src/runtime/init.r indicates that the signal 0 is passed to interp when it is initially called non-recursively to start up iconx. So fsig is removed from the parameter list and made a local variable initialized to 0. Similarly cargp is made a local variable, and icall is given two parameters theProc and arglist used to pass that necessary for the call to Icon. Immediately after the initial declarations inside icall, the Icon stack pointer sp is used to initialize cargp to refer to the first descriptor on the stack beyond sp, which is assigned the procedure desciptor parameter of icall. The desciptor beyond that is assigned the argument list descriptor parameter, and the stack pointer augmented to refer to its last word. A new local variable result_sp is initialized to location on the stack of the last word of the procedure descriptor. This is used by the mechanism to return to C described below. Now the details of pushing the procedure descriptor and the argument list descriptor onto the stack are complete. The body of interp consists of some straight-line code followed by the interpreter loop, which contains some code to get the next icode instruction followed by a switch to jump to the correct code to execute it, all inside and endless loop. Just before the loop starts, an unconditional goto is inserted, jumping to a newly inserted label called aptly 'apply' which is placed just after the switch label (called Op_Apply in interp.r) which precedes the code to implement the icode 'apply' instruction, that implements the apply operator (binary "!") in Icon. This instruction expects to find a procedure descriptor and a list descriptor on the stack, and then causes the icode instructions of the procedure to be accordingly invoked. Now the details of calling the procedure are complete. What is left to insert is the mechanism to return to C. When the procedure that we called returns or fails, it will execute a 'pret' instruction or a 'pfail' instruction. However, these instructions may also be executed by Icon procedures called from the one we called. At the end of the code for 'pret' inside the switch in the interpreter is a 'break' to leave the switch and go round to get the next icode instruction. Just before that 'break' we can tell if our procedure call is the one returning by comparing the Icon stack pointer sp to the one we saved, result_sp, which our procedure call will have restored sp to when it overwrote the procedure descriptor with the result of the call. So if they are equal, we can clean up (decrement ilevel, move sp just before the former procedure descriptor) and return, finishing the call to icall. Now C can retrieve the result of the call just beyond the stack pointer. The 'pfail' code is similar, just before a jump to efail, which we do not execute since the context of our call is not an Icon expression. C can determine success or failure from the integer code returned. This completes the mechanism to return to C. 4. Conclusions Overall this mechanism depends upon few things, mainly upon the fact that when a procedure is called, the Icon stack below the part used for the call is not modified during the call. Our copy of the interpreter loop is identical to the original with the exception of the code added for the C return mechanism, which is only exceptionally executed. And the Icon procedure call mechanism itself will save and restore the interpreter state apart from the stack pointer which we abuse at the start and restore at the end. The compiled result with gcc was about 10 Kbyte. A simple test confirmed that call and return occur in the correct order, from Icon to C to Icon returning to C returning to Icon.