diff options
Diffstat (limited to 'deps/lightening/lightning.texi')
| -rw-r--r-- | deps/lightening/lightning.texi | 1760 | 
1 files changed, 0 insertions, 1760 deletions
diff --git a/deps/lightening/lightning.texi b/deps/lightening/lightning.texi deleted file mode 100644 index 88f397a..0000000 --- a/deps/lightening/lightning.texi +++ /dev/null @@ -1,1760 +0,0 @@ -\input texinfo.tex  @c -*- texinfo -*- -@c %**start of header (This is for running Texinfo on a region.) - -@setfilename lightning.info - -@set TITLE       Using @sc{gnu} @i{lightning} -@set TOPIC       installing and using - -@settitle @value{TITLE} - -@c --------------------------------------------------------------------- -@c Common macros -@c --------------------------------------------------------------------- - -@macro bulletize{a} -@item -\a\ -@end macro - -@macro rem{a} -@r{@i{\a\}} -@end macro - -@macro gnu{} -@sc{gnu} -@end macro - -@macro lightning{} -@gnu{} @i{lightning} -@end macro - -@c --------------------------------------------------------------------- -@c Macros for Texinfo 3.1/4.0 compatibility -@c --------------------------------------------------------------------- - -@c @hlink (macro), @url and @email are used instead of @uref for Texinfo 3.1 -@c compatibility -@macro hlink{url, link} -\link\ (\url\) -@end macro - -@c ifhtml can only be true in Texinfo 4.0, which has uref -@ifhtml -@unmacro hlink - -@macro hlink{url, link} -@uref{\url\, \link\} -@end macro - -@macro email{mail} -@uref{mailto:\mail\, , \mail\} -@end macro - -@macro url{url} -@uref{\url\} -@end macro -@end ifhtml - -@c --------------------------------------------------------------------- -@c References to the other half of the manual -@c --------------------------------------------------------------------- - -@macro usingref{node, name} -@ref{\node\, , \name\} -@end macro - -@c --------------------------------------------------------------------- -@c End of macro section -@c --------------------------------------------------------------------- - -@set UPDATED 18 June 2018 -@set UPDATED-MONTH June 2018 -@set EDITION 2.1.2 -@set VERSION 2.1.2 - -@ifnottex -@dircategory Software development -@direntry -* lightning: (lightning).       Library for dynamic code generation. -@end direntry -@end ifnottex - -@ifnottex -@node Top -@top @lightning{} - -@iftex -@macro comma -@verbatim{|,|} -@end macro -@end iftex - -@ifnottex -@macro comma -@verb{|,|} -@end macro -@end ifnottex - -This document describes @value{TOPIC} the @lightning{} library for -dynamic code generation. - -@menu -* Overview::                What GNU lightning is -* Installation::            Configuring and installing GNU lightning -* The instruction set::     The RISC instruction set used in GNU lightning -* GNU lightning examples::  GNU lightning's examples -* Reentrancy::              Re-entrant usage of GNU lightning -* Customizations::          Advanced code generation customizations -* Acknowledgements::        Acknowledgements for GNU lightning -@end menu -@end ifnottex - -@node Overview -@chapter Introduction to @lightning{} - -@iftex -This document describes @value{TOPIC} the @lightning{} library for -dynamic code generation. -@end iftex - -Dynamic code generation is the generation of machine code  -at runtime. It is typically used to strip a layer of interpretation  -by allowing compilation to occur at runtime.  One of the most -well-known applications of dynamic code generation is perhaps that -of interpreters that compile source code to an intermediate bytecode -form, which is then recompiled to machine code at run-time: this -approach effectively combines the portability of bytecode -representations with the speed of machine code.  Another common -application of dynamic code generation is in the field of hardware -simulators and binary emulators, which can use the same techniques -to translate simulated instructions to the instructions of the  -underlying machine. - -Yet other applications come to mind: for example, windowing -@dfn{bitblt} operations, matrix manipulations, and network packet -filters.  Albeit very powerful and relatively well known within the -compiler community, dynamic code generation techniques are rarely -exploited to their full potential and, with the exception of the -two applications described above, have remained curiosities because -of their portability and functionality barriers: binary instructions -are generated, so programs using dynamic code generation must be -retargeted for each machine; in addition, coding a run-time code -generator is a tedious and error-prone task more than a difficult one. - -@lightning{} provides a portable, fast and easily retargetable dynamic -code generation system.  - -To be portable, @lightning{} abstracts over current architectures' -quirks and unorthogonalities.  The interface that it exposes to is that -of a standardized RISC architecture loosely based on the SPARC and MIPS -chips.  There are a few general-purpose registers (six, not including -those used to receive and pass parameters between subroutines), and -arithmetic operations involve three operands---either three registers -or two registers and an arbitrarily sized immediate value. - -On one hand, this architecture is general enough that it is possible to -generate pretty efficient code even on CISC architectures such as the -Intel x86 or the Motorola 68k families.  On the other hand, it matches -real architectures closely enough that, most of the time, the -compiler's constant folding pass ends up generating code which -assembles machine instructions without further tests. - -@node Installation -@chapter Configuring and installing @lightning{} - -The first thing to do to use @lightning{} is to configure the -program, picking the set of macros to be used on the host -architecture; this configuration is automatically performed by -the @file{configure} shell script; to run it, merely type: -@example -     ./configure -@end example - -@lightning{} supports the @code{--enable-disassembler} option, that -enables linking to GNU binutils and optionally print human readable -disassembly of the jit code. This option can be disabled by the -@code{--disable-disassembler} option. - -Another option that @file{configure} accepts is -@code{--enable-assertions}, which enables several consistency checks in -the run-time assemblers.  These are not usually needed, so you can -decide to simply forget about it; also remember that these consistency -checks tend to slow down your code generator. - -After you've configured @lightning{}, run @file{make} as usual. - -@lightning{} has an extensive set of tests to validate it is working -correctly in the build host. To test it run: -@example -    make check -@end example - -The next important step is: -@example -    make install -@end example - -This ends the process of installing @lightning{}. - -@node The instruction set -@chapter @lightning{}'s instruction set - -@lightning{}'s instruction set was designed by deriving instructions -that closely match those of most existing RISC architectures, or -that can be easily syntesized if absent.  Each instruction is composed -of: -@itemize @bullet -@item -an operation, like @code{sub} or @code{mul} - -@item -most times, a register/immediate flag (@code{r} or @code{i}) - -@item -an unsigned modifier (@code{u}), a type identifier or two, when applicable. -@end itemize - -Examples of legal mnemonics are @code{addr} (integer add, with three -register operands) and @code{muli} (integer multiply, with two -register operands and an immediate operand).  Each instruction takes -two or three operands; in most cases, one of them can be an immediate -value instead of a register. - -Most @lightning{} integer operations are signed wordsize operations, -with the exception of operations that convert types, or load or store -values to/from memory. When applicable, the types and C types are as -follow: - -@example -     _c         @r{signed char} -     _uc        @r{unsigned char} -     _s         @r{short} -     _us        @r{unsigned short} -     _i         @r{int} -     _ui        @r{unsigned int} -     _l         @r{long} -     _f         @r{float} -     _d         @r{double} -@end example - -Most integer operations do not need a type modifier, and when loading or -storing values to memory there is an alias to the proper operation -using wordsize operands, that is, if ommited, the type is @r{int} on -32-bit architectures and @r{long} on 64-bit architectures.  Note -that lightning also expects @code{sizeof(void*)} to match the wordsize. - -When an unsigned operation result differs from the equivalent signed -operation, there is a the @code{_u} modifier. - -There are at least seven integer registers, of which six are -general-purpose, while the last is used to contain the frame pointer -(@code{FP}).  The frame pointer can be used to allocate and access local -variables on the stack, using the @code{allocai} or @code{allocar} -instruction. - -Of the general-purpose registers, at least three are guaranteed to be -preserved across function calls (@code{V0}, @code{V1} and -@code{V2}) and at least three are not (@code{R0}, @code{R1} and -@code{R2}).  Six registers are not very much, but this -restriction was forced by the need to target CISC architectures -which, like the x86, are poor of registers; anyway, backends can -specify the actual number of available registers with the calls -@code{JIT_R_NUM} (for caller-save registers) and @code{JIT_V_NUM} -(for callee-save registers). - -There are at least six floating-point registers, named @code{F0} to -@code{F5}.  These are usually caller-save and are separate from the integer -registers on the supported architectures; on Intel architectures, -in 32 bit mode if SSE2 is not available or use of X87 is forced, -the register stack is mapped to a flat register file.  As for the -integer registers, the macro @code{JIT_F_NUM} yields the number of -floating-point registers. - -The complete instruction set follows; as you can see, most non-memory -operations only take integers (either signed or unsigned) as operands; -this was done in order to reduce the instruction set, and because most -architectures only provide word and long word operations on registers. -There are instructions that allow operands to be extended to fit a larger -data type, both in a signed and in an unsigned way. - -@table @b -@item Binary ALU operations -These accept three operands; the last one can be an immediate. -@code{addx} operations must directly follow @code{addc}, and -@code{subx} must follow @code{subc}; otherwise, results are undefined. -Most, if not all, architectures do not support @r{float} or @r{double} -immediate operands; lightning emulates those operations by moving the -immediate to a temporary register and emiting the call with only -register operands. -@example -addr         _f  _d  O1 = O2 + O3 -addi         _f  _d  O1 = O2 + O3 -addxr                O1 = O2 + (O3 + carry) -addxi                O1 = O2 + (O3 + carry) -addcr                O1 = O2 + O3, set carry -addci                O1 = O2 + O3, set carry -subr         _f  _d  O1 = O2 - O3 -subi         _f  _d  O1 = O2 - O3 -subxr                O1 = O2 - (O3 + carry) -subxi                O1 = O2 - (O3 + carry) -subcr                O1 = O2 - O3, set carry -subci                O1 = O2 - O3, set carry -rsbr         _f  _d  O1 = O3 - O1 -rsbi         _f  _d  O1 = O3 - O1 -mulr         _f  _d  O1 = O2 * O3 -muli         _f  _d  O1 = O2 * O3 -divr     _u  _f  _d  O1 = O2 / O3 -divi     _u  _f  _d  O1 = O2 / O3 -remr     _u          O1 = O2 % O3 -remi     _u          O1 = O2 % O3 -andr                 O1 = O2 & O3 -andi                 O1 = O2 & O3 -orr                  O1 = O2 | O3 -ori                  O1 = O2 | O3 -xorr                 O1 = O2 ^ O3 -xori                 O1 = O2 ^ O3 -lshr                 O1 = O2 << O3 -lshi                 O1 = O2 << O3 -rshr     _u          O1 = O2 >> O3@footnote{The sign bit is propagated unless using the @code{_u} modifier.} -rshi     _u          O1 = O2 >> O3@footnote{The sign bit is propagated unless using the @code{_u} modifier.} -@end example - -@item Four operand binary ALU operations -These accept two result registers, and two operands; the last one can -be an immediate. The first two arguments cannot be the same register. - -@code{qmul} stores the low word of the result in @code{O1} and the -high word in @code{O2}. For unsigned multiplication, @code{O2} zero -means there was no overflow. For signed multiplication, no overflow -check is based on sign, and can be detected if @code{O2} is zero or -minus one. - -@code{qdiv} stores the quotient in @code{O1} and the remainder in -@code{O2}. It can be used as quick way to check if a division is -exact, in which case the remainder is zero. - -@example -qmulr    _u       O1 O2 = O3 * O4 -qmuli    _u       O1 O2 = O3 * O4 -qdivr    _u       O1 O2 = O3 / O4 -qdivi    _u       O1 O2 = O3 / O4 -@end example - -@item Unary ALU operations -These accept two operands, both of which must be registers. -@example -negr         _f  _d  O1 = -O2 -comr                 O1 = ~O2 -@end example - -These unary ALU operations are only defined for float operands. -@example -absr         _f  _d  O1 = fabs(O2) -sqrtr                O1 = sqrt(O2) -@end example - -Besides requiring the @code{r} modifier, there are no unary operations -with an immediate operand. - -@item Compare instructions -These accept three operands; again, the last can be an immediate. -The last two operands are compared, and the first operand, that must be -an integer register, is set to either 0 or 1, according to whether the -given condition was met or not. - -The conditions given below are for the standard behavior of C, -where the ``unordered'' comparison result is mapped to false. - -@example -ltr       _u  _f  _d  O1 =  (O2 <  O3) -lti       _u  _f  _d  O1 =  (O2 <  O3) -ler       _u  _f  _d  O1 =  (O2 <= O3) -lei       _u  _f  _d  O1 =  (O2 <= O3) -gtr       _u  _f  _d  O1 =  (O2 >  O3) -gti       _u  _f  _d  O1 =  (O2 >  O3) -ger       _u  _f  _d  O1 =  (O2 >= O3) -gei       _u  _f  _d  O1 =  (O2 >= O3) -eqr           _f  _d  O1 =  (O2 == O3) -eqi           _f  _d  O1 =  (O2 == O3) -ner           _f  _d  O1 =  (O2 != O3) -nei           _f  _d  O1 =  (O2 != O3) -unltr         _f  _d  O1 = !(O2 >= O3) -unler         _f  _d  O1 = !(O2 >  O3) -ungtr         _f  _d  O1 = !(O2 <= O3) -unger         _f  _d  O1 = !(O2 <  O3) -uneqr         _f  _d  O1 = !(O2 <  O3) && !(O2 >  O3) -ltgtr         _f  _d  O1 = !(O2 >= O3) || !(O2 <= O3) -ordr          _f  _d  O1 =  (O2 == O2) &&  (O3 == O3) -unordr        _f  _d  O1 =  (O2 != O2) ||  (O3 != O3) -@end example - -@item Transfer operations -These accept two operands; for @code{ext} both of them must be -registers, while @code{mov} accepts an immediate value as the second -operand. - -Unlike @code{movr} and @code{movi}, the other instructions are used -to truncate a wordsize operand to a smaller integer data type or to -convert float data types. You can also use @code{extr} to convert an -integer to a floating point value: the usual options are @code{extr_f} -and @code{extr_d}. - -@example -movr                                 _f  _d  O1 = O2 -movi                                 _f  _d  O1 = O2 -extr      _c  _uc  _s  _us  _i  _ui  _f  _d  O1 = O2 -truncr                               _f  _d  O1 = trunc(O2) -@end example - -In 64-bit architectures it may be required to use @code{truncr_f_i}, -@code{truncr_f_l}, @code{truncr_d_i} and @code{truncr_d_l} to match -the equivalent C code.  Only the @code{_i} modifier is available in -32-bit architectures. - -@example -truncr_f_i    = <int> O1 = <float> O2 -truncr_f_l    = <long>O1 = <float> O2 -truncr_d_i    = <int> O1 = <double>O2 -truncr_d_l    = <long>O1 = <double>O2 -@end example - -The float conversion operations are @emph{destination first, -source second}, but the order of the types is reversed.  This happens -for historical reasons. - -@example -extr_f_d    = <double>O1 = <float> O2 -extr_d_f    = <float> O1 = <double>O2 -@end example - -@item Network extensions -These accept two operands, both of which must be registers; these -two instructions actually perform the same task, yet they are -assigned to two mnemonics for the sake of convenience and -completeness.  As usual, the first operand is the destination and -the second is the source. -The @code{_ul} variant is only available in 64-bit architectures. -@example -htonr    _us _ui _ul @r{Host-to-network (big endian) order} -ntohr    _us _ui _ul @r{Network-to-host order } -@end example - -@item Load operations -@code{ld} accepts two operands while @code{ldx} accepts three; -in both cases, the last can be either a register or an immediate -value. Values are extended (with or without sign, according to -the data type specification) to fit a whole register. -The @code{_ui} and @code{_l} types are only available in 64-bit -architectures.  For convenience, there is a version without a -type modifier for integer or pointer operands that uses the -appropriate wordsize call. -@example -ldr     _c  _uc  _s  _us  _i  _ui  _l  _f  _d  O1 = *O2 -ldi     _c  _uc  _s  _us  _i  _ui  _l  _f  _d  O1 = *O2 -ldxr    _c  _uc  _s  _us  _i  _ui  _l  _f  _d  O1 = *(O2+O3) -ldxi    _c  _uc  _s  _us  _i  _ui  _l  _f  _d  O1 = *(O2+O3) -@end example - -@item Store operations -@code{st} accepts two operands while @code{stx} accepts three; in -both cases, the first can be either a register or an immediate -value. Values are sign-extended to fit a whole register. -@example -str     _c  _uc  _s  _us  _i  _ui  _l  _f  _d  *O1 = O2 -sti     _c  _uc  _s  _us  _i  _ui  _l  _f  _d  *O1 = O2 -stxr    _c  _uc  _s  _us  _i  _ui  _l  _f  _d  *(O1+O2) = O3 -stxi    _c  _uc  _s  _us  _i  _ui  _l  _f  _d  *(O1+O2) = O3 -@end example -As for the load operations, the @code{_ui} and @code{_l} types are -only available in 64-bit architectures, and for convenience, there -is a version without a type modifier for integer or pointer operands -that uses the appropriate wordsize call. - -@item Argument management -These are: -@example -prepare     (not specified) -va_start    (not specified) -pushargr                                   _f  _d -pushargi                                   _f  _d -va_push     (not specified) -arg         _c  _uc  _s  _us  _i  _ui  _l  _f  _d -getarg      _c  _uc  _s  _us  _i  _ui  _l  _f  _d -va_arg                                         _d -putargr                                    _f  _d -putargi                                    _f  _d -ret         (not specified) -retr                                       _f  _d -reti                                       _f  _d -va_end      (not specified) -retval      _c  _uc  _s  _us  _i  _ui  _l  _f  _d -epilog      (not specified) -@end example -As with other operations that use a type modifier, the @code{_ui} and -@code{_l} types are only available in 64-bit architectures, but there -are operations without a type modifier that alias to the appropriate -integer operation with wordsize operands. - -@code{prepare}, @code{pusharg}, and @code{retval} are used by the caller, -while @code{arg}, @code{getarg} and @code{ret} are used by the callee. -A code snippet that wants to call another procedure and has to pass -arguments must, in order: use the @code{prepare} instruction and use -the @code{pushargr} or @code{pushargi} to push the arguments @strong{in -left to right order}; and use @code{finish} or @code{call} (explained below) -to perform the actual call. - -@code{va_start} returns a @code{C} compatible @code{va_list}. To fetch -arguments, use @code{va_arg} for integers and @code{va_arg_d} for doubles. -@code{va_push} is required when passing a @code{va_list} to another function, -because not all architectures expect it as a single pointer. Known case -is DEC Alpha, that requires it as a structure passed by value. - -@code{arg}, @code{getarg} and @code{putarg} are used by the callee. -@code{arg} is different from other instruction in that it does not -actually generate any code: instead, it is a function which returns -a value to be passed to @code{getarg} or @code{putarg}. @footnote{``Return -a value'' means that @lightning{} code that compile these -instructions return a value when expanded.} You should call -@code{arg} as soon as possible, before any function call or, more -easily, right after the @code{prolog} instructions -(which is treated later). - -@code{getarg} accepts a register argument and a value returned by -@code{arg}, and will move that argument to the register, extending -it (with or without sign, according to the data type specification) -to fit a whole register.  These instructions are more intimately -related to the usage of the @lightning{} instruction set in code -that generates other code, so they will be treated more -specifically in @ref{GNU lightning examples, , Generating code at -run-time}. - -@code{putarg} is a mix of @code{getarg} and @code{pusharg} in that -it accepts as first argument a register or immediate, and as -second argument a value returned by @code{arg}. It allows changing, -or restoring an argument to the current function, and is a -construct required to implement tail call optimization. Note that -arguments in registers are very cheap, but will be overwritten -at any moment, including on some operations, for example division, -that on several ports is implemented as a function call. - -Finally, the @code{retval} instruction fetches the return value of a -called function in a register.  The @code{retval} instruction takes a -register argument and copies the return value of the previously called -function in that register.  A function with a return value should use -@code{retr} or @code{reti} to put the return value in the return register -before returning.  @xref{Fibonacci, the Fibonacci numbers}, for an example. - -@code{epilog} is an optional call, that marks the end of a function -body. It is automatically generated by @lightning{} if starting a new -function (what should be done after a @code{ret} call) or finishing -generating jit. -It is very important to note that the fact that @code{epilog} being -optional may cause a common mistake. Consider this: -@example -fun1: -    prolog -    ... -    ret -fun2: -    prolog -@end example -Because @code{epilog} is added when finding a new @code{prolog}, -this will cause the @code{fun2} label to actually be before the -return from @code{fun1}. Because @lightning{} will actually -understand it as: -@example -fun1: -    prolog -    ... -    ret -fun2: -    epilog -    prolog -@end example - -You should observe a few rules when using these macros.  First of -all, if calling a varargs function, you should use the @code{ellipsis} -call to mark the position of the ellipsis in the C prototype. - -You should not nest calls to @code{prepare} inside a -@code{prepare/finish} block.  Doing this will result in undefined -behavior. Note that for functions with zero arguments you can use -just @code{call}. - -@item Branch instructions -Like @code{arg}, these also return a value which, in this case, -is to be used to compile forward branches as explained in -@ref{Fibonacci, , Fibonacci numbers}.  They accept two operands to be -compared; of these, the last can be either a register or an immediate. -They are: -@example -bltr      _u  _f  _d  @r{if }(O2 <  O3)@r{ goto }O1 -blti      _u  _f  _d  @r{if }(O2 <  O3)@r{ goto }O1 -bler      _u  _f  _d  @r{if }(O2 <= O3)@r{ goto }O1 -blei      _u  _f  _d  @r{if }(O2 <= O3)@r{ goto }O1 -bgtr      _u  _f  _d  @r{if }(O2 >  O3)@r{ goto }O1 -bgti      _u  _f  _d  @r{if }(O2 >  O3)@r{ goto }O1 -bger      _u  _f  _d  @r{if }(O2 >= O3)@r{ goto }O1 -bgei      _u  _f  _d  @r{if }(O2 >= O3)@r{ goto }O1 -beqr          _f  _d  @r{if }(O2 == O3)@r{ goto }O1 -beqi          _f  _d  @r{if }(O2 == O3)@r{ goto }O1 -bner          _f  _d  @r{if }(O2 != O3)@r{ goto }O1 -bnei          _f  _d  @r{if }(O2 != O3)@r{ goto }O1 - -bunltr        _f  _d  @r{if }!(O2 >= O3)@r{ goto }O1 -bunler        _f  _d  @r{if }!(O2 >  O3)@r{ goto }O1 -bungtr        _f  _d  @r{if }!(O2 <= O3)@r{ goto }O1 -bunger        _f  _d  @r{if }!(O2 <  O3)@r{ goto }O1 -buneqr        _f  _d  @r{if }!(O2 <  O3) && !(O2 >  O3)@r{ goto }O1 -bltgtr        _f  _d  @r{if }!(O2 >= O3) || !(O2 <= O3)@r{ goto }O1 -bordr         _f  _d  @r{if } (O2 == O2) &&  (O3 == O3)@r{ goto }O1 -bunordr       _f  _d  @r{if }!(O2 != O2) ||  (O3 != O3)@r{ goto }O1 - -bmsr                  @r{if }O2 &  O3@r{ goto }O1 -bmsi                  @r{if }O2 &  O3@r{ goto }O1 -bmcr                  @r{if }!(O2 & O3)@r{ goto }O1 -bmci                  @r{if }!(O2 & O3)@r{ goto }O1@footnote{These mnemonics mean, respectively, @dfn{branch if mask set} and @dfn{branch if mask cleared}.} -boaddr    _u          O2 += O3@r{, goto }O1@r{ if overflow} -boaddi    _u          O2 += O3@r{, goto }O1@r{ if overflow} -bxaddr    _u          O2 += O3@r{, goto }O1@r{ if no overflow} -bxaddi    _u          O2 += O3@r{, goto }O1@r{ if no overflow} -bosubr    _u          O2 -= O3@r{, goto }O1@r{ if overflow} -bosubi    _u          O2 -= O3@r{, goto }O1@r{ if overflow} -bxsubr    _u          O2 -= O3@r{, goto }O1@r{ if no overflow} -bxsubi    _u          O2 -= O3@r{, goto }O1@r{ if no overflow} -@end example - -@item Jump and return operations -These accept one argument except @code{ret} and @code{jmpi} which -have none; the difference between @code{finishi} and @code{calli} -is that the latter does not clean the stack from pushed parameters -(if any) and the former must @strong{always} follow a @code{prepare} -instruction. -@example -callr     (not specified)                @r{function call to register O1} -calli     (not specified)                @r{function call to immediate O1} -finishr   (not specified)                @r{function call to register O1} -finishi   (not specified)                @r{function call to immediate O1} -jmpr      (not specified)                @r{unconditional jump to register} -jmpi      (not specified)                @r{unconditional jump} -ret       (not specified)                @r{return from subroutine} -retr      _c _uc _s _us _i _ui _l _f _d -reti      _c _uc _s _us _i _ui _l _f _d -retval    _c _uc _s _us _i _ui _l _f _d  @r{move return value} -                                         @r{to register} -@end example - -Like branch instruction, @code{jmpi} also returns a value which is to -be used to compile forward branches. @xref{Fibonacci, , Fibonacci -numbers}. - -@item Labels -There are 3 @lightning{} instructions to create labels: -@example -label     (not specified)                @r{simple label} -forward   (not specified)                @r{forward label} -indirect  (not specified)                @r{special simple label} -@end example - -@code{label} is normally used as @code{patch_at} argument for backward -jumps. - -@example -        jit_node_t *jump, *label; -label = jit_label(); -        ... -        jump = jit_beqr(JIT_R0, JIT_R1); -        jit_patch_at(jump, label); -@end example - -@code{forward} is used to patch code generation before the actual -position of the label is known. - -@example -        jit_node_t *jump, *label; -label = jit_forward(); -        jump = jit_beqr(JIT_R0, JIT_R1); -        jit_patch_at(jump, label); -        ... -        jit_link(label); -@end example - -@code{indirect} is useful when creating jump tables, and tells -@lightning{} to not optimize out a label that is not the target of -any jump, because an indirect jump may land where it is defined. - -@example -        jit_node_t *jump, *label; -        ... -        jmpr(JIT_R0);                    @rem{/* may jump to label */} -        ... -label = jit_indirect(); -@end example - -@code{indirect} is an special case of @code{note} and @code{name} -because it is a valid argument to @code{address}. - -Note that the usual idiom to write the previous example is -@example -        jit_node_t *addr, *jump; -addr  = jit_movi(JIT_R0, 0);             @rem{/* immediate is ignored */} -        ... -        jmpr(JIT_R0); -        ... -        jit_patch(addr);                 @rem{/* implicit label added */} -@end example - -that automatically binds the implicit label added by @code{patch} with -the @code{movi}, but on some special conditions it is required to create -an "unbound" label. - -@item Function prolog - -These macros are used to set up a function prolog.  The @code{allocai} -call accept a single integer argument and returns an offset value -for stack storage access.  The @code{allocar} accepts two registers -arguments, the first is set to the offset for stack access, and the -second is the size in bytes argument. - -@example -prolog    (not specified)                @r{function prolog} -allocai   (not specified)                @r{reserve space on the stack} -allocar   (not specified)                @r{allocate space on the stack} -@end example - -@code{allocai} receives the number of bytes to allocate and returns -the offset from the frame pointer register @code{FP} to the base of -the area. - -@code{allocar} receives two register arguments.  The first is where -to store the offset from the frame pointer register @code{FP} to the -base of the area.  The second argument is the size in bytes.  Note -that @code{allocar} is dynamic allocation, and special attention -should be taken when using it.  If called in a loop, every iteration -will allocate stack space.  Stack space is aligned from 8 to 64 bytes -depending on backend requirements, even if allocating only one byte. -It is advisable to not use it with @code{frame} and @code{tramp}; it -should work with @code{frame} with special care to call only once, -but is not supported if used in @code{tramp}, even if called only -once. - -As a small appetizer, here is a small function that adds 1 to the input -parameter (an @code{int}).  I'm using an assembly-like syntax here which -is a bit different from the one used when writing real subroutines with -@lightning{}; the real syntax will be introduced in @xref{GNU lightning -examples, , Generating code at run-time}. - -@example -incr: -     prolog -in = arg                     @rem{! We have an integer argument} -     getarg    R0, in        @rem{! Move it to R0} -     addi      R0, R0, 1     @rem{! Add 1} -     retr      R0            @rem{! And return the result} -@end example - -And here is another function which uses the @code{printf} function from -the standard C library to write a number in hexadecimal notation: - -@example -printhex: -     prolog -in = arg                     @rem{! Same as above} -     getarg    R0, in -     prepare                 @rem{! Begin call sequence for printf} -     pushargi  "%x"          @rem{! Push format string} -     ellipsis                @rem{! Varargs start here} -     pushargr  R0            @rem{! Push second argument} -     finishi   printf        @rem{! Call printf} -     ret                     @rem{! Return to caller} -@end example - -@item Trampolines, continuations and tail call optimization - -Frequently it is required to generate jit code that must jump to -code generated later, possibly from another @code{jit_context_t}. -These require compatible stack frames. - -@lightning{} provides two primitives from where trampolines, -continuations and tail call optimization can be implemented. - -@example -frame   (not specified)                  @r{create stack frame} -tramp   (not specified)                  @r{assume stack frame} -@end example - -@code{frame} receives an integer argument@footnote{It is not -automatically computed because it does not know about the -requirement of later generated code.} that defines the size in -bytes for the stack frame of the current, @code{C} callable, -jit function. To calculate this value, a good formula is maximum -number of arguments to any called native function times -eight@footnote{Times eight so that it works for double arguments. -And would not need conditionals for ports that pass arguments in -the stack.}, plus the sum of the arguments to any call to -@code{jit_allocai}. @lightning{} automatically adjusts this value -for any backend specific stack memory it may need, or any -alignment constraint. - -@code{frame} also instructs @lightning{} to save all callee -save registers in the prolog and reload in the epilog. - -@example -main:                        @rem{! jit entry point} -     prolog                  @rem{! function prolog} -     frame  256              @rem{! save all callee save registers and} -                             @rem{! reserve at least 256 bytes in stack} -main_loop: -     ... -     jmpi   handler          @rem{! jumps to external code} -     ... -     ret                     @rem{! return to the caller} -@end example - -@code{tramp} differs from @code{frame} only that a prolog and epilog -will not be generated. Note that @code{prolog} must still be used. -The code under @code{tramp} must be ready to be entered with a jump -at the prolog position, and instead of a return, it must end with -a non conditional jump. @code{tramp} exists solely for the fact -that it allows optimizing out prolog and epilog code that would -never be executed. - -@example -handler:                     @rem{! handler entry point} -     prolog                  @rem{! function prolog} -     tramp  256              @rem{! assumes all callee save registers} -                             @rem{! are saved and there is at least} -                             @rem{! 256 bytes in stack} -     ... -     jmpi   main_loop        @rem{! return to the main loop} -@end example - -@lightning{} only supports Tail Call Optimization using the -@code{tramp} construct. Any other way is not guaranteed to -work on all ports. - -An example of a simple (recursive) tail call optimization: - -@example -factorial:                   @rem{! Entry point of the factorial function} -     prolog -in = arg                     @rem{! Receive an integer argument} -     getarg R0, in           @rem{! Move argument to RO} -     prepare -         pushargi 1          @rem{! This is the accumulator} -         pushargr R0         @rem{! This is the argument} -     finishi fact            @rem{! Call the tail call optimized function} -     retval R0               @rem{! Fetch the result} -     retr R0                 @rem{! Return it} -     epilog                  @rem{! Epilog *before* label before prolog} - -fact:                        @rem{! Entry point of the helper function} -     prolog -     frame 16                @rem{! Reserve 16 bytes in the stack} -fact_entry:                  @rem{! This is the tail call entry point} -ac = arg                     @rem{! The accumulator is the first argument} -in = arg                     @rem{! The factorial argument} -     getarg R0, ac           @rem{! Move the accumulator to R0} -     getarg R1, in           @rem{! Move the argument to R1} -     blei fact_out, R1, 1    @rem{! Done if argument is one or less} -     mulr R0, R0, R1         @rem{! accumulator *= argument} -     putargr R0, ac          @rem{! Update the accumulator} -     subi R1, R1, 1          @rem{! argument -= 1} -     putargr R1, in          @rem{! Update the argument} -     jmpi fact_entry         @rem{! Tail Call Optimize it!} -fact_out: -     retr R0                 @rem{! Return the accumulator} -@end example - -@item Predicates -@example -forward_p      (not specified)           @r{forward label predicate} -indirect_p     (not specified)           @r{indirect label predicate} -target_p       (not specified)           @r{used label predicate} -arg_register_p (not specified)           @r{argument kind predicate} -callee_save_p  (not specified)           @r{callee save predicate} -pointer_p      (not specified)           @r{pointer predicate} -@end example - -@code{forward_p} expects a @code{jit_node_t*} argument, and -returns non zero if it is a forward label reference, that is, -a label returned by @code{forward}, that still needs a -@code{link} call. - -@code{indirect_p} expects a @code{jit_node_t*} argument, and returns -non zero if it is an indirect label reference, that is, a label that -was returned by @code{indirect}. - -@code{target_p} expects a @code{jit_node_t*} argument, that is any -kind of label, and will return non zero if there is at least one -jump or move referencing it. - -@code{arg_register_p} expects a @code{jit_node_t*} argument, that must -have been returned by @code{arg}, @code{arg_f} or @code{arg_d}, and -will return non zero if the argument lives in a register. This call -is useful to know the live range of register arguments, as those -are very fast to read and write, but have volatile values. - -@code{callee_save_p} exects a valid @code{JIT_Rn}, @code{JIT_Vn}, or -@code{JIT_Fn}, and will return non zero if the register is callee -save. This call is useful because on several ports, the @code{JIT_Rn} -and @code{JIT_Fn} registers are actually callee save; no need -to save and load the values when making function calls. - -@code{pointer_p} expects a pointer argument, and will return non -zero if the pointer is inside the generated jit code. Must be -called after @code{jit_emit} and before @code{jit_destroy_state}. -@end table - -@node GNU lightning examples -@chapter Generating code at run-time - -To use @lightning{}, you should include the @file{lightning.h} file that -is put in your include directory by the @samp{make install} command. - -Each of the instructions above translates to a macro or function call. -All you have to do is prepend @code{jit_} (lowercase) to opcode names -and @code{JIT_} (uppercase) to register names.  Of course, parameters -are to be put between parentheses. - -This small tutorial presents three examples: - -@iftex -@itemize @bullet -@item -The @code{incr} function found in @ref{The instruction set, , -@lightning{}'s instruction set}: - -@item -A simple function call to @code{printf} - -@item -An RPN calculator. - -@item -Fibonacci numbers -@end itemize -@end iftex -@ifnottex -@menu -* incr::             A function which increments a number by one -* printf::           A simple function call to printf -* RPN calculator::   A more complex example, an RPN calculator -* Fibonacci::        Calculating Fibonacci numbers -@end menu -@end ifnottex - -@node incr -@section A function which increments a number by one - -Let's see how to create and use the sample @code{incr} function created -in @ref{The instruction set, , @lightning{}'s instruction set}: - -@example -#include <stdio.h> -#include <lightning.h> - -static jit_state_t *_jit; - -typedef int (*pifi)(int);    @rem{/* Pointer to Int Function of Int */} - -int main(int argc, char *argv[]) -@{ -  jit_node_t  *in; -  pifi         incr; - -  init_jit(argv[0]); -  _jit = jit_new_state(); - -  jit_prolog();                    @rem{/* @t{     prolog             } */} -  in = jit_arg();                  @rem{/* @t{     in = arg           } */} -  jit_getarg(JIT_R0, in);          @rem{/* @t{     getarg R0          } */} -  jit_addi(JIT_R0, JIT_R0, 1);     @rem{/* @t{     addi   R0@comma{} R0@comma{} 1   } */} -  jit_retr(JIT_R0);                @rem{/* @t{     retr   R0          } */} - -  incr = jit_emit(); -  jit_clear_state(); - -  @rem{/* call the generated code@comma{} passing 5 as an argument */} -  printf("%d + 1 = %d\n", 5, incr(5)); - -  jit_destroy_state(); -  finish_jit(); -  return 0; -@} -@end example - -Let's examine the code line by line (well, almost@dots{}): - -@table @t -@item #include <lightning.h> -You already know about this.  It defines all of @lightning{}'s macros. - -@item static jit_state_t *_jit; -You might wonder about what is @code{jit_state_t}.  It is a structure -that stores jit code generation information.  The name @code{_jit} is -special, because since multiple jit generators can run at the same -time, you must either @r{#define _jit my_jit_state} or name it -@code{_jit}. - -@item typedef int (*pifi)(int); -Just a handy typedef for a pointer to a function that takes an -@code{int} and returns another. - -@item jit_node_t  *in; -Declares a variable to hold an identifier for a function argument. It -is an opaque pointer, that will hold the return of a call to @code{arg} -and be used as argument to @code{getarg}. - -@item pifi         incr; -Declares a function pointer variable to a function that receives an -@code{int} and returns an @code{int}. - -@item init_jit(argv[0]); -You must call this function before creating a @code{jit_state_t} -object. This function does global state initialization, and may need -to detect CPU or Operating System features.  It receives a string -argument that is later used to read symbols from a shared object using -GNU binutils if disassembly was enabled at configure time. If no -disassembly will be performed a NULL pointer can be used as argument. - -@item _jit = jit_new_state(); -This call initializes a @lightning{} jit state. - -@item jit_prolog(); -Ok, so we start generating code for our beloved function@dots{} - -@item in = jit_arg(); -@itemx jit_getarg(JIT_R0, in); -We retrieve the first (and only) argument, an integer, and store it -into the general-purpose register @code{R0}. - -@item jit_addi(JIT_R0, JIT_R0, 1); -We add one to the content of the register. - -@item jit_retr(JIT_R0); -This instruction generates a standard function epilog that returns -the contents of the @code{R0} register. - -@item incr = jit_emit(); -This instruction is very important.  It actually translates the -@lightning{} macros used before to machine code, flushes the generated -code area out of the processor's instruction cache and return a -pointer to the start of the code. - -@item jit_clear_state(); -This call cleanups any data not required for jit execution. Note -that it must be called after any call to @code{jit_print} or -@code{jit_address}, as this call destroy the @lightning{} -intermediate representation. - -@item printf("%d + 1 = %d", 5, incr(5)); -Calling our function is this simple---it is not distinguishable from -a normal C function call, the only difference being that @code{incr} -is a variable. - -@item jit_destroy_state(); -Releases all memory associated with the jit context. It should be -called after known the jit will no longer be called. - -@item finish_jit(); -This call cleanups any global state hold by @lightning{}, and is -advisable to call it once jit code will no longer be generated. -@end table - -@lightning{} abstracts two phases of dynamic code generation: selecting -instructions that map the standard representation, and emitting binary -code for these instructions.  The client program has the responsibility -of describing the code to be generated using the standard @lightning{} -instruction set. - -Let's examine the code generated for @code{incr} on the SPARC and x86_64 -architecture (on the right is the code that an assembly-language -programmer would write): - -@table @b -@item SPARC -@example -      save  %sp, -112, %sp -      mov  %i0, %g2                 retl -      inc  %g2                      inc %o0 -      mov  %g2, %i0 -      restore  -      retl  -      nop  -@end example -In this case, @lightning{} introduces overhead to create a register -window (not knowing that the procedure is a leaf procedure) and to -move the argument to the general purpose register @code{R0} (which -maps to @code{%g2} on the SPARC). -@end table - -@table @b -@item x86_64 -@example -    sub   $0x30,%rsp -    mov   %rbp,(%rsp) -    mov   %rsp,%rbp -    sub   $0x18,%rsp -    mov   %rdi,%rax            mov %rdi, %rax -    add   $0x1,%rax            inc %rax -    mov   %rbp,%rsp -    mov   (%rsp),%rbp -    add   $0x30,%rsp -    retq                       retq -@end example -In this case, the main overhead is due to the function's prolog and -epilog, and stack alignment after reserving stack space for word -to/from float conversions or moving data from/to x87 to/from SSE. -Note that besides allocating space to save callee saved registers, -no registers are saved/restored because @lightning{} notices those -registers are not modified. There is currently no logic to detect -if it needs to allocate stack space for type conversions neither -proper leaf function detection, but these are subject to change -(FIXME). -@end table - -@node printf -@section A simple function call to @code{printf} - -Again, here is the code for the example: - -@example -#include <stdio.h> -#include <lightning.h> - -static jit_state_t *_jit; - -typedef void (*pvfi)(int);      @rem{/* Pointer to Void Function of Int */} - -int main(int argc, char *argv[]) -@{ -  pvfi          myFunction;             @rem{/* ptr to generated code */} -  jit_node_t    *start, *end;           @rem{/* a couple of labels */} -  jit_node_t    *in;                    @rem{/* to get the argument */} - -  init_jit(argv[0]); -  _jit = jit_new_state(); - -  start = jit_note(__FILE__, __LINE__); -  jit_prolog(); -  in = jit_arg(); -  jit_getarg(JIT_R1, in); -  jit_pushargi((jit_word_t)"generated %d bytes\n"); -  jit_ellipsis(); -  jit_pushargr(JIT_R1); -  jit_finishi(printf); -  jit_ret(); -  jit_epilog(); -  end = jit_note(__FILE__, __LINE__); - -  myFunction = jit_emit(); - -  @rem{/* call the generated code@comma{} passing its size as argument */} -  myFunction((char*)jit_address(end) - (char*)jit_address(start)); -  jit_clear_state(); - -  jit_disassemble(); - -  jit_destroy_state(); -  finish_jit(); -  return 0; -@} -@end example - -The function shows how many bytes were generated.  Most of the code -is not very interesting, as it resembles very closely the program -presented in @ref{incr, , A function which increments a number by one}. - -For this reason, we're going to concentrate on just a few statements. - -@table @t -@item start = jit_note(__FILE__, __LINE__); -@itemx @r{@dots{}} -@itemx end = jit_note(__FILE__, __LINE__); -These two instruction call the @code{jit_note} macro, which creates -a note in the jit code; arguments to @code{jit_note} usually are a -filename string and line number integer, but using NULL for the -string argument is perfectly valid if only need to create a simple -marker in the code. - -@item jit_ellipsis(); -@code{ellipsis} usually is only required if calling varargs functions -with double arguments, but it is a good practice to properly describe -the @r{@dots{}} in the call sequence. - -@item jit_pushargi((jit_word_t)"generated %d bytes\n"); -Note the use of the @code{(jit_word_t)} cast, that is used only -to avoid a compiler warning, due to using a pointer where a -wordsize integer type was expected. - -@item jit_prepare(); -@itemx @r{@dots{}} -@itemx jit_finishi(printf); -Once the arguments to @code{printf} have been pushed, what means -moving them to stack or register arguments, the @code{printf} -function is called and the stack cleaned.  Note how @lightning{} -abstracts the differences between different architectures and -ABI's -- the client program does not know how parameter passing -works on the host architecture. - -@item jit_epilog(); -Usually it is not required to call @code{epilog}, but because it -is implicitly called when noticing the end of a function, if the -@code{end} variable was set with a @code{note} call after the -@code{ret}, it would not consider the function epilog. - -@item myFunction((char*)jit_address(end) - (char*)jit_address(start)); -This calls the generate jit function passing as argument the offset -difference from the @code{start} and @code{end} notes. The @code{address} -call must be done after the @code{emit} call or either a fatal error -will happen (if @lightning{} is built with assertions enable) or an -undefined value will be returned. - -@item jit_clear_state(); -Note that @code{jit_clear_state} was called after executing jit in -this example. It was done because it must be called after any call -to @code{jit_address} or @code{jit_print}. - -@item jit_disassemble(); -@code{disassemble} will dump the generated code to standard output, -unless @lightning{} was built with the disassembler disabled, in which -case no output will be shown. -@end table - -@node RPN calculator -@section A more complex example, an RPN calculator - -We create a small stack-based RPN calculator which applies a series -of operators to a given parameter and to other numeric operands. -Unlike previous examples, the code generator is fully parameterized -and is able to compile different formulas to different functions. -Here is the code for the expression compiler; a sample usage will -follow. - -Since @lightning{} does not provide push/pop instruction, this -example uses a stack-allocated area to store the data.  Such an -area can be allocated using the macro @code{allocai}, which -receives the number of bytes to allocate and returns the offset -from the frame pointer register @code{FP} to the base of the -area. - -Usually, you will use the @code{ldxi} and @code{stxi} instruction -to access stack-allocated variables.  However, it is possible to -use operations such as @code{add} to compute the address of the -variables, and pass the address around. - -@example -#include <stdio.h> -#include <lightning.h> - -typedef int (*pifi)(int);       @rem{/* Pointer to Int Function of Int */} - -static jit_state_t *_jit; - -void stack_push(int reg, int *sp) -@{ -  jit_stxi_i (*sp, JIT_FP, reg); -  *sp += sizeof (int); -@} - -void stack_pop(int reg, int *sp) -@{ -  *sp -= sizeof (int); -  jit_ldxi_i (reg, JIT_FP, *sp); -@} - -jit_node_t *compile_rpn(char *expr) -@{ -  jit_node_t *in, *fn; -  int stack_base, stack_ptr; - -  fn = jit_note(NULL, 0); -  jit_prolog(); -  in = jit_arg(); -  stack_ptr = stack_base = jit_allocai (32 * sizeof (int)); - -  jit_getarg_i(JIT_R2, in); - -  while (*expr) @{ -    char buf[32]; -    int n; -    if (sscanf(expr, "%[0-9]%n", buf, &n)) @{ -      expr += n - 1; -      stack_push(JIT_R0, &stack_ptr); -      jit_movi(JIT_R0, atoi(buf)); -    @} else if (*expr == 'x') @{ -      stack_push(JIT_R0, &stack_ptr); -      jit_movr(JIT_R0, JIT_R2); -    @} else if (*expr == '+') @{ -      stack_pop(JIT_R1, &stack_ptr); -      jit_addr(JIT_R0, JIT_R1, JIT_R0); -    @} else if (*expr == '-') @{ -      stack_pop(JIT_R1, &stack_ptr); -      jit_subr(JIT_R0, JIT_R1, JIT_R0); -    @} else if (*expr == '*') @{ -      stack_pop(JIT_R1, &stack_ptr); -      jit_mulr(JIT_R0, JIT_R1, JIT_R0); -    @} else if (*expr == '/') @{ -      stack_pop(JIT_R1, &stack_ptr); -      jit_divr(JIT_R0, JIT_R1, JIT_R0); -    @} else @{ -      fprintf(stderr, "cannot compile: %s\n", expr); -      abort(); -    @} -    ++expr; -  @} -  jit_retr(JIT_R0); -  jit_epilog(); -  return fn; -@} -@end example - -The principle on which the calculator is based is easy: the stack top -is held in R0, while the remaining items of the stack are held in the -memory area that we allocate with @code{allocai}.  Compiling a numeric -operand or the argument @code{x} pushes the old stack top onto the -stack and moves the operand into R0; compiling an operator pops the -second operand off the stack into R1, and compiles the operation so -that the result goes into R0, thus becoming the new stack top. - -This example allocates a fixed area for 32 @code{int}s.  This is not -a problem when the function is a leaf like in this case; in a full-blown -compiler you will want to analyze the input and determine the number -of needed stack slots---a very simple example of register allocation. -The area is then managed like a stack using @code{stack_push} and -@code{stack_pop}. - -Source code for the client (which lies in the same source file) follows: - -@example -int main(int argc, char *argv[]) -@{ -  jit_node_t *nc, *nf; -  pifi c2f, f2c; -  int i; - -  init_jit(argv[0]); -  _jit = jit_new_state(); - -  nc = compile_rpn("32x9*5/+"); -  nf = compile_rpn("x32-5*9/"); -  (void)jit_emit(); -  c2f = (pifi)jit_address(nc); -  f2c = (pifi)jit_address(nf); -  jit_clear_state(); - -  printf("\nC:"); -  for (i = 0; i <= 100; i += 10) printf("%3d ", i); -  printf("\nF:"); -  for (i = 0; i <= 100; i += 10) printf("%3d ", c2f(i)); -  printf("\n"); - -  printf("\nF:"); -  for (i = 32; i <= 212; i += 18) printf("%3d ", i); -  printf("\nC:"); -  for (i = 32; i <= 212; i += 18) printf("%3d ", f2c(i)); -  printf("\n"); - -  jit_destroy_state(); -  finish_jit(); -  return 0; -@} -@end example - -The client displays a conversion table between Celsius and Fahrenheit -degrees (both Celsius-to-Fahrenheit and Fahrenheit-to-Celsius). The -formulas are, @math{F(c) = c*9/5+32} and @math{C(f) = (f-32)*5/9}, -respectively. - -Providing the formula as an argument to @code{compile_rpn} effectively -parameterizes code generation, making it possible to use the same code -to compile different functions; this is what makes dynamic code -generation so powerful. - -@node Fibonacci -@section Fibonacci numbers - -The code in this section calculates the Fibonacci sequence. That is -modeled by the recurrence relation: -@display -     f(0) = 0 -     f(1) = f(2) = 1 -     f(n) = f(n-1) + f(n-2) -@end display - -The purpose of this example is to introduce branches.  There are two -kind of branches: backward branches and forward branches.  We'll -present the calculation in a recursive and iterative form; the -former only uses forward branches, while the latter uses both. - -@example -#include <stdio.h> -#include <lightning.h> - -static jit_state_t *_jit; - -typedef int (*pifi)(int);       @rem{/* Pointer to Int Function of Int */} - -int main(int argc, char *argv[]) -@{ -  pifi       fib; -  jit_node_t *label; -  jit_node_t *call; -  jit_node_t *in;                 @rem{/* offset of the argument */} -  jit_node_t *ref;                @rem{/* to patch the forward reference */} -  jit_node_t *zero;               @rem{/* to patch the forward reference */} - -  init_jit(argv[0]); -  _jit = jit_new_state(); - -  label = jit_label(); -        jit_prolog   (); -  in =  jit_arg      (); -        jit_getarg   (JIT_V0, in);              @rem{/* R0 = n */} - zero = jit_beqi     (JIT_R0, 0); -        jit_movr     (JIT_V0, JIT_R0);          /* V0 = R0 */ -        jit_movi     (JIT_R0, 1); -  ref = jit_blei     (JIT_V0, 2); -        jit_subi     (JIT_V1, JIT_V0, 1);       @rem{/* V1 = n-1 */} -        jit_subi     (JIT_V2, JIT_V0, 2);       @rem{/* V2 = n-2 */} -        jit_prepare(); -          jit_pushargr(JIT_V1); -        call = jit_finishi(NULL); -        jit_patch_at(call, label); -        jit_retval(JIT_V1);                     @rem{/* V1 = fib(n-1) */} -        jit_prepare(); -          jit_pushargr(JIT_V2); -        call = jit_finishi(NULL); -        jit_patch_at(call, label); -        jit_retval(JIT_R0);                     @rem{/* R0 = fib(n-2) */} -        jit_addr(JIT_R0, JIT_R0, JIT_V1);       @rem{/* R0 = R0 + V1 */} - -  jit_patch(ref);                               @rem{/* patch jump */} -  jit_patch(zero);                              @rem{/* patch jump */} -        jit_retr(JIT_R0); - -  @rem{/* call the generated code@comma{} passing 32 as an argument */} -  fib = jit_emit(); -  jit_clear_state(); -  printf("fib(%d) = %d\n", 32, fib(32)); -  jit_destroy_state(); -  finish_jit(); -  return 0; -@} -@end example - -As said above, this is the first example of dynamically compiling -branches.  Branch instructions have two operands containing the -values to be compared, and return a @code{jit_note_t *} object -to be patched. - -Because labels final address are only known after calling @code{emit}, -it is required to call @code{patch} or @code{patch_at}, what does -tell @lightning{} that the target to patch is actually a pointer to -a @code{jit_node_t *} object, otherwise, it would assume that is -a pointer to a C function. Note that conditional branches do not -receive a label argument, so they must be patched. - -You need to call @code{patch_at} on the return of value @code{calli}, -@code{finishi}, and @code{calli} if it is actually referencing a label -in the jit code. All branch instructions do not receive a label -argument. Note that @code{movi} is an special case, and patching it -is usually done to get the final address of a label, usually to later -call @code{jmpr}. - -Now, here is the iterative version: - -@example -#include <stdio.h> -#include <lightning.h> - -static jit_state_t *_jit; - -typedef int (*pifi)(int);       @rem{/* Pointer to Int Function of Int */} - -int main(int argc, char *argv[]) -@{ -  pifi       fib; -  jit_node_t *in;               @rem{/* offset of the argument */} -  jit_node_t *ref;              @rem{/* to patch the forward reference */} -  jit_node_t *zero;             @rem{/* to patch the forward reference */} -  jit_node_t *jump;             @rem{/* jump to start of loop */} -  jit_node_t *loop;             @rem{/* start of the loop */} - -  init_jit(argv[0]); -  _jit = jit_new_state(); - -        jit_prolog   (); -  in =  jit_arg      (); -        jit_getarg   (JIT_R0, in);              @rem{/* R0 = n */} - zero = jit_beqi     (JIT_R0, 0); -        jit_movr     (JIT_R1, JIT_R0); -        jit_movi     (JIT_R0, 1); -  ref = jit_blti     (JIT_R1, 2); -        jit_subi     (JIT_R2, JIT_R2, 2); -        jit_movr     (JIT_R1, JIT_R0); - -  loop= jit_label(); -        jit_subi     (JIT_R2, JIT_R2, 1);       @rem{/* decr. counter */} -        jit_movr     (JIT_V0, JIT_R0);          /* V0 = R0 */ -        jit_addr     (JIT_R0, JIT_R0, JIT_R1);  /* R0 = R0 + R1 */ -        jit_movr     (JIT_R1, JIT_V0);          /* R1 = V0 */ -  jump= jit_bnei     (JIT_R2, 0);               /* if (R2) goto loop; */ -  jit_patch_at(jump, loop); - -  jit_patch(ref);                               @rem{/* patch forward jump */} -  jit_patch(zero);                              @rem{/* patch forward jump */} -        jit_retr     (JIT_R0); - -  @rem{/* call the generated code@comma{} passing 36 as an argument */} -  fib = jit_emit(); -  jit_clear_state(); -  printf("fib(%d) = %d\n", 36, fib(36)); -  jit_destroy_state(); -  finish_jit(); -  return 0; -@} -@end example - -This code calculates the recurrence relation using iteration (a -@code{for} loop in high-level languages).  There are no function -calls anymore: instead, there is a backward jump (the @code{bnei} at -the end of the loop). - -Note that the program must remember the address for backward jumps; -for forward jumps it is only required to remember the jump code, -and call @code{patch} for the implicit label. - -@node Reentrancy -@chapter Re-entrant usage of @lightning{} - -@lightning{} uses the special @code{_jit} identifier. To be able -to be able to use multiple jit generation states at the same -time, it is required to used code similar to: - -@example -    struct jit_state lightning; -    #define lightning _jit -@end example - -This will cause the symbol defined to @code{_jit} to be passed as -the first argument to the underlying @lightning{} implementation, -that is usually a function with an @code{_} (underscode) prefix -and with an argument named @code{_jit}, in the pattern: - -@example -    static void _jit_mnemonic(jit_state_t *, jit_gpr_t, jit_gpr_t); -    #define jit_mnemonic(u, v) _jit_mnemonic(_jit, u, v); -@end example - -The reason for this is to use the same syntax as the initial lightning -implementation and to avoid needing the user to keep adding an extra -argument to every call, as multiple jit states generating code in -paralell should be very uncommon. - -@section Registers -@chapter Accessing the whole register file - -As mentioned earlier in this chapter, all @lightning{} back-ends are -guaranteed to have at least six general-purpose integer registers and -six floating-point registers, but many back-ends will have more. - -To access the entire register files, you can use the -@code{JIT_R}, @code{JIT_V} and @code{JIT_F} macros.  They -accept a parameter that identifies the register number, which -must be strictly less than @code{JIT_R_NUM}, @code{JIT_V_NUM} -and @code{JIT_F_NUM} respectively; the number need not be -constant.  Of course, expressions like @code{JIT_R0} and -@code{JIT_R(0)} denote the same register, and likewise for -integer callee-saved, or floating-point, registers. - -@node Customizations -@chapter Customizations - -Frequently it is desirable to have more control over how code is -generated or how memory is used during jit generation or execution. - -@section Memory functions -To aid in complete control of memory allocation and deallocation -@lightning{} provides wrappers that default to standard @code{malloc}, -@code{realloc} and @code{free}. These are loosely based on the -GNU GMP counterparts, with the difference that they use the same -prototype of the system allocation functions, that is, no @code{size} -for @code{free} or @code{old_size} for @code{realloc}. - -@deftypefun void jit_set_memory_functions (@* void *(*@var{alloc_func_ptr}) (size_t), @* void *(*@var{realloc_func_ptr}) (void *, size_t), @* void (*@var{free_func_ptr}) (void *)) -@lightning{} guarantees that memory is only allocated or released -using these wrapped functions, but you must note that if lightning -was linked to GNU binutils, malloc is probably will be called multiple -times from there when initializing the disassembler. - -Because @code{init_jit} may call memory functions, if you need to call -@code{jit_set_memory_functions}, it must be called before @code{init_jit}, -otherwise, when calling @code{finish_jit}, a pointer allocated with the -previous or default wrappers will be passed. -@end deftypefun - -@deftypefun void jit_get_memory_functions (@* void *(**@var{alloc_func_ptr}) (size_t), @* void *(**@var{realloc_func_ptr}) (void *, size_t), @* void (**@var{free_func_ptr}) (void *)) -Get the current memory allocation function. Also, unlike the GNU GMP -counterpart, it is an error to pass @code{NULL} pointers as arguments. -@end deftypefun - -@section Alternate code buffer -To instruct @lightning{} to use an alternate code buffer it is required -to call @code{jit_realize} before @code{jit_emit}, and then query states -and customize as appropriate. - -@deftypefun void jit_realize () -Must be called once, before @code{jit_emit}, to instruct @lightning{} -that no other @code{jit_xyz} call will be made. -@end deftypefun - -@deftypefun jit_pointer_t jit_get_code (jit_word_t *@var{code_size}) -Returns NULL or the previous value set with @code{jit_set_code}, and -sets the @var{code_size} argument to an appropriate value. -If @code{jit_get_code} is called before @code{jit_emit}, the -@var{code_size} argument is set to the expected amount of bytes -required to generate code. -If @code{jit_get_code} is called after @code{jit_emit}, the -@var{code_size} argument is set to the exact amount of bytes used -by the code. -@end deftypefun - -@deftypefun void jit_set_code (jit_ponter_t @var{code}, jit_word_t @var{size}) -Instructs @lightning{} to output to the @var{code} argument and -use @var{size} as a guard to not write to invalid memory. If during -@code{jit_emit} @lightning{} finds out that the code would not fit -in @var{size} bytes, it halts code emit and returns @code{NULL}. -@end deftypefun - -A simple example of a loop using an alternate buffer is: - -@example -  jit_uint8_t   *code; -  int           *(func)(int);      @rem{/* function pointer */} -  jit_word_t     code_size; -  jit_word_t     real_code_size; -  @rem{...} -  jit_realize();                   @rem{/* ready to generate code */} -  jit_get_code(&code_size);        @rem{/* get expected code size */} -  code_size = (code_size + 4095) & -4096; -  do (;;) @{ -    code = mmap(NULL, code_size, PROT_EXEC | PROT_READ | PROT_WRITE, -                MAP_PRIVATE | MAP_ANON, -1, 0); -    jit_set_code(code, code_size); -    if ((func = jit_emit()) == NULL) @{ -      munmap(code, code_size); -      code_size += 4096; -    @} -  @} while (func == NULL); -  jit_get_code(&real_code_size);   @rem{/* query exact size of the code */} -@end example - -The first call to @code{jit_get_code} should return @code{NULL} and set -the @code{code_size} argument to the expected amount of bytes required -to emit code. -The second call to @code{jit_get_code} is after a successful call to -@code{jit_emit}, and will return the value previously set with -@code{jit_set_code} and set the @code{real_code_size} argument to the -exact amount of bytes used to emit the code. - -@section Alternate data buffer -Sometimes it may be desirable to customize how, or to prevent -@lightning{} from using an extra buffer for constants or debug -annotation. Usually when also using an alternate code buffer. - -@deftypefun jit_pointer_t jit_get_data (jit_word_t *@var{data_size}, jit_word_t *@var{note_size}) -Returns @code{NULL} or the previous value set with @code{jit_set_data}, -and sets the @var{data_size} argument to how many bytes are required -for the constants data buffer, and @var{note_size} to how many bytes -are required to store the debug note information. -Note that it always preallocate one debug note entry even if -@code{jit_name} or @code{jit_note} are never called, but will return -zero in the @var{data_size} argument if no constant is required; -constants are only used for the @code{float} and @code{double} operations -that have an immediate argument, and not in all @lightning{} ports. -@end deftypefun - -@deftypefun void jit_set_data (jit_pointer_t @var{data}, jit_word_t @var{size}, jit_word_t @var{flags}) - -@var{data} can be NULL if disabling constants and annotations, otherwise, -a valid pointer must be passed. An assertion is done that the data will -fit in @var{size} bytes (but that is a noop if @lightning{} was built -with @code{-DNDEBUG}). - -@var{size} tells the space in bytes available in @var{data}. - -@var{flags} can be zero to tell to just use the alternate data buffer, -or a composition of @code{JIT_DISABLE_DATA} and @code{JIT_DISABLE_NOTE} - -@table @t -@item JIT_DISABLE_DATA -@cindex JIT_DISABLE_DATA -Instructs @lightning{} to not use a constant table, but to use an -alternate method to synthesize those, usually with a larger code -sequence using stack space to transfer the value from a GPR to a -FPR register. - -@item JIT_DISABLE_NOTE -@cindex JIT_DISABLE_NOTE -Instructs @lightning{} to not store file or function name, and -line numbers in the constant buffer. -@end table -@end deftypefun - -A simple example of a preventing usage of a data buffer is: - -@example -  @rem{...} -  jit_realize();                        @rem{/* ready to generate code */} -  jit_get_data(NULL, NULL); -  jit_set_data(NULL, 0, JIT_DISABLE_DATA | JIT_DISABLE_NOTE); -  @rem{...} -@end example - -Or to only use a data buffer, if required: - -@example -  jit_uint8_t   *data; -  jit_word_t     data_size; -  @rem{...} -  jit_realize();                        @rem{/* ready to generate code */} -  jit_get_data(&data_size, NULL); -  if (data_size) -    data = malloc(data_size); -  else -    data = NULL; -  jit_set_data(data, data_size, JIT_DISABLE_NOTE); -  @rem{...} -  if (data) -    free(data); -  @rem{...} -@end example - -@node Acknowledgements -@chapter Acknowledgements - -As far as I know, the first general-purpose portable dynamic code -generator is @sc{dcg}, by Dawson R.@: Engler and T.@: A.@: Proebsting. -Further work by Dawson R. Engler resulted in the @sc{vcode} system; -unlike @sc{dcg}, @sc{vcode} used no intermediate representation and -directly inspired @lightning{}. - -Thanks go to Ian Piumarta, who kindly accepted to release his own -program @sc{ccg} under the GNU General Public License, thereby allowing -@lightning{} to use the run-time assemblers he had wrote for @sc{ccg}. -@sc{ccg} provides a way of dynamically assemble programs written in the -underlying architecture's assembly language.  So it is not portable, -yet very interesting. - -I also thank Steve Byrne for writing GNU Smalltalk, since @lightning{} -was first developed as a tool to be used in GNU Smalltalk's dynamic -translator from bytecodes to native code. - -@c %**end of header (This is for running Texinfo on a region.) - -@c *********************************************************************** - -@bye  | 
