diff options
Diffstat (limited to 'deps/lightening/lightning.texi')
-rw-r--r-- | deps/lightening/lightning.texi | 1760 |
1 files changed, 0 insertions, 1760 deletions
diff --git a/deps/lightening/lightning.texi b/deps/lightening/lightning.texi deleted file mode 100644 index 88f397a..0000000 --- a/deps/lightening/lightning.texi +++ /dev/null @@ -1,1760 +0,0 @@ -\input texinfo.tex @c -*- texinfo -*- -@c %**start of header (This is for running Texinfo on a region.) - -@setfilename lightning.info - -@set TITLE Using @sc{gnu} @i{lightning} -@set TOPIC installing and using - -@settitle @value{TITLE} - -@c --------------------------------------------------------------------- -@c Common macros -@c --------------------------------------------------------------------- - -@macro bulletize{a} -@item -\a\ -@end macro - -@macro rem{a} -@r{@i{\a\}} -@end macro - -@macro gnu{} -@sc{gnu} -@end macro - -@macro lightning{} -@gnu{} @i{lightning} -@end macro - -@c --------------------------------------------------------------------- -@c Macros for Texinfo 3.1/4.0 compatibility -@c --------------------------------------------------------------------- - -@c @hlink (macro), @url and @email are used instead of @uref for Texinfo 3.1 -@c compatibility -@macro hlink{url, link} -\link\ (\url\) -@end macro - -@c ifhtml can only be true in Texinfo 4.0, which has uref -@ifhtml -@unmacro hlink - -@macro hlink{url, link} -@uref{\url\, \link\} -@end macro - -@macro email{mail} -@uref{mailto:\mail\, , \mail\} -@end macro - -@macro url{url} -@uref{\url\} -@end macro -@end ifhtml - -@c --------------------------------------------------------------------- -@c References to the other half of the manual -@c --------------------------------------------------------------------- - -@macro usingref{node, name} -@ref{\node\, , \name\} -@end macro - -@c --------------------------------------------------------------------- -@c End of macro section -@c --------------------------------------------------------------------- - -@set UPDATED 18 June 2018 -@set UPDATED-MONTH June 2018 -@set EDITION 2.1.2 -@set VERSION 2.1.2 - -@ifnottex -@dircategory Software development -@direntry -* lightning: (lightning). Library for dynamic code generation. -@end direntry -@end ifnottex - -@ifnottex -@node Top -@top @lightning{} - -@iftex -@macro comma -@verbatim{|,|} -@end macro -@end iftex - -@ifnottex -@macro comma -@verb{|,|} -@end macro -@end ifnottex - -This document describes @value{TOPIC} the @lightning{} library for -dynamic code generation. - -@menu -* Overview:: What GNU lightning is -* Installation:: Configuring and installing GNU lightning -* The instruction set:: The RISC instruction set used in GNU lightning -* GNU lightning examples:: GNU lightning's examples -* Reentrancy:: Re-entrant usage of GNU lightning -* Customizations:: Advanced code generation customizations -* Acknowledgements:: Acknowledgements for GNU lightning -@end menu -@end ifnottex - -@node Overview -@chapter Introduction to @lightning{} - -@iftex -This document describes @value{TOPIC} the @lightning{} library for -dynamic code generation. -@end iftex - -Dynamic code generation is the generation of machine code -at runtime. It is typically used to strip a layer of interpretation -by allowing compilation to occur at runtime. One of the most -well-known applications of dynamic code generation is perhaps that -of interpreters that compile source code to an intermediate bytecode -form, which is then recompiled to machine code at run-time: this -approach effectively combines the portability of bytecode -representations with the speed of machine code. Another common -application of dynamic code generation is in the field of hardware -simulators and binary emulators, which can use the same techniques -to translate simulated instructions to the instructions of the -underlying machine. - -Yet other applications come to mind: for example, windowing -@dfn{bitblt} operations, matrix manipulations, and network packet -filters. Albeit very powerful and relatively well known within the -compiler community, dynamic code generation techniques are rarely -exploited to their full potential and, with the exception of the -two applications described above, have remained curiosities because -of their portability and functionality barriers: binary instructions -are generated, so programs using dynamic code generation must be -retargeted for each machine; in addition, coding a run-time code -generator is a tedious and error-prone task more than a difficult one. - -@lightning{} provides a portable, fast and easily retargetable dynamic -code generation system. - -To be portable, @lightning{} abstracts over current architectures' -quirks and unorthogonalities. The interface that it exposes to is that -of a standardized RISC architecture loosely based on the SPARC and MIPS -chips. There are a few general-purpose registers (six, not including -those used to receive and pass parameters between subroutines), and -arithmetic operations involve three operands---either three registers -or two registers and an arbitrarily sized immediate value. - -On one hand, this architecture is general enough that it is possible to -generate pretty efficient code even on CISC architectures such as the -Intel x86 or the Motorola 68k families. On the other hand, it matches -real architectures closely enough that, most of the time, the -compiler's constant folding pass ends up generating code which -assembles machine instructions without further tests. - -@node Installation -@chapter Configuring and installing @lightning{} - -The first thing to do to use @lightning{} is to configure the -program, picking the set of macros to be used on the host -architecture; this configuration is automatically performed by -the @file{configure} shell script; to run it, merely type: -@example - ./configure -@end example - -@lightning{} supports the @code{--enable-disassembler} option, that -enables linking to GNU binutils and optionally print human readable -disassembly of the jit code. This option can be disabled by the -@code{--disable-disassembler} option. - -Another option that @file{configure} accepts is -@code{--enable-assertions}, which enables several consistency checks in -the run-time assemblers. These are not usually needed, so you can -decide to simply forget about it; also remember that these consistency -checks tend to slow down your code generator. - -After you've configured @lightning{}, run @file{make} as usual. - -@lightning{} has an extensive set of tests to validate it is working -correctly in the build host. To test it run: -@example - make check -@end example - -The next important step is: -@example - make install -@end example - -This ends the process of installing @lightning{}. - -@node The instruction set -@chapter @lightning{}'s instruction set - -@lightning{}'s instruction set was designed by deriving instructions -that closely match those of most existing RISC architectures, or -that can be easily syntesized if absent. Each instruction is composed -of: -@itemize @bullet -@item -an operation, like @code{sub} or @code{mul} - -@item -most times, a register/immediate flag (@code{r} or @code{i}) - -@item -an unsigned modifier (@code{u}), a type identifier or two, when applicable. -@end itemize - -Examples of legal mnemonics are @code{addr} (integer add, with three -register operands) and @code{muli} (integer multiply, with two -register operands and an immediate operand). Each instruction takes -two or three operands; in most cases, one of them can be an immediate -value instead of a register. - -Most @lightning{} integer operations are signed wordsize operations, -with the exception of operations that convert types, or load or store -values to/from memory. When applicable, the types and C types are as -follow: - -@example - _c @r{signed char} - _uc @r{unsigned char} - _s @r{short} - _us @r{unsigned short} - _i @r{int} - _ui @r{unsigned int} - _l @r{long} - _f @r{float} - _d @r{double} -@end example - -Most integer operations do not need a type modifier, and when loading or -storing values to memory there is an alias to the proper operation -using wordsize operands, that is, if ommited, the type is @r{int} on -32-bit architectures and @r{long} on 64-bit architectures. Note -that lightning also expects @code{sizeof(void*)} to match the wordsize. - -When an unsigned operation result differs from the equivalent signed -operation, there is a the @code{_u} modifier. - -There are at least seven integer registers, of which six are -general-purpose, while the last is used to contain the frame pointer -(@code{FP}). The frame pointer can be used to allocate and access local -variables on the stack, using the @code{allocai} or @code{allocar} -instruction. - -Of the general-purpose registers, at least three are guaranteed to be -preserved across function calls (@code{V0}, @code{V1} and -@code{V2}) and at least three are not (@code{R0}, @code{R1} and -@code{R2}). Six registers are not very much, but this -restriction was forced by the need to target CISC architectures -which, like the x86, are poor of registers; anyway, backends can -specify the actual number of available registers with the calls -@code{JIT_R_NUM} (for caller-save registers) and @code{JIT_V_NUM} -(for callee-save registers). - -There are at least six floating-point registers, named @code{F0} to -@code{F5}. These are usually caller-save and are separate from the integer -registers on the supported architectures; on Intel architectures, -in 32 bit mode if SSE2 is not available or use of X87 is forced, -the register stack is mapped to a flat register file. As for the -integer registers, the macro @code{JIT_F_NUM} yields the number of -floating-point registers. - -The complete instruction set follows; as you can see, most non-memory -operations only take integers (either signed or unsigned) as operands; -this was done in order to reduce the instruction set, and because most -architectures only provide word and long word operations on registers. -There are instructions that allow operands to be extended to fit a larger -data type, both in a signed and in an unsigned way. - -@table @b -@item Binary ALU operations -These accept three operands; the last one can be an immediate. -@code{addx} operations must directly follow @code{addc}, and -@code{subx} must follow @code{subc}; otherwise, results are undefined. -Most, if not all, architectures do not support @r{float} or @r{double} -immediate operands; lightning emulates those operations by moving the -immediate to a temporary register and emiting the call with only -register operands. -@example -addr _f _d O1 = O2 + O3 -addi _f _d O1 = O2 + O3 -addxr O1 = O2 + (O3 + carry) -addxi O1 = O2 + (O3 + carry) -addcr O1 = O2 + O3, set carry -addci O1 = O2 + O3, set carry -subr _f _d O1 = O2 - O3 -subi _f _d O1 = O2 - O3 -subxr O1 = O2 - (O3 + carry) -subxi O1 = O2 - (O3 + carry) -subcr O1 = O2 - O3, set carry -subci O1 = O2 - O3, set carry -rsbr _f _d O1 = O3 - O1 -rsbi _f _d O1 = O3 - O1 -mulr _f _d O1 = O2 * O3 -muli _f _d O1 = O2 * O3 -divr _u _f _d O1 = O2 / O3 -divi _u _f _d O1 = O2 / O3 -remr _u O1 = O2 % O3 -remi _u O1 = O2 % O3 -andr O1 = O2 & O3 -andi O1 = O2 & O3 -orr O1 = O2 | O3 -ori O1 = O2 | O3 -xorr O1 = O2 ^ O3 -xori O1 = O2 ^ O3 -lshr O1 = O2 << O3 -lshi O1 = O2 << O3 -rshr _u O1 = O2 >> O3@footnote{The sign bit is propagated unless using the @code{_u} modifier.} -rshi _u O1 = O2 >> O3@footnote{The sign bit is propagated unless using the @code{_u} modifier.} -@end example - -@item Four operand binary ALU operations -These accept two result registers, and two operands; the last one can -be an immediate. The first two arguments cannot be the same register. - -@code{qmul} stores the low word of the result in @code{O1} and the -high word in @code{O2}. For unsigned multiplication, @code{O2} zero -means there was no overflow. For signed multiplication, no overflow -check is based on sign, and can be detected if @code{O2} is zero or -minus one. - -@code{qdiv} stores the quotient in @code{O1} and the remainder in -@code{O2}. It can be used as quick way to check if a division is -exact, in which case the remainder is zero. - -@example -qmulr _u O1 O2 = O3 * O4 -qmuli _u O1 O2 = O3 * O4 -qdivr _u O1 O2 = O3 / O4 -qdivi _u O1 O2 = O3 / O4 -@end example - -@item Unary ALU operations -These accept two operands, both of which must be registers. -@example -negr _f _d O1 = -O2 -comr O1 = ~O2 -@end example - -These unary ALU operations are only defined for float operands. -@example -absr _f _d O1 = fabs(O2) -sqrtr O1 = sqrt(O2) -@end example - -Besides requiring the @code{r} modifier, there are no unary operations -with an immediate operand. - -@item Compare instructions -These accept three operands; again, the last can be an immediate. -The last two operands are compared, and the first operand, that must be -an integer register, is set to either 0 or 1, according to whether the -given condition was met or not. - -The conditions given below are for the standard behavior of C, -where the ``unordered'' comparison result is mapped to false. - -@example -ltr _u _f _d O1 = (O2 < O3) -lti _u _f _d O1 = (O2 < O3) -ler _u _f _d O1 = (O2 <= O3) -lei _u _f _d O1 = (O2 <= O3) -gtr _u _f _d O1 = (O2 > O3) -gti _u _f _d O1 = (O2 > O3) -ger _u _f _d O1 = (O2 >= O3) -gei _u _f _d O1 = (O2 >= O3) -eqr _f _d O1 = (O2 == O3) -eqi _f _d O1 = (O2 == O3) -ner _f _d O1 = (O2 != O3) -nei _f _d O1 = (O2 != O3) -unltr _f _d O1 = !(O2 >= O3) -unler _f _d O1 = !(O2 > O3) -ungtr _f _d O1 = !(O2 <= O3) -unger _f _d O1 = !(O2 < O3) -uneqr _f _d O1 = !(O2 < O3) && !(O2 > O3) -ltgtr _f _d O1 = !(O2 >= O3) || !(O2 <= O3) -ordr _f _d O1 = (O2 == O2) && (O3 == O3) -unordr _f _d O1 = (O2 != O2) || (O3 != O3) -@end example - -@item Transfer operations -These accept two operands; for @code{ext} both of them must be -registers, while @code{mov} accepts an immediate value as the second -operand. - -Unlike @code{movr} and @code{movi}, the other instructions are used -to truncate a wordsize operand to a smaller integer data type or to -convert float data types. You can also use @code{extr} to convert an -integer to a floating point value: the usual options are @code{extr_f} -and @code{extr_d}. - -@example -movr _f _d O1 = O2 -movi _f _d O1 = O2 -extr _c _uc _s _us _i _ui _f _d O1 = O2 -truncr _f _d O1 = trunc(O2) -@end example - -In 64-bit architectures it may be required to use @code{truncr_f_i}, -@code{truncr_f_l}, @code{truncr_d_i} and @code{truncr_d_l} to match -the equivalent C code. Only the @code{_i} modifier is available in -32-bit architectures. - -@example -truncr_f_i = <int> O1 = <float> O2 -truncr_f_l = <long>O1 = <float> O2 -truncr_d_i = <int> O1 = <double>O2 -truncr_d_l = <long>O1 = <double>O2 -@end example - -The float conversion operations are @emph{destination first, -source second}, but the order of the types is reversed. This happens -for historical reasons. - -@example -extr_f_d = <double>O1 = <float> O2 -extr_d_f = <float> O1 = <double>O2 -@end example - -@item Network extensions -These accept two operands, both of which must be registers; these -two instructions actually perform the same task, yet they are -assigned to two mnemonics for the sake of convenience and -completeness. As usual, the first operand is the destination and -the second is the source. -The @code{_ul} variant is only available in 64-bit architectures. -@example -htonr _us _ui _ul @r{Host-to-network (big endian) order} -ntohr _us _ui _ul @r{Network-to-host order } -@end example - -@item Load operations -@code{ld} accepts two operands while @code{ldx} accepts three; -in both cases, the last can be either a register or an immediate -value. Values are extended (with or without sign, according to -the data type specification) to fit a whole register. -The @code{_ui} and @code{_l} types are only available in 64-bit -architectures. For convenience, there is a version without a -type modifier for integer or pointer operands that uses the -appropriate wordsize call. -@example -ldr _c _uc _s _us _i _ui _l _f _d O1 = *O2 -ldi _c _uc _s _us _i _ui _l _f _d O1 = *O2 -ldxr _c _uc _s _us _i _ui _l _f _d O1 = *(O2+O3) -ldxi _c _uc _s _us _i _ui _l _f _d O1 = *(O2+O3) -@end example - -@item Store operations -@code{st} accepts two operands while @code{stx} accepts three; in -both cases, the first can be either a register or an immediate -value. Values are sign-extended to fit a whole register. -@example -str _c _uc _s _us _i _ui _l _f _d *O1 = O2 -sti _c _uc _s _us _i _ui _l _f _d *O1 = O2 -stxr _c _uc _s _us _i _ui _l _f _d *(O1+O2) = O3 -stxi _c _uc _s _us _i _ui _l _f _d *(O1+O2) = O3 -@end example -As for the load operations, the @code{_ui} and @code{_l} types are -only available in 64-bit architectures, and for convenience, there -is a version without a type modifier for integer or pointer operands -that uses the appropriate wordsize call. - -@item Argument management -These are: -@example -prepare (not specified) -va_start (not specified) -pushargr _f _d -pushargi _f _d -va_push (not specified) -arg _c _uc _s _us _i _ui _l _f _d -getarg _c _uc _s _us _i _ui _l _f _d -va_arg _d -putargr _f _d -putargi _f _d -ret (not specified) -retr _f _d -reti _f _d -va_end (not specified) -retval _c _uc _s _us _i _ui _l _f _d -epilog (not specified) -@end example -As with other operations that use a type modifier, the @code{_ui} and -@code{_l} types are only available in 64-bit architectures, but there -are operations without a type modifier that alias to the appropriate -integer operation with wordsize operands. - -@code{prepare}, @code{pusharg}, and @code{retval} are used by the caller, -while @code{arg}, @code{getarg} and @code{ret} are used by the callee. -A code snippet that wants to call another procedure and has to pass -arguments must, in order: use the @code{prepare} instruction and use -the @code{pushargr} or @code{pushargi} to push the arguments @strong{in -left to right order}; and use @code{finish} or @code{call} (explained below) -to perform the actual call. - -@code{va_start} returns a @code{C} compatible @code{va_list}. To fetch -arguments, use @code{va_arg} for integers and @code{va_arg_d} for doubles. -@code{va_push} is required when passing a @code{va_list} to another function, -because not all architectures expect it as a single pointer. Known case -is DEC Alpha, that requires it as a structure passed by value. - -@code{arg}, @code{getarg} and @code{putarg} are used by the callee. -@code{arg} is different from other instruction in that it does not -actually generate any code: instead, it is a function which returns -a value to be passed to @code{getarg} or @code{putarg}. @footnote{``Return -a value'' means that @lightning{} code that compile these -instructions return a value when expanded.} You should call -@code{arg} as soon as possible, before any function call or, more -easily, right after the @code{prolog} instructions -(which is treated later). - -@code{getarg} accepts a register argument and a value returned by -@code{arg}, and will move that argument to the register, extending -it (with or without sign, according to the data type specification) -to fit a whole register. These instructions are more intimately -related to the usage of the @lightning{} instruction set in code -that generates other code, so they will be treated more -specifically in @ref{GNU lightning examples, , Generating code at -run-time}. - -@code{putarg} is a mix of @code{getarg} and @code{pusharg} in that -it accepts as first argument a register or immediate, and as -second argument a value returned by @code{arg}. It allows changing, -or restoring an argument to the current function, and is a -construct required to implement tail call optimization. Note that -arguments in registers are very cheap, but will be overwritten -at any moment, including on some operations, for example division, -that on several ports is implemented as a function call. - -Finally, the @code{retval} instruction fetches the return value of a -called function in a register. The @code{retval} instruction takes a -register argument and copies the return value of the previously called -function in that register. A function with a return value should use -@code{retr} or @code{reti} to put the return value in the return register -before returning. @xref{Fibonacci, the Fibonacci numbers}, for an example. - -@code{epilog} is an optional call, that marks the end of a function -body. It is automatically generated by @lightning{} if starting a new -function (what should be done after a @code{ret} call) or finishing -generating jit. -It is very important to note that the fact that @code{epilog} being -optional may cause a common mistake. Consider this: -@example -fun1: - prolog - ... - ret -fun2: - prolog -@end example -Because @code{epilog} is added when finding a new @code{prolog}, -this will cause the @code{fun2} label to actually be before the -return from @code{fun1}. Because @lightning{} will actually -understand it as: -@example -fun1: - prolog - ... - ret -fun2: - epilog - prolog -@end example - -You should observe a few rules when using these macros. First of -all, if calling a varargs function, you should use the @code{ellipsis} -call to mark the position of the ellipsis in the C prototype. - -You should not nest calls to @code{prepare} inside a -@code{prepare/finish} block. Doing this will result in undefined -behavior. Note that for functions with zero arguments you can use -just @code{call}. - -@item Branch instructions -Like @code{arg}, these also return a value which, in this case, -is to be used to compile forward branches as explained in -@ref{Fibonacci, , Fibonacci numbers}. They accept two operands to be -compared; of these, the last can be either a register or an immediate. -They are: -@example -bltr _u _f _d @r{if }(O2 < O3)@r{ goto }O1 -blti _u _f _d @r{if }(O2 < O3)@r{ goto }O1 -bler _u _f _d @r{if }(O2 <= O3)@r{ goto }O1 -blei _u _f _d @r{if }(O2 <= O3)@r{ goto }O1 -bgtr _u _f _d @r{if }(O2 > O3)@r{ goto }O1 -bgti _u _f _d @r{if }(O2 > O3)@r{ goto }O1 -bger _u _f _d @r{if }(O2 >= O3)@r{ goto }O1 -bgei _u _f _d @r{if }(O2 >= O3)@r{ goto }O1 -beqr _f _d @r{if }(O2 == O3)@r{ goto }O1 -beqi _f _d @r{if }(O2 == O3)@r{ goto }O1 -bner _f _d @r{if }(O2 != O3)@r{ goto }O1 -bnei _f _d @r{if }(O2 != O3)@r{ goto }O1 - -bunltr _f _d @r{if }!(O2 >= O3)@r{ goto }O1 -bunler _f _d @r{if }!(O2 > O3)@r{ goto }O1 -bungtr _f _d @r{if }!(O2 <= O3)@r{ goto }O1 -bunger _f _d @r{if }!(O2 < O3)@r{ goto }O1 -buneqr _f _d @r{if }!(O2 < O3) && !(O2 > O3)@r{ goto }O1 -bltgtr _f _d @r{if }!(O2 >= O3) || !(O2 <= O3)@r{ goto }O1 -bordr _f _d @r{if } (O2 == O2) && (O3 == O3)@r{ goto }O1 -bunordr _f _d @r{if }!(O2 != O2) || (O3 != O3)@r{ goto }O1 - -bmsr @r{if }O2 & O3@r{ goto }O1 -bmsi @r{if }O2 & O3@r{ goto }O1 -bmcr @r{if }!(O2 & O3)@r{ goto }O1 -bmci @r{if }!(O2 & O3)@r{ goto }O1@footnote{These mnemonics mean, respectively, @dfn{branch if mask set} and @dfn{branch if mask cleared}.} -boaddr _u O2 += O3@r{, goto }O1@r{ if overflow} -boaddi _u O2 += O3@r{, goto }O1@r{ if overflow} -bxaddr _u O2 += O3@r{, goto }O1@r{ if no overflow} -bxaddi _u O2 += O3@r{, goto }O1@r{ if no overflow} -bosubr _u O2 -= O3@r{, goto }O1@r{ if overflow} -bosubi _u O2 -= O3@r{, goto }O1@r{ if overflow} -bxsubr _u O2 -= O3@r{, goto }O1@r{ if no overflow} -bxsubi _u O2 -= O3@r{, goto }O1@r{ if no overflow} -@end example - -@item Jump and return operations -These accept one argument except @code{ret} and @code{jmpi} which -have none; the difference between @code{finishi} and @code{calli} -is that the latter does not clean the stack from pushed parameters -(if any) and the former must @strong{always} follow a @code{prepare} -instruction. -@example -callr (not specified) @r{function call to register O1} -calli (not specified) @r{function call to immediate O1} -finishr (not specified) @r{function call to register O1} -finishi (not specified) @r{function call to immediate O1} -jmpr (not specified) @r{unconditional jump to register} -jmpi (not specified) @r{unconditional jump} -ret (not specified) @r{return from subroutine} -retr _c _uc _s _us _i _ui _l _f _d -reti _c _uc _s _us _i _ui _l _f _d -retval _c _uc _s _us _i _ui _l _f _d @r{move return value} - @r{to register} -@end example - -Like branch instruction, @code{jmpi} also returns a value which is to -be used to compile forward branches. @xref{Fibonacci, , Fibonacci -numbers}. - -@item Labels -There are 3 @lightning{} instructions to create labels: -@example -label (not specified) @r{simple label} -forward (not specified) @r{forward label} -indirect (not specified) @r{special simple label} -@end example - -@code{label} is normally used as @code{patch_at} argument for backward -jumps. - -@example - jit_node_t *jump, *label; -label = jit_label(); - ... - jump = jit_beqr(JIT_R0, JIT_R1); - jit_patch_at(jump, label); -@end example - -@code{forward} is used to patch code generation before the actual -position of the label is known. - -@example - jit_node_t *jump, *label; -label = jit_forward(); - jump = jit_beqr(JIT_R0, JIT_R1); - jit_patch_at(jump, label); - ... - jit_link(label); -@end example - -@code{indirect} is useful when creating jump tables, and tells -@lightning{} to not optimize out a label that is not the target of -any jump, because an indirect jump may land where it is defined. - -@example - jit_node_t *jump, *label; - ... - jmpr(JIT_R0); @rem{/* may jump to label */} - ... -label = jit_indirect(); -@end example - -@code{indirect} is an special case of @code{note} and @code{name} -because it is a valid argument to @code{address}. - -Note that the usual idiom to write the previous example is -@example - jit_node_t *addr, *jump; -addr = jit_movi(JIT_R0, 0); @rem{/* immediate is ignored */} - ... - jmpr(JIT_R0); - ... - jit_patch(addr); @rem{/* implicit label added */} -@end example - -that automatically binds the implicit label added by @code{patch} with -the @code{movi}, but on some special conditions it is required to create -an "unbound" label. - -@item Function prolog - -These macros are used to set up a function prolog. The @code{allocai} -call accept a single integer argument and returns an offset value -for stack storage access. The @code{allocar} accepts two registers -arguments, the first is set to the offset for stack access, and the -second is the size in bytes argument. - -@example -prolog (not specified) @r{function prolog} -allocai (not specified) @r{reserve space on the stack} -allocar (not specified) @r{allocate space on the stack} -@end example - -@code{allocai} receives the number of bytes to allocate and returns -the offset from the frame pointer register @code{FP} to the base of -the area. - -@code{allocar} receives two register arguments. The first is where -to store the offset from the frame pointer register @code{FP} to the -base of the area. The second argument is the size in bytes. Note -that @code{allocar} is dynamic allocation, and special attention -should be taken when using it. If called in a loop, every iteration -will allocate stack space. Stack space is aligned from 8 to 64 bytes -depending on backend requirements, even if allocating only one byte. -It is advisable to not use it with @code{frame} and @code{tramp}; it -should work with @code{frame} with special care to call only once, -but is not supported if used in @code{tramp}, even if called only -once. - -As a small appetizer, here is a small function that adds 1 to the input -parameter (an @code{int}). I'm using an assembly-like syntax here which -is a bit different from the one used when writing real subroutines with -@lightning{}; the real syntax will be introduced in @xref{GNU lightning -examples, , Generating code at run-time}. - -@example -incr: - prolog -in = arg @rem{! We have an integer argument} - getarg R0, in @rem{! Move it to R0} - addi R0, R0, 1 @rem{! Add 1} - retr R0 @rem{! And return the result} -@end example - -And here is another function which uses the @code{printf} function from -the standard C library to write a number in hexadecimal notation: - -@example -printhex: - prolog -in = arg @rem{! Same as above} - getarg R0, in - prepare @rem{! Begin call sequence for printf} - pushargi "%x" @rem{! Push format string} - ellipsis @rem{! Varargs start here} - pushargr R0 @rem{! Push second argument} - finishi printf @rem{! Call printf} - ret @rem{! Return to caller} -@end example - -@item Trampolines, continuations and tail call optimization - -Frequently it is required to generate jit code that must jump to -code generated later, possibly from another @code{jit_context_t}. -These require compatible stack frames. - -@lightning{} provides two primitives from where trampolines, -continuations and tail call optimization can be implemented. - -@example -frame (not specified) @r{create stack frame} -tramp (not specified) @r{assume stack frame} -@end example - -@code{frame} receives an integer argument@footnote{It is not -automatically computed because it does not know about the -requirement of later generated code.} that defines the size in -bytes for the stack frame of the current, @code{C} callable, -jit function. To calculate this value, a good formula is maximum -number of arguments to any called native function times -eight@footnote{Times eight so that it works for double arguments. -And would not need conditionals for ports that pass arguments in -the stack.}, plus the sum of the arguments to any call to -@code{jit_allocai}. @lightning{} automatically adjusts this value -for any backend specific stack memory it may need, or any -alignment constraint. - -@code{frame} also instructs @lightning{} to save all callee -save registers in the prolog and reload in the epilog. - -@example -main: @rem{! jit entry point} - prolog @rem{! function prolog} - frame 256 @rem{! save all callee save registers and} - @rem{! reserve at least 256 bytes in stack} -main_loop: - ... - jmpi handler @rem{! jumps to external code} - ... - ret @rem{! return to the caller} -@end example - -@code{tramp} differs from @code{frame} only that a prolog and epilog -will not be generated. Note that @code{prolog} must still be used. -The code under @code{tramp} must be ready to be entered with a jump -at the prolog position, and instead of a return, it must end with -a non conditional jump. @code{tramp} exists solely for the fact -that it allows optimizing out prolog and epilog code that would -never be executed. - -@example -handler: @rem{! handler entry point} - prolog @rem{! function prolog} - tramp 256 @rem{! assumes all callee save registers} - @rem{! are saved and there is at least} - @rem{! 256 bytes in stack} - ... - jmpi main_loop @rem{! return to the main loop} -@end example - -@lightning{} only supports Tail Call Optimization using the -@code{tramp} construct. Any other way is not guaranteed to -work on all ports. - -An example of a simple (recursive) tail call optimization: - -@example -factorial: @rem{! Entry point of the factorial function} - prolog -in = arg @rem{! Receive an integer argument} - getarg R0, in @rem{! Move argument to RO} - prepare - pushargi 1 @rem{! This is the accumulator} - pushargr R0 @rem{! This is the argument} - finishi fact @rem{! Call the tail call optimized function} - retval R0 @rem{! Fetch the result} - retr R0 @rem{! Return it} - epilog @rem{! Epilog *before* label before prolog} - -fact: @rem{! Entry point of the helper function} - prolog - frame 16 @rem{! Reserve 16 bytes in the stack} -fact_entry: @rem{! This is the tail call entry point} -ac = arg @rem{! The accumulator is the first argument} -in = arg @rem{! The factorial argument} - getarg R0, ac @rem{! Move the accumulator to R0} - getarg R1, in @rem{! Move the argument to R1} - blei fact_out, R1, 1 @rem{! Done if argument is one or less} - mulr R0, R0, R1 @rem{! accumulator *= argument} - putargr R0, ac @rem{! Update the accumulator} - subi R1, R1, 1 @rem{! argument -= 1} - putargr R1, in @rem{! Update the argument} - jmpi fact_entry @rem{! Tail Call Optimize it!} -fact_out: - retr R0 @rem{! Return the accumulator} -@end example - -@item Predicates -@example -forward_p (not specified) @r{forward label predicate} -indirect_p (not specified) @r{indirect label predicate} -target_p (not specified) @r{used label predicate} -arg_register_p (not specified) @r{argument kind predicate} -callee_save_p (not specified) @r{callee save predicate} -pointer_p (not specified) @r{pointer predicate} -@end example - -@code{forward_p} expects a @code{jit_node_t*} argument, and -returns non zero if it is a forward label reference, that is, -a label returned by @code{forward}, that still needs a -@code{link} call. - -@code{indirect_p} expects a @code{jit_node_t*} argument, and returns -non zero if it is an indirect label reference, that is, a label that -was returned by @code{indirect}. - -@code{target_p} expects a @code{jit_node_t*} argument, that is any -kind of label, and will return non zero if there is at least one -jump or move referencing it. - -@code{arg_register_p} expects a @code{jit_node_t*} argument, that must -have been returned by @code{arg}, @code{arg_f} or @code{arg_d}, and -will return non zero if the argument lives in a register. This call -is useful to know the live range of register arguments, as those -are very fast to read and write, but have volatile values. - -@code{callee_save_p} exects a valid @code{JIT_Rn}, @code{JIT_Vn}, or -@code{JIT_Fn}, and will return non zero if the register is callee -save. This call is useful because on several ports, the @code{JIT_Rn} -and @code{JIT_Fn} registers are actually callee save; no need -to save and load the values when making function calls. - -@code{pointer_p} expects a pointer argument, and will return non -zero if the pointer is inside the generated jit code. Must be -called after @code{jit_emit} and before @code{jit_destroy_state}. -@end table - -@node GNU lightning examples -@chapter Generating code at run-time - -To use @lightning{}, you should include the @file{lightning.h} file that -is put in your include directory by the @samp{make install} command. - -Each of the instructions above translates to a macro or function call. -All you have to do is prepend @code{jit_} (lowercase) to opcode names -and @code{JIT_} (uppercase) to register names. Of course, parameters -are to be put between parentheses. - -This small tutorial presents three examples: - -@iftex -@itemize @bullet -@item -The @code{incr} function found in @ref{The instruction set, , -@lightning{}'s instruction set}: - -@item -A simple function call to @code{printf} - -@item -An RPN calculator. - -@item -Fibonacci numbers -@end itemize -@end iftex -@ifnottex -@menu -* incr:: A function which increments a number by one -* printf:: A simple function call to printf -* RPN calculator:: A more complex example, an RPN calculator -* Fibonacci:: Calculating Fibonacci numbers -@end menu -@end ifnottex - -@node incr -@section A function which increments a number by one - -Let's see how to create and use the sample @code{incr} function created -in @ref{The instruction set, , @lightning{}'s instruction set}: - -@example -#include <stdio.h> -#include <lightning.h> - -static jit_state_t *_jit; - -typedef int (*pifi)(int); @rem{/* Pointer to Int Function of Int */} - -int main(int argc, char *argv[]) -@{ - jit_node_t *in; - pifi incr; - - init_jit(argv[0]); - _jit = jit_new_state(); - - jit_prolog(); @rem{/* @t{ prolog } */} - in = jit_arg(); @rem{/* @t{ in = arg } */} - jit_getarg(JIT_R0, in); @rem{/* @t{ getarg R0 } */} - jit_addi(JIT_R0, JIT_R0, 1); @rem{/* @t{ addi R0@comma{} R0@comma{} 1 } */} - jit_retr(JIT_R0); @rem{/* @t{ retr R0 } */} - - incr = jit_emit(); - jit_clear_state(); - - @rem{/* call the generated code@comma{} passing 5 as an argument */} - printf("%d + 1 = %d\n", 5, incr(5)); - - jit_destroy_state(); - finish_jit(); - return 0; -@} -@end example - -Let's examine the code line by line (well, almost@dots{}): - -@table @t -@item #include <lightning.h> -You already know about this. It defines all of @lightning{}'s macros. - -@item static jit_state_t *_jit; -You might wonder about what is @code{jit_state_t}. It is a structure -that stores jit code generation information. The name @code{_jit} is -special, because since multiple jit generators can run at the same -time, you must either @r{#define _jit my_jit_state} or name it -@code{_jit}. - -@item typedef int (*pifi)(int); -Just a handy typedef for a pointer to a function that takes an -@code{int} and returns another. - -@item jit_node_t *in; -Declares a variable to hold an identifier for a function argument. It -is an opaque pointer, that will hold the return of a call to @code{arg} -and be used as argument to @code{getarg}. - -@item pifi incr; -Declares a function pointer variable to a function that receives an -@code{int} and returns an @code{int}. - -@item init_jit(argv[0]); -You must call this function before creating a @code{jit_state_t} -object. This function does global state initialization, and may need -to detect CPU or Operating System features. It receives a string -argument that is later used to read symbols from a shared object using -GNU binutils if disassembly was enabled at configure time. If no -disassembly will be performed a NULL pointer can be used as argument. - -@item _jit = jit_new_state(); -This call initializes a @lightning{} jit state. - -@item jit_prolog(); -Ok, so we start generating code for our beloved function@dots{} - -@item in = jit_arg(); -@itemx jit_getarg(JIT_R0, in); -We retrieve the first (and only) argument, an integer, and store it -into the general-purpose register @code{R0}. - -@item jit_addi(JIT_R0, JIT_R0, 1); -We add one to the content of the register. - -@item jit_retr(JIT_R0); -This instruction generates a standard function epilog that returns -the contents of the @code{R0} register. - -@item incr = jit_emit(); -This instruction is very important. It actually translates the -@lightning{} macros used before to machine code, flushes the generated -code area out of the processor's instruction cache and return a -pointer to the start of the code. - -@item jit_clear_state(); -This call cleanups any data not required for jit execution. Note -that it must be called after any call to @code{jit_print} or -@code{jit_address}, as this call destroy the @lightning{} -intermediate representation. - -@item printf("%d + 1 = %d", 5, incr(5)); -Calling our function is this simple---it is not distinguishable from -a normal C function call, the only difference being that @code{incr} -is a variable. - -@item jit_destroy_state(); -Releases all memory associated with the jit context. It should be -called after known the jit will no longer be called. - -@item finish_jit(); -This call cleanups any global state hold by @lightning{}, and is -advisable to call it once jit code will no longer be generated. -@end table - -@lightning{} abstracts two phases of dynamic code generation: selecting -instructions that map the standard representation, and emitting binary -code for these instructions. The client program has the responsibility -of describing the code to be generated using the standard @lightning{} -instruction set. - -Let's examine the code generated for @code{incr} on the SPARC and x86_64 -architecture (on the right is the code that an assembly-language -programmer would write): - -@table @b -@item SPARC -@example - save %sp, -112, %sp - mov %i0, %g2 retl - inc %g2 inc %o0 - mov %g2, %i0 - restore - retl - nop -@end example -In this case, @lightning{} introduces overhead to create a register -window (not knowing that the procedure is a leaf procedure) and to -move the argument to the general purpose register @code{R0} (which -maps to @code{%g2} on the SPARC). -@end table - -@table @b -@item x86_64 -@example - sub $0x30,%rsp - mov %rbp,(%rsp) - mov %rsp,%rbp - sub $0x18,%rsp - mov %rdi,%rax mov %rdi, %rax - add $0x1,%rax inc %rax - mov %rbp,%rsp - mov (%rsp),%rbp - add $0x30,%rsp - retq retq -@end example -In this case, the main overhead is due to the function's prolog and -epilog, and stack alignment after reserving stack space for word -to/from float conversions or moving data from/to x87 to/from SSE. -Note that besides allocating space to save callee saved registers, -no registers are saved/restored because @lightning{} notices those -registers are not modified. There is currently no logic to detect -if it needs to allocate stack space for type conversions neither -proper leaf function detection, but these are subject to change -(FIXME). -@end table - -@node printf -@section A simple function call to @code{printf} - -Again, here is the code for the example: - -@example -#include <stdio.h> -#include <lightning.h> - -static jit_state_t *_jit; - -typedef void (*pvfi)(int); @rem{/* Pointer to Void Function of Int */} - -int main(int argc, char *argv[]) -@{ - pvfi myFunction; @rem{/* ptr to generated code */} - jit_node_t *start, *end; @rem{/* a couple of labels */} - jit_node_t *in; @rem{/* to get the argument */} - - init_jit(argv[0]); - _jit = jit_new_state(); - - start = jit_note(__FILE__, __LINE__); - jit_prolog(); - in = jit_arg(); - jit_getarg(JIT_R1, in); - jit_pushargi((jit_word_t)"generated %d bytes\n"); - jit_ellipsis(); - jit_pushargr(JIT_R1); - jit_finishi(printf); - jit_ret(); - jit_epilog(); - end = jit_note(__FILE__, __LINE__); - - myFunction = jit_emit(); - - @rem{/* call the generated code@comma{} passing its size as argument */} - myFunction((char*)jit_address(end) - (char*)jit_address(start)); - jit_clear_state(); - - jit_disassemble(); - - jit_destroy_state(); - finish_jit(); - return 0; -@} -@end example - -The function shows how many bytes were generated. Most of the code -is not very interesting, as it resembles very closely the program -presented in @ref{incr, , A function which increments a number by one}. - -For this reason, we're going to concentrate on just a few statements. - -@table @t -@item start = jit_note(__FILE__, __LINE__); -@itemx @r{@dots{}} -@itemx end = jit_note(__FILE__, __LINE__); -These two instruction call the @code{jit_note} macro, which creates -a note in the jit code; arguments to @code{jit_note} usually are a -filename string and line number integer, but using NULL for the -string argument is perfectly valid if only need to create a simple -marker in the code. - -@item jit_ellipsis(); -@code{ellipsis} usually is only required if calling varargs functions -with double arguments, but it is a good practice to properly describe -the @r{@dots{}} in the call sequence. - -@item jit_pushargi((jit_word_t)"generated %d bytes\n"); -Note the use of the @code{(jit_word_t)} cast, that is used only -to avoid a compiler warning, due to using a pointer where a -wordsize integer type was expected. - -@item jit_prepare(); -@itemx @r{@dots{}} -@itemx jit_finishi(printf); -Once the arguments to @code{printf} have been pushed, what means -moving them to stack or register arguments, the @code{printf} -function is called and the stack cleaned. Note how @lightning{} -abstracts the differences between different architectures and -ABI's -- the client program does not know how parameter passing -works on the host architecture. - -@item jit_epilog(); -Usually it is not required to call @code{epilog}, but because it -is implicitly called when noticing the end of a function, if the -@code{end} variable was set with a @code{note} call after the -@code{ret}, it would not consider the function epilog. - -@item myFunction((char*)jit_address(end) - (char*)jit_address(start)); -This calls the generate jit function passing as argument the offset -difference from the @code{start} and @code{end} notes. The @code{address} -call must be done after the @code{emit} call or either a fatal error -will happen (if @lightning{} is built with assertions enable) or an -undefined value will be returned. - -@item jit_clear_state(); -Note that @code{jit_clear_state} was called after executing jit in -this example. It was done because it must be called after any call -to @code{jit_address} or @code{jit_print}. - -@item jit_disassemble(); -@code{disassemble} will dump the generated code to standard output, -unless @lightning{} was built with the disassembler disabled, in which -case no output will be shown. -@end table - -@node RPN calculator -@section A more complex example, an RPN calculator - -We create a small stack-based RPN calculator which applies a series -of operators to a given parameter and to other numeric operands. -Unlike previous examples, the code generator is fully parameterized -and is able to compile different formulas to different functions. -Here is the code for the expression compiler; a sample usage will -follow. - -Since @lightning{} does not provide push/pop instruction, this -example uses a stack-allocated area to store the data. Such an -area can be allocated using the macro @code{allocai}, which -receives the number of bytes to allocate and returns the offset -from the frame pointer register @code{FP} to the base of the -area. - -Usually, you will use the @code{ldxi} and @code{stxi} instruction -to access stack-allocated variables. However, it is possible to -use operations such as @code{add} to compute the address of the -variables, and pass the address around. - -@example -#include <stdio.h> -#include <lightning.h> - -typedef int (*pifi)(int); @rem{/* Pointer to Int Function of Int */} - -static jit_state_t *_jit; - -void stack_push(int reg, int *sp) -@{ - jit_stxi_i (*sp, JIT_FP, reg); - *sp += sizeof (int); -@} - -void stack_pop(int reg, int *sp) -@{ - *sp -= sizeof (int); - jit_ldxi_i (reg, JIT_FP, *sp); -@} - -jit_node_t *compile_rpn(char *expr) -@{ - jit_node_t *in, *fn; - int stack_base, stack_ptr; - - fn = jit_note(NULL, 0); - jit_prolog(); - in = jit_arg(); - stack_ptr = stack_base = jit_allocai (32 * sizeof (int)); - - jit_getarg_i(JIT_R2, in); - - while (*expr) @{ - char buf[32]; - int n; - if (sscanf(expr, "%[0-9]%n", buf, &n)) @{ - expr += n - 1; - stack_push(JIT_R0, &stack_ptr); - jit_movi(JIT_R0, atoi(buf)); - @} else if (*expr == 'x') @{ - stack_push(JIT_R0, &stack_ptr); - jit_movr(JIT_R0, JIT_R2); - @} else if (*expr == '+') @{ - stack_pop(JIT_R1, &stack_ptr); - jit_addr(JIT_R0, JIT_R1, JIT_R0); - @} else if (*expr == '-') @{ - stack_pop(JIT_R1, &stack_ptr); - jit_subr(JIT_R0, JIT_R1, JIT_R0); - @} else if (*expr == '*') @{ - stack_pop(JIT_R1, &stack_ptr); - jit_mulr(JIT_R0, JIT_R1, JIT_R0); - @} else if (*expr == '/') @{ - stack_pop(JIT_R1, &stack_ptr); - jit_divr(JIT_R0, JIT_R1, JIT_R0); - @} else @{ - fprintf(stderr, "cannot compile: %s\n", expr); - abort(); - @} - ++expr; - @} - jit_retr(JIT_R0); - jit_epilog(); - return fn; -@} -@end example - -The principle on which the calculator is based is easy: the stack top -is held in R0, while the remaining items of the stack are held in the -memory area that we allocate with @code{allocai}. Compiling a numeric -operand or the argument @code{x} pushes the old stack top onto the -stack and moves the operand into R0; compiling an operator pops the -second operand off the stack into R1, and compiles the operation so -that the result goes into R0, thus becoming the new stack top. - -This example allocates a fixed area for 32 @code{int}s. This is not -a problem when the function is a leaf like in this case; in a full-blown -compiler you will want to analyze the input and determine the number -of needed stack slots---a very simple example of register allocation. -The area is then managed like a stack using @code{stack_push} and -@code{stack_pop}. - -Source code for the client (which lies in the same source file) follows: - -@example -int main(int argc, char *argv[]) -@{ - jit_node_t *nc, *nf; - pifi c2f, f2c; - int i; - - init_jit(argv[0]); - _jit = jit_new_state(); - - nc = compile_rpn("32x9*5/+"); - nf = compile_rpn("x32-5*9/"); - (void)jit_emit(); - c2f = (pifi)jit_address(nc); - f2c = (pifi)jit_address(nf); - jit_clear_state(); - - printf("\nC:"); - for (i = 0; i <= 100; i += 10) printf("%3d ", i); - printf("\nF:"); - for (i = 0; i <= 100; i += 10) printf("%3d ", c2f(i)); - printf("\n"); - - printf("\nF:"); - for (i = 32; i <= 212; i += 18) printf("%3d ", i); - printf("\nC:"); - for (i = 32; i <= 212; i += 18) printf("%3d ", f2c(i)); - printf("\n"); - - jit_destroy_state(); - finish_jit(); - return 0; -@} -@end example - -The client displays a conversion table between Celsius and Fahrenheit -degrees (both Celsius-to-Fahrenheit and Fahrenheit-to-Celsius). The -formulas are, @math{F(c) = c*9/5+32} and @math{C(f) = (f-32)*5/9}, -respectively. - -Providing the formula as an argument to @code{compile_rpn} effectively -parameterizes code generation, making it possible to use the same code -to compile different functions; this is what makes dynamic code -generation so powerful. - -@node Fibonacci -@section Fibonacci numbers - -The code in this section calculates the Fibonacci sequence. That is -modeled by the recurrence relation: -@display - f(0) = 0 - f(1) = f(2) = 1 - f(n) = f(n-1) + f(n-2) -@end display - -The purpose of this example is to introduce branches. There are two -kind of branches: backward branches and forward branches. We'll -present the calculation in a recursive and iterative form; the -former only uses forward branches, while the latter uses both. - -@example -#include <stdio.h> -#include <lightning.h> - -static jit_state_t *_jit; - -typedef int (*pifi)(int); @rem{/* Pointer to Int Function of Int */} - -int main(int argc, char *argv[]) -@{ - pifi fib; - jit_node_t *label; - jit_node_t *call; - jit_node_t *in; @rem{/* offset of the argument */} - jit_node_t *ref; @rem{/* to patch the forward reference */} - jit_node_t *zero; @rem{/* to patch the forward reference */} - - init_jit(argv[0]); - _jit = jit_new_state(); - - label = jit_label(); - jit_prolog (); - in = jit_arg (); - jit_getarg (JIT_V0, in); @rem{/* R0 = n */} - zero = jit_beqi (JIT_R0, 0); - jit_movr (JIT_V0, JIT_R0); /* V0 = R0 */ - jit_movi (JIT_R0, 1); - ref = jit_blei (JIT_V0, 2); - jit_subi (JIT_V1, JIT_V0, 1); @rem{/* V1 = n-1 */} - jit_subi (JIT_V2, JIT_V0, 2); @rem{/* V2 = n-2 */} - jit_prepare(); - jit_pushargr(JIT_V1); - call = jit_finishi(NULL); - jit_patch_at(call, label); - jit_retval(JIT_V1); @rem{/* V1 = fib(n-1) */} - jit_prepare(); - jit_pushargr(JIT_V2); - call = jit_finishi(NULL); - jit_patch_at(call, label); - jit_retval(JIT_R0); @rem{/* R0 = fib(n-2) */} - jit_addr(JIT_R0, JIT_R0, JIT_V1); @rem{/* R0 = R0 + V1 */} - - jit_patch(ref); @rem{/* patch jump */} - jit_patch(zero); @rem{/* patch jump */} - jit_retr(JIT_R0); - - @rem{/* call the generated code@comma{} passing 32 as an argument */} - fib = jit_emit(); - jit_clear_state(); - printf("fib(%d) = %d\n", 32, fib(32)); - jit_destroy_state(); - finish_jit(); - return 0; -@} -@end example - -As said above, this is the first example of dynamically compiling -branches. Branch instructions have two operands containing the -values to be compared, and return a @code{jit_note_t *} object -to be patched. - -Because labels final address are only known after calling @code{emit}, -it is required to call @code{patch} or @code{patch_at}, what does -tell @lightning{} that the target to patch is actually a pointer to -a @code{jit_node_t *} object, otherwise, it would assume that is -a pointer to a C function. Note that conditional branches do not -receive a label argument, so they must be patched. - -You need to call @code{patch_at} on the return of value @code{calli}, -@code{finishi}, and @code{calli} if it is actually referencing a label -in the jit code. All branch instructions do not receive a label -argument. Note that @code{movi} is an special case, and patching it -is usually done to get the final address of a label, usually to later -call @code{jmpr}. - -Now, here is the iterative version: - -@example -#include <stdio.h> -#include <lightning.h> - -static jit_state_t *_jit; - -typedef int (*pifi)(int); @rem{/* Pointer to Int Function of Int */} - -int main(int argc, char *argv[]) -@{ - pifi fib; - jit_node_t *in; @rem{/* offset of the argument */} - jit_node_t *ref; @rem{/* to patch the forward reference */} - jit_node_t *zero; @rem{/* to patch the forward reference */} - jit_node_t *jump; @rem{/* jump to start of loop */} - jit_node_t *loop; @rem{/* start of the loop */} - - init_jit(argv[0]); - _jit = jit_new_state(); - - jit_prolog (); - in = jit_arg (); - jit_getarg (JIT_R0, in); @rem{/* R0 = n */} - zero = jit_beqi (JIT_R0, 0); - jit_movr (JIT_R1, JIT_R0); - jit_movi (JIT_R0, 1); - ref = jit_blti (JIT_R1, 2); - jit_subi (JIT_R2, JIT_R2, 2); - jit_movr (JIT_R1, JIT_R0); - - loop= jit_label(); - jit_subi (JIT_R2, JIT_R2, 1); @rem{/* decr. counter */} - jit_movr (JIT_V0, JIT_R0); /* V0 = R0 */ - jit_addr (JIT_R0, JIT_R0, JIT_R1); /* R0 = R0 + R1 */ - jit_movr (JIT_R1, JIT_V0); /* R1 = V0 */ - jump= jit_bnei (JIT_R2, 0); /* if (R2) goto loop; */ - jit_patch_at(jump, loop); - - jit_patch(ref); @rem{/* patch forward jump */} - jit_patch(zero); @rem{/* patch forward jump */} - jit_retr (JIT_R0); - - @rem{/* call the generated code@comma{} passing 36 as an argument */} - fib = jit_emit(); - jit_clear_state(); - printf("fib(%d) = %d\n", 36, fib(36)); - jit_destroy_state(); - finish_jit(); - return 0; -@} -@end example - -This code calculates the recurrence relation using iteration (a -@code{for} loop in high-level languages). There are no function -calls anymore: instead, there is a backward jump (the @code{bnei} at -the end of the loop). - -Note that the program must remember the address for backward jumps; -for forward jumps it is only required to remember the jump code, -and call @code{patch} for the implicit label. - -@node Reentrancy -@chapter Re-entrant usage of @lightning{} - -@lightning{} uses the special @code{_jit} identifier. To be able -to be able to use multiple jit generation states at the same -time, it is required to used code similar to: - -@example - struct jit_state lightning; - #define lightning _jit -@end example - -This will cause the symbol defined to @code{_jit} to be passed as -the first argument to the underlying @lightning{} implementation, -that is usually a function with an @code{_} (underscode) prefix -and with an argument named @code{_jit}, in the pattern: - -@example - static void _jit_mnemonic(jit_state_t *, jit_gpr_t, jit_gpr_t); - #define jit_mnemonic(u, v) _jit_mnemonic(_jit, u, v); -@end example - -The reason for this is to use the same syntax as the initial lightning -implementation and to avoid needing the user to keep adding an extra -argument to every call, as multiple jit states generating code in -paralell should be very uncommon. - -@section Registers -@chapter Accessing the whole register file - -As mentioned earlier in this chapter, all @lightning{} back-ends are -guaranteed to have at least six general-purpose integer registers and -six floating-point registers, but many back-ends will have more. - -To access the entire register files, you can use the -@code{JIT_R}, @code{JIT_V} and @code{JIT_F} macros. They -accept a parameter that identifies the register number, which -must be strictly less than @code{JIT_R_NUM}, @code{JIT_V_NUM} -and @code{JIT_F_NUM} respectively; the number need not be -constant. Of course, expressions like @code{JIT_R0} and -@code{JIT_R(0)} denote the same register, and likewise for -integer callee-saved, or floating-point, registers. - -@node Customizations -@chapter Customizations - -Frequently it is desirable to have more control over how code is -generated or how memory is used during jit generation or execution. - -@section Memory functions -To aid in complete control of memory allocation and deallocation -@lightning{} provides wrappers that default to standard @code{malloc}, -@code{realloc} and @code{free}. These are loosely based on the -GNU GMP counterparts, with the difference that they use the same -prototype of the system allocation functions, that is, no @code{size} -for @code{free} or @code{old_size} for @code{realloc}. - -@deftypefun void jit_set_memory_functions (@* void *(*@var{alloc_func_ptr}) (size_t), @* void *(*@var{realloc_func_ptr}) (void *, size_t), @* void (*@var{free_func_ptr}) (void *)) -@lightning{} guarantees that memory is only allocated or released -using these wrapped functions, but you must note that if lightning -was linked to GNU binutils, malloc is probably will be called multiple -times from there when initializing the disassembler. - -Because @code{init_jit} may call memory functions, if you need to call -@code{jit_set_memory_functions}, it must be called before @code{init_jit}, -otherwise, when calling @code{finish_jit}, a pointer allocated with the -previous or default wrappers will be passed. -@end deftypefun - -@deftypefun void jit_get_memory_functions (@* void *(**@var{alloc_func_ptr}) (size_t), @* void *(**@var{realloc_func_ptr}) (void *, size_t), @* void (**@var{free_func_ptr}) (void *)) -Get the current memory allocation function. Also, unlike the GNU GMP -counterpart, it is an error to pass @code{NULL} pointers as arguments. -@end deftypefun - -@section Alternate code buffer -To instruct @lightning{} to use an alternate code buffer it is required -to call @code{jit_realize} before @code{jit_emit}, and then query states -and customize as appropriate. - -@deftypefun void jit_realize () -Must be called once, before @code{jit_emit}, to instruct @lightning{} -that no other @code{jit_xyz} call will be made. -@end deftypefun - -@deftypefun jit_pointer_t jit_get_code (jit_word_t *@var{code_size}) -Returns NULL or the previous value set with @code{jit_set_code}, and -sets the @var{code_size} argument to an appropriate value. -If @code{jit_get_code} is called before @code{jit_emit}, the -@var{code_size} argument is set to the expected amount of bytes -required to generate code. -If @code{jit_get_code} is called after @code{jit_emit}, the -@var{code_size} argument is set to the exact amount of bytes used -by the code. -@end deftypefun - -@deftypefun void jit_set_code (jit_ponter_t @var{code}, jit_word_t @var{size}) -Instructs @lightning{} to output to the @var{code} argument and -use @var{size} as a guard to not write to invalid memory. If during -@code{jit_emit} @lightning{} finds out that the code would not fit -in @var{size} bytes, it halts code emit and returns @code{NULL}. -@end deftypefun - -A simple example of a loop using an alternate buffer is: - -@example - jit_uint8_t *code; - int *(func)(int); @rem{/* function pointer */} - jit_word_t code_size; - jit_word_t real_code_size; - @rem{...} - jit_realize(); @rem{/* ready to generate code */} - jit_get_code(&code_size); @rem{/* get expected code size */} - code_size = (code_size + 4095) & -4096; - do (;;) @{ - code = mmap(NULL, code_size, PROT_EXEC | PROT_READ | PROT_WRITE, - MAP_PRIVATE | MAP_ANON, -1, 0); - jit_set_code(code, code_size); - if ((func = jit_emit()) == NULL) @{ - munmap(code, code_size); - code_size += 4096; - @} - @} while (func == NULL); - jit_get_code(&real_code_size); @rem{/* query exact size of the code */} -@end example - -The first call to @code{jit_get_code} should return @code{NULL} and set -the @code{code_size} argument to the expected amount of bytes required -to emit code. -The second call to @code{jit_get_code} is after a successful call to -@code{jit_emit}, and will return the value previously set with -@code{jit_set_code} and set the @code{real_code_size} argument to the -exact amount of bytes used to emit the code. - -@section Alternate data buffer -Sometimes it may be desirable to customize how, or to prevent -@lightning{} from using an extra buffer for constants or debug -annotation. Usually when also using an alternate code buffer. - -@deftypefun jit_pointer_t jit_get_data (jit_word_t *@var{data_size}, jit_word_t *@var{note_size}) -Returns @code{NULL} or the previous value set with @code{jit_set_data}, -and sets the @var{data_size} argument to how many bytes are required -for the constants data buffer, and @var{note_size} to how many bytes -are required to store the debug note information. -Note that it always preallocate one debug note entry even if -@code{jit_name} or @code{jit_note} are never called, but will return -zero in the @var{data_size} argument if no constant is required; -constants are only used for the @code{float} and @code{double} operations -that have an immediate argument, and not in all @lightning{} ports. -@end deftypefun - -@deftypefun void jit_set_data (jit_pointer_t @var{data}, jit_word_t @var{size}, jit_word_t @var{flags}) - -@var{data} can be NULL if disabling constants and annotations, otherwise, -a valid pointer must be passed. An assertion is done that the data will -fit in @var{size} bytes (but that is a noop if @lightning{} was built -with @code{-DNDEBUG}). - -@var{size} tells the space in bytes available in @var{data}. - -@var{flags} can be zero to tell to just use the alternate data buffer, -or a composition of @code{JIT_DISABLE_DATA} and @code{JIT_DISABLE_NOTE} - -@table @t -@item JIT_DISABLE_DATA -@cindex JIT_DISABLE_DATA -Instructs @lightning{} to not use a constant table, but to use an -alternate method to synthesize those, usually with a larger code -sequence using stack space to transfer the value from a GPR to a -FPR register. - -@item JIT_DISABLE_NOTE -@cindex JIT_DISABLE_NOTE -Instructs @lightning{} to not store file or function name, and -line numbers in the constant buffer. -@end table -@end deftypefun - -A simple example of a preventing usage of a data buffer is: - -@example - @rem{...} - jit_realize(); @rem{/* ready to generate code */} - jit_get_data(NULL, NULL); - jit_set_data(NULL, 0, JIT_DISABLE_DATA | JIT_DISABLE_NOTE); - @rem{...} -@end example - -Or to only use a data buffer, if required: - -@example - jit_uint8_t *data; - jit_word_t data_size; - @rem{...} - jit_realize(); @rem{/* ready to generate code */} - jit_get_data(&data_size, NULL); - if (data_size) - data = malloc(data_size); - else - data = NULL; - jit_set_data(data, data_size, JIT_DISABLE_NOTE); - @rem{...} - if (data) - free(data); - @rem{...} -@end example - -@node Acknowledgements -@chapter Acknowledgements - -As far as I know, the first general-purpose portable dynamic code -generator is @sc{dcg}, by Dawson R.@: Engler and T.@: A.@: Proebsting. -Further work by Dawson R. Engler resulted in the @sc{vcode} system; -unlike @sc{dcg}, @sc{vcode} used no intermediate representation and -directly inspired @lightning{}. - -Thanks go to Ian Piumarta, who kindly accepted to release his own -program @sc{ccg} under the GNU General Public License, thereby allowing -@lightning{} to use the run-time assemblers he had wrote for @sc{ccg}. -@sc{ccg} provides a way of dynamically assemble programs written in the -underlying architecture's assembly language. So it is not portable, -yet very interesting. - -I also thank Steve Byrne for writing GNU Smalltalk, since @lightning{} -was first developed as a tool to be used in GNU Smalltalk's dynamic -translator from bytecodes to native code. - -@c %**end of header (This is for running Texinfo on a region.) - -@c *********************************************************************** - -@bye |