Using Compiled Code
Euler has a mighty programming language. But this language is restricted to numerical programming in the context of a matrix language. Like Matlab and even Python to much extend, it is the wrong choice as an efficient programming language.
To be able to add more performant algorithms to Euler, Euler can load functions from a DLL (dynamic link library). These libraries can be programmed in any language, which can link to a DLL. E.g., Peter Notebaert compiled the LPSOLVE program for Euler to a DLL.
Of course, this section is for experienced programmers. However, small C functions are not difficult to implement.
Tiny C Compiler
To make it easy for programmers to add C code to Euler, Euler includes the Tiny C Compiler (TCC) by Fabrice Bellard. This compiler does some optimizations, and is usually well in the range of optimizing compilers. It is very small, and has even support for the Windows API.
The syntax is an extended C. The integer size is 32 bit, but there is support for a 64bit long long type.
The 64bit version of Euler includes the 64bit version of Tiny C.
TinyC Functions
The easiest way to use C code in Euler are TinyC functions. These functions look like ordinary functions, but have the tinyc flag, and contain C code and macros in the body. When tinyc functions are defined, Euler generates a C file containing the definition in C of these functions.
Here is an example.
>function tinyc agm (a, b) ... $ ## computes the arithmetic geometric mean $ ARG_DOUBLE(a); ARG_DOUBLE(b); $ CHECK(a>0 && b>0,"Need positive real numbers."); $ double a1,b1; $ while (1) $ { $ a1=(a+b)/2; b1=sqrt(a*b); $ if (fabs(a1-b1)/a1<1e-14) break; $ a=a1; b=b1; $ if (test_key()) ERROR("User Interrupt!"); $ } $ new_real(a1); $ endfunction
You find this example in the introduction to compiled code. Here is a short explanation of this code.
- ARG_DOUBLE(a) and ARG_DOUBLE(b) tall EMT to take the two parameters of the EMT stack and define them as C variables. These macros also check that the given arguments are indeed real values. Note, that the parameter number is already checked when calling the function.
- CHECK is another macro that does a check to assert the correctness of the parameters.
- test_key() tests if the user pressed any key.
- new_real(a1) puts the value of a1 onto the EMT stack. This is interpreted as the result of the function.
The function can be loaded like all Euler functions, but it is handled very differently. The following happens:
- The number of parameters is taken from the definition. The names do not matter. They are only for documentation.
- The comment lines for the function are taken from the definition.
- The rest of the code is a mixture of C macros and C code, which is used to generate the C file "agm.c".
- The function must end with "endfunction".
- The TinyC compiler is invoked to compile "agm.c" to "agm.dll".
- The function agm with two parameters is loaded from the DLL.
- A dummy comment function agm(a,b) is generated in the Euler memory for documentation, using the saved comment lines. Consequently, the status line, the help window, and the help command work as expected.
The file "agm.c" will contain the necessary C includes and the startup code. The C code provided in the TinyC function is only the body of the function agm() in this C file.
Note that all this is taking place in the active directory, which is usually the directory of the notebook. You should save the notebook into a writeable directory, or you can make the Euler directory in the user home directory active and switch back with the following commands.
>cd(eulerhome());
>...
>cd(home());
Then the files "agm.c" and "agm.dll" will be in the directory "%USERPROFILE%\Euler". Usually, this will be "C:\Users\username\Euler".
Macros
To make it easy for the casual user, the generated C code includes the header file "dlldef.h", which contains declarations of functions in "dlldef.c" and some macros. Both files are located in the "dll" subdirectory of the installation of Euler. You can study the files. But you need to give yourself administrator rights to change them.
The macros starting "ARG_" declare variables, which are initialized with the arguments that are given to the function. You need exactly one declaration for each parameter. The number of the parameters is taken from the definition line. The names do not matter.
E.g., the macro
ARG_DOUBLE(x);
is expanded into the following code.
double x; CHECK(argn<np,"Too many ARG macros"); x=getreal(hd[argn++]); IFERROR("Need a real argument.");
Here, "CHECK" and "IFERROR" are two further macros, which store an error message and return an error aborting the function. The function "getreal" is defined in "dlldef.c" and extracts the double value x from the Euler header, which is the way arguments are passed to DLL functions. "getreal" does also check, if the argument is indeed real, setting the global value "error=1" if it is not and aborting the program. "argn" counts the arguments. It is an integer value initialized to 0 in front of the user code above.
The main things to remember are
- to use exactly one ARG_ macro for each parameter,
- to remember that the variables are declared and initializes by the ARG_ macro.
The following ARG_ macros exist to date.
- ARG_DOUBLE(x) declaring the double variable x.
- ARG_DOUBLE_MATRIX(A,r,c) declaring double *A, and two integer variables for the number of rows and columns of A. This macro works for real scalar values too. Then r=c=1. If the matrix is not a vector you need to access the elements of A using the index computation A[i*c+j].
- ARG_COMPLEX_MATRIX(A,r,c) works the same way, but checks for a complex matrix. The complex values are stored in pairs, so that A[2*(i*c+j)] is the real part and A[2*(i*c+j)+1] the imaginary part. Use this for complex scalars too. Then r=c=1.
- ARG_INTERVAL_MATRIX(A,r,c) works like complex matrices. Use this for interval scalars too. Then r=c=1.
- ARG_STRING(s) declaring char *s for string variables.
- ARG_STRING_VECTOR(s) works like ARG_STRING, but checks for string vectors. These are a series of strings, with the last string followed by a 1.
- ARG_BINARY(p,size) for binary data (see below) declaring void *p and the size of the binary data field in bytes.
Note that arguments are passed by reference to DLL functions. You can change them, but you need to be careful not to write outside the memory area reserved for the arguments. Strings must be closed with a 0. Be sure to read the section about pitfalls below.
To store the result, you can simply call the "new_" functions in "dlldef.c". These functions return pointers to a header on the Euler stack. For vector results, you want to declare the result, and then change it. There are "RES_" macros for this.
- RES_DOUBLE_MATRIX(A,r,c) declares the variable double *A and initializes it with the beginning of the matrix with r rows and c columns on the Euler stack. Set matrix elements as described above in ARG_DOUBLE_MATRIX.
- RES_COMPLEX_MATRIX and RES_INTERVAL_MATRIX work the same way.
- RES_BINARY(p,size) declares void *p pointing to a binary area of size bytes on the Euler stack.
Make sure not to use the same variable name in "ARG_" and "RES_". This would confuse both the compiler and you.
For simple data, use the following functions. You may want to call IFERROR("Out of stack space") after these functions.
- new_real(x)
- new_complex(x,y)
- new_interval(x,y)
- new_string(s)
The compressed data type, the binary data and the string vectors are special, since they might grow during the computations. Study the example of the bitsets to learn, how this can be achieved. In general, you have to adjust the size element of the header properly.
You must not return anything after the TinyC function, since this is done automatically. If you need to abort the function, use "return newram". If you want to set an error and return, use the "ERROR" macro.
ERROR("Something strange happened. Aborting.");
You can find some examples for TinyC function in this introduction. Of course, you can study the dlldef.h and dlldef.c files in your installation directory.
Binary Data
Binary data are an area of memory on the Euler stack, which can be used by compiled functions for any purpose they wish. The size of the data in bytes is stored in front of the data. Functions cannot change this size, and they need to take care not to write outside the binary area, or Euler may crash.
Binary data can be stored to Euler variables. They will be printed as "binary data of size ...". They can be passed to functions like any other value.
To generate binary data, use an Euler header of type "s_binary". Much easier is to ues the macro "RES_BINARY(p,size)" which does this for you and assigns the pointer void *p. If you receive a binary argument, you can use the macro "ARG_BINARY(p,n)" to get its data.
Worke Space
You might need memory to do computations besided the stack space of your function or the static space in your DLL. The use of malloc() is discouraged. You can use the function getram() to get a part of the EMT stack.
int *I = (int *)getram(Nboot*sizeof(int)); CHECK(I,"Out of heap space."); ... header *out=new_real(x); moveresults(out);
In this case, you have destroyed the proper order on the stack of EMT. Thus you need to move your result to the place where EMT expects it to be, namely at the end of the stack at the time the C functions was called.
Compilation
For larger projects, you want to write the C code yourself. This must be done, if
- you want more than one function in a DLL,
- you need to keep static things in the DLL between calls,
- you need subroutines of any kind,
- you want to define macros.
The C code must be stored in the current directory. This is usually the directory of the notebook. Of course, you need write access to this directory. The generated DLL will be stored in the same directory.
You can compile a C file to a DLL with this compiler right from the Euler notebook. Euler calls TCC with the necessary command line. Use
>tccompile "filename";
You must not add the .c extension. Compiler errors will print to the notebook window. TCC has a reasonable way to print errors.
If you press F10 at this line, the external editor will open with the C file loaded, if it is in the current directory. The Java editor je that comes with Euler has syntax highlighting for C.
You can compile the DLL yourself in a command line without using the tccompile command, or create a batch file, if you like. The command you need is the following
EMT\tcc\tcc -shared -I EMT\dll -o FILE.dll F.c EMT\dll\dlldef.c
Here, EMT must be replaced by the installation directory of Euler, and F must be replaced by the file name. As you see, tcc needs to include the dll directory in Euler for the headers with -I, and it needs to include the compilation of dlldef.c in the same directory.
You can include more than one C file. You can also use the exec() command of Euler for the compilation.
>s=start(); >w=""; if win64() then w="-DWIN64 "; endif; >p="-shared "+w+"-I "+s+"dll -o f.dll f.c g.c "+s+"dll\dlldef.c" >closedll("f"); exec(s+"tcc\tcc.exe",p,home,>print) >dll("f","f",1); >f(5) 25
The example does also show how to load the function from the DLL after compilation. See the dll() command for more details. Here is the content of the f.c file.
#include "dlldef.h" #include <stdio.h> #include <math.h> double g (double x); EXPORT char *f (header *hd[], int np, char *ramstart, char *ramend) { int argn=0; start(ramstart,ramend); ARG_DOUBLE(x); new_real(g(x)); return newram; }
And here is the external definition of g in g.c
double g (double x) { return x*x; }
You can use Visual C++ to compile the files and create the DLL.
- Generate a new empty project.
- Edit the project and make it a DLL project.
- Change the output file, if you like.
- Define the macros VISUALCPP, _CFT_NO_SECURE_WARNINGS, and WIN64 (if needed).
- Import your C files, dlldef.c and dlldef.h (from the dll directory in Euler).
- Select the properties of the C files and compile as C++.
- Generate the project.
Euler DLL API
You need to obey certain rules in your code to extract the parameters from the Euler headers, and to return the results in a proper way. A good way to learn this is to study the examples in one of the following introductions.
For more advanced examples, you will have to study the files dlldef.h and dlldef.c to see, how everything is working.
Euler uses a stack on the heap to keep its data. You need to deliver your results in Euler stack elements on the stack. To make this easier, there are macros and functions in dlldef.c and dlldef.h. Both files are in the DLL directory of your Euler installation. You need to include dlldef.h and stdio.h. The file dlldef.h is well documented and explains how to create the headers of stack elements and access their data part.
Here is an example, which takes an integer n and computes an array containing the first n Fibonacci numbers.
#include "dlldef.h" #include <stdio.h> EXPORT char * fib (header *hd[], int np, char *ramstart, char *ramend) { start(ramstart,ramend); // required !!! int n=getint(hd[0]); IFERROR("Need an integer for fib!"); if (n<2) n=2; header *result=new_matrix(1,n); CHECK(result,"Stack overflow!"); double *m=matrixof(result); m[0]=1; m[1]=1; int i; for (i=2; i<n; i++) m[i]=m[i-1]+m[i-2]; return newram; // required !!! }
The function start is absolutely necessary. It sets the global variables ram, newram and ramend for other functions in the API.
The parameters of Euler are passed in a vector of parameters. The number of parameters is in np. (It is possible to write functions with a variable number of parameters.) We extract the real from the stack element with getreal. You may wish to study dlldef.c for details on this.
After minimal error checking we create a matrix of size n (1 row, n columns), and fill it up with the Fibonacci numbers.
To return the result, we return the value newram. This variable is increased by any new_... command. Euler assumes that the results start from the old value of newram. Note, that you can return multiple results just like in Euler functions. To return nothing, return 0 from main. To return an error message, copy it to char *ram, and return this value. The variable ram keeps the old stack position.
sprintf(ram,"Error in variable %s",variablename); return ram;
There is a macro ERROR("...") for this.
To load the function into Euler, use the command "dll(file,function,n)". It takes the name if the library, without the extension .dll, the name of the function, and the number of parameters (-1 for a flexible number of parameters). The library is searched in the Euler path.
>closedll("test"); >tccompile test; >dll("test","fib",1); >fib(40) [ 1 1 2 3 5 8 13 21 34 55 89 144 233 377 610 987 1597 2584 4181 6765 10946 17711 28657 46368 75025 121393 196418 317811 514229 832040 1346269 2178309 3524578 5702887 9227465 14930352 24157817 39088169 63245986 102334155 ] Need a real number for fib!
You can use the stack as a working space. In this case, use the getram function. If you do this, newram will be increased, which would confuse Euler, since it assumes the result there. So you need to move the result to newram with moveresults. See the this example.
Of course, you can also malloc memory. Don't forget to free the memory later. This is useful to hold data in other formats than Euler provides. Of course, memory will be releases when Euler quits.
Pitfalls
It is not easy to write proper C code, unless for very simple functions. Moreover, the C code has to use the Euler API properly. Here are the most important errors you can make.
Not calling the start routine. The start routine initializes the ram pointers, and resets the error. Not calling it crashes the DLL.
No error checking for Euler parameters. Euler only knows the name of the functions and the number of parameters. You need to check the type of the parameters properly. The ARG_ macros do this automatically. Use hd->type and the provided header types for this. Also check for proper array dimensions.
Returning the false value. Euler assumes that the results are at the pointer ramstart, and the end of the results is returned by the function. If there is no result, 0 must be returned. If the functions returns ram, Euler assumes an error string at ram, which it prints. If you need working space from the stack, either use the stack space after your result (use the getram(size) function and check for an error), or move the result later to the ramstart with moveresults().
Not releasing memory. Your program will get slower and slower due to swapping if you do this. Any memory allocated with malloc must be releases with free. Note that releasing released memory crashes programs. Of course you can keep memory between function calls. But if you do not longer need it, release it. You can export a clear() function to release all memory at the end. It will be called automatically, if it is present.