In this article I want to discuss a pattern of combining ac_evalform
and runparallel
that can both
simplify the implementation as well as allow increased performance by parallel processing.
To keep the example simple, imagine we wanted to know the longnames of all ADOs in our system. Then we could implement an FE script get_all_names.fe as follows:
function get_name(ado) {
return [ado, name(ado)];
}
local list ados = ado_browse("symbol like '%'");
return map(get_name, ados);
And we could run it like this:
acdba@acbox:~$ time ac_evalform -f get_all_names.fe
C0.ISS.1,TERRAFINO INDUSTRIES COR-7.25% PFD
C0.ISS.2,TERRAFINO INDUSTRIES CORP
C0.ISS.3,TERRAFINO INDUSTRIES CORP-CL E
...
C0.ISS.9999,TERRAFINO INDUSTRIES-7.25% NOTES
real 0m9.126s
user 0m1.040s
sys 0m0.004s
acdba@acbox:~$
This took over 9 seconds. We can do better!
Writing Formula Engine code to process one ADO at a time
Now, look at the following snippet of FE code:
return [[$SYMBOL, name($SYMBOL)]];
It only contains the logic to process one ADO referenced by $SYMBOL
. Note, that there is no ado_browse
statement there.
Parallel processing using runparallel
Instead of doing the ado_browse inside the script and then work sequentially, we pull the ado_browse out and use runparallel to fork a number of parallel processes:
acdba@acbox:~$ time echo "ado_browse(\"symbol like '%'\");" | ac_evalform - | runparallel -l10 -n8 ac_evalform -f get_name.fe -l -
+++INFO+++ 20190406_14:08:26 runparallel: Starting: runparallel -l10 -n8 ac_evalform -f get_name.fe -l -
C0.ISS.1,TERRAFINO INDUSTRIES COR-7.25% PFD
C0.ISS.2,TERRAFINO INDUSTRIES CORP
C0.ISS.3,TERRAFINO INDUSTRIES CORP-CL E
...
C0.ISS.9999,TERRAFINO INDUSTRIES-7.25% NOTES
+++INFO+++ 20190406_14:08:28 runparallel: Finished runparallel
real 0m2.243s
user 0m0.344s
sys 0m0.028s
acdba@acbox:~$
This is much faster.
The important thing to mention here is the use of ac_evalform with the option -l -
. This instructs
ac_evalform to read from STDIN and execute the FE script once for each line of the input. Each time the
current line is available in $SYMBOL
.
In our example we worked with ADO IDs, but you are not limited to that. Input lines can for example contain an ADO ID and a start and end date. You would then have to parse the input accordingly:
function getSomeOtherData(ado, s, e) {
return ...;
}
local list parts = split($SYMBOL, ',');
local string ado = parts[0];
local integer startDate = integer(parts[1]);
local integer endDate = integer(parts[2]);
return [[ado, name(ado), getSomeOtherData(ado, startDate, endDate)]];
I am sure you can find many other similar use cases.