There is a very useful script called
runparallel which allows you to distribute input coming from STDIN
to multiple parallel incarnations of a process. The output of these processes (if they produce any and
you are interested in it) is then combined into one. This is done while maintaining the correct order,
just like if the input had been processed sequentially by a single process.
Let’s look at an example: Imagine we have a custom script
get_name.sh that reads ADO IDs on STDIN
and returns both the ADO IDs and the respective ADO longname. And it does this by looping through the ADO IDs
and calling some Formula Engine code to get the longname etc.:
acdba@acbox:~$ echo C0.ISS.1 | ./get_name.sh C0.ISS.1,TERRAFINO INDUSTRIES COR-7.25% PFD acdba@acbox:~$
If we do this for a single ADO, this is quick. We can even wait for 100. But things are starting to take longer once we run this for a few thousand ADOs.
This is where
runparallel can help:
acdba@acbox:~$ echo "select symbol from fundmstr order by 1" | ac_bl -qs - | runparallel -l10 -n8 ./get_name.sh +++INFO+++ 20190405_22:12:08 runparallel: Starting: runparallel -l10 -n8 ./get_name.sh C0.ISS.1,TERRAFINO INDUSTRIES COR-7.25% PFD C0.ISS.2,TERRAFINO INDUSTRIES-7.25% NOTES ... C0.ISS.99998,TERRAFINO INDUSTRIES CORP C0.ISS.99999,TERRAFINO INDUSTRIES CORP-CL E +++INFO+++ 20190405_22:12:09 runparallel: Finished runparallel acdba@acbox:~$
echo ... | ac_bl -qs - to the left of runparallel returns all ADO IDs in our system. Then we use
-n8 to instruct runparallel to divide the input into batches of 10 and run 8 instances
of get_name.sh in parallel.
So, rather than a single process collecting all the ADO longnames, we now have 8. And we expect this to be much faster.
Btw. the script
runparallel is very useful in connection with
this approach for effectively applying an FE function to one ADO at a time.