1. Overview
This manual provides an index of important concepts internal to Turbine.
2. Quick start
The fastest way to get a quick overview of provided features is to just run:
./setup.sh
./configure --with-adlb=<PATH/TO/ADLB>
--with-mpi=<PATH/TO/MPICH>
--with-tcl=<PATH/TO/TCL>
make package
Note:
PATH/TO/ADLB : Points to source tree
PATH/TO/MPICH : Points to installed location
PATH/TO/TCL : Points to installed location
Then, just take a look at the test code and output to see how things work.
3. Tcl packaging
Turbine consists of two libraries, ADLB and Turbine, packaged as Tcl
extensions, and several Tcl script libraries. All of this is packaged
with Tcl conventions in lib
. Cf. lib/make-package.tcl
and
lib/module.mk.in
.
To bring these extensions and libraries into a Tcl script, we use:
package require turbine 0.1
This command refers to environment variable TCLLIBPATH
, which we
set in bin/turbine
.
4. MPI process modes
A Turbine program is a TCL script launched as an SPMD program by
mpiexec
. In general, the idea is to do
mpiexec -l -n ${N} tclsh something.tcl
In our case, we provide a helper script. So in the test cases, we run
bin/turbine -l -n ${N} test/something.tcl
The Turbine MPI environment is set by the mpiexec -n
number and the
inputs to turbine::init
. As a result, each MPI process will become
a Turbine Worker or ADLB Server.
- Turbine Worker
-
Runs on the lowest MPI ranks. Rank 0 calls the user
rules
procedure, starting the program. Work from this procedure may be distributed to other workers. - ADLB Server
-
Performs ADLB services, including task queues, data storage, and data-dependent task release. Enters
ADLB_Server()
and does not exit until the run is complete. Cf.src/tcl/adlb/tcl-adlb.c::ADLB_Server_Cmd()
. Runs on the highest MPI ranks.
In Tcl, the mode is stored in turbine::mode
and is either
WORKER
or SERVER
.
5. Software structure
The Turbine API is a Tcl API. Some of the features are defined in Tcl, some are hand-coded Tcl extensions, and some are SWIG-generated Tcl extensions.
-
All Tcl source is in
lib
-
Turbine C code, e.g. for caches and the worker loop is in
src/turbine
-
Tcl extensions are in
src/tcl
-
src/tcl/turbine
wraps up Turbine C code for Tcl -
src/tcl/adlb
is the Tcl extension for the ADLB code in the ADLB package. This includes the ADLB data calls -
src/tcl/blob
is a SWIG-generated module for advanced blob functionality -
src/tcl/mpe
is the MPE library for Turbine
-
6. Turbine concepts
- TD
-
A Turbine datum. Represented in Tcl by a 64-bit TD number. A TD may be open (unset) or closed (set). TD IDs are represented in the log as
<ID>
. The types are:-
void
-
integer
-
float
-
string
-
blob
-
container
-
- Rules
-
The ADLB/Turbine data dependency engine makes progress by evaluating Turbine rules.
-
A rule has a a input TD list, a TD/subscript list, a rule type, and an action, and optional arguments.
-
The action is a simple Tcl string that is
eval
'd by a possibly different Tcl process. This allows actions to be load balanced by ADLB. -
Rule types are:
-
CONTROL
: put the action into ADLB for evaluation elsewhere -
WORK
: put the action into ADLB for evaluation by a worker -
LOCAL
: send the task to local worker (deprecated)
-
-
When rules are evaluated, they produce in-memory records called transforms.
-
When the transform is ready, it is released to the appropriate ADLB task queue to be retrieved by a worker.
-
The function body targeted by the action can contain arbitrary Tcl code, lookup data from the given TDs, launch external processes via Tcl
exec
, and store TDs, and issue more rule statements.
-
- Container
-
Element from which Turbine data structures are created. May be used to create associative arrays, structs, and other data structures. Represented by a TD. A TD plus a subscript results in another TD.
Container operations are represented in the log as, e.g.,
<4>["k"]=<8>
, indicating that container TD 4 with subscript "k" resulted in TD 8. - Subscribe
-
TRs are stored in the ADLB servers. To make progress, the TRs are activated when their input data is ready. Thus, the servers subscribe to data stored in ADLB and are notified when data is ready.
7. Symbols
This describes the symbols available to the Turbine programmer. These features are required when writing STC or constructing Swift/T extensions.
7.1. Turbine core
The core Turbine features are as follows.
7.1.1. Program structure
Turbine code is Tcl code. For example:
> cat hello.tcl
puts HELLO
> turbine -n 3 hello.tcl
HELLO
HELLO
HELLO
The following code is found in nearly every Turbine program:
package require turbine 0.1
turbine::defaults
turbine::init $servers
turbine::start rules
turbine::finalize
It loads the Turbine Tcl package, loads defaults and environment settings, initializes Turbine, starts progress, and finalizes.
The rules proc contains the initial calls to get the program running. It is only executed by the worker with rank 0.
Other code may be placed in functions.
7.1.2. Startup/shutdown
-
defaults
-
Sets variable servers in the caller’s scope
ADLB_SERVERS
is stored in servers, defaults to 1 -
init servers
-
Initialize Turbine Initializes ADLB
-
finalize
-
Shuts down and reports unused rules
7.2. ADLB layer
Turbine uses ADLB to distribute tasks and locate data.
All Turbine variables are stored in a customized data store built into ADLB. This required the construction of additional ADLB API calls.
The following ADLB features are available to Turbine. Usually, they are used internally by the Turbine features, they are not called directly by the user script.
tcl-adlb.c
-
adlb::SUCCESS
-
Variable represents
ADLB_SUCCESS
. -
adlb::ANY
-
Variable represents "any", which is -1 in ADLB.
-
adlb::init servers types
-
Start ADLB with the given number of servers and work types.
-
adlb::finalize
-
Stop ADLB.
-
adlb::put reserve_rank work type work_unit
-
Submit a work unit as a string of given integer type. Sent to given rank, which may be
adlb::ANY
. -
adlb::get req_type answer_rank
-
Get a work unit as a string of given integer type, which may be
adlb::ANY
. ADLB answer rank stored inanswer_rank
. -
adlb::create id data
-
Instantiate the given data but do not close it. Data may be:
-
string:
-
integer:
-
container:<type>
wheretype
is the type of the container keys. -
file:<name>
wherename
is the file name.
-
-
adlb::store id data
-
Store the TD.
-
adlb::retrieve id
-
Retrieve the TD.
-
adlb::insert id subscript member
-
Store TD
member
at the givensubscript
in containerid
. -
adlb::lookup id subscript
-
Obtain the TD for the given
subscript
in containerid
. -
adlb::unique
-
Return a unique TD.
7.2.1. Internals
adlb.c
-
ADLB_Subscribe(id)
-
Called only by Turbine rule processing. Request that the calling rank be notified when TD
id
is closed.
7.3. Data
7.3.1. Data allocation
Data must be allocated before it may be used as the input to a rule.
data.tcl
-
allocate [<name>] <type> → td
-
Creates and returns a unique TD. The TD is actually stored on some ADLB server, the user does not know which one. If
name
is given, logs a message based onname
. -
allocate_container [<name>] <subscript type> → td
-
Creates and returns a unique TD that is a container with the given subscript type:
"integer"
or"string"
7.3.2. Data storage/retrieval
Data storage/retrieval allows you to store Tcl values in Turbine and retrieve Turbine TDs as Tcl values.
data.tcl
-
store_integer td value
-
retrieve_integer td → value
-
store_string td value
-
retrieve_string td → value
-
store_float td
-
retrieve_float td → value
-
store_void td
-
store_blob td [ list pointer length]
-
retrieve_blob td → [ list pointer length ]
Once you have the values in Tcl, you can perform arbitrary operations and store results back into Turbine.
You can think of Turbine as a load/store architecture, where the Turbine data store is main memory and the local Tcl operations and values are the CPU and its registers.
void
type variables may be used to represent pure dataflow- e.g.,
Swift external variables. Internally, these are just an integer.
blob
values in Turbine/Tcl are a [ list pointer length ]
, where
the pointer is stored as a Tcl integer and the length is the byte
length.
-
Note that to pass these pointers to SWIG interfaces you have to cast them to
void*
,double*
, etc. Tools are provided by the Turbineblobutils
package to do this. -
The pointer points to a locally allocated copy of the blob data. This must be freed with
adlb::blob_free
. Auto-wrapped STC functions will automatically insert this instruction.
7.3.3. Literals
There is a convenience function to set up literal data.
functions.tcl
set x [ literal integer 3 ]
or
literal x integer 3
Now x is a closed TD of type integer with value 3.
7.4. Progress (rules)
Data-dependent progress is controlled by Turbine rules.
A Turbine rule statement contains:
rule input_list action options...
-
input_list
-
A space-separated list (Tcl list) of TDs. When these are are closed, the action is
eval
'd. -
action
-
A string of Tcl code for execution once all inputs are closed. Essentially, when all the inputs are closed, Turbine will make the action ready for execution, based on the
type
.
7.4.1. Options
All options are optional
rule input_list action name "myfunction" type $turbine::WORK location 4 parallelism 2
-
name
-
An arbitrary string name used for debugging and logging. Turbine will make up a default name
-
type
-
LOCAL
,CONTROL
, orWORK
. Default isCONTROL
-
parallelism
-
Number of processes to use for an MPI parallel task. Default is 1.
-
target
-
Send action to this MPI rank. Default is any available process based on
type
($adlb::RANK_ANY
)
7.4.2. Semantics
The rule statement semantics are as follows, with respect to the Tcl thread of execution.
-
I can pause here
-
I have an action I would like to perform at some point in the future
-
I can restart myself given the action string
-
Do not restart me until the given inputs are closed
-
When my action completes, my outputs will be closed
-
For
CONTROL
orWORK
, you can execute my action on a different node (I will be able to find my data (and call stack) in the global store)
7.4.3. Naming
The name "rule" was chosen because this is somewhat like a Makefile rule, and the analogy was intended to be helpful.
7.4.4. Rationale
A Turbine rule is not just a control structure, it is data- it has an identifier and debug token, is stored in data structures, is loggable, debuggable, etc. The arbitrary action string provides a lot of flexibility in how the statement may be used (by the code generator)
7.5. Functions
A good way to manage progress is to define Tcl functions (procs) for use in the execution string.
To implement a Swift function, we often have three Tcl
functions. Consider Swift function f()
:
-
The "rule" function: conventionally called
f
. This is called to register the function call with the ADLB/Turbine dataflow engine -
The rule statement stores the action until the inputs are ready
-
The "body" function: conventionally called
f_body
. This is called when the inputs are ready. The body function retrieves data, computes, and stores data -
The "impl" function: conventionally called
f_impl
. The impl acts on values, not addresses. This is convenient because sometimes STC can optimize addresses and operate on values. This saves on calls to the ADLB data API, which uses messaging and is expensive. Thus, you do not need an impl function if you just want to perform the computation in the body function
# x, y and z are string TDs. x and y may be unset
proc f { stack z x y } {
rule f-$x-$y [ list $x $y ] $turbine::LOCAL "f_body no_stack $x $y $z"
}
# x, y and z are string TDs. x and y are now set
proc f_body { stack x y z } {
set s1 [ retrieve_string $x ]
set s2 [ retrieve_string $y ]
set s3 [ f_impl $s1 $s2 ]
store_string $z $s3
}
# x and y are string values
proc f_impl { x y } {
return compute_something $x $y
}
# Calling code:
allocate x string
allocate y string
allocate z string
store_string $x "sample1"
store_string $y "sample2"
f no_stack $z $x $y
The previous example could have used the literal function but it is an opportunity to show things in full detail.
Implementation reference: the Turbine tests and any STC-generated code.
7.6. Operations
These are the operations available in Turbine.
7.6.1. Arithmetic
All arithmetic functions operate on TDs and are of the form:
op stack outputs inputs
The impl versions operate on values and are of the form:
op_impl inputs -> outputs
arith.tcl
Integer | Float |
---|---|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
7.6.2. Statistics
Statistics functions are in stats.tcl. These make straightforward use of the Turbine API and Tcl arithmetic capabilities.
7.6.3. String manipulation
String functions are in string.tcl. These make straightforward use of the Turbine API and Tcl string capabilities.
7.7. Containers
A container is a TD that is allows one to insert and retrieve TDs contained by it. It is used to represent associative arrays, structs, and stack frames.
Lookups are performed on "subscripts", which are serialized, hashable representations of the keys. Each container has a subscript type that represents the type of the keys: this allows for Swift loop variables to be automatically defined. The values stored are "members" which are strings- they typically represent TDs. Thus, arbitrary data may be stored in a container as an optimization
Rules may wait on the whole container TD just like any other TD. TDs that are members of a container are not special. They are simply linked into the container data structure.
tcl-adlb.c
-
allocate_container td type
-
Initialize a TD as a container with the given subscript type, which may be integer or string. The members in the container may be of any type
-
container_typeof td → type
-
Get the subscript type of the container as a Tcl string. Use
typeof
to get the type of a member. -
adlb::enumerate td subscripts|members|dict|count count|all offset
-
-
subscripts
:: Return list of subscript strings -
members
:: Return list of member TDs -
dict
:: Return Tcl dict mapping subscripts to TDs -
count
:: Return integer count of container elements -
count,all,offset
:: Return all entries or justcount
, starting fromoffset
-
-
container_list td → list
-
Obtain all subscripts in the container as a big Tcl list (Convenience wrapper around enumerate)
-
container_size td → count
-
(Convenience wrapper around enumerate)
-
container_reference c i r
-
Make
r
a reference forc[i]
. Thus, whenc[i]
is inserted,r
is closed by the system.r
is a copy ofc[i]
, thus,r
must be of the same type asc[i]
.
data.tcl
-
container_insert container_td subscript member
-
Link member TD into the container at given subscript member is typically a TD, allowing for linked data.
-
container_lookup container td subscript → member
-
Lookup the member corresponding to the subscript in the given container
7.7.1. Advanced container operations
These are used to support the full set of possible Swift/T array operations.
Currently, these contain these existing name
, the proposed name,
and a proposed shorthand notation (PSN).
(A[i])
is used to express a reference on A[i]
.
container.tcl
-
container_create_nested container subscript type
-
c_v_create (CVC)
Creates subdatum when index is a value.
Swift/T example:
(A[i])[j] = f();
-
struct_create_nested struct subscript type
-
struct_create (SC)
Creates subdatum in struct.
Swift/T example:
s.f[i] = f();
-
f_container_create_nested container subscript type
-
c_f_create (CFC)
Creates subdatum when index is a future.
Swift/T example:
(A[i])[j] = f();
-
container_f_insert container subscript td
-
c_f_insert (CFI)
When
subscript
is set, inserttd
atcontainer[subscript]
.Swift/T example:
A[i] = j;
-
container_deref_insert container subscript reference
-
c_v_insert_r (CVIR)
Swift/T example:
A[3] = (B[j]);
-
container_f_deref_insert container subscript reference
-
c_f_insert_r (CFIR)
When
subscript
andreference
are closed, insert the TD stored inreference
intocontainer[subscript]
.Swift/T example:
A[i] = (B[j]);
-
container_f_get_integer container subscript → td
-
c_f_retrieve_integer (CFRI)
When
container[subscript]
is inserted, store a copy of that integer result intd
.Swift/T example:
j = A[i];
-
f_dereference_integer/float/string/blob reference td
-
When
reference
is closed, copy its value intotd
dereference_retrieve_integer (DRI)
dereference_retrieve_float (DRF)
Swift/T example:
j = (A[i]);
-
f_reference container subscript → reference
-
c_f_lookup (CFL)
Swift/T example:
f(A[i]);
-
f_cref_create_nested container_reference subscript type → reference
-
cr_v_create (CRVC)
Swift/T example:
A[i][3] = f();
-
cref_create_nested container_reference subscript type → reference
-
cr_f_create (CRFC)
Swift/T example:
(A[i])[j] = f();
-
f_cref_lookup_literal container_reference integer td td_type
-
cr_v_lookup (CRVL)
Swift/T example:
j = (A[i])[3];
-
f_cref_lookup container_reference subscript td td_type
-
cr_f_lookup (CRFL)
Swift/T example:
k = (A[i])[j];
-
cref_insert container_reference subscript td
-
cr_v_insert (CRVI)
Swift/T example:
(A[i])[3] = k;
-
f_cref_insert container_reference subscript td
-
cr_f_insert (CRFI)
Swift/T example:
(A[i])[j] = k;
-
cref_deref_insert container_reference subscript td_reference outer_container
-
When
container_reference
andtd_reference
are set, inserttd
atcontainer[subscript]
.cr_f_insert_r (CRFIR)
Swift/T example:
(A[i])[j] = (B[k])
;
functions.tcl
-
range container start end
-
Fill and close given container with integer subscripts that map to TDs that are integers from
start
toend
7.7.2. Blobs
Blobs (Binary Large OBjects) may be used to represent byte data (pointer+length). This is to allow Turbine data store to store native data from C/C++/Fortran.
When blobs are retrieved from ADLB, they are stored in a local cache. These entries should be freed before returning control to Turbine.
In Tcl, the blob is a [ list pointer length ]
where pointer
and
length
are integers. pointer
is the real pointer to the blob’s
data- it may be passed into a C function as void*
. length
is the
size in bytes.
blob.tcl
-
blob_from_string
-
Convert a Tcl string into a blob. String will be NULL-terminated.
-
string_from_blob
-
Convert a blob into a string. String must be NULL-terminated
-
blob_from_floats
-
Convert a container of floats into a blob, which is actually a C array of doubles
-
floats_from_blob
-
Convert a blob into a container of floats
-
blob_size_async
-
Obtain the size of a blob in bytes
tcl-adlb.c
-
retrieve_blob td → [ list pointer length ]
-
Retrieve a blob from ADLB and store in the local cache. The user must free this from cache. Returns the pointer and length in a Tcl list.
-
blob_free td
-
Free the blob from the local cache.
-
store_blob td pointer length
-
Store blob in ADLB
blob.c
Usage
This illustrates what can go in a typical Swift/T leaf function
Assumes blobs id1, id2 have been created
# Retrieve input blob
set L1 [ adlb::retrieve_blob $id1 ]
set pointer1 [ lindex 0 $L1 ]
set length1 [ lindex 1 $L1 ]
# Call C function
set L2 [ user::compute $pointer1 $length1 ]
# C function returned pointer and length in L2
set pointer2 [ lindex 0 $L2 ]
set length2 [ lindex 1 $L2 ]
# Store C function result
turbine::store_blob $id2 [ list $pointer2 $length2 ]
# Free from local cache
adlb::blob_free $id1
7.8. I/O
Turbine I/O capabilities.
functions.tcl
-
trace
-
Simply outputs the values of the given TDs without formatting.
io.tcl
-
printf
-
As
printf()
in C. The format string is handled with the Tclformat
command.
files.tcl
TODO: files.tcl
7.9. Void
Operations for void
variables
functions.tcl
-
make_void
-
Create and close a
void
TD -
zero
-
Convert a
void
to the integer 0.
7.10. Updateables
updateable.tcl
TODO: updateables
7.11. Assertions
assert.tcl
Assertion functions are in assert.tcl. These make straightforward use of the Turbine API and Tcl capabilities. When they fail, they bring the whole Turbine execution down.
7.12. Logging
tcl-turbine.c
-
log
-
Simply report the given string to stdout with a timestamp. This may be disabled by setting environment variable
TURBINE_LOG=0
.
7.13. MPE
MPE is the primary way to obtain profiling and debugging information from Turbine/ADLB. CPU profiling information can also be obtained without recompilation as described in the CPU profiling section below. MPE log entries are automatically created by ADLB if enabled at configure time. One additional MPE function is available from Turbine:
-
metadata
-
Simply insert the given string into the log.
The MPE log will contain solo events with the "metadata" event type.
It is safe to call this function even if MPE is not configured - it will simply be a noop.
7.14. System
System functions are in sys.tcl. These make straightforward use of the Turbine API and Tcl capabilities. See the Swift/T documentation for a sense of the purpose of these features.
7.15. Blob manipulation
TODO
8. CPU Profiling
It is possible to obtain information about CPU usage in Turbine by using the Google perftools CPU profiler. This profiler is non-intrusive: it doesn’t require recompilation, only that the application is compiled with debugging symbols (the default). The profiler is a sampling profiler, which means that it periodically snapshots the program’s stack. This is good for finding out where your program spends its time, but will not provide information on the number of times a function is called, or the duration of an individual function call. The tools are available at http://code.google.com/p/gperftools/, and may be available as an operation system package (e.g. gperftools in Ubuntu). Once installed, you can enable the profiler with the CPUPROFILE and LD_PRELOAD environment variables. E.g. if using Mpich, which automatically passes environment variables to MPI processes, the following is sufficient:
export LD_PRELOAD=/usr/lib/libprofiler.so
export CPUPROFILE=./turbine.prof
turbine -n8 program.tcl
This will output profiling information files with the ./turbine.prof prefix and the process ID appended. Once you have the profiles, you can view the information in various formats, including text and graphical.
pprof --text `which tclsh8.5` turbine.prof_12345 > turbine.prof_12345.txt
pprof --pdf `which tclsh8.5` turbine.prof_12345 > turbine.prof_12345.pdf
Note: on Ubuntu, pprof is renamed to google-pprof.