:stem: asciimath The Swift Runtime ----------------- Swift is a deterministic dataflow language. The lexical ordering of statements is generally irrelevant. What is relevant are the *dependencies* between data. Values as Futures ~~~~~~~~~~~~~~~~~ Each <> in Swift is a *future*. A future *wraps* a *concrete value* and has two possible states: Open:: (or unbound). This is the default initial state of a value. The concrete value is absent and cannot yet be used in a concrete operation. It will be available at a later time. Closed:: (or bound). This is a state in which a concrete value is available and can be used in an operation. If a value is open at some time, it can be closed only at a later time. It is not possible for a variable to become open after it was closed. Value literals are represented with futures that are closed when a program starts execution. [[runtime:order-of-operations]] Order of Operations ~~~~~~~~~~~~~~~~~~~ Independent operations in Swift are all executed in parallel. However, operations can depend on values which must be closed before the respective operations can be executed. Consider the following example: [listing, swift] ---- int a = 1; int b = 2; int c = f(a) + g(b); ---- The following operations can be identified: * three assignments, for variables +a+, +b+ and +c+ * two function invocations: +f(a)+ and +g(b)+ * an addition operation All these operations are started in parallel as soon as Swift starts running the program. The assignments to variables +a+ and +b+ can continue immediately since they depend only on integer literals, which are closed by default. The invocations of +f+ and +g+ can then continue. The addition has to wait for the results from the invocations of both +f+ and +g+. When those results are available, the addition can be peformed and the resulting value can be finally assigned to +c+. Types of Operations ~~~~~~~~~~~~~~~~~~~ Technically speaking, many things can be considered ``operations''. However, it is worth emphasizing some of them due to the particular way in which they are executed. Assignments:: The <> waits for the right hand side to be closed, copies the concrete value from the right hand side to the left hand side, and finally closes the left hand side. TIP: See also: <> Application Functions:: An application instance will only run after all of its actual parameters are closed. After an application invocation completes, all the actual return parameters are closed by Swift. Function Invocations:: When a non-application function is invoked, it does not necessarily wait for all of its actual parameters to be closed. The details depend on the actual definition of the function. Consider the following example: + [listing, swift] ---- (int result) product(int f1, int f2) { if (f1 == 0) { result = 0; } else { result = f1 * f2; } } int r1 = product(0, x); int r2 = product(2, x); ---- + Two cases exist: + . In the first invocation of the +product+ function, the parameter +f1+ is zero. The first branch of the +if+ condition is taken, which does not include an operation that waits for the value of +f2+ to be closed. . In the second invocation, the value of +f1+ is non-zero. The second branch of the +if+ is taken. This branch contains an operator that has the values of both +f1+ and +f2+ as parameters. In this case the +product+ function needs to wait for +f2+ to be closed before producing a result. + For functions defined in the Swift <>, it is implied that the implementation will always wait for the value of the parameters to be closed. Any exceptions to this rule are explicitly documented. + Operators:: There is no semantic difference between operator invocations and library function invocations, so the above rule applies. In particular, the current implementation does not implement shortcut evaluation for boolean operators. [[runtime:arrays-and-iterations]] Arrays and Iterations ~~~~~~~~~~~~~~~~~~~~~ Arrays in Swift are sparse. This means that array sizes, in general, can only be fully determined at run-time. Because of this, the array structure itself has the properties of a future. Consider the following example: [listing, swift] ---- int[] a, b; a[0] = 1; iterate i { int value = someComplexFunction(i); a[i + 1] = value; } until (value > 100); foreach x, i in a { b[i] = f(x); } ---- Since +iterate+ and +foreach+ run in parallel, the +foreach+ will start before +a+ is fully constructed. We assume that the +iterate+ condition eventually becomes +true+. In order for the program to terminate and function deterministically, the +foreach+ must eventually terminate and it must not do so before all the items have been added to +a+. Arrays must therefore have two states: Open array:: In this state it is unknown whether more items will be added to an array or not. Closed array:: Both the size of the array and the indices corresponding to all the items in the array are known. A +foreach+ statement will start, in parallel, an iteration for each item available in an array. As more items are added to the array, +foreach+ will start the corresponding iterations. When the array becomes closed and all iterations complete, +foreach+ will complete. When are Arrays Closed ^^^^^^^^^^^^^^^^^^^^^^ The determination of when an array can be closed is made based on analyzing all source code locations in which that array is written to. In the last example above, all writing operations to +a+ are done within the +iterate+ statement. The Swift compiler will generate code to close +a+ as soon as the +iterate+ statement completes. Writing to an Array that is Being Iterated on ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ Swift allows writing to an array that is being iterated on using the +foreach+ statement: [listing, swift] ---- int[] a; a[0] = 1; foreach x, i in a { int value = someComplexFunction(i); if (value <= 100) { a[i + 1] = value; } } ---- Swift handles array closing slightly differently in the case of ``self referencing'' +foreach+ statements. The +a+ array will be closed when both the following conditions are true: . All writing operations to +a+ outside of the +foreach+ have completed . Iterations for all available items in +a+ have completed This can be used to implement everything that can be implemented using the +iterate+ statement, possibly in a more clear fashion. Users are encouraged to use +foreach+ instead of +iterate+. Circular Dependencies ~~~~~~~~~~~~~~~~~~~~~ Circular dependencies are situations in which two or more values form a dependency cycle through operations applied on them. The simplest case is that of two mutually dependent values: [listing, swift] ---- int a, b; a = f(b); b = f(a); ---- The variable +a+ cannot be closed until +b+ is closed which in turn cannot be closed until +a+ is closed. Situations like this will cause a Swift program to stop progressing. This is detected by the Swift run-time and results in a run-time error. While the above situation can theoretically be detected at compile-time, this cannot be done in all cases. Consider the following example: [listing, swift] ---- int[] a; a[0] = 1; a[1] = a[getIndex(1)]; a[2] = a[getIndex(2)]; ---- It is possible, but not necessary, that at run-time the above code will result in: [listing, swift] ---- int[] a; a[0] = 1; a[1] = a[2]; a[2] = a[1]; ---- This can only be determined by knowing the values returned by the +getIndex()+ function, which in general can only be done at run-time. Error Handling ~~~~~~~~~~~~~~ TODO [[runtime:file-mapping]] File Mapping ~~~~~~~~~~~~ File mapping is the process through which values stored in Swift variables of <> are associated with physical files. This process is implemented by *mappers* which are specified using <>. A non-composite mapped value can be associated to a file using the <> or its short form: [listing, swift] ---- file f <"file.dat">; ---- The above code associates the variable +f+ to a local file named +file.dat+ which is assumed to be in the directory from which Swift is invoked. It is possible to more complex path names or even *URLs* to associate a Swift variable with files that do not necessarily reside in the current directory: [listing, swift] ---- file f1 <"/tmp/file.dat">; file f2 <"http://example.org/index.html">; ---- For a list of all supported remote file access methods, please see <>. Values of composite types containing file-valued data can be mapped in bulk using one of the additional <> provided by Swift. For example, the <> can be used to glob files in a directory and map them to an entire array: [listing, swift] ---- file[] a ; ---- [[runtime:file-mapping:implicit-mapping]] Implicit Mapping ^^^^^^^^^^^^^^^^ A mapped type value that is not mapped explicitly is *implicitly mapped*. Specifically Swift will map it to a deterministic but opaque temporary file. Input or Output ^^^^^^^^^^^^^^^ Swift distinguishes between input and output mapped data based on whether explicit assignments are present in the program. If a variable is assigned to in a Swift program, Swift considers that variable to be an output. Otherwise, Swift marks it as an input. When a variable is marked as an input, Swift requires that the corresponding files be present, unless the variable is an array that can be empty. An input mapped variable is considered to be implicitly assigned. Example: [listing, swift] ---- file f <"input.dat">; <1> file g <"output.dat">; <2> g = cat(f); ---- <1> Variable +f+ is not explicitly assigned to. It is therefore an input variable, and Swift implicitly assigns a file value representing the +input.dat+ file to it. It is an error for +input.dat+ not to exist as a file. <2> Variable +g+ is explicitly assigned to. It is therefore an output and it will be created by Swift during the program's execution. It does not make sense for certain mappers to be used for output (such as the <>), since their operation depends on having a set of physical files present. [[runtime:assignment-of-mapped-values]] Assignment of Mapped Values ^^^^^^^^^^^^^^^^^^^^^^^^^^^ When a mapped value is assigned to a mapped type variable, and the destination variable is not <>, Swift guarantees that the file that the destination variable is mapped to will exist. This is generally done by copying the file. However, Swift may optimize this when possible, in particular by creating symbolic links on systems that support it. [[runtime:application-functions-execution]] Application Functions Execution ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Applications in Swift are generally executed on remote computing resources. In the Swift language, applications functions are specified in a way that is independent of where and how applications are run, which allows application instances to be scheduled efficiently based on resource availability. When an application function is invoked, the following steps are taken: . Swift waits for all the actual parameters to the application function to be closed . *Site selection*: a remote resource is selected from all available resources based on whether they contain the given application, load, and other run-time information (see Section on <>). . A file sandbox is created, where the application input and output files will be stored . All files corresponding to mapped values in the actual parameters are copied to the application sandbox ("stage-in") . The application is executed remotely using one of the available <> . All files corresponding to mapped values in the actual return parameters are copied back from the application sandbox, and the actual return parameters are closed ("stage-out") . The application sandbox is deleted The application execution is performed indirectly. A small script, called "wrapper" (or +_swiftwrap+), is used to implement the following functions: - set up the details of the application sandbox, such as directory structures where input and output files go - perform basic tests of the environment and try to generate user-friendly error messages if something is wrong - copy or link files from a resource-local swift cache to the application sandbox and back - optionally clean up the application sandbox - record the status of the application execution - log various application timing information There are three ways in which the above operations can be performed and they will be explained in detail in the following sections: *swift staging*, *provider staging*, and *wrapper staging*. Swift Staging ^^^^^^^^^^^^^ Swift staging is historically the first mechanism used by Swift to deal with remote application files. In swift staging mode (selected through the `staging: "swift"` <> option), for each job, the Swift runtime selects a site as described in <>, and, using that site, it performs the following operations in order: . Using the <> specified in the configuration file, it creates a *shared directory* in the location specified by the `workDirectory` site property that will mirror the local directory structure with respect to all files used by applications running on that site. This directory is only created once per site per swift program execution, and it is initially empty. It is guaranteed that two invocations of swift programs will have different shared directories on a given site. . In the shared directory, Swift creates the directory structure needed by the job's input and output files. For example, if an application uses the following input: + [listing, swift] ---- file f <"data/f.dat">; ---- + then Swift will create a directory named `data` inside the site shared directory. + . Using the filesystem provider, swift copies all the application's input files into their respective directories inside the shared directory. Files that are already in the shared directory are skipped. . The application is invoked through `_swiftwrap`. The essential steps taken by `_swiftwrap` in Swift staging mode are as follows: .. Create a sandbox directory either inside the shared directory, or in a temporary location if the `scratch` property is specified for the site .. For each of the application's input files, either copy the file from the shared directory into the sandbox directory or create a symbolic link inside the sandbox directory to the file in the shared directory. The choice of whether to copy or link is determined by existence of the `scratch` property. If the property is defined for the site, then the files will be copied. The copying process preserves the directory structure. .. Run the application .. If the application returns a zero exit code indicating success, then: ... Check that all the output files were created by the application and fail if not ... Move the output files from the sandbox directory back to the shared directory preserving the directory structure ... Remove the job directory .. Exit signalling success or failure; the exact method of signalling depends on the value of the `statusMode` property. If set to `provider`, then `_swiftwrap` exits with an exit code equal to the exit code of the application. If set to `files`, then swift creates either an empty success file or a failure file containing the application exit code . Transfer and check the status files for the job status if `statusMode` is `files` . Copy all application output files back to the machine that Swift is running on For example, consider the following Swift program: [listing, swift] ---- app (file outf) cat(file inf) { "/bin/cat" filename(inf) stdout = filename(outf); } file inf <"inputs/a.dat">; file outf <"outputs/b.dat">; outf = cat(inf); ---- and the following configuration: [listing, swiftconf] ---- site.cluster { execution { type: "GRAM" url: "login.cluster.example.org" jobManager: "PBS" } filesystem { type: "GSIFTP" url: "login.cluster.example.org" } statusMode: "provider" staging: "swift" workDirectory: "/homes/johndoe/swiftwork" } sites: [cluster] ---- Swift would execute the `cat` application as follows: image:swift-staging.svg[] A few observations are in order: - `statusMode: "files"` was historically used to deal with execution providers that did not have a mechanism of reporting the application exit code. The most notable example is the early implementation of the GRAM protocol from the Globus Toolkit version 2. It is unlikely to be needed. - when running on compute clusters, the shared directory and therefore the work directory needs to reside on a shared filesystem that is accessible from the compute nodes. - transfers and other file operations are all controlled directly by Swift and are governed by <> settings, such as `fileTransfersThrottle` and `fileOperationsThrottle`. Throttling limits the number of concurrent operations and is useful to increase stability without affecting performance. - use of a `scratch` option pointing to a compute-node local filesystem can yield better performance if the work directory resides on a shared filesystem that has high latencies and high throughput. The performance improvement also depends on how the application accesses its input files and is more noticeable if the application uses a random-access pattern on its input files rather than sequential reads. - while Swift staging is mostly superseded by provider staging through Coasters, there are still legitimate reasons to use Swift staging, such as running MPI applications through GRAM and a local resource manager (e.g. PBS). Provider Staging ^^^^^^^^^^^^^^^^ With provider staging, Swift delegates the task of transferring files to the compute nodes to the execution provider. The full functionality needed to manage all the file operations needed by a job are currently only supported by the <> execution provider, so provider staging requires the use of Coasters. Since the provider takes care of shipping files to the compute nodes, there is no strict need for a shared filesystem on the cluster except as needed by <>. Files on the Swift side, as well as the job sandboxes, can reside on directly-attached disks. This can improve performance with respect to shared filesystems that would have non-trivial latencies. Wrapper staging is enabled by specifying one of `local`, `service-local`, `shared-fs` or `direct` choices to the <> site configuration property. The meaning of the various choices is as follows: `local`:: files are assumed to reside on the machine where Swift is running. Coasters will copy the files using the Coaster Service as a proxy. This allows files to be copied even if the compute nodes cannot reach networks outside of the cluster they belong to. `service-local`:: files are assumed to reside on the machine where Swift is running and the Coaster Service is assumed to be running on the same machine. `shared-fs`:: files are assumed to reside on a shared filesystem that is accessible by the compute nodes and are copied using standard POSIX copying routines. `direct`:: this is an experimental mode similar to `shared-fs` in which no copying to a sandbox is actually done. Instead, applications are passed the absolute paths to the files involved. Using the same example application as in the case of Swift staging, an example set of operations performed by Swift with `staging: local` is shown below: image:provider-staging.svg[] Wrapper Staging ^^^^^^^^^^^^^^^ Wrapper staging is an experimental feature that allows `_swiftwrap` to perform the necessary file staging operations. It generally requires a shared filesystem accessible from the compute nodes. [[runtime:data-flow-and-staging]] Data flow and staging ~~~~~~~~~~~~~~~~~~~~~ Overview ^^^^^^^^ The execution components involved in a Swift workflow are the client, the swift service and the workers. The client is the program that executes the workflow described in a swift script and is invoked by the swift command. The service may be started separately or automatically by the swift client and is responsible for provisioning resources from clouds, clusters or HPC systems. The workers are launched by the swift-service and are responsible for controlling the execution of the user's application on the compute nodes. Different clusters, HPC systems, and cloud vendors may have shared file-systems, varying network characteristics and local-disks available which can be utilized differently to marshal data efficiently within a workflow. Data flow refers to this movement of data within a workflow. On distributed systems with varying levels of shared resources, the Swift client and service coordinates the flow of data among the worker-nodes such that the data required for computation is available to the worker prior to the execution of the users's application as well as ensuring that the computed results are captured once tasks run to completion. There are 6 different staging methods that are supported by Swift. They are: * Local * Direct * Wrapper * Swift * Shared-fs * Service-local These staging methods are explained in detail in the following sections. Legend for the following sections: image:figs/legend.png["Legend"] Staging method : local ^^^^^^^^^^^^^^^^^^^^^^ Data flow in local staging: image:figs/local.png["Local staging"] Summary +++++++ The local staging method is designed for shared-nothing architectures such as clusters and clouds with no shared file-systems. The data originates on the node where the client runs and all data transfers are done explicity over the network. This method avoids using a shared-filesystem to transfer the files over the network, as in many cases the shared-filesystem is a shared among multiple users which adds congestion and it also could just be unsuitable for certain file access patterns. The client and service need not be on the same machine, which allows a client running on a local workstation to channel data through a service on the headnode of a Cluster1 to compute nodes provisioned from Cluster1. The is the default file staging method as this works on all computational resources. Since all the data is transferred via the swift-service the network bandwidth of the service could bottleneck the data flow. Similarly if the swift client is running remotely, the network links between the client and the service could potentially become a bottleneck for large volumes of data. When to use this mode +++++++++++++++++++++ The data volumes that need to be transferred to and from the workers to the client per worker are not more that hundreds of MB. As data sizes approach GBs of data per task, other transport mechanisms such as Globus transfers are worth considering. When each task either consumes or generates a large number of small files, shared-filesystem based copies can be very slow. The local staging method is an ideal candidate for this scenario. However, when there are large number of files involved the filesystem of the system on which the client is executing could become a bottleneck. Using a faster non-disk filesystem when available generally improves performance significantly. Example configs +++++++++++++++ [listing,swiftconf] ----- sites: midway site.midway { execution { type: "coaster" URL: "swift.rcc.uchicago.edu" jobManager: "ssh-cl:slurm" # Client connects remotely to the login node. options { nodeGranularity: 1 maxNodesPerJob: 1 jobQueue: "sandyb" maxJobs: 1 tasksPerNode: 1 maxJobTime: "00:08:20" } } staging: "local" workDirectory: "/tmp/"${env.USER} app.date { executable: "/bin/date" maxWallTime: "00:05:00" } } ----- Performance +++++++++++ All data-flow is over the network links from the client node and service node in this staging method and as a result, the network capacity of the client node is a potential limiting factor for large data volumes. When several small files are involved, or with sufficiently large files, the filesystem on the client node can become a bottleneck. There are performance limitations to the the staging/transport mechanism that swift uses, which could limit transfer throughputs. [TODO: Data to support this would be very nice] Notes: ++++++ When running using local coasters (local instead of ssh-cl), the client and service run on the same node. In this case, the network links are between the service and workers. Staging method : Direct ^^^^^^^^^^^^^^^^^^^^^^^ Data flow with Direct staging: image:figs/direct.png["Direct staging"] Data flow with Direct staging and a scratch directory: image:figs/direct_with_scratch.png["Direct staging with scratch directory"] Summary +++++++ The direct staging mode is designed for computational resources with High-Performance shared-filesystems. This mode requires that a shared filesystem such as NFS, Lustre, or even FUSE-mounted-S3 is mounted across the nodes where the client, service, and the workers are executing. Instead of Swift managing network transfers, the network transfers are implicitly managed by the shared-filesystem. The apps run in sandbox directories created under the workDirectory, but the tasks themselves will receive absolute paths for the input and output files. For applications that are IO bound, writing directly to the shared-filesystem can adversely affect the shared filesystem performance. To avoid this there is an option to specify a “scratch” folder on a local disk on the compute nodes. When to use this mode +++++++++++++++++++++ Large volumes of data are either consumed or generated by the application and a High Performance shared-filesystem is available across the nodes. On systems which have shared-filesystems, with I/O bandwidth that exceeds the network links between the headnode and the worker nodes, using the network to transfer data to the compute nodes could be sub-optimal. When a high-performance shared filesystem is available, such as the case on many supercomputing systems, there is sufficient I/O bandwidth to support several applications reading and writing to the filesystem in parallel. Another scenario is when the shared-filesystem is sensitive to creation and deletion of small files and directories. The swift workers create a sandbox directory for each task, which is (3 : TODO:confirm this with Mihael) levels deep. Using the direct mode with the workDirectory on a local disk (say /tmp) could avoid the overhead from swift's mechanisms for sandboxing tasks. Example configs +++++++++++++++ The following is an example for the direct staging mode. * Staging method is set to “direct”. * workDirectory may be set to the shared filesystem or a local filesystem. In this case, Swift assumes that file variables point at files on the shared filesystem. The apps which are executed on the workers resolve the file variables to absolute paths to the input and output files on the shared-filesystem. [listing,swiftconf] ----- sites: midway site.midway { execution { type: "coaster" URL: "swift.rcc.uchicago.edu" jobManager: "local:slurm" options { nodeGranularity: 1 maxNodesPerJob: 1 jobQueue: "sandyb" maxJobs: 1 tasksPerNode: 1 maxJobTime: "00:08:20" } } staging: direct workDirectory: "/tmp/"${env.USER}"/swiftwork" app.bash { executable: "/bin/bash" maxWallTime: "00:05:00" } } ----- The following is an example for the direct staging mode. * Staging method is set to “direct” * workDirectory may be set to the shared filesystem or a local filesystem. * Scratch is set to a directory on the local disks of the workers. Since the staging method is set to “direct”, swift will assume that file are on a shared file-system. In the context of user-application the file variables will resolve to absolute paths of the input/output files on the scratch directory. Before the workers start the execution of user tasks, the workers will copy the input files from the shared-filesystem to the scratch directory, and after execution will copy out the output files from the scratch directory to the shared-filesystem. [listing,swiftconf] ----- sites: midway site.midway { execution { type: "coaster" URL: "swift.rcc.uchicago.edu" jobManager: "local:slurm" options { nodeGranularity: 1 maxNodesPerJob: 1 jobQueue: "sandyb" maxJobs: 1 tasksPerNode: 1 maxJobTime: "00:08:20" } } staging: direct workDirectory: "/tmp/"${env.USER}"/swiftwork" scratch: "/scratch/local/"${env.USER}"/work/" app.bash { executable: "/bin/bash" maxWallTime: "00:05:00" } } TCPPortRange: "50000,51000" lazyErrors: false executionRetries: 0 keepSiteDir: true providerStagingPinSwiftFiles: false alwaysTransferWrapperLog: true ----- Notes: ++++++ TODO : Details of the filename behavior in apps and within swiftscript body. When this configuration is used, the worker copies the input files from the shared-filesystem to the scratch directory, and the user application will get the path to the file on scratch when the filename() and it's shorthand @ primitives are used in the app definition. The filename and @ primitives when used outside of the app definitions will point at the files on the shared-filesystem. Performance +++++++++++ "Direct" is theoretically the optimal way to use the shared-filesystem. There are no unnecessary copies, and the application that requires the file alone access the data. If the data access pattern of the application involves random seeks or creation of several intermediate small files, the "scratch" option allows you to offload sub-optimal file access patterns to a local disk/memory. This avoids costly accesses on the shared-filesystem and indirectly the network. Staging method : Swift ^^^^^^^^^^^^^^^^^^^^^^ Data flow with staging method Swift: image:figs/swift.png["Swift staging"] Summary +++++++ Swift staging, involves the client accessing file over a supported method like ssh or a local-filesystem access, and making the inputs available to the workers over a work-directory on a shared filesystem. This staging method uses an intermediate staging location that is on a shared-FS so each files is, in addition to being read from the initial location, written to and read from a shared FS, both of each are overhead. The only advantage to this is that you don't need coasters to use it and it's supported on a large number of computational resources. This is the default staging mechanism used if no staging method is defined in the swift config file. When to use this mode +++++++++++++++++++++ 1. You can access data using one of the supported methods like: local filesystem access ssh - Use scp to access files GSIFTP GridFTP 2. A shared-FS is present, that works well for your data access patterns. 3. You want to use a non-coaster execution provider. The GSIFTP and GridFTP are not actively tested, and are not guaranteed to work. Example configs +++++++++++++++ [listing,swiftconf] ----- sites: midway site.midway { execution { type: "coaster" URL: "swift.rcc.uchicago.edu" jobManager: "local:slurm" options { nodeGranularity: 1 maxNodesPerJob: 1 jobQueue: "sandyb" maxJobs: 1 tasksPerNode: 1 maxJobTime: "00:08:20" } } filesystem { type: "local" URL: "localhost" } staging: direct workDirectory: "/scratch/midway/"${env.USER}"/swiftwork" app.bash { executable: "/bin/bash" maxWallTime: "00:05:00" } } TCPPortRange: "50000,51000" lazyErrors: false executionRetries: 0 keepSiteDir: true providerStagingPinSwiftFiles: false alwaysTransferWrapperLog: true ----- Performance +++++++++++ The Swift staging method uses an intermediate staging location that is on a shared FS, each files is, in addition to being read from the initial location, written to and read from a shared FS, both of each are overhead. The only advantage to this is that you don't need coasters to use it and it's supported in a lot of cases. Staging method : Wrapper ^^^^^^^^^^^^^^^^^^^^^^^^ image:figs/wrapper.png["Wrapper staging"] Summary +++++++ The wrapper staging method relies on a wrapper script used to stage files to and from the swift workers. Currently the wrapper staging method supports fetching files over HTTP and between the client filesystem. The wrapper staging method provides a flexible interface to add support for third party transfer mechanisms to the swift worker. TODO: Is *guc* supported ? When to use this mode +++++++++++++++++++++ The repository for the input or output data can be accessed over a supported tranfer mechanism. The data can be accessed only by an exotic tranfer mechanism, which could be incorporated into the supported methods for wrapper staging. Example configs +++++++++++++++ [listing,swiftconf] ----- sites: midway site.midway { execution { type: "coaster" jobManager: "local:local" URL: "localhost" } filesystem { type: local } staging: "wrapper" scratch: "/tmp/"${env.USER}"/swift-scratch" workDirectory: "swiftwork" app.ALL { executable: "*" } } wrapperStagingLocalServer: "file://" ----- [[runtime:site-selection]] Site Selection ~~~~~~~~~~~~~~ TODO