1. Introduction
This is an introductory tutorial describing the use of Swift and its programming language SwiftScript. It is intended to introduce new users to the basics of Swift. It is structured as a series of simple exercises/examples which you can try for yourself as you read along.
For information on getting an installation of Swift running, consult the Swift Quickstart Guide. We advise you to install the latest stable release of Swift. Return to this document when you have successfully run the test SwiftScript program mentioned there.
There is also a Swift User Guide which contains a more detailed reference material on topics covered in this manual. All of the programs included in this tutorial can be found in your Swift distribution in the examples/tutorial directory.
2. Hello World
The first example program, hello.swift, outputs a hello world message into a file called hello.txt.
To run hello.swift, change directories to the location of the script and run the swift command as follows.
Make sure the bin directory of your swift installation is in your PATH. |
$ cd examples/tutorial $ swift hello.swift Swift svn swift-r3334 (swift modified locally) cog-r2752 RunID: 20100526-1925-8zjupq1b Progress: Final status: Finished successfully:1 $ cat hello.txt Hello, world!
The basic structure of this program is a type definition, an application procedure definition, a variable definition and then a call to the procedure.
First we define a new type, called messagefile. In this example, we will use this messagefile type for our output message.
type messagefile;
All data in SwiftScript must be typed, whether it is stored in memory or on disk. This example defines a very simple type. Later on we will see more complex type examples.
app (messagefile t) greeting() { echo "Hello, world!" stdout=@filename(t); }
Next we define a procedure called greeting. This procedure will write out the "hello world" message to a file. To achieve this, it executes the unix utility echo with a parameter "Hello, world!" and directs the standard output into the output file.
The actual file to use is specified by the return parameter, t.
messagefile outfile <"hello.txt">;
Here we define a variable called outfile. The type of this variable is messagefile, and we specify that the contents of this variable will be stored on disk in a file called hello.txt
app (messagefile t) greeting() { echo "Hello, world!" stdout=@filename(t); }
Now we call the greeting procedure, with its output going to the outfile variable and therefore to hello.txt on disk.
Over the following exercises, we’ll extend this simple hello world program to demonstrate various features of Swift.
3. Language features
3.1. Parameters
Procedures can have parameters. Input parameters specify inputs to the procedure and output parameters specify outputs. Our hello world greeting procedure already uses an output parameter, t, which indicates where the greeting output will go. In this section, we will modify the previous script to add an input parameter to the greeting function.
We have modified the signature of the greeting procedure to indicate that it takes a single parameter, s, of type string.
We have modified the invocation of the echo utility so that it takes the value of s as a parameter, instead of the string literal "Hello, world!".
We have modified the output file definition to point to a different file on disk.
We have modified the invocation of greeting so that a greeting string is supplied.
The code for this section can be found in parameter.swift. It can be invoked using the swift command, with output appearing in parameter.hello.txt:
$ swift parameter.swift
Now that we can choose our greeting text, we can call the same procedure with different parameters to generate several output files with different greetings. The code is in manyparam.swift and can be run as before using the swift command.
Note that we can intermingle definitions of variables with invocations of procedures.
When this program runs, there should be three new files in the working directory (manyparam.english.txt, manyparam.francais.txt and manyparam.nihongo.txt) each containing a greeting in a different language.
In addition to specifying parameters positionally, parameters can be named, and if desired a default value can be specified.
3.2. Adding another application
Now we’ll define a new application procedure. The procedure we define will capitalise all the words in the input file.
To do this, we’ll use the unix tr (translate) utility. Here is an example of using tr on the unix command line, not using Swift:
$ echo hello | tr '[a-z]' '[A-Z]' HELLO
There are two main steps - updating the transformation catalog, and updating the application block.
The transformation catalog lists where application executables are located on remote sites. We need to modify the transformation catalog to define a logical transformation for the tr utility. The transformation catalog can be found in etc/tc.data. There are already several entries specifying where executables can be found. Add a new line to the file, specifying where tr can be found (usually in /usr/bin/tr but it may differ on your system), like this:
localhost tr /usr/bin/tr INSTALLED INTEL32::LINUX null
For now, ignore all of the fields except the second and the third. The second field tr specifies a logical application name and the third specifies the location of the application executable.
Now that we have defined where to find tr, we can use it in SwiftScript.
We can define a new procedure, capitalise, which calls tr.
app (messagefile o) capitalise(messagefile i) { tr "[a-z]" "[A-Z]" stdin=@filename(i) stdout=@filename(o); }
We can call capitalise like this:
messagefile final <"capitalise.2.txt">; hellofile = greeting("hello from Swift"); final = capitalise(hellofile);
Here is the full program based on this exercise:
Next, run swift and verify the output is correct.
$ swift capitalise.swift ... $ cat capitalise.2.txt HELLO FROM SWIFT
3.3. Anonymous files
In the previous section, the file hello.txt is used only to store an intermediate result. We don’t really care about which name is used for the file, and we can let Swift choose the name.
To do that, omit the mapping entirely when declaring hellofile:
messagefile hellofile;
Swift will choose a filename, which in the present version will be in a subdirectory called _concurrent.
3.4. Datatypes
All data in variables and files has a data type. So far, we’ve seen two types:
-
string - this is a built-in type for storing strings of text in memory, much like in other programming languages
-
messagefile - this is a user-defined type used to mark disc resident files as containing messages
SwiftScript has the additional built-in types: boolean, integer and float that function much like their counterparts in other programming languages.
It is also possible to create user defined types with more structure, for example:
type details { string name; int pies; }
Each element of the structured type can be accessed using a . like this:
person.name = "John";
The following complete program, types.swift, outputs a greeting using a user-defined structure type to hold parameters for the message:
Structured types can be comprised of marker types for files. See the later section on mappers for more information about this.
3.5. Arrays
We can define arrays using the [] suffix in a variable declaration:
string words[] = ["how","are","you"];
This program, arrays.swift, will declare an array of message files.
Observe that the type of the parameter to greeting is now an array of strings, string s[], instead of a single string, string s, that elements of the array can be referenced numerically, for example s[0], and that the array is initialised using an array literal, ["how","are","you"].
3.6. Mappers
A significant difference between SwiftScript and other languages is that data can be referred to on disk through variables in a very similar fashion to data in memory. For example, in the above examples we have seen a variable definition like this:
messagefile outfile <"arrays.txt">;
This means that outfile is a dataset variable, which is mapped to a file on disk called arrays.txt. This variable can be assigned to using = in a similar fashion to an in-memory variable. We can say that outfile is mapped onto the disk file arrays.txt by a mapper.
There are various ways of mapping in SwiftScript. Two forms of mapping, simple named mapping and anonymous mapping, have already been seen in this tutorial. Later exercises will introduce more forms.
In simple named mapping, the name of the file that a variable is mapped to is explictly listed.
messagefile outfile <"hello.txt">;
This is useful when you want to explicitly name input and output files for your program. An example of this can be seen with outfile in the hello world exercise.
With anonymous mapping no name is specified in the source code. A name is automatically generated for the file. This is useful for intermediate files that are only referenced through SwiftScript. A variable declaration is mapped anonymously by ommitting any mapper definition.
messagefile hellofile;
Later exercises will introduce other ways of mapping from disk files to SwiftScript variables.
3.6.1. The Regexp Mapper
In this exercise, we introduce the regexp mapper. This mapper transforms a string expression using a regular expression, and uses the result of that transformation as the filename to map.
regexp.swift demonstrates the use of this by placing output into a file that is based on the name of the input file. Our input file is mapped to the inputfile variable using the simple named mapper, then we use the regular expression mapper to map the output file. We then use the countwords() procedure to count the words in the input file and store the result in the output file. In order for the countwords() procedure to work correctly, add the wc utility (usually found in /usr/bin/wc) to tc.data.
The following program replaces the suffix of the input file (regexp_mapper.words.txt) with a new suffix (.count) to create regexp_mapper.words.count.
3.6.2. fixed_array_mapper
The fixed array mapper maps a list of files into an array. Each element of the array is mapped into one file in the specified directory. See fixed_array_mapper.swift below.
3.7. foreach
SwiftScript provides a control structure, foreach, to operate on each element of an array in parallel.
In this example, we will run the previous word counting example over each file in an array without having to explicitly list the array elements. The source code for this example is in foreach.swift. This program uses three input files: foreach.1.txt, foreach.2.txt, and foreach.3.txt. After you have run the workflow, you should see that there are three output files: foreach.1.count, foreach.2.count and foreach.3.count, each containing the word count for the corresponding input file. We combine the use of the fixed_array_mapper and the regexp_mapper.
3.8. If
Decisions can be made using if, like this:
if(morning) { outfile = greeting("good morning"); } else { outfile = greeting("good afternoon"); }
if.swift contains a simple example of this. Compile and run if.swift and see that it outputs "good morning". Changing the morning variable from true to false will cause the program to output "good afternoon". Here is the contents of the full script:
3.9. Sequential iteration
A serial execution of instructions can be carried out using the sequential iteration construct. Iterate expressions allow a block of code to be evaluated repeatedly, with an integer parameter sweeping upwards from 0 until a termination condition holds.
The general form is:
iterate var { statements; } until (terminationExpression);
The following example demonstrates one simple application. We will use iterate to set the value of i from 0 to 4. We will then use i as an index to sequentially print the values of an array.
Here’s the program:
You should see a result similar to this:
Swift trunk swift-r5746 cog-r3370 RunID: 20120417-0820-g1q1m8b3 Progress: time: Tue, 17 Apr 2012 08:20:22 -0500 Letter 0 is: a Letter 1 is: b Letter 2 is: c Letter 3 is: d Letter 4 is: e Final status: Tue, 17 Apr 2012 08:20:22 -0500
4. Runtime features
4.1. Visualizing the workflow as a graph
When running a workflow, its possible to generate a provenance graph at the same time:
$ swift -pgraph graph.dot first.swift $ dot -ograph.png -Tpng graph.dot
graph.png can then be viewed using your favourite image viewer. The dot application is part of the graphViz project. More information can be found at http://www.graphviz.org.
4.2. Running on a remote site
As configured by default, all jobs are run locally. In the previous examples, we’ve invoked echo and tr executables from our SwiftScript program. These have been run on the local system (the same computer on which you ran swift). We can also make our computations run on a remote resource. For more information on running Swift on a remote site please see the Site Configuration Guide.
4.3. Starting and restarting
Now we’re going to try out the restart capabilities of Swift. We will make a workflow that will deliberately fail, and then we will fix the problem so that Swift can continue with the workflow.
First we have the program in working form, restart.swift.
We must define some transformation catalog entries:
localhost touch /usr/bin/touch INSTALLED INTEL32::LINUX null localhost broken /bin/true INSTALLED INTEL32::LINUX null
Now we can run the program:
$ swift restart.swift Swift 0.9 swift-r2860 cog-r2388 RunID: 20100526-1119-3kgzzi15 Progress: Final status: Finished successfully:4
Four jobs run - touch, echo, broken and a final echo. (note that broken isn’t actually broken yet).
Now we will break the broken job and see what happens. Replace the definition in tc.data for broken with this:
localhost broken /bin/false INSTALLED INTEL32::LINUX null
Now when we run the workflow, the broken task fails:
$ swift restart.swift Swift 0.9 swift-r2860 cog-r2388 RunID: 20100526-1121-tssdcljg Progress: Progress: Stage in:1 Finished successfully:2 Execution failed: Exception in broken: Arguments: [process] Host: localhost Directory: restart-20100526-1121-tssdcljg/jobs/1/broken-1i6ufisj stderr.txt: stdout.txt:
From the output we can see that touch and the first echo completed, but then broken failed and so swift did not attempt to execute the final echo.
There will be a restart log with the same name as the RunID:
$ ls *20100526-1121-tssdcljg*rlog restart-20100526-1121-tssdcljg.0.rlog
This restart log contains enough information for swift to know which parts of the workflow were executed successfully.
We can try to rerun it immediately, like this:
$ swift -resume restart-20100526-1121-tssdcljg.0.rlog restart.swift Swift 0.9 swift-r2860 cog-r2388 RunID: 20100526-1125-7yx0zi6d Progress: Execution failed: Exception in broken: Arguments: [process] Host: localhost Directory: restart-20100526-1125-7yx0zi6d/jobs/m/broken-msn1gisj stderr.txt: stdout.txt: Caused by: Exit code 1
Swift tried to resume the workflow by executing "broken" again. It did not try to run the touch or first echo jobs, because the restart log says that they do not need to be executed again.
Broken failed again, leaving the original restart log in place.
Now we will fix the problem with "broken" by restoring the original tc.data line that works.
Remove the existing "broken" line and replace it with the successful tc.data entry above:
localhost broken /bin/true INSTALLED INTEL32::LINUX null
Now run again:
$ swift -resume restart-20100526-1121-tssdcljg.0.rlog restart.swift Swift 0.9 swift-r2860 cog-r2388 RunID: 20100526-1128-a2gfuxhg Progress: Final status: Initializing:2 Finished successfully:2
Swift tries to run "broken" again. This time it works, and so Swift continues on to execute the final piece of the workflow as if nothing had ever gone wrong.
5. Bits
5.1. Named and optional parameters
In addition to specifying parameters positionally, parameters can be named, and if desired a default value can be specified: