Site Configuration Guide

http://swiftlang.org/docs/index.php
http://beagle.ci.uchicago.edu
ssh yourusername@login.beagle.ci.uchicago.edu
module load swift sun-java
mkdir swift-lab
cd swift-lab
/bin/cat
data.txt
f.nnn.out
type file;

/* App definition */
app (file o) cat (file i)
{
  cat @i stdout=@o;
}

file out[]<simple_mapper; location="outdir", prefix="f.",suffix=".out">;
file data<"data.txt">;

/* App invocation: n times */
foreach j in [1:@toint(@arg("n","1"))] {
  out[j] = cat(data);
}
<config>
  <pool handle="pbs">
    <execution provider="coaster" jobmanager="local:pbs"/>
    <!-- replace with your project -->
    <profile namespace="globus" key="project">CI-CCR000013</profile>

    <profile namespace="globus" key="providerAttributes">
                     pbs.aprun;pbs.mpp;depth=24</profile>

    <profile namespace="globus" key="jobsPerNode">24</profile>
    <profile namespace="globus" key="maxTime">1000</profile>
    <profile namespace="globus" key="slots">1</profile>
    <profile namespace="globus" key="nodeGranularity">1</profile>
    <profile namespace="globus" key="maxNodes">1</profile>

    <profile namespace="karajan" key="jobThrottle">.63</profile>
    <profile namespace="karajan" key="initialScore">10000</profile>

    <filesystem provider="local"/>
    <!-- replace this with your home on lustre -->
    <workdirectory >/lustre/beagle/ketan/swift.workdir</workdirectory>
  </pool>
</config>
wrapperlog.always.transfer=true
sitedir.keep=true
execution.retries=1
lazy.errors=true
use.provider.staging=true
provider.staging.pin.swiftfiles=false
foreach.max.threads=100
provenance.log=false
pbs cat /bin/cat null null null
swift -config cf -tc.file tc -sites.file sites.xml catsn.swift -n=1
-n
cat
outdir
ls outdir
Swift 0.93RC5 swift-r5285 cog-r3322

RunID: 20111218-0246-6ai8g7f0
Progress:  time: Sun, 18 Dec 2011 02:46:33 +0000
Progress:  time: Sun, 18 Dec 2011 02:46:42 +0000  Active:1
Final status:  time: Sun, 18 Dec 2011 02:46:43 +0000  Finished successfully:1
Number of Tasks = (JobThrottle x 100) + 1
<config>
  <pool handle="pbs">
    <execution provider="coaster" jobmanager="local:pbs"/>
    <profile namespace="globus" key="project">CI-CCR000013</profile>

    <profile namespace="globus" key="ppn">24:cray:pack</profile>

    <!-- For swift 0.93
    <profile namespace="globus" key="ppn">pbs.aprun;pbs.mpp;depth=24</profile>
    -->

    <profile namespace="globus" key="jobsPerNode">24</profile>
    <profile namespace="globus" key="maxTime">50000</profile>
    <profile namespace="globus" key="slots">50</profile>
    <profile namespace="globus" key="nodeGranularity">4</profile>
    <profile namespace="globus" key="maxNodes">4</profile>

    <profile namespace="karajan" key="jobThrottle">48.00</profile>
    <profile namespace="karajan" key="initialScore">10000</profile>

    <filesystem provider="local"/>
    <workdirectory >/lustre/beagle/ketan/swift.workdir</workdirectory>
  </pool>
</config>
If 'swift' is not a typo you can run the following command to lookup the package that contains the binary:
    command-not-found swift
-bash: swift: command not found
$ module load swift sun-java
Swift version swift-0.93 loaded
sun-java version jdk1.7.0_02 loaded
<workdirectory >/home/ketan/swift.workdir</workdirectory>
<workdirectory >/lustre/beagle/ketan/swift.workdir</workdirectory>
ssh <yourusername>@fusion.lcrc.anl.gov

<workdirectory>/tmp/swiftwork</workdirectory>

$ cp ~/swift/examples/misc/catsn.swift .
$ cp ~/swift/examples/misc/data.txt .
$ swift -sites.file sites.xml -tc.file tc catsn.swift
<profile namespace="globus" key="queue">batch</profile>
<config>
  <pool handle="geyser">
    <filesystem provider="local"/>
    <execution provider="coaster" jobmanager="local:lsf"/>
    <profile namespace="globus"   key="jobsPerNode">1</profile>
    <profile namespace="globus"   key="maxTime">3600</profile>
    <profile namespace="globus"   key="maxwalltime">00:05</profile>
    <profile namespace="globus"   key="lowOverallocation">100</profile>
    <profile namespace="globus"   key="highOverallocation">100</profile>
    <profile namespace="globus"   key="nodeGranularity">1</profile>
    <profile namespace="globus"   key="maxNodes">1</profile>
    <profile namespace="globus"   key="project">P93300606</profile>
    <profile namespace="globus"   key="queue">small</profile>
    <profile namespace="karajan"  key="jobThrottle">4</profile>
    <profile namespace="karajan"  key="initialScore">10000</profile>
    <workdirectory>/glade/scratch/davkelly</workdirectory>
  </pool>
</config>
ssh yourusername@intrepid.alcf.anl.gov
export PATH=$PATH:$HOME/cog/modules/swift/dist/swift-svn/bin
export PATH=$PATH:$HOME/swift-<version>/bin
export SWIFT_USERHOME=/intrepid-fs0/users/`whoami`/scratch
/home/username/swiftwork
/home/username/work
/tmp/swift.work
$ projects
HTCScienceApps
gensites

sites.xml
_HOST_
_PROJECT_
_QUEUE_
_WORK_
HOST
ifconfig
172
PROJECT
QUEUE
WORK

type file;

app (file o) cat (file i)
{
  cat @i stdout=@o;
}

string t = "abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ0123456789";
string char[] = @strsplit(t, "");

file out[]<simple_mapper; location=".", prefix="catsn.",suffix=".out">;
foreach j in [1:@toInt(@arg("n","10"))] {
  file data<"data.txt">;
  out[j] = cat(data);
}
swift -sites.file sites.xml -tc.file tc.data catsn.swift -n=10
$ resoft

start-coaster-service
swift -sites.file sites.xml -tc.file tc.data yourscript.swift
$ stop-coaster-service
$ ssh userid@midway.rcc.uchicago.edu
$ module load swift
<config>
  <pool handle="midway-sandyb">
    <execution provider="coaster" jobmanager="local:slurm"/>
    <profile namespace="globus" key="jobsPerNode">16</profile>
    <profile namespace="globus" key="maxWalltime">00:05:00</profile>
    <profile namespace="globus" key="highOverAllocation">100</profile>
    <profile namespace="globus" key="lowOverAllocation">100</profile>
    <profile namespace="globus" key="queue">sandyb</profile>
    <profile namespace="karajan" key="initialScore">10000</profile>
    <filesystem provider="local"/>
    <workdirectory>/scratch/midway/{env.USER}/work</workdirectory>
  </pool>

  <pool handle="midway-westmere">
    <execution provider="coaster" jobmanager="local:slurm"/>
    <profile namespace="globus" key="jobsPerNode">12</profile>
    <profile namespace="globus" key="maxWalltime">00:05:00</profile>
    <profile namespace="globus" key="highOverAllocation">100</profile>
    <profile namespace="globus" key="lowOverAllocation">100</profile>
    <profile namespace="globus" key="queue">westmere</profile>
    <profile namespace="karajan" key="initialScore">10000</profile>
    <filesystem provider="local"/>
    <workdirectory>/scratch/midway/{env.USER}/work</workdirectory>
  </pool>
</config>
<config>
  <pool handle="midway-westmere">
    <execution provider="coaster" jobmanager="local:slurm"/>
    <profile namespace="globus" key="jobsPerNode">1</profile>
    <profile namespace="globus" key="ppn">12</profile>
    <profile namespace="globus" key="maxWalltime">_WALLTIME_</profile>
    <profile namespace="globus" key="highOverAllocation">100</profile>
    <profile namespace="globus" key="lowOverAllocation">100</profile>
    <profile namespace="globus" key="queue">westmere</profile>
    <profile namespace="karajan" key="initialScore">10000</profile>
    <profile namespace="globus" key="jobtype">single</profile>
    <profile namespace="globus" key="count">2</profile>
    <filesystem provider="local"/>
    <workdirectory>/scratch/midway/{env.USER}/work</workdirectory>
  </pool>
</config>
<profile namespace="globus" key="slurm.setting">value</profile>
<profile namespace="globus" key="slurm.mail-type">END</profile>
export WORKER_MODE=local
export IPADDR=127.0.0.1
export JOBSPERNODE=1
export JOBTHROTTLE=0.0099
export WORK=$HOME/swiftwork
$ start-coaster-service
Start-coaster-service...
Configuration: coaster-service.conf
Service address: 127.0.0.1
Starting coaster-service
Service port: 51099
Local port: 41764
Generating sites.xml
Starting worker on local machine

$ swift -sites.file sites.xml -tc.file tc.data hostsnsleep.swift
Swift trunk swift-r7153 cog-r3810
RunID: 20131014-1807-q6h89eq3
Progress:  time: Mon, 14 Oct 2013 18:07:13 +0000
Passive queue processor initialized. Callback URI is http://128.135.112.73:41764
Progress:  time: Mon, 14 Oct 2013 18:07:14 +0000  Active:1
Final status: Mon, 14 Oct 2013 18:07:15 +0000  Finished successfully:1
$ stop-coaster-service
Stop-coaster-service...
Configuration: coaster-service.conf
Ending coaster processes..
Killing process 8579
Done
export WORKER_MODE=ssh
export WORKER_USERNAME=yourusername
export WORKER_HOSTS="host1.example.edu host2.example.edu"
export WORKER_LOCATION="/homes/davidk/logs"
export IPADDR=swift.rcc.uchicago.edu
export JOBSPERNODE=1
export JOBTHROTTLE=0.0099
export WORK=/homes/davidk/swiftwork
wrapperlog.always.transfer=false
sitedir.keep=false
execution.retries=0
lazy.errors=false
status.mode=provider
use.provider.staging=true
provider.staging.pin.swiftfiles=false
use.wrapper.staging=false
$ start-coaster-service
Start-coaster-service...
Configuration: coaster-service.conf
Service address: swift.rcc.uchicago.edu
Starting coaster-service
Service port: 41714
Local port: 41685
Generating sites.xml
Starting worker on host1.example.edu
Starting worker on host2.example.edu

$ swift -sites.file sites.xml -tc.file tc.data -config cf hostsnsleep.swift
Swift trunk swift-r7153 cog-r3810
RunID: 20131014-1844-7flhik67
Progress:  time: Mon, 14 Oct 2013 18:44:43 +0000
Passive queue processor initialized. Callback URI is http://128.135.112.73:41685
Progress:  time: Mon, 14 Oct 2013 18:44:44 +0000  Selecting site:4 Finished successfully:4
Final status: Mon, 14 Oct 2013 18:44:45 +0000  Finished successfully:10
Host node.host.edu
   Hostname node.host.edu
   ProxyCommand ssh -A username@gateway.host.edu nc %h %p 2> /dev/null
   ForwardAgent yes
   User username
export SSH_TUNNELING=yes
export WORKER_HOSTS=$( cat /path/to/hostlist.txt )
$WORKER $WORKERURL logname $WORKER_LOG_DIR
export WORKER_MODE=scheduler
export WORKER_LOG_DIR=/scratch/midway/$USER
export IPADDR=10.50.181.1
export JOBSPERNODE=1
export JOBTHROTTLE=0.0099
export WORK=$HOME/swiftwork
export SCHEDULER_COMMAND="sbatch start-workers.submit"
#!/bin/bash

#SBATCH --job-name=start-workers
#SBATCH --output=start-workers.stdout
#SBATCH --error=start-workers.stderr
#SBATCH --nodes=1
#SBATCH --partition=westmere
#SBATCH --time=00:10:00
#SBATCH --ntasks-per-node=12
#SBATCH --exclusive

$WORKER $WORKERURL logname $WORKER_LOG_DIR
export WORKER_LOGGING_LEVEL=DEBUG
ssh-keygen -t dsa -f $HOME/.ssh/id_dsa-swift
$ ssh -o IdentitiesOnly=true -i $HOME/.ssh/id_dsa-swift user@login.remotehost.edu
$ touch $HOME/.ssh/auth.defaults
$ chmod 600 $HOME/.ssh/auth.defaults
login.remotehost.edu.type=key
login.remotehost.edu.username=your_remote_username
login.remotehost.edu.key=/your/home/.ssh/id_dsa-swift
login.remotehost.edu.passphrase=your_passphrase
<config>
   <pool handle="remotehost">
      <execution provider="coaster" jobmanager="ssh:local" url="login.remotehost.edu"/>
      <filesystem provider="ssh" url="login.remotehost.edu"/>
      <profile namespace="karajan" key="jobThrottle">0</profile>
      <profile namespace="karajan" key="initialScore">10000</profile>
      <workdirectory>/path/to/remote/workdirectory</workdirectory>
   </pool>
</config>
use.provider.staging=false
use.wrapper.staging=false
$ ssh my.site.com
Host my.site.com
  Hostname my.site.com
  User myusername
<config>
  <pool handle="mysite">
    <execution provider="coaster" jobmanager="ssh-cl:local" url="my.site.com"/>
    <profile namespace="globus" key="jobsPerNode">1</profile>
    <profile namespace="globus" key="lowOverAllocation">100</profile>
    <profile namespace="globus" key="highOverAllocation">100</profile>
    <profile namespace="karajan" key="jobThrottle">1</profile>
    <profile namespace="karajan" key="initialScore">10000</profile>
    <workdirectory>/home/username/work</workdirectory>
  </pool>
</config>
use.provider.staging=true
swift -sites.file sites.xml -tc.file tc.data -config cf script.swift
$ myproxy-logon -l username -s myproxy.teragrid.org
<config>
  <pool handle="stampede">
    <execution provider="coaster" jobmanager="gt2:gt2:slurm" url="login5.stampede.tacc.utexas.edu:2119/jobmanager-slurm"/>
    <filesystem provider="gsiftp" url="gsiftp://gridftp.stampede.tacc.utexas.edu:2811"/>
    <profile namespace="globus"  key="jobsPerNode">16</profile>
    <profile namespace="globus"  key="ppn">16</profile>
    <profile namespace="globus"  key="maxTime">3600</profile>
    <profile namespace="globus"  key="maxwalltime">00:05:00</profile>
    <profile namespace="globus"  key="lowOverallocation">100</profile>
    <profile namespace="globus"  key="highOverallocation">100</profile>
    <profile namespace="globus"  key="queue">normal</profile>
    <profile namespace="globus"  key="nodeGranularity">1</profile>
    <profile namespace="globus"  key="maxNodes">1</profile>
    <profile namespace="globus"  key="project">yourproject</profile>
    <profile namespace="karajan" key="jobThrottle">.3199</profile>
    <profile namespace="karajan" key="initialScore">10000</profile>
    <workdirectory>/scratch/01503/yourusername</workdirectory>
  </pool>
</config>
$ swift -sites.file sites.xml -tc.file tc -config cf myscript.swift

User Queue	Queue	Nodes	Time (hours)	User Maxrun	Project maxrun
prod-devel	prod-devel	64-512	0-1	5	20
prod	prod-short	512-4096	0-6	5	20
prod	prod-long	512-4096	6-12	5	20
prod	prod-capability	4097-32768	0-12	2	20
prod	prod-24k	16385-24576	0-12	2	20
prod	prod-bigrun	32769-40960	0-12	2	20
prod	backfill	512-8192	0-6	5	10

Site Configuration Guide

1. Overview

2. Prerequisites

3. Beagle (Cray XE6)

3.1. Requesting Access

3.2. Connecting to a login node

3.3. Getting Started with Swift

3.4. Larger Runs on Beagle

3.5. Troubleshooting

4. Amazon EC2 Cloud

5. Fusion (x86 cluster)

5.1. Requesting Access

5.2. Projects

5.3. SSH Keys

5.4. Connecting to a login node

5.5. Creating sites.xml

5.6. Creating tc

5.7. Copy a Swift Script

5.8. Run Swift

5.9. Queues

5.10. More Help

6. Futuregrid Cloud

7. Geyser and Caldera (x86 clusters)

7.1. Example sites.xml

8. Grids: Open Science Grid and TeraGrid

9. Intrepid (Blue Gene/P)

9.1. Requesting Access

9.2. SSH Keys

9.3. Cryptocard

9.4. Connecting to a login node

9.5. Downloading and building Swift

9.6. Adding Swift to your PATH

9.7. What You Need To Know Before Running Swift

9.7.1. Swift Work Directory

9.7.2. Which project(s) are you a member of?

9.7.3. Determine your Queue

9.8. Generating Configuration Files

9.8.1. Manually Editing sites.xml

9.9. Manually Editing tc.data

9.10. Catsn.swift

9.11. Running Swift

9.12. More Help

10. MCS Compute Servers (x86 workstations)

10.1. Create a coaster-service.conf

10.2. Starting the Coaster Service

10.3. Run Swift

10.4. Stopping the Coaster Service

11. Midway (x86 cluster)

11.1. Connecting to a login node

11.2. Loading Swift

11.3. Example sites.xml

11.4. Example sites.xml for use with MPI

11.5. Defining non-standard Slurm options

11.6. Various tips for running MPI jobs

12. Persistent Coasters

12.1. Example 1: Starting workers locally

12.2. Example 2: Starting workers remotely via SSH

12.3. Example 3: Starting workers remotely via SSH, with multihop

12.4. Example 4: Starting workers remotely via SSH, with tunneling

12.5. Example 5: Starting workers remotely via SSH, hostnames in a file

12.6. Example 6: Starting workers via a scheduler

12.7. List of all coaster-service.conf settings

12.7.1. IPADDR

12.7.2. LOCAL_PORT

12.7.3. LOG

12.7.4. SCHEDULER_COMMAND

12.7.5. SERVICE_PORT

12.7.6. SSH_TUNNELING

12.7.7. WORKER_HOSTS

12.7.8. WORKER_LOCATION

12.7.9. WORKER_LOG_DIR

12.7.10. WORKER_LOGGING_LEVEL

12.7.11. WORKER_USERNAME

13. SSH

13.1. Generate a unique SSH key

13.2. Add your public key to the remote host

13.3. Verify your new key works

13.4. Create auth.defaults

13.5. Create a sites.xml file

13.6. Setting your properties