Example: Distributed Pi.

This example estimates the value of pi using the Monte Carlo method.

How to run it:

1) Start the slave script on each of your slave machines:

> ruby slave_script.rb

2) Run the master script on your designated master machine:
> ruby master.rb --slaves <comma seperated list of slaves> -i <#of iterations>

example:
> ruby master.rb --slaves curly,moe,larry -i 1000000


Creating an example TaskMaster application:
     
     A TaskMaster application has four components:
       - A master script which runs on your designated master machine.
       - A slave script which runs on each of your slave machines (and
         it can also run on your master machine).
       - A user defined task class that defines your application's task
         in a 'run' method.
       - A user defined reporter class that defines how results are 
         collected from tasks and how they are reported.

     For an example of using the TaskMaster framework, let's look at an
     application that calculates an estimate of pi using the Monte Carlo
     method.  PiTask is shown in Listing 3 (PiTask.rb).
.################## listing 3 #########################
.# PiTask.rb
.# Estimates pi using the Monte Carlo method
.# more iterations should produce more accurate results
.#######################################################
.require "TaskMaster"
.class PiTask
.   
.   def run(iterations)
.     hits = 0
.     puts "iterations = #{iterations} iterations.type = #{iterations.type}"
.     iterations.times { |i|
.       #pick a random x,y
.       x = rand
.       y = rand
.       dist = Math.sqrt(x*x + y*y)
.       if dist <= 1.0
.         hits += 1
.       end
.     }
.     @my_pi = 4*(hits.to_f/iterations.to_f)
.     puts "PiTask::run finished -> estimated pi = #@my_pi"
.   end
.
.   def harvest
.     return @my_pi
.   end
.
.end

    PiTask defines a 'run' method which takes an argument 'iterations'.
    For the given number of iterations a random x and y coordinate is
    chosen (with the values of x and y between 0 and 1).  A distance 
    is then calculated from the origin (0,0).  If the distance is
    less than or equal to 1, the random point falls within one quadrant of
    a circle and the 'hits' variable is incremented.  After thousands 
    or millions of iterations a value for pi can be calculated using
    the formula: pi_est = 4(hits/iterations).  This is exactly what
    PiTask's run method does. After calculating the estimate it assigns
    it to an instance variable (@my_pi).

    The PiTask class also has a 'harvest' method defined.  A task's 'harvest' 
    method is called from the master side when the Distributor finds that
    the task has completed.  In this case we simply return the value of
    pi that was estimated; the instance variable @my_pi is returned.

    Listing 4 shows the master script (master.rb).

.################### listing 4 #######################
.# master.rb - run on your master machine
.#####################################################
.require "TaskMaster"
.
.iterations = 1000000
.slaveList = ['frodo','sam','merry','pippin']
.class PiReporter
.  def initialize
.    @estimates = []
.  end
.
.  def report(pi_est)
.    @estimates << pi_est
.  end
.
.  def final_report
.    sum = 0
.    @estimates.each { |est|
.      sum += est
.    }
.    print "The average pi estimate was: "
.    print "#{sum/(@estimates.length)}\n"
.  end
.end
.startTime = Time.now
.filesToRequire = ["PiTask.rb"]
.filesToRequire.each { |file| require file }
.distrib = TaskMaster::Distributor.new(slaveList)
.distrib.remote_require(filesToRequire)
.reporter = PiReporter.new
.distrib.reporter = reporter
.(distrib.availableSlaves).each {
.  distrib.send_task(PiTask.new(),(iterations/distrib.availableSlaves.length))
.}
.distrib.wait
.reporter.final_report
.endTime = Time.now
.puts "Total time: #{endTime-startTime} seconds"

    After setting up the list of slaves, the PiReporter class is defined.
    PiReporter's 'report' method will receive the value returned by a
    PiTasks's 'harvest' method.  In this case the 'report' method just
    stores the estimated value of pi coming from a slave in a list.  
    PiReporter's 'final_report' method will be called after all of the 
    slave machines have completed their estimates; it calculates the average
    of the estimates and prints the results.

    After setting a start time (so we can calculate how long the program
    takes to run) and require'ing the PiTask.rb file, the Distributor is 
    instantiated with the slaveList.  Distributor's 'remote_require' method
    is then called with a list containing the string "PiTask.rb" - this
    will send the contents of that file to each of the slave machines, it 
    essentially gives the slaves machines the definition for the PiTask
    class so that they can receive and act upon PiTask objects which will
    be sent to them.

    A PiReporter is then instantiated and referenced by the 'reporter' 
    variable and the Distributor's 'reporter' instance is then set 
    ( distrib.reporter = reporter ) to use it.

    Now comes the time to divide up the tasks between the available 
    slave machines.  We send a task for each available slave machine.
    We divide the total number of iterations (in this case 1000000) by 
    the number of slave machines.  In this example there are four slaves
    so each one will run 250000 iterations.  After all the tasks are sent
    we then wait for all of the tasks to complete (distrib.wait) and then
    call reporter.final_report to give us the average for all machines.

    In reality if you really wanted to speed this up you would probably 
    write the computationally intensive part of PiTask's 'run' method 
    in C and create an extension for Ruby in a shared library file. You could
    then call the C function from this library from your 'run' method.
    (Note: there is an example of this included with TaskMaster)

    Listing 5 shows the slave script (slave.rb)which should be running on your
    slave machines prior to starting the master.rb script shown in listing 4.
    
.################## listing 5 #####################
.# slave.rb - run on your slave machines
.require "TaskMaster"
.slaveObj = TaskMaster::Slave.new()
.puts "Slave #{Socket::gethostname} started"
.slaveObj.wait

    
    This example illustrates the main points you need to be aware of when
    creating your own distributed applications with TaskMaster:
    1) You need to define a task class that defines a 'run' method 
    which performs some task and saves the results in instance variables.
    You also need to define a 'harvest' method which returns the results 
    to the master machine.
    2) You need to define a reporter class which defines a 'report' method 
    that will receive the results from your task's 'harvest' method.
    Your reporter class can optionally define methods for generating
    reports based on results gethered from finished tasks.
    3) You need to create the master-side script which instantiates a
    Distributor object with a list of slave machines.
    4) You might need to create the slave-side script, but in most
    cases you could use the example in Listing 5 unchanged.



