[Saga-devel] saga-projects SVN commit 839: /papers/clouds/

Mon Jan 12 15:02:40 CST 2009

User: sjha
Date: 2009/01/12 03:02 PM

Modified:
 /papers/clouds/
  saga_cloud_interop.tex

Log:
 added outline / structure to the experiments
   --  four types.

File Changes:

Directory: /papers/clouds/
==========================

File [modified]: saga_cloud_interop.tex
Delta lines: +23 -0
===================================================================

--- papers/clouds/saga_cloud_interop.tex	2009-01-12 19:46:36 UTC (rev 838)
+++ papers/clouds/saga_cloud_interop.tex	2009-01-12 21:02:33 UTC (rev 839)
@@ -545,6 +545,9 @@
   between SAGA All-Pairs using Advert Service versus using HBase or
   Bigtable as distributed data-store, but due to space constraints we
   will report results of the All-Pairs experiments elsewhere.}  :
+
+
+In an earlier paper, we had essentially done the following:
 \begin{enumerate}
 \item Both \sagamapreduce workers
   (compute) and data-distribution are local. Number of workers vary
@@ -561,7 +564,27 @@
 \item {\bf NEEDS MODIFICATION}
 \end{enumerate}
 
+In this paper, we do the following:
+\begin{enumerate}
+\item For Clouds the default assumption should be that the VMs are
+  distributed with respect to each other. It should also be assumed
+  that some data is also locally distributed (with respect to a VM).
+  Number of workers vary from 1 to 10, and the data-set sizes varying
+  from 1 to 10GB.  Compare performance of \sagamapreduce when
+  exclusively running in a Cloud to the performance in Grids. (both
+  Amazon and GumboCloud) Here we assume that the number of workers per
+  VM is 1, which is treated as the base case.
+\item We then vary the number of workers per VM, such that the ratio
+  is 1:2; we repeat with the ratio at 1:4 -- that is the number of
+  workers per VM is 4.
+\item We then distribute the same number of workers across Grids and
+  Clouds (assuming the base case for Clouds)
+\item Distributed compute (workers) but using GridFTP for
+  transfer. This corresponds to the case where workers are able to
+  communicate directly with each other.
+\end{enumerate}
 
+
 {\bf SAGA-MapReduce on Grids:} We begin with the observation that the
 efficiency of \sagamapreduce is pretty close to 1, actually better
 than 1 -- like any good (data) parallel applications should be.  For