[Saga-devel] saga-projects SVN commit 866: /papers/clouds/

Sat Jan 24 12:46:38 CST 2009

User: sjha
Date: 2009/01/24 12:46 PM

Modified:
 /papers/clouds/
  saga_cloud_interop.tex

Log:
 pending commits before i have to go offline to attend
   another darned meeting
   
   sorry lost track of what the changes are/were. mostly
   smallish.

File Changes:

Directory: /papers/clouds/
==========================

File [modified]: saga_cloud_interop.tex
Delta lines: +218 -220
===================================================================

--- papers/clouds/saga_cloud_interop.tex	2009-01-24 17:41:05 UTC (rev 865)
+++ papers/clouds/saga_cloud_interop.tex	2009-01-24 18:46:33 UTC (rev 866)
@@ -274,6 +274,20 @@
 \end{itemize}
 
 
+\subsection{Clouds: An Emerging Distributed Infrastructure}
+{\textcolor{blue} {KS}}
+
+In our opinion the primary distinguishing feature of Grids and
+Clouds  is...
+
+\subsection{Amazon EC2:} 
+
+\subsection{Eucalyptus}
+
+
+GumboCloud, ECP etc
+
+
 \section{SAGA}  {\textcolor{blue} {SJ}}
 
 
@@ -327,21 +341,218 @@
 
 Forward reference the section on the role of adaptors.. 
 
+\subsection{SAGA: An interface to Clouds and Grids}{\bf AM}
 
-\section{Clouds: An Emerging Distributed Infrastructure}
-{\textcolor{blue} {KS}}
+\section{Interfacing SAGA to Clouds: The role of Adaptors}
 
-In our opinion the primary distinguishing feature of Grids and
-Clouds  is...
+As alluded to, there is a proliferation of Clouds and Cloud-like
+systems, but it is important to remember that ``what constitutes or
+does not constitute a Cloud'' is not universally agreed upon.  However
+there are several aspects and attributes of Cloud systems that are
+generally agreed upon~\cite{buyya_hpcc}...
 
+% Here we will by necessity
+% limit our discussion to two type of distributed file-systems (HDFS and
+% KFS) and two types of distributed structured-data store (Bigtable and
+% HBase). We have developed SAGA adaptors for these, have used
+% \sagamapreduce (and All-Pairs) seamlessly on these infrastructure.
 
-\subsection{Amazon EC2:} 
+% {\it HDFS and KFS: } HDFS is a distributed parallel fault tolerant
+% application that handles the details of spreading data across multiple
+% machines in a traditional hierarchical file organization.  Implemented
+% in Java, HDFS is designed to run on commodity hardware while providing
+% scalability and optimizations for large files.  The FS works by having
+% one or two namenodes (masters) and many rack-aware datanodes (slaves).
+% All data requests go through the namenode that uses block operations
+% on each data node to properly assemble the data for the requesting
+% application.  The goal of replication and rack-awareness is to improve
+% reliability and data retrieval time based on locality.  In data
+% intensive applications, these qualities are essential. KFS (also
+% called CloudStore) is an open-source high-performance distributed FS
+% implemented in C++, with many of the same design features as HDFS.
 
-\subsection{Eucalyptus}
+% There exist many other implementations of both distributed FS (such as
+% Sector) and of distributed data-store (such as Cassandra and
+% Hybertable); for the most part they are variants on the same theme
+% technically, but with different language and performance criteria
+% optimizations.  Hypertable is an open-source implementation of
+% Bigtable; Cassandra is a Bigtable clone but eschews an explicit
+% coordinator (Bigtable's Chubby, HBase's HMaster, Hypertable's
+% Hyperspace) for a P2P/DHT approach for data distribution and location
+% and for availability.  In the near future we will be providing
+% adaptors for Sector\footnote{http://sector.sourceforge.net/} and
+% Cassandra\footnote{http://code.google.com/p/the-cassandra-project/}.
+% And although Fig.~\ref{saga_figure} explicitly maps out different
+% functional areas for which SAGA adaptors exist, there can be multiple
+% adaptors (for different systems) that implement that functionality;
+% the SAGA run-time dynamically loads the correct adaptor, thus
+% providing both an effective abstraction layer as well as an
+% interesting means of providing interoperability between different
+% Cloud-like infrastructure.  As testimony to the power of SAGA, the
+% ability to create the relevant adaptors in a lightweight fashion and
+% thus extend applications to different systems with minimal overhead is
+% an important design feature and a significant requirement so as to be
+% an effective programming abstraction layer.
 
+\subsection{Clouds Adaptors: Design and Implementation}
 
-GumboCloud, ECP etc
+\jhanote{The aim of this section is to discuss how SAGA on Clouds
+  differs from SAGA for Grids. Everything from i) job submission ii)
+  file transfer...}
 
+{\bf SAGA-MapReduce on Clouds: } Thanks to the low overhead of
+developing adaptors, SAGA has been deployed on three Cloud Systems --
+Amazon, Nimbus~\cite{nimbus} and Eucalyptus~\cite{eucalyptus} (we have
+a local installation of Eucalyptus, referred to as GumboCloud).  On
+EC2, we created custom virtual machine (VM) image with preinstalled
+SAGA.  For Eucalyptus and Nimbus, a boot strapping script equips a
+standard VM instance with SAGA, and SAGA's prerequisites (mainly
+boost).  To us, a mixed approach seemed most favourable, where the
+bulk software installation is statically done via a custom VM image,
+but software configuration and application deployment are done
+dynamically during VM startup.
+
+There are several aspects to Cloud Interoperability. A simple form of
+interoperability -- more akin to inter-changeable -- is that any
+application can use either of the three Clouds systems without any
+changes to the application: the application simply needs to
+instantiate a different set of security credentials for the respective
+runtime environment, aka cloud.  Interestingly, SAGA provides this level of
+interoperability quite trivially thanks to the adaptors.
+
+By almost trivial extension, SAGA also provides Grid-Cloud
+interoperability, as shown in Fig.~\ref{gramjob} and ~\ref{vmjob},
+where exactly the same interface and functional calls lead to job
+submission on Grids or on Clouds. Although syntactically identical,
+the semantics of the calls and back-end management are somewhat
+different.  For example, for Grids, a \texttt{job\_service} instance
+represents a live job submission endpoint, whilst for Clouds it
+represents a VM instance created on the fly.  It takes SAGA about 45
+seconds to instantiate a VM on Eucalyptus, and about 90 seconds on
+EC2. Once instantiated, it takes about 1 second to assign a job to a
+VM on Eucalyptus, or EC2.  It is a configurable option to tie the VM
+lifetime to the \texttt{job\_service} object lifetime, or not.
+
+We have also deployed \sagamapreduce to work on Cloud platforms.  It
+is critical to mention that the \sagamapreduce code did not undergo
+any changes whatsoever. The change lies in the run-time system and
+deployment architecture. For example, when running \sagamapreduce on
+EC2, the master process resides on one VM, while workers reside on
+different VMs.  Depending on the available adaptors, Master and Worker
+can either perform local I/O on a global/distributed file system, or
+remote I/O on a remote, non-shared file systems.  In our current
+implementation, the VMs hosting the master and workers share the same
+ssh credentials and a shared file-system (using sshfs/FUSE).
+Application deployment and configuration (as discussed above) are also
+performed via that sshfs.  Due to space limitations we will not
+discuss the performance data of \sagamapreduce with different data-set
+sizes and varying worker numbers.
+
+\begin{figure}[!ht]
+\upp
+ \begin{center}
+  \begin{mycode}[label=SAGA Job Launch via GRAM gatekeeper]
+   { // contact a GRAM gatekeeper
+    saga::job::service     js;
+    saga::job::description jd;
+    jd.set_attribute (``Executable'', ``/tmp/my_prog'');
+    // translate job description to RSL
+    // submit RSL to gatekeeper, and obtain job handle
+    saga:job::job j = js.create_job (jd);
+    j.run ():
+    // watch handle until job is finished
+    j.wait ();
+   } // break contact to GRAM
+  \end{mycode}
+  \caption{\label{gramjob}Job launch via Gram }
+ \end{center}
+\upp
+\end{figure}
+
+\begin{figure}[!ht]
+\upp
+ \begin{center}
+  \begin{mycode}[label=SAGA create a VM instance on a Cloud]
+   {// create a VM instance on Eucalyptus/Nimbus/EC2
+    saga::job::service     js;
+    saga::job::description jd;
+    jd.set_attribute (``Executable'', ``/tmp/my_prog'');
+    // translate job description to ssh command
+    // run the ssh command on the VM
+    saga:job::job j = js.create_job (jd);
+    j.run ():
+    // watch command until done
+    j.wait ();
+   } // shut down VM instance
+  \end{mycode}
+  \caption{\label{vmjob} Job launch via VM}
+ \end{center}
+\upp
+\end{figure}
+
+%{\bf SAGA-MapReduce on Clouds and Grids:} 
+\begin{figure}[t]
+  % \includegraphics[width=0.4\textwidth]{MapReduce_local_executiontime.png}
+  \caption{Plots showing how the \tc for different data-set sizes
+    varies with the number of workers employed.  For example, with
+    larger data-set sizes although $t_{pp}$ increases, as the number
+    of workers increases the workload per worker decreases, thus
+    leading to an overall reduction in $T_c$. The advantages of a
+    greater number of workers is manifest for larger data-sets.}
+\label{grids1}
+\end{figure}
+
+% {\bf SAGA-MapReduce on Cloud-like infrastructure: } Accounting for the
+% fact that time for chunking is not included, Yahoo's MapReduce takes a
+% factor of 2 less time than \sagamapreduce
+% (Fig.~\ref{mapreduce_timing_FS}). This is not surprising, as
+% \sagamapreduce implementations have not been optimized, e.g.,
+% \sagamapreduce is not multi-threaded.
+% \begin{figure}[t]
+% \upp
+%       \centering
+% %          \includegraphics[width=0.40\textwidth]{mapreduce_timing_FS.pdf}
+%           \caption{\tc for \sagamapreduce using one worker (local to
+%             the master) for different configurations.  The label
+%             ``Hadoop'' represents Yahoo's MapReduce implementation;
+%             \tc for Hadoop is without chunking, which takes
+%             several hundred sec for larger data-sets.  The ``SAGA
+%             MapReduce + Local FS'' corresponds to the use of the local
+%             FS on Linux clusters, while the label ``SAGA + HDFS''
+%             corresponds to the use of HDFS on the clusters. Due to
+%             simplicity, of the Local FS, its performance beats
+%             distributed FS when used in local mode.}
+%           % It is interesting to note that as the data-set sizes get
+%           % larger, HDFS starts outperforming local FS.  We attribute
+%           % this to the use of caching and other advanced features in
+%           % HDFS which prove to be useful, even though it is not being
+%           % used in a distributed fashion.  scenarios considered are
+%           % (i) all infrastructure is local and thus SAGA's local
+%           % adapters are invoked, (ii) local job adaptors are used,
+%           % but the hadoop file-system (HDFS) is used, (iii) Yahoo's
+%           % mapreduce.
+% %      \label{saga_mapreduce_1worker.png}
+%           \label{mapreduce_timing_FS}
+% \upp
+% \end{figure}
+% Experiment 5 (Table~\ref{exp4and5}) provides insight into performance
+% figure when the same number of workers are available, but are either
+% all localized, or are split evenly between two similar but distributed
+% machines. It shows that to get lowest $T_c$, it is often required to
+% both distribute the compute and lower the workload per worker; just
+% lowering the workload per worker is not good enough as there is still
+% a point of serialization (usually local I/O).  % It shows that when
+% % workload per worker gets to a certain point, it is beneficial to
+% % distribute the workers, as the machine I/0 becomes the bottleneck.
+% When coupled with the advantages of a distributed FS, the ability to
+% both distribute compute and data provides additional performance
+% advantage, as shown by the values of $T_c$ for both distributed
+% compute and DFS cases in Table~\ref{exp4and5}.
+
+
+
+
+
 \section{SAGA-based MapReduce}
 
 In this paper we will demonstrate the use of SAGA in implementing well
@@ -567,220 +778,7 @@
 % fragment to each one in the base.  This is done starting at every
 % point possible on the base.
 
-\section{Interfacing SAGA to Cloud-like Infrastructure: The role of
-  Adaptors}
 
-As alluded to, there is a proliferation of Clouds and Cloud-like
-systems, but it is important to remember that ``what constitutes or
-does not constitute a Cloud'' is not universally agreed upon.  However
-there are several aspects and attributes of Cloud systems that are
-generally agreed upon~\cite{buyya_hpcc}...
-
-% Here we will by necessity
-% limit our discussion to two type of distributed file-systems (HDFS and
-% KFS) and two types of distributed structured-data store (Bigtable and
-% HBase). We have developed SAGA adaptors for these, have used
-% \sagamapreduce (and All-Pairs) seamlessly on these infrastructure.
-
-% {\it HDFS and KFS: } HDFS is a distributed parallel fault tolerant
-% application that handles the details of spreading data across multiple
-% machines in a traditional hierarchical file organization.  Implemented
-% in Java, HDFS is designed to run on commodity hardware while providing
-% scalability and optimizations for large files.  The FS works by having
-% one or two namenodes (masters) and many rack-aware datanodes (slaves).
-% All data requests go through the namenode that uses block operations
-% on each data node to properly assemble the data for the requesting
-% application.  The goal of replication and rack-awareness is to improve
-% reliability and data retrieval time based on locality.  In data
-% intensive applications, these qualities are essential. KFS (also
-% called CloudStore) is an open-source high-performance distributed FS
-% implemented in C++, with many of the same design features as HDFS.
-
-% There exist many other implementations of both distributed FS (such as
-% Sector) and of distributed data-store (such as Cassandra and
-% Hybertable); for the most part they are variants on the same theme
-% technically, but with different language and performance criteria
-% optimizations.  Hypertable is an open-source implementation of
-% Bigtable; Cassandra is a Bigtable clone but eschews an explicit
-% coordinator (Bigtable's Chubby, HBase's HMaster, Hypertable's
-% Hyperspace) for a P2P/DHT approach for data distribution and location
-% and for availability.  In the near future we will be providing
-% adaptors for Sector\footnote{http://sector.sourceforge.net/} and
-% Cassandra\footnote{http://code.google.com/p/the-cassandra-project/}.
-% And although Fig.~\ref{saga_figure} explicitly maps out different
-% functional areas for which SAGA adaptors exist, there can be multiple
-% adaptors (for different systems) that implement that functionality;
-% the SAGA run-time dynamically loads the correct adaptor, thus
-% providing both an effective abstraction layer as well as an
-% interesting means of providing interoperability between different
-% Cloud-like infrastructure.  As testimony to the power of SAGA, the
-% ability to create the relevant adaptors in a lightweight fashion and
-% thus extend applications to different systems with minimal overhead is
-% an important design feature and a significant requirement so as to be
-% an effective programming abstraction layer.
-
-\subsection{Clouds Adaptors: Design and Implementation}
-
-
-
-\section{SAGA: An interface to Clouds and Grids}{\bf AM}
-
-
-\jhanote{The aim of this section is to discuss how SAGA on Clouds
-  differs from SAGA for Grids. Everything from i) job submission ii)
-  file transfer...}
-
-
-{\bf SAGA-MapReduce on Clouds: } Thanks to the low overhead of
-developing adaptors, SAGA has been deployed on three Cloud Systems --
-Amazon, Nimbus~\cite{nimbus} and Eucalyptus~\cite{eucalyptus} (we have
-a local installation of Eucalyptus, referred to as GumboCloud).  On
-EC2, we created custom virtual machine (VM) image with preinstalled
-SAGA.  For Eucalyptus and Nimbus, a boot strapping script equips a
-standard VM instance with SAGA, and SAGA's prerequisites (mainly
-boost).  To us, a mixed approach seemed most favourable, where the
-bulk software installation is statically done via a custom VM image,
-but software configuration and application deployment are done
-dynamically during VM startup.
-
-There are several aspects to Cloud Interoperability. A simple form of
-interoperability -- more akin to inter-changeable -- is that any
-application can use either of the three Clouds systems without any
-changes to the application: the application simply needs to
-instantiate a different set of security credentials for the respective
-runtime environment, aka cloud.  Interestingly, SAGA provides this level of
-interoperability quite trivially thanks to the adaptors.
-
-By almost trivial extension, SAGA also provides Grid-Cloud
-interoperability, as shown in Fig.~\ref{gramjob} and ~\ref{vmjob},
-where exactly the same interface and functional calls lead to job
-submission on Grids or on Clouds. Although syntactically identical,
-the semantics of the calls and back-end management are somewhat
-different.  For example, for Grids, a \texttt{job\_service} instance
-represents a live job submission endpoint, whilst for Clouds it
-represents a VM instance created on the fly.  It takes SAGA about 45
-seconds to instantiate a VM on Eucalyptus, and about 90 seconds on
-EC2. Once instantiated, it takes about 1 second to assign a job to a
-VM on Eucalyptus, or EC2.  It is a configurable option to tie the VM
-lifetime to the \texttt{job\_service} object lifetime, or not.
-
-We have also deployed \sagamapreduce to work on Cloud platforms.  It
-is critical to mention that the \sagamapreduce code did not undergo
-any changes whatsoever. The change lies in the run-time system and
-deployment architecture. For example, when running \sagamapreduce on
-EC2, the master process resides on one VM, while workers reside on
-different VMs.  Depending on the available adaptors, Master and Worker
-can either perform local I/O on a global/distributed file system, or
-remote I/O on a remote, non-shared file systems.  In our current
-implementation, the VMs hosting the master and workers share the same
-ssh credentials and a shared file-system (using sshfs/FUSE).
-Application deployment and configuration (as discussed above) are also
-performed via that sshfs.  Due to space limitations we will not
-discuss the performance data of \sagamapreduce with different data-set
-sizes and varying worker numbers.
-
-\begin{figure}[!ht]
-\upp
- \begin{center}
-  \begin{mycode}[label=SAGA Job Launch via GRAM gatekeeper]
-   { // contact a GRAM gatekeeper
-    saga::job::service     js;
-    saga::job::description jd;
-    jd.set_attribute (``Executable'', ``/tmp/my_prog'');
-    // translate job description to RSL
-    // submit RSL to gatekeeper, and obtain job handle
-    saga:job::job j = js.create_job (jd);
-    j.run ():
-    // watch handle until job is finished
-    j.wait ();
-   } // break contact to GRAM
-  \end{mycode}
-  \caption{\label{gramjob}Job launch via Gram }
- \end{center}
-\upp
-\end{figure}
-
-\begin{figure}[!ht]
-\upp
- \begin{center}
-  \begin{mycode}[label=SAGA create a VM instance on a Cloud]
-   {// create a VM instance on Eucalyptus/Nimbus/EC2
-    saga::job::service     js;
-    saga::job::description jd;
-    jd.set_attribute (``Executable'', ``/tmp/my_prog'');
-    // translate job description to ssh command
-    // run the ssh command on the VM
-    saga:job::job j = js.create_job (jd);
-    j.run ():
-    // watch command until done
-    j.wait ();
-   } // shut down VM instance
-  \end{mycode}
-  \caption{\label{vmjob} Job launch via VM}
- \end{center}
-\upp
-\end{figure}
-
-{\bf SAGA-MapReduce on Clouds and Grids:} 
-\begin{figure}[t]
-  % \includegraphics[width=0.4\textwidth]{MapReduce_local_executiontime.png}
-  \caption{Plots showing how the \tc for different data-set sizes
-    varies with the number of workers employed.  For example, with
-    larger data-set sizes although $t_{pp}$ increases, as the number
-    of workers increases the workload per worker decreases, thus
-    leading to an overall reduction in $T_c$. The advantages of a
-    greater number of workers is manifest for larger data-sets.}
-\label{grids1}
-\end{figure}
-
-% {\bf SAGA-MapReduce on Cloud-like infrastructure: } Accounting for the
-% fact that time for chunking is not included, Yahoo's MapReduce takes a
-% factor of 2 less time than \sagamapreduce
-% (Fig.~\ref{mapreduce_timing_FS}). This is not surprising, as
-% \sagamapreduce implementations have not been optimized, e.g.,
-% \sagamapreduce is not multi-threaded.
-% \begin{figure}[t]
-% \upp
-%       \centering
-% %          \includegraphics[width=0.40\textwidth]{mapreduce_timing_FS.pdf}
-%           \caption{\tc for \sagamapreduce using one worker (local to
-%             the master) for different configurations.  The label
-%             ``Hadoop'' represents Yahoo's MapReduce implementation;
-%             \tc for Hadoop is without chunking, which takes
-%             several hundred sec for larger data-sets.  The ``SAGA
-%             MapReduce + Local FS'' corresponds to the use of the local
-%             FS on Linux clusters, while the label ``SAGA + HDFS''
-%             corresponds to the use of HDFS on the clusters. Due to
-%             simplicity, of the Local FS, its performance beats
-%             distributed FS when used in local mode.}
-%           % It is interesting to note that as the data-set sizes get
-%           % larger, HDFS starts outperforming local FS.  We attribute
-%           % this to the use of caching and other advanced features in
-%           % HDFS which prove to be useful, even though it is not being
-%           % used in a distributed fashion.  scenarios considered are
-%           % (i) all infrastructure is local and thus SAGA's local
-%           % adapters are invoked, (ii) local job adaptors are used,
-%           % but the hadoop file-system (HDFS) is used, (iii) Yahoo's
-%           % mapreduce.
-% %      \label{saga_mapreduce_1worker.png}
-%           \label{mapreduce_timing_FS}
-% \upp
-% \end{figure}
-% Experiment 5 (Table~\ref{exp4and5}) provides insight into performance
-% figure when the same number of workers are available, but are either
-% all localized, or are split evenly between two similar but distributed
-% machines. It shows that to get lowest $T_c$, it is often required to
-% both distribute the compute and lower the workload per worker; just
-% lowering the workload per worker is not good enough as there is still
-% a point of serialization (usually local I/O).  % It shows that when
-% % workload per worker gets to a certain point, it is beneficial to
-% % distribute the workers, as the machine I/0 becomes the bottleneck.
-% When coupled with the advantages of a distributed FS, the ability to
-% both distribute compute and data provides additional performance
-% advantage, as shown by the values of $T_c$ for both distributed
-% compute and DFS cases in Table~\ref{exp4and5}.
-
-
 \section{Demonstrating Cloud-Grid Interoperabilty}
 
 In an earlier paper, we had essentially done the following: