Monday, July 9, 2007

Grid computing - A definition

As I mentioned on a previous post, one of the areas in which I have been working on lately is that of Grid computing. But what is Grid computing? As a colleague at work once said - and I believe he was not the first I heard that though from - being able to state a definition of something is a big step in understanding that something. I will try to reproduce the definition in the following lines.

So, here it goes: A Grid is a set of several resources that can be used in the form of a single, powerful resource, so that it does not matter where exactly the power that is used comes from, only that it is there as part of that very powerful and big resource. Note that I use the word resource and not computer, I do that because with this technology other elements, such as sensors, may be shared. In a Grid, physical location of the resources does not matter, nor does it matter that those resources be heterogeneous.

The last two aspects I mentioned seem to be what distinguishes a Grid from a Cluster. In a Cluster, usually all resources are homogeneous, e.g. all have the same operating system. This is not a restriction in a Grid. Also, a Cluster is usually placed as a whole inside an administrative domain, i.e. managed by a single authority (I would like to get further explanation on this), whereas a Grid can expand accross different administrative domains. In practice, what all this means is that a set of computers from anywhere in the world, belonging to different people, and running different operating systems, can have its power combined and added up into a Grid structure - and you cannot have that with a Cluster. In addition to all that, it can be noted that in a Cluster environment there is a master, whereas in a Grid environment all resources are peers.

I was just re-reading a definition from GridCafé (link some lines below) and note that it states that a Grid is aimed at sharing computing power and data storage capacity. This sets me to think if anything else could be shared. It also mentions that it is mainly directed to sharing over the Internet, though I think uses could be found for any internet (lower case i).

In order for a Grid to work, a middleware is used. A middleware (yet another definition I will intend to give here) is a layer of software placed above the operating system level, but below the application level, that makes something heterogeneous seem homogeneous and provides services to make use of it.

The most known middleware software available for grids is the Globus Toolkit (from now on, Globus). Globus is an application that provides a set of services that solve common tasks needed in a Grid environment, such as execution management - e.g. job submission -, data management - e.g. file transfer -, security-related tasks, and information services - e.g. resource monitoring and discovery -. It exposes those services as APIs that can be used from higher level applications.

Grid computing - for the sake of formality - could be defined as the use of grids.

There is a great place for Grid newbies called GridCafe, it can be accessed at http://gridcafe.web.cern.ch/gridcafe/

A last remark, that would deserve its own post: what can a Grid be used for? Well, I was reading also about this from one of IBM redpapers at http://www.redbooks.ibm.com/ called "Fundamentals of Grid Computing" (specific link is http://www.redbooks.ibm.com/abstracts/redp3613.html). I will mention only one aspect here: resources are usually dimensioned on a most-demanding case basis, for each organizational unit - think of them as being silos -. But that capacity is not used at its peak all the time in every organization unit. What if you could take advantage of idle capacity in one place, when at another place you are running out of resources? Well, your overall need for resources (in total) would decrease.

No comments: