Grid Computing
Most companies today are using precious little of the computing power available to them through the machines and software they already own. PCs, servers and mainframes all sit idle much of time, while the people who operate them are away from the office or the plant. And as a recent IBM Corp. study points out, this is a significant problem for at least three reasons. First, companies are continually being asked to do more with less, but they cannot seem to break the cycle of increasing infrastructure needs and costs. Second, there is much value locked up in infrastructure that companies would like to release in the hope that it might change the way they do business. And third, there is continual pressure on IT functions to deal with a backlog of projects and to help deploy new business capabilities (Desau, 2003).
Fortunately, there is a solution to this underutilization of computing infrastructure. At present, it is fairly easy to achieve 60% to 70% utilization on a mainframe, but most companies are using only 15% to 20% of all their computing resources across their entire infrastructure. But with the emerging practice of grid computing, companies could attain 90% in the near future.
Grid computing is a collection of distributed computing resources (memory, processing and communications technology) available over a network that appears, to an end user, as one large virtual computing system. It dynamically links far-flung computers and computing resources over the public Internet or a virtual private network on an as-needed basis. In essence, it provides computing power on demand, much like a utility.
At the technology level, grid computing is closely related to peer-to-peer (P2P) technology. A few years ago, P2P was seen as a way for users to share files directly (using Napster, for example); today it enables different types of computers and devices to communicate directly with each other, without a server in the middle. P2P will become a fundamental part of how distributed computing (another name for grid computing) evolves across the Internet and how enterprises build distributed systems internally (Fontana, 2002). In April 2002, the Global Grid Forum, a group founded by academics and researchers to establish standards for grid computing, merged with the P2P Working Group, a larger collaboration of universities and corporations. Together, they hope to marry the forum’s work on harnessing servers on a grid with P2P’s ability to connect desktops in the same fashion.
Grid computing has two main benefits: increased utilization of existing resources and increased computer power. It enables companies to share computing resources more effectively, internally at first but increasingly with partners and even through public grids. This sharing of resources will allow companies to boost their computational power while cutting costs. Grid computing will thus redefine how companies pay for and manage IT. Many approaches to grid computing are possible, but IBM believes that a utilities model — pay for what you use — will be a significant component of computing within five to 10 years. That model will enable companies to move from making one large purchase of technology to making many smaller purchases.
New Strategic Possibilities
While the use of grid computing in business is not unheard of, most of its applications thus far have been for academic and scientific purposes — such as forecasting weather, modeling nuclear explosions and analyzing seismic data. Business applications have been limited by the lack of necessary software, but some companies have been pioneers. Pratt & Whitney’s aerospace engineering division uses 5,000 workstations in three cities to run complex computations that simulate airflow through jet engines (Ricadela, 2002). Bristol-Myers Squibb Co. uses the spare computing power of more than 400 PCs to analyze the efficacy of potential drug compounds (Ricadela, 2002). Charles Schwab Corp. distributes research-and-development workloads among its servers and has announced it will soon use grid computing to give customers much faster turnaround on transactions (Greenemeier, 2003). While distributed computing can be quite useful within a single company, its full power will be realized when it transcends company boundaries.
Researchers are just beginning to identify business applications that will be appropriate for grid computing. Some suggest that general-purpose applications of grids will include videostreaming, large file transmission and shared data access (Marsan, 2003). Others believe that grid computing is good for large applications with small data transfer that don’t require shared memory (Ricadela, 2002). Still others think that applications will be “delayered” into business logic, data and application components — and the pieces deployed on different parts of the network where it makes most sense (Margulius, 2002). It is clear that the nature of applications and programming will change as a result of grid computing (Bielski, 2002). Ultimately, the vision is to enable “coordinated resource sharing and problem-solving in dynamic, multi-institutional virtual organizations” (Foster, 2002).
Two enterprises that have given considerable thought to the deployment of grid computing in a modern organization are IBM Corp. and the U.S. Department of Defense (DoD). Each has come to a similar conclusion: that real-time networks, made possible by grid computing and new peripheral devices (such as radio-frequency-identity — RFID — tags and mobile devices) will create new strategic possibilities. Such networks will change how organizations sense and respond to their environment, their customers, their partners and their competitors. To take advantage of the possibilities, a company’s sensing and responding capabilities must be well integrated, not implemented on separate platforms.
The IBM approach to these new possibilities is focused on customers. IBM believes that today’s companies are on the cusp of providing what it calls “e-business on demand” (Reuters, 2003). This type of e-business will be a matter of real-time responsiveness, customized solutions, resilience in the face of threats and opportunities, and focus on core competencies, with other functions outsourced to strategic partners. Although few companies have these capabilities at present, about 3% of IBM’s largest customers do.
The DoD approach is focused on operations. The government agency believes that because of the growth of standardized communication protocols, network devices and high-speed data access, it is now possible to collect, create, distribute and exploit information in an extremely heterogeneous global-computing environment. The payoff comes from information-intensive interactions between large numbers of heterogeneous computational nodes on the network. Value is derived from the content, quality and timeliness of the information moving across the network. In the business world, Wal-Mart Stores Inc. is a classic example of how the shift to network-centric operations enabled the company to obtain superior information which, in turn, led to competitive advantage.
Making Grid Computing Possible
While the vision of grid computing is promising, the challenges of managing a grid infrastructure are daunting. Some of the difficulties include the heterogeneity of devices on the grid; the need to operate in geographically dispersed and complex environments; the unpredictability of system performance and behavior over time; and the existence of multiple administrative domains. Usage-sensitive pricing, service-level agreements and network-security issues introduce other serious concerns.
To enable the dynamic sharing of resources and the development of general-purpose services and tools, the grid-computing vision requires protocols, interfaces and policies that are standard, open and general in purpose. The Globus Project, funded by the National Science Foundation and the DoD, has been working on these problems since 1992. It is developing the Globus Toolkit, which has emerged as the de facto standard for connecting computers, databases and instruments. The toolkit is made up of open-architecture, open-source software designed to make it easier to build grid-based applications. Tools are being developed in five major areas: resource management (includes uniform, scalable mechanisms for naming, locating and allocating computational and communications resources in distributed systems); application-development environments (the integration of grid services into existing frameworks, environments and languages); data management and access; information services; and security.
Version three of the toolkit includes a software layer called the Open Grid Services Architecture (backed by the Global Grid Forum). This is a protocol that lets applications call upon grid services to find computers, balance workloads and transfer jobs using standard Web Services Description Language (WSDL). In August 2002, IBM ported the Globus Toolkit to its four computing platforms. Hewlett-Packard Co. and Microsoft Corp. have also begun integrating the Globus Toolkit into their products, notably Windows (Ricadela, 2002).
While companies may not yet be ready to move to grid computing, they can prepare for it. The new technical architecture is based on two critical principles: open standards and self-organization. Open standards are essential to future computing architectures because they make external networking beyond organizational boundaries possible. In the future, companies should seek out nonproprietary technology in order to be able to take advantage of such standards. They should also note that the internal integration that is achieved with technologies such as enterprise-resource planning (ERP) and customer-relationship management (CRM) is not going to help organizations evolve in the future. Hardwired technologies will actually ossify existing organizational structures, preventing the flexibility companies will need to operate in a network-centric environment.
Companies must also prepare for technologies that are self-organizing —and self-optimizing and self-changing. While these features have not, as yet, been fully developed in most technologies, companies should be aware of the principles of self-organization and incorporate them as much as possible into their selection criteria and general computing-architecture principles (Huang, 2003). CIOs should explore opportunities to improve the utilization of their organizations’ IT resources through the use of grid computing for particular applications.
Technical and Organizational Challenges
Grid computing will change the economics of IT utilization, making new types of applications and services more cost-effective. It will also enable a variety of different technologies to be seamlessly interconnected to create vast networks capable of gathering huge amounts of data. Organizations that develop the capabilities to collect, digest and act on that information will achieve quantum leaps in flexibility and agility in responding to customers, risks and competitive threats. Nevertheless, there are many challenges that must be overcome before the potential of this new architecture can be achieved.
To take advantage of the sense-and-respond capabilities enabled by grid computing, leaders will have to make considerable changes in how they and their organizations operate. For example, centralized command-and-control structures will no longer be effective. Networks will enable front-line workers to organize and synchronize their activities from the bottom up. Top managers will therefore have to relinquish some degree of decision-making control in favor of an accelerated ability to respond. The new environment will also have implications for the role of the CIO (Desau, 2003). As on-demand computing becomes more available and less complex, some CIOs will find their jobs reduced in scope and will become IT directors. Others, however, will increasingly take on strategic responsibilities. The IT function will also change from being a cost center to a “transformation center” focused on freeing up value. Fortunately, companies have some time to explore how best to achieve the changes implicit in the new architecture — technically, strategically and behaviorally — before they become imperative.