Concurrency and Computation: Practice and Experience

Published Papers for 2001

Journal Vision

WILEY Journal Home Page

Papers under review through 2004

Editorial Board


Concurrency and Computation: Practice and Experience Volume 13 2001

Article ID: CPE544

Publisher John Wiley & Sons, Ltd. Chichester, UK
Category Editorial
Article Title Special Issue: High Performance Agent Systems
Volume ID 13
Issue ID 1
Date Jan 1 2001
DOI(URI) http://dx.doi.org/10.1002/cpe.544
Article ID CPE544
Author Name(s) Omer F. Rana1David Kotz2
Author Email(s)
Affiliation(s) 1 2
Keyword(s)
Abstract
No abstract

Article ID: CPE545

Publisher John Wiley & Sons, Ltd. Chichester, UK
Category Research Article
Article Title Dynamic configuration of access control for mobile components in FarGo
Volume ID 13
Issue ID 1
Date Jan 1 2001
DOI(URI) http://dx.doi.org/10.1002/cpe.545
Article ID CPE545
Author Name(s) Yoad Gidron1Israel Ben-Shaul2Ophir Holder3Yariv Aridor4
Author Email(s) yariv@il.ibm.com4
Affiliation(s) Department of Electrical Engineering, Technion-Israel Institute of Technology, Haifa, 32000, Israel 1 2 3 IBM Research Laboratory in Haifa, MATAM, 31905 Haifa, Israel 4
Keyword(s) mobile components, dynamic layout, access control, FarGo,
Abstract
Component mobility is an important enabling technology for the design of wide area pervasive applications, but it introduces new challenges in the critical aspect of access control. In particular, when mobility is used for dynamic relocation of distributed components, access from both remote and local mobile components needs to be uniformly controlled. The dynamic determination of execution location, possibly crossing multiple administrative authorities, requires dynamic establishment and enforcement of access control. The deployment over widely heterogeneous hosts and devices requires integration of access control with dynamic probing of resource availability so as to influence the relocation process. This paper presents a model for dynamic specification and enforcement of access control in the context of dynamically relocatable components, and an implementation in the Java-based FarGo framework. The specification follows a negotiation-based protocol that enables dynamic matching of available and required resources by providers and consumers, respectively. Enforcement is provided through a capability-based secure component reference architecture, which uniformly applies to both local and remote references, and through instance-level, as opposed to type-level (supported in Java), access control. Finally, access control is integrated into the programming model in a non-intrusive fashion, by separating the encoding of access control from the encoding of the logic of the application. Copyright © 2001 John Wiley & Sons, Ltd.

Article ID: CPE546

Publisher John Wiley & Sons, Ltd. Chichester, UK
Category Research Article
Article Title Performance investigation of an on-line auction system
Volume ID 13
Issue ID 1
Date Jan 1 2001
DOI(URI) http://dx.doi.org/10.1002/cpe.546
Article ID CPE546
Author Name(s) Jane Hillston1Leïla Kloul2
Author Email(s) jeh@dcs.ed.ac.uk1
Affiliation(s) LFCS, University of Edinburgh, Kings Buildings, Edinburgh EH9 3JZ, U.K. 1PRiSM, Université de Versailles, 45, Av. des Etats-Unis, 78035 Versailles Cedex, France 2
Keyword(s) active networks, agent systems, electronic commerce, load balancing, stochastic process algebra,
Abstract
The standard design of on-line auction systems places most of the computational load on the server and its adjacent links, resulting in a bottleneck in the system. In this paper, we investigate the impact, in terms of the performance of the server and its adjacent links, of introducing active nodes into the network. The performance study of the system is done using the stochastic process algebra formalism PEPA. Copyright © 2001 John Wiley & Sons, Ltd.

Article ID: CPE547

Publisher John Wiley & Sons, Ltd. Chichester, UK
Category Research Article
Article Title Fault-tolerant holonic manufacturing systems
Volume ID 13
Issue ID 1
Date Jan 1 2001
DOI(URI) http://dx.doi.org/10.1002/cpe.547
Article ID CPE547
Author Name(s) Martyn Fletcher1S. Misbah Deen2
Author Email(s) deen@cs.keele.ac.uk2
Affiliation(s) Data and Knowledge Engineering Group, Department of Computer Science, Keele University, Keele, Staffordshire ST5 5BG, U.K. 1 2
Keyword(s) fault-tolerance, agent-based manufacturing, holonic manufacturing,
Abstract
This paper presents a model of fault-tolerant holonic manufacturing systems (HMS) where each holon"s activities are controlled by an intelligent software agent. Multiple agents schedule actions, resolve conflicts and manage information to produce, transport, assemble, inspect and store customized products. Our model provides robustness and distribution transparency across a shop-floor where unpredictable failures occur with machines, control software and communication networks. Each autonomous holon is composed of a hierarchy of large-grain functional components where interaction is carried out by user-defined cooperation strategies. These strategies enable holons to coordinate their behaviour through exchanging messages and sensing/actuating of their shared environment. Therefore, holonic agents can select suitable rescheduling and recovery mechanisms to tolerate faults and keep the manufacturing system working. We also propose how the IEC 1499 standard (Function Block Architecture) for distributed control systems could be used to implement our model. The model presented here is a crystallization of some abstract concepts from a generic cooperating agent system, with suitable extensions to meet the criteria of the ongoing HMS project. Copyright © 2001 John Wiley & Sons, Ltd.

Article ID: CPE548

Publisher John Wiley & Sons, Ltd. Chichester, UK
Category Research Article
Article Title Which paradigm should I use? An analytical comparison of the client-server, remote evaluation and mobile agent paradigms
Volume ID 13
Issue ID 1
Date Jan 1 2001
DOI(URI) http://dx.doi.org/10.1002/cpe.548
Article ID CPE548
Author Name(s) Antonio Puliafito1Salvatore Riccobene2Marco Scarpa3
Author Email(s) apulia@ingegneria.unime.it1
Affiliation(s) Dipartimento di Matematica - Contrada Papardo, S. Speroni, 98166 Messina, Italy 1Dipartimento di Matematica - V. le A. Doria 6, 95125, Catania, Italy 2Istituto di Informatica e Telecomunicazioni - V. le A. Doria 6, 95125, Catania, Italy 3
Keyword(s) client-server, remote evaluation, mobile agents, performance analysis, Petri nets,
Abstract
In this paper we deal with the study of the actual convenience of using the agent programming paradigm for accesssing distributed service. We try to point out the benefits of such a communication paradigm, by providing an analytical study of its basic features in comparison with the client-server approach and remote evaluation. The aim of the paper is to show how the Petri net analysis technique can be used for deciding whether to use traditional client/server, remote evaluation or mobile agents paradigm in designing a particular evaluation. So, we present several models of non-Markovian Petri nets, which have been solved through the WebSPN tool, and we provide a close comparison between the agents technique, the client-server and the remote evaluation communication paradigm. The results that we have obtained show how agents must not always be considered the only solution to any communication issue, since in several cases their use might even reveal a drawback. We also focus out attention on providing some practical remarks, which can help the developer during the design in order to select the communication paradigm which best suits the features of the application that has to be developed. Copyright © 2001 John Wiley & Sons, Ltd.

Article ID: CPE582

Publisher John Wiley & Sons, Ltd. Chichester, UK
Category Editorial
Article Title Editorial: Concurrency and Computation: Practice and Experience
Volume ID 13
Issue ID 1
Date Jan 1 2001
DOI(URI) http://dx.doi.org/10.1002/cpe.582
Article ID CPE582
Author Name(s) Geoffrey C. Fox1Anthony J. G. Hey2
Author Email(s)
Affiliation(s) 1 2
Keyword(s)
Abstract
No abstract

Article ID: CPE539

Publisher John Wiley & Sons, Ltd. Chichester, UK
Category Research Article
Article Title A parallel Viterbi decoding algorithm
Volume ID 13
Issue ID 2
Date Feb 1 2001
DOI(URI) http://dx.doi.org/10.1002/cpe.539
Article ID CPE539
Author Name(s) J. S. Reeve1
Author Email(s) jsr@ecs.soton.ac.uk1
Affiliation(s) Department of Electronics and Computer Science, University of Southampton, Southampton SO17 1BJ, U.K. 1
Keyword(s) trellis decoding, Viterbi decoding, BCH codes,
Abstract
In this paper we express the Viterbi algorithm as a matrix-vector reduction in which multiplication is replaced by addition and addition by minimization. The resulting algorithm is then readily parallelized in a form suitable for implementation on a systolic processor array. We describe the algorithm for Bose-Chaudhuri-Hocquenghem (BCH) codes which have a task graph with its valence restricted to four inputs and four outputs. The method is also applicable to convolution codes, but the complexity of the task graph increases with the number of input bits for these codes. Results for BCH codes are given for two general purpose parallel machines, an IBM SP2 and a Meiko CS2. Copyright © 2001 John Wiley & Sons, Ltd.

Article ID: CPE549

Publisher John Wiley & Sons, Ltd. Chichester, UK
Category Research Article
Article Title Emmerald: a fast matrix-matrix multiply using Intel"s SSE instructions
Volume ID 13
Issue ID 2
Date Feb 1 2001
DOI(URI) http://dx.doi.org/10.1002/cpe.549
Article ID CPE549
Author Name(s) Douglas Aberdeen1Jonathan Baxter2
Author Email(s) daa@csl.anu.edu.au1
Affiliation(s) Research School of Information Sciences and Engineering, Australian National University, Canberra, Australia 1 2
Keyword(s) GEMM, SIMD, SSE, matrix multiply, deep memory hierarchy,
Abstract
Generalized matrix-matrix multiplication forms the kernel of many mathematical algorithms, hence a faster matrix-matrix multiply immediately benefits these algorithms. In this paper we implement efficient matrix multiplication for large matrices using the Intel Pentium single instruction multiple data (SIMD) floating point architecture. The main difficulty with the Pentium and other commodity processors is the need to efficiently utilize the cache hierarchy, particularly given the growing gap between main-memory and CPU clock speeds. We give a detailed description of the register allocation, Level 1 and Level 2 cache blocking strategies that yield the best performance for the Pentium III family. Our results demonstrate an average performance of 2.09 times faster than the leading public domain matrix-matrix multiply routines and comparable performance with Intel"s SIMD small matrix-matrix multiply routines. Copyright © 2001 John Wiley & Sons, Ltd.

Article ID: CPE550

Publisher John Wiley & Sons, Ltd. Chichester, UK
Category Research Article
Article Title Experiences from integrating algorithmic and systemic load balancing strategies
Volume ID 13
Issue ID 2
Date Feb 1 2001
DOI(URI) http://dx.doi.org/10.1002/cpe.550
Article ID CPE550
Author Name(s) Ioana Banicescu1Sheikh Ghafoor2Vijay Velusamy3Samuel H. Russ4Mark Bilderback5
Author Email(s) ioana@erc.msstate.edu1
Affiliation(s) Mississippi State University, Department of Computer Science, NSF Engineering Research Center for Computational Field Simulation, MS 39762, U.S.A. 1 2 3 4 5
Keyword(s) load balancing, dynamic scheduling, resource management, parallel applications, distributed runtime environment,
Abstract
Load balancing increases the efficient use of existing resources for parallel and distributed applications. At a coarse level of granularity, advances in runtime systems for parallel programs have been proposed in order to control available resources as efficiently as possible by utilizing idle resources and using task migration. Simultaneously, at a finer granularity level, advances in algorithmic strategies for dynamically balancing computational loads by data redistribution have been proposed in order to respond to variations in processor performance during the execution of a given parallel application. Combining strategies from each level of granularity can result in a system which delivers advantages of both. The resulting integration is systemic in nature and transfers the responsibility of efficient resource utilization from the application programmer to the runtime system. This paper presents the design and implementation of a system that combines an algorithmic fine-grained data parallel load balancing strategy with a systemic coarse-grained task-parallel load balancing strategy, and reports on recent experimental results of running a computationally intensive scientific application under this integrated system. The experimental results indicate that a distributed runtime environment which combines both task and data migration can provide performance advantages with little overhead. It also presents proposals for performance enhancements of the implementation, as well as future explorations for effective resource management. Copyright © 2001 John Wiley & Sons, Ltd.

Article ID: CPE551

Publisher John Wiley & Sons, Ltd. Chichester, UK
Category Research Article
Article Title Analysis and measurement of the effect of kernel locks in SMP systems
Volume ID 13
Issue ID 2
Date Feb 1 2001
DOI(URI) http://dx.doi.org/10.1002/cpe.551
Article ID CPE551
Author Name(s) Akihiro Kaieda1Yasuichi Nakayama2Atsuhiro Tanaka3Takashi Horikawa4Toshiyasu Kurasugi5Issei Kino6
Author Email(s) yasu@cs.uec.ac.jp2
Affiliation(s) Department of Computer Science, The University of Electro-Communications, Chofu, Tokyo 182-8585, Japan 1 2 NEC Corporation, Kawasaki 216-8555, Japan 3 4 5 6
Keyword(s) SMP systems, operating systems, parallel programs, performance evaluation, kernel lock,
Abstract
This article reports the use of case studies to evaluate the performance degradation caused by the kernel-level lock. We define the lock ratio as a ratio of the execution time for critical sections to the total execution time of a parallel program. The kernel-level lock ratio determines how effective programs work on symmetric multiprocessor (SMP) systems. We have measured the lock ratios and the performance of three types of parallel programs on SMP systems with Linux 2.0: matrix multiplication, parallel make, and WWW server programs. Experimental results show that the higher the lock ratio of parallel programs, the worse their performance becomes. Copyright © 2001 John Wiley & Sons, Ltd.

Article ID: CPE552

Publisher John Wiley & Sons, Ltd. Chichester, UK
Category Research Article
Article Title Parallel solvers for discrete-time algebric Riccati equations
Volume ID 13
Issue ID 2
Date Feb 1 2001
DOI(URI) http://dx.doi.org/10.1002/cpe.552
Article ID CPE552
Author Name(s) Rafael Mayo1Enrique S. Quintana-Ortí2Gregorio Quintana-Ortí3Vicente Hernández4
Author Email(s) quintana@inf.uji.es2
Affiliation(s) Depto. de Informática, Universidad Jaume I, 12080-Castellón, Spain 1 2 3 Depto. de Sistemas Informáticos y Computación, Universidad Politécnica de Valencia, 46071-Valencia, Spain 4
Keyword(s) algebraic Riccati equations, Stein equations, linear control systems, parallel algorithms and architectures,
Abstract
We investigate the numerical solution of discrete-time algebraic Riccati equations on a parallel distributed architecture. Our solvers obtain an initial solution of the Riccati equation via the disc function method, and then refine this solution using Newton"s method. The Smith iteration is employed to solve the Stein equation that arises at each step of Newton"s method. The numerical experiments on an Intel Pentium-II cluster, connected via a Myrinet switch, report the performance and scalability of the new algorithms. Copyright © 2001 John Wiley & Sons, Ltd.

Article ID: CPE554

Publisher John Wiley & Sons, Ltd. Chichester, UK
Category Research Article
Article Title A parallel Navier-Stokes solver for the rotating flow problem
Volume ID 13
Issue ID 3
Date March 1 2001
DOI(URI) http://dx.doi.org/10.1002/cpe.554
Article ID CPE554
Author Name(s) Rudnei Dias da Cunha1Álvaro Luiz de Bortoli2
Author Email(s) rudnei@mat.ufrgs.br1
Affiliation(s) Postgraduate Programme on Applied Mathematics, Institute of Mathematics, Federal University of Rio Grande do Sul, Brazil 1 2
Keyword(s) parallel computing, message-passing, computational fluid dynamics,
Abstract
In this paper, we investigate the parallel solution of rotating internal flow problems, using the Navier-Stokes equations as proposed by Speziale and Thangam (in 1983) and Speziale (in 1985). A Runge-Kutta time-stepping scheme was applied to the equations and both sequential and message-passing implementations were developed, the latter using MPI, and were tested on a four-processor SGI Origin200 distributed, global shared memory parallel computer and on a 32-processor IBM 9076 SP/2 distributed memory parallel computer. The results show that our approach to parallelize the sequential implementation requires little effort whilst providing good results even for medium-sized problems. Copyright © 2001 John Wiley & Sons, Ltd.

Article ID: CPE563

Publisher John Wiley & Sons, Ltd. Chichester, UK
Category Research Article
Article Title Using knowledge-based systems for research on parallelizing compilers
Volume ID 13
Issue ID 3
Date March 1 2001
DOI(URI) http://dx.doi.org/10.1002/cpe.563
Article ID CPE563
Author Name(s) Chao-Tung Yang1Shian-Shyong Tseng2Yun-Woei Fann3Ting-Ku Tsai4Ming-Huei Hsieh5Cheng-Tien Wu6
Author Email(s) ctyang@nspo.gov.tw1
Affiliation(s) Ground System Section, National Space Program Office, 8F, No. 9 Prosperity 1st Road, Science-Based Industrial Park, Hsinchu, Taiwan 300, ROC 1Department of Computer&Information Science, National Chiao Tung University, Hsinchu, Taiwan 300, ROC 2 3 4 5 6
Keyword(s) parallelizing compiler, knowledge-based system, loop parallelization, multithreaded OS, program restructuring,
Abstract
The main function of parallelizing compilers is to analyze sequential programs, in particular the loop structure, to detect hidden parallelism and automatically restructure sequential programs into parallel subtasks that are executed on a multiprocessor. This article describes the design and implementation of an efficient parallelizing compiler to parallelize loops and achieve high speedup rates on multiprocessor systems. It is well known that the execution efficiency of a loop can be enhanced if the loop is executed in parallel or partially parallel, such as in a DOALL or DOACROSS loop. This article also reviews a practical parallel loop detector (PPD) that is implemented in our PFPC on finding the parallelism in loops. The PPD can extract the potential DOALL and DOACROSS loops in a program by verifying array subscripts. In addition, a new model by using knowledge-based approach is proposed to exploit more loop parallelisms in this paper. The knowledge-based approach integrates existing loop transformations and loop scheduling algorithms to make good use of their ability to extract loop parallelisms. Two rule-based systems, called the KPLT and IPLS, are then developed using repertory grid analysis and attribute-ordering tables respectively, to construct the knowledge bases. These systems can choose an appropriate transform and loop schedule, and then apply the resulting methods to perform loop parallelization and obtain a high speedup rate. For example, the IPLS system can choose an appropriate loop schedule for running on multiprocessor systems. Finally, a runtime technique based on the inspector/executor scheme is proposed in this article for finding available parallelism on loops. Our inspector can determine the wavefronts of a loop with any complex indirected array-indexing pattern by building a DEF-USE table. The inspector is fully parallel without any synchronization. Experimental results show that the new method can resolve any complex data dependence patterns where no previous research can. One of the ultimate goals is to construct a high-performance and portable FORTRAN parallelizing compiler on shared-memory multiprocessors. We believe that our research may provide more insight into the development of a high-performance parallelizing compiler. Copyright © 2001 John Wiley & Sons, Ltd.

Article ID: CPE564

Publisher John Wiley & Sons, Ltd. Chichester, UK
Category Research Article
Article Title Redistribution strategies for portable parallel FFT: a case study
Volume ID 13
Issue ID 3
Date March 1 2001
DOI(URI) http://dx.doi.org/10.1002/cpe.564
Article ID CPE564
Author Name(s) Anshu Dubey1Daniele Tessera2
Author Email(s) dubey@tagore.uchicago.edu1
Affiliation(s) Astronomy&Astrophysics, University of Chicago, 5640 S. Ellis Ave., Chicago, IL 60637, U.S.A. 1Dipartimento di Informatica e Sistemistica, Università degli Studi di Pavia, I-27100 Pavia, Italy 2
Keyword(s) FFT, distributed transpose, MPI; communication policy,
Abstract
The best approach to parallelize multidimensional FFT algorithms has long been under debate. Distributed transposes are widely used, but they also vary in communication policies and hence performance. In this work we analyze the impact of different redistribution strategies on the performance of parallel FFT, on various machine architectures. We found that some redistribution strategies were consistently superior, while some others were unexpectedly inferior. An in-depth investigation into the reasons for this behavior is included in this work. Copyright © 2001 John Wiley & Sons, Ltd.

Article ID: CPE565

Publisher John Wiley & Sons, Ltd. Chichester, UK
Category Research Article
Article Title A distributed implementation of a virtual machine for Java
Volume ID 13
Issue ID 3
Date March 1 2001
DOI(URI) http://dx.doi.org/10.1002/cpe.565
Article ID CPE565
Author Name(s) Yariv Aridor1Michael Factor2Avi Teperman3
Author Email(s) teperman@il.ibm.com3
Affiliation(s) IBM Haifa Research Laboratory, Matam, Advanced Technology Center, Haifa 31905, Israel 1 2 3
Keyword(s) cluster machine, Java, objects, Java Virtual Machine, single system image,
Abstract
The cluster virtual machine (VM) for Java provides a single system image of a traditional Java Virtual Machine (JVM) while executing in a distributed fashion on the nodes of a cluster. The cluster VM for Java virtualizes the cluster, supporting any pure Java application without requiring that application be tailored specifically for it. The aim of our cluster VM is to obtain improved scalability for a class of Java Server Applications by distributing the application"s work among the cluster"s computing resources. The implementation of the cluster VM for Java is based on a novel object model which distinguishes between an application"s view of an object (e.g. every object is a unique data structure) and its implementation (e.g. objects may have consistent replications on different nodes). This enables us to exploit knowledge on the use of individual objects to improve performance (e.g. using object replications to increase locality of access to objects). We have already completed a prototype that runs pure Java applications on a cluster of NT workstations connected by a Myrinet fast switch. The prototype provides a single system image to applications, distributing the application"s threads and objects over the cluster. We used the cluster VM to run, without change, a real Java Server Application containing over 10 Kloc1Kloc means Kilo lines of code-used to describe the size of applications in terms of source lines count. for the source code and achieved high scalability for it on a cluster. We also achieved linear speedup for another application with a large number of independent threads. This paper discusses the architecture and implementation of the cluster VM. It focuses on achieving a single system image for a traditional JVM on a cluster while describing, in short, how we aim to obtain scalability. Copyright © 2001 John Wiley & Sons, Ltd.

Article ID: CPE556

Publisher John Wiley & Sons, Ltd. Chichester, UK
Category Research Article
Article Title Object-oriented analysis and design of the Message Passing Interface
Volume ID 13
Issue ID 4
Date Apr 10 2001
DOI(URI) http://dx.doi.org/10.1002/cpe.556
Article ID CPE556
Author Name(s) Anthony Skjellum1Diane G. Wooley2Ziyang Lu3Michael Wolf4Purushotham V. Bangalore5Andrew Lumsdaine6Jeffrey M. Squyres7Brian McCandless8
Author Email(s) tony@cs.msstate.edu1
Affiliation(s) Engineering Research Center and Department of Computer Science, Mississippi State University, Mississippi State, MS 39762, U.S.A. 1 2 3 4 5 Laboratory for Scientific Computing, Department of Computer Science and Engineering, University of Notre Dame, Notre Dame, IN 46556, U.S.A. 6 7 8
Keyword(s) message passing, parallel programming, object-oriented application programmer interface, design patterns, MPI-1, MPI-2,
Abstract
The major contribution of this paper is the application of modern analysis techniques to the important Message Passing Interface standard, work done in order to obtain information useful in designing both application programmer interfaces for object-oriented languages, and message passing systems. Recognition of ‘Design Patterns’ within MPI is an important discernment of this work. A further contribution is a comparative discussion of the design and evolution of three actual object-oriented designs for the Message Passing Interface ( MPI-1SF ) application programmer interface (API), two of which have influenced the standardization of C++ explicit parallel programming with MPI-2, and which strongly indicate the value of a priori object-oriented design and analysis of such APIs. Knowledge of design patterns is assumed herein. Discussion provided here includes systems developed at Mississippi State University (MPI++), the University of Notre Dame (OOMPI), and the merger of these systems that results in a standard binding within the MPI-2 standard. Commentary concerning additional opportunities for further object-oriented analysis and design of message passing systems and APIs, such as MPI-2 and MPI/RT, are mentioned in conclusion. Connection of modern software design and engineering principles to high performance computing programming approaches is a new and important further contribution of this work. Copyright © 2001 John Wiley & Sons, Ltd.

Article ID: CPE570

Publisher John Wiley & Sons, Ltd. Chichester, UK
Category Object Systems Paper
Article Title Strong types for coordinating active objects
Volume ID 13
Issue ID 4
Date Apr 10 2001
DOI(URI) http://dx.doi.org/10.1002/cpe.570
Article ID CPE570
Author Name(s) Franz Puntigam1
Author Email(s) franz@complang.tuwien.ac.at1
Affiliation(s) Technische Universität Wien, Institut für Computersprachen, Argentinierstraße 8, A-1040 Vienna, Austria 1
Keyword(s) type systems, concurrent programming languages, active objects,
Abstract
An object type is usually regarded as a contract between an object and each of its clients. However, in concurrent (and sometimes also in sequential) systems it is more useful to regard a type as a contract between an object and the set of all clients: when the object acts as a shared resource, the clients must be coordinated before sending messages to the object. Process types of active objects specify constraints on the ordering of messages. Static type checking ensures proper coordination of clients so that objects receive messages only in acceptable orderings. As presented in this article, the process type system is based on a simple computation model where active objects communicate via asynchronous message passing. Copyright © 2001 John Wiley & Sons, Ltd.

Article ID: CPE555

Publisher John Wiley & Sons, Ltd. Chichester, UK
Category Research Article
Article Title Algorithmic support for model transformation in object-oriented software development
Volume ID 13
Issue ID 5
Date Apr 25 2001
DOI(URI) http://dx.doi.org/10.1002/cpe.555
Article ID CPE555
Author Name(s) Siegfried Schönberger1Rudolf K. Keller2Ismail Khriss3
Author Email(s) schoenbe@iro.umontreal.ca1
Affiliation(s) Département d"informatique et de recherche opérationelle, Université de Montréal, C.P. 6128, succursale Centre-ville, Montréal, Québec H3C 3JZ, Canada 1 2 3
Keyword(s) object-oriented analysis and design, model transformation, transformation algorithm, Unified Modeling Language (UML), collaboration diagram, state transition diagram,
Abstract
Current methods for object-oriented software development provide notation for the specification of models, yet do not sufficiently relate the different model types to each other, nor do they provide support for transformations from one model type to another. This makes transformations a manual activity, which increases the risk of inconsistencies among models and may lead to a loss of information. We have developed and implemented an algorithm supporting one of the transitions from analysis to design, the transformation of scenario models into behavior models. This algorithm supports the Unified Modelling Language (UML), mapping the UML"s collaboration diagrams into state transition diagrams. We believe that CASE tools implementing such algorithms will be highly beneficial in object-oriented software development. In this paper, we provide an overview of our algorithm and discuss all its major steps. The algorithm is detailed in semi-formal English and illustrated with a number of examples. Furthermore, the algorithm is assessed from different perspectives, such as scope and role in the overall development process, issues in the design of the algorithm, complexity, implementation and experimentation, and related work. Copyright © 2001 John Wiley & Sons, Ltd.

Article ID: CPE569

Publisher John Wiley & Sons, Ltd. Chichester, UK
Category Research Article
Article Title A new parallel domain decomposition method for the adaptive finite element solution of elliptic partial differential equations
Volume ID 13
Issue ID 5
Date Apr 25 2001
DOI(URI) http://dx.doi.org/10.1002/cpe.569
Article ID CPE569
Author Name(s) Randolph E. Bank1Peter K. Jimack2
Author Email(s)
Affiliation(s) Department of Mathematics, University of California at San Diego, La Jolla, CA 92093, U.S.A. 1School of Computing, University of Leeds, Leeds LS2 9JT, U.K. 2
Keyword(s) partial differential equations, parallel computing, domain decomposition, mesh adaptivity, finite element method,
Abstract
We present a new domain decomposition algorithm for the parallel finite element solution of elliptic partial differential equations. As with most parallel domain decomposition methods each processor is assigned one or more subdomains and an iteration is devised which allows the processors to solve their own subproblem(s) concurrently. The novel feature of this algorithm however is that each of these subproblems is defined over the entire domain-although the vast majority of the degrees of freedom for each subproblem are associated with a single subdomain (owned by the corresponding processor). This ensures that a global mechanism is contained within each of the subproblems tackled and so no separate coarse grid solve is required in order to achieve rapid convergence of the overall iteration. Furthermore, by following the paradigm introduced in 15, it is demonstrated that this domain decomposition solver may be coupled easily with a conventional mesh refinement code, thus allowing the accuracy, reliability and efficiency of mesh adaptivity to be utilized in a well load-balanced manner. Finally, numerical evidence is presented which suggests that this technique has significant potential, both in terms of the rapid convergence properties and the efficiency of the parallel implementation. Copyright © 2001 John Wiley & Sons, Ltd.

Article ID: CPE571

Publisher John Wiley & Sons, Ltd. Chichester, UK
Category Research Article
Article Title Modeling and testing object-oriented distributed systems with linear-time temporal logic
Volume ID 13
Issue ID 5
Date Apr 25 2001
DOI(URI) http://dx.doi.org/10.1002/cpe.571
Article ID CPE571
Author Name(s) F. Dietrich1X. Logean2J.-P. Hubaux3
Author Email(s) falk@sslab.sony.co.jp1
Affiliation(s) Institute for Computer Communications and Applications (ICA), Swiss Federal Institute of Technology, CH-1015 Lausanne, Switzerland 1 2 3
Keyword(s) formal model, event-based behavioral abstraction, temporal logic,
Abstract
We present a framework for constructing formal models of object-oriented distributed systems and a property language to express behavioral constraints in such models. Most of the existing models have their origin in specific mathematical notations and/or concepts. In contrast, we have developed our model such that it accounts for a large set of phenomena associated with industrial implementations of object-oriented distributed systems. The model that we propose, while closer to industrial concerns and practice, still has the powerful features of formal approaches. It also offers the possibility to automatically check at service run-time that the final service implementation has not violated and is not violating properties expressed at the abstraction level of our model. In our model, which relies on event-based behavioral abstraction, we use linear-time temporal logic as the underlying formalism for the specification of properties. We introduce two novel operators which are especially useful for object-oriented systems and which provide a number of advantages over the well-known temporal logic operators. A recent decision of one of our industrial partners to adopt our proposal into one of their development platforms can be seen as a strong evidence of the relevance of our work and as a promising step towards a better understanding between the academic formal methods community and industry. Copyright © 2001 John Wiley & Sons, Ltd.

Article ID: CPE583

Publisher John Wiley & Sons, Ltd. Chichester, UK
Category Research Article
Article Title Feature-oriented programming: A new way of object compositionA preliminary version of this article appeared under the title ‘Feature-Oriented Programming: A Fresh Look at Objects’ at ECOOP 1997.
Volume ID 13
Issue ID 6
Date May 1 2001
DOI(URI) http://dx.doi.org/10.1002/cpe.583
Article ID CPE583
Author Name(s) Christian Prehofer1
Author Email(s) Christian.Prehofer@mch.siemens.de1
Affiliation(s) TU München, 80290 Munich, Germany 1
Keyword(s) feature model, Java, object composition,
Abstract
We propose a new model for flexible composition of objects from a set of features. Features are services of an object and are similar to classes in object-oriented languages. In many cases, features have to be adapted in the presence of other features, which is also called the feature interaction problem. We introduce explicit interaction handlers which can adapt features to other features by overriding methods. When features are composed, the appropriate interaction handling is added in a way which generalizes inheritance and aggregation. For a set of features, an exponential number of different feature combinations is possible, based on a quadratic number of interaction resolutions. We present the feature model as an extension of Java and give two translations to Java, one via inheritance and the other via aggregation. We show that the feature model interacts nicely with several common language extensions such as type parameters, exceptions, and higher-order functions. Copyright © 2001 John Wiley & Sons, Ltd.

Article ID: CPE584

Publisher John Wiley & Sons, Ltd. Chichester, UK
Category Research Article
Article Title Effective multicast programming in large scale distributed systems
Volume ID 13
Issue ID 6
Date May 1 2001
DOI(URI) http://dx.doi.org/10.1002/cpe.584
Article ID CPE584
Author Name(s) Patrick Th. Eugster1Romain Boichat2Rachid Guerraoui3Joe Sventek4
Author Email(s) Patrick.Eugster@epfl.ch1
Affiliation(s) Swiss Federal Institute of Technology, Lausanne 1 2 3 Agilent Laboratories Scotland, Edinburgh, U.K. 4
Keyword(s) concurrency, scalability, reliability, multicast, membership, partitions,
Abstract
Many distributed applications have a strong requirement for efficient dissemination of large amounts of information to widely spread consumers in large networks. These include applications in e-commerce and telecommunication. Publish/subscribe is considered one of the most important interaction styles with which to model communication on a large scale. Producers publish information on a topic and consumers subscribe to the topics they wish to be informed of. The decoupling of producers and consumers in time, space, and flow makes the publish/subscribe paradigm very attractive for large scale distribution, especially in environments like the Internet. This paper describes the architecture and implementation of DACE (Distributed Asynchronous Computing Environment), a framework for publish/subscribe communication based on an object-oriented programming abstraction in the form of Distributed Asynchronous Collection (DAC). DACs capture the variants of publish/subscribe, without blurring their respective advantages. The architecture we present is tolerant of network partitions and crash failures. The underlying model is based on the notion of Topic Membership: a weak membership for the parties involved in a topic. We present how Topic Membership enables the realization of a robust and efficient reliable multicast on a large scale. The protocol ensures that, inside a topic, even a subscriber who is temporarily partitioned away eventually receives a published message. Copyright © 2001 John Wiley & Sons, Ltd.

Article ID: CPE585

Publisher John Wiley & Sons, Ltd. Chichester, UK
Category Research Article
Article Title Implementing a dynamic processor allocation policy for multiprogrammed parallel applications in the SolarisTM
Volume ID 13
Issue ID 6
Date May 1 2001
DOI(URI) http://dx.doi.org/10.1002/cpe.585
Article ID CPE585
Author Name(s) Kelvin K. Yue1David J. Lilja2
Author Email(s) lilja@ece.umn.edu2
Affiliation(s) Sun Microsystems Inc., 901 San Antonio Road, MS MPK17-302, Palo Alto, CA 94303, U.S.A. 1Department of Electrical and Computer Engineering, Minnesota Supercomputing Institute, University of Minnesota, Minneapolis, MN 55455, U.S.A. 2
Keyword(s) multiprogramming, process scheduling, processor allocation, operating system, parallelizing compiler, shared-memory multiprocessor, performance measurement,
Abstract
Parallel applications typically do not perform well in a multiprogrammed environment that uses time-sharing to allocate processor resources to the applications" parallel threads. Co-scheduling related parallel threads, or statically partitioning the system, often can reduce the applications" execution times, but at the expense of reducing the overall system utilization. To address this problem, there has been increasing interest in dynamically allocating processors to applications based on their resource demands and the dynamically varying system load. The Loop-Level Process Control (LLPC) policy (Yue K, Lilja D. Efficient execution of parallel applications in multiprogrammed multiprocessor systems. 10th International Parallel Processing Symposium, 1996; 448-456) dynamically adjusts the number of threads an application is allowed to execute based on the application"s available parallelism and the overall system load. This study demonstrates the feasibility of incorporating the LLPC strategy into an existing commercial operating system and parallelizing compiler and provides further evidence of the performance improvement that is possible using this dynamic allocation strategy. In this implementation, applications are automatically parallelized and enhanced with the appropriate LLPC hooks so that each application interacts with the modified version of the Solaris operating system. The parallelism of the applications are then dynamically adjusted automatically when they are executed in a multiprogrammed environment so that all applications obtain a fair share of the total processing resources. Copyright © 2001 John Wiley & Sons, Ltd.

Article ID: CPE562

Publisher John Wiley & Sons, Ltd. Chichester, UK
Category Research Article
Article Title Access to SAP"s Business Framework from Java-based applications
Volume ID 13
Issue ID 7
Date June 1 2001
DOI(URI) http://dx.doi.org/10.1002/cpe.562
Article ID CPE562
Author Name(s) M. Aleksy1A. Korthaus2
Author Email(s) korthaus@wifo3.uni-mannheim.de2
Affiliation(s) University of Mannheim, Lehrstuhl für Wirtschaftsinformatik 3, Schloß, D-68131 Mannheim, Germany 1 2
Keyword(s) SAP"s Business Framework, Business Objects, interoperability, Java,
Abstract
As the leading vendor of enterprise business standard software, SAP has recognized the need to adapt their R/3 system to current trends in software development and to meet market needs for speed of development, flexibility, openness, and interoperability. In this paper, we first present SAP"s approach to object-oriented and component-based technology by describing the Business Framework, the concepts of Business Objects, BAPIs, and the Business Object Repository. On this basis, we then analyze current communication architectures and products enabling the interaction of external Java- based software applications with SAP R/3, point out the advantages and disadvantages of different solutions, and finally elaborate on potential strategies and steps for driving the evolution of SAP R/3 in order to further increase interoperability, openness, and flexibility. Copyright © 2001 John Wiley & Sons, Ltd.

Article ID: CPE561

Publisher John Wiley & Sons, Ltd. Chichester, UK
Category Research Article
Article Title Experience with using CORBA to implement a file caching coordination system
Volume ID 13
Issue ID 7
Date June 1 2001
DOI(URI) http://dx.doi.org/10.1002/cpe.561
Article ID CPE561
Author Name(s) A. Sim1H. Nordberg2L. M. Bernardo3A. Shoshani4D. Rotem5
Author Email(s)
Affiliation(s) National Energy Research Scientific Computing Division, Lawrence Berkeley National Laboratory, Berkeley, CA 94720, U.S.A. 1 2 3 4 5
Keyword(s) CORBA, storage management, mass storage system, file caching,
Abstract
We describe our experience with using several CORBA products to interconnect the software modules of a fairly complex system for file caching coordination from a tertiary storage. The application area that the system was designed for is High Energy and Nuclear Physics (HENP). In this application area, the volume of data reaches hundreds of terabytes per year and therefore it is impractical to store them on disk systems. Rather the data are organized into large files and stored on robotic tape systems that are managed by some Mass Storage System (MSS). The role of the Storage Access Coordination System (STACS) that we developed is to manage the caching of files from the MSS to a large disk cache that is shared by multiple HENP analysis programs. The system design involved multiple components developed by different people at different sites and the modules could potentially be distributed as well. In this paper we describe the architecture and implementation of STACS, emphasizing the inter-module communication requirements. We describe the use of CORBA interfaces between system components, and our experience with using multi-threaded CORBA and using hundreds of concurrent CORBA connections. STACS development was recently completed and is being incorporated in an operational environment that started to produce data in the summer of 2000 [1]. Copyright © 2001 John Wiley & Sons, Ltd.

Article ID: CPE557

Publisher John Wiley & Sons, Ltd. Chichester, UK
Category Editorial
Article Title Special Issue: Distributed Objects and Applications "99
Volume ID 13
Issue ID 7
Date June 1 2001
DOI(URI) http://dx.doi.org/10.1002/cpe.557
Article ID CPE557
Author Name(s) Zahir Tari1Robert Meersman2
Author Email(s)
Affiliation(s) RMIT University, Australia 1Free University of Brussels, Belgium 2
Keyword(s)
Abstract
No abstract

Article ID: CPE558

Publisher John Wiley & Sons, Ltd. Chichester, UK
Category Research Article
Article Title Evaluating policies and mechanisms to support distributed real-time applications with CORBA
Volume ID 13
Issue ID 7
Date June 1 2001
DOI(URI) http://dx.doi.org/10.1002/cpe.558
Article ID CPE558
Author Name(s) Carlos O"Ryan1Douglas C. Schmidt2Fred Kuhns3Marina Spivak4Jeff Parsons5Irfan Pyarali6David L. Levine7
Author Email(s) schmidt@uci.edu2
Affiliation(s) Electrical&Computer Engineering Department, University of California, Irvine, CA 92697, U.S.A. 1 2 Department of Computer Science, Washington University, St. Louis, MO 63130, U.S.A. 3 4 5 6 7
Keyword(s) real-time CORBA, patterns and frameworks, distributed and real-time Middleware,
Abstract
To be an effective platform for performance-sensitive real-time systems, commodity-off-the-shelf (COTS) distributed object computing (DOC) middleware must support application quality of service (QoS) requirements end-to-end. However, conventional COTS DOC middleware does not provide this support, which makes it unsuited for applications with stringent latency, determinism, and priority preservation requirements. It is essential, therefore, to develop standards-based, COTS DOC middleware that permits the specification, allocation, and enforcement of application QoS requirements end-to-end. The real-time CORBA and messaging specifications in the CORBA 2.4 standard are important steps towards defining standards-based, COTS DOC middleware that can deliver end-to-end QoS support at multiple levels in distributed and embedded real-time systems. These specifications still lack sufficient detail, however, to portably configure and control processor, communication, and memory resources for applications with stringent QoS requirements. This paper provides four contributions to research on real-time DOC middleware. First, we illustrate how the CORBA 2.4 real-time and messaging specifications provide a starting point to address the needs of an important class of applications with stringent real-time requirements. Second, we illustrate how the CORBA 2.4 specifications are not sufficient to solve all the issues within this application domain. Third, we describe how we have implemented portions of these specifications, as well as several enhancements, using TAO, which is our open-source real-time CORBA ORB. Finally, we evaluate the performance of TAO empirically to illustrate how its features address the QoS requirements for certain classes of real-time applications. Copyright © 2001 John Wiley & Sons, Ltd.

Article ID: CPE559

Publisher John Wiley & Sons, Ltd. Chichester, UK
Category Research Article
Article Title The COIL data mediator definition language
Volume ID 13
Issue ID 7
Date June 1 2001
DOI(URI) http://dx.doi.org/10.1002/cpe.559
Article ID CPE559
Author Name(s) Christian Och1Richard M. Osborne2John Todd3Roger King4
Author Email(s) och@cs.colorado.edu1
Affiliation(s) Database Research Laboratory, Department of Computer Science, University of Colorado at Boulder, Boulder, CO 80308 0430, U.S.A. 1 2 3 4
Keyword(s) data mediators, data source integration and evolution, heterogeneous databases, interoperability, high-level integration semantics,
Abstract
Modern persistent applications typically run on top of numerous (often hundreds) of distributed, heterogeneous databases. Integrating data from various data sources in order to provide uniform, homogeneous, and consistent access to the data from the underlying data sources while preserving the integrity and autonomy of preexisting data sources is a huge problem, especially in evolving environments. The main issue that a global information sharing system has to address is heterogeneity on all levels of the participating data sources: the data might be stored in heterogeneous data sources, using different data models and access/middleware technologies. Another major type of heterogeneity that has to be addressed by a global information sharing system is due to differences in the semantics of the data from the data source participating in the integration process. In particular, the resolution of potential conflicts due to structural and semantic heterogeneity of the integrated data is an essential part of the integration process performed by a data integration environment. The COIL data mediator definition language supports the definition of the integration process and its semantics on a high level of abstraction. The language provides powerful language components which can be used to resolve potential conflicts due to structural and semantic heterogeneity, and to define the integration process in general. The semantics of a specific integration process are defined in a COIL data mediator definition (COIL program). The COIL compiler uses those specifications expressed in the COIL language to generate a standard Java object which captures the semantics of the integration process defined in the COIL mediator definition. The generated Java object is called a COIL data mediator. In general, a COIL data mediator is a user-defined, customized object/component which can be used like any other normal object in the Java development environment. A COIL data mediator presents a well-defined interface which may be used to access and configure the mediator at runtime. Therefore it is possible to change the data mediator"s behavior and the semantics of the integration process in general at runtime. We present a detailed description of the COIL data mediator definition language and its language components, along with a detailed discussion of how COIL data mediators are used to overcome different heterogeneity problems. Copyright © 2001 John Wiley & Sons, Ltd.

Article ID: CPE560

Publisher John Wiley & Sons, Ltd. Chichester, UK
Category Research Article
Article Title A multicast group communication protocol, engine, and bridge for CORBA
Volume ID 13
Issue ID 7
Date June 1 2001
DOI(URI) http://dx.doi.org/10.1002/cpe.560
Article ID CPE560
Author Name(s) L. E. Moser1P. M. Melliar-Smith2P. Narasimhan3R. R. Koch4K. Berket5
Author Email(s)
Affiliation(s) Department of Electrical and Computer Engineering, University of California, Santa Barbara, CA 93106, U.S.A. 1 2 3 4 5
Keyword(s) multicast group communication, CORBA, fault-tolerant distributed systems,
Abstract
Multicast group communication is needed for fault-tolerant distributed systems and, in particular, for the new Fault Tolerant CORBA standard, to maintain strong replica consistency. However, different multicast group communication protocols are appropriate for different environments, which makes it difficult to define a single standard multicast protocol. In this paper, we present a multicast group communication engine and bridge for CORBA that allows multiple group communication protocols to be used concurrently. We also present the Fault Tolerant Multicast Protocol, a group communication protocol that allows different fault tolerance systems to interoperate. The group communication engine and bridge places Lamport timestamps on messages, and multicasts messages to groups, using one or more group communication protocols. The group communication protocols reliably deliver the timestamped messages in timestamp order to the group communication engine, which integrates these streams of messages into a single stream for delivery in timestamp order. The Fault Tolerant Multicast Protocol operates over IP Multicast, and consists of the Reliable Multicast Protocol which provides reliable source-ordered message delivery, the Reliable Ordered Multicast Protocol which provides reliable totally ordered message delivery, and the Host Group Membership Protocol which provides host group membership services. Copyright © 2001 John Wiley & Sons, Ltd.

Article ID: CPE572

Publisher John Wiley & Sons, Ltd. Chichester, UK
Category Research Article
Article Title A Java commodity grid kit
Volume ID 13
Issue ID 8-9
Date July 1 2001
DOI(URI) http://dx.doi.org/10.1002/cpe.572
Article ID CPE572
Author Name(s) Gregor von Laszewski1Ian Foster2Jarek Gawor3Peter Lane4
Author Email(s) gregor@mcs.anl.gov1
Affiliation(s) Mathematics and Computer Science Division, Argonne National Laboratory, Argonne, IL 60439, U.S.A. 1 2 3 4
Keyword(s) Commodity Grid Toolkits, Java CoG Kit, Computational Grid,
Abstract
Developing advanced applications for the emerging national-scale ‘Computational Grid’ infrastructures is still a difficult task. Although Grid services are available that assist the application developers in authentication, remote access to computers, resource management, and infrastructure discovery, they provide a challenge because these services may not be compatible with the commodity distributed-computing technologies and frameworks used previously. The Commodity Grid project is working to overcome this difficulty by creating what we call Commodity Grid Toolkits (CoG Kits) that define mappings and interfaces between Grid and particular commodity frameworks. In this paper, we explain why CoG Kits are important, describe the design and implementation of a Java CoG Kit, and use examples to illustrate how CoG Kits can enable new approaches to application development based on the integrated use of commodity and Grid technologies. Copyright © 2001 John Wiley & Sons, Ltd.

Article ID: CPE573

Publisher John Wiley & Sons, Ltd. Chichester, UK
Category Research Article
Article Title Strategies for the efficient exploitation of loop-level parallelism in JavaPreliminary versions of the material presented in this paper appeared in the proceedings of the ACM 2000 Java Grande Conference and The Second Workshop on Java for High Performance Computing (ICS"2000).
Volume ID 13
Issue ID 8-9
Date July 1 2001
DOI(URI) http://dx.doi.org/10.1002/cpe.573
Article ID CPE573
Author Name(s) José Oliver1Jordi Guitart2Eduard Ayguadé3Nacho Navarro4Jordi Torres5
Author Email(s) eduard@ac.upc.es3
Affiliation(s) iSOCO, Intelligent Software Components, S.A. 1Computer Architecture Department, Technical University of Catalunya, Barcelona, Spain 2 3 4 5
Keyword(s) loop-level parallelism, Java Threads, program transformations,
Abstract
This paper analyzes the overheads incurred in the exploitation of loop-level parallelism using Java Threads and proposes some code transformations that minimize them. The transformations avoid the intensive use of Java Threads and reduce the number of classes used to specify the parallelism in the application (which reduces the time for class loading). The use of such transformations results in promising performance gains that may encourage the use of Java for exploiting loop-level parallelism in the framework of OpenMP. On average, the execution time for our synthetic benchmarks is reduced by 50% from the simplest transformation when eight threads are used. The paper explores some possible enhancements to the Java threading API oriented towards improving the application-runtime interaction. Copyright © 2001 John Wiley & Sons, Ltd.

Article ID: CPE574

Publisher John Wiley & Sons, Ltd. Chichester, UK
Category Editorial
Article Title Special Issue: ACM 2000 Java Grande Conference
Volume ID 13
Issue ID 8-9
Date July 1 2001
DOI(URI) http://dx.doi.org/10.1002/cpe.574
Article ID CPE574
Author Name(s) Geoffrey Fox1
Author Email(s)
Affiliation(s) School of Computational Science and Information Technology, Florida State University 1
Keyword(s)
Abstract
No abstract

Article ID: CPE575

Publisher John Wiley & Sons, Ltd. Chichester, UK
Category Research Article
Article Title Interceptors for Java Remote Method Invocation
Volume ID 13
Issue ID 8-9
Date July 1 2001
DOI(URI) http://dx.doi.org/10.1002/cpe.575
Article ID CPE575
Author Name(s) N. Narasimhan1L. E. Moser2P. M. Melliar-Smith3
Author Email(s) nitya@alpha.ece.ucsb.edu1
Affiliation(s) Department of Electrical and Computer Engineering, University of California, Santa Barbara, CA 93106, U.S.A. 1 2 3
Keyword(s) interceptor, Java Remote Method Invocation, Dynamic Proxy, middleware,
Abstract
An interceptor is a software mechanism that provides the hooks that are needed to introduce additional code dynamically into the execution path of an application. By exploiting interceptors, developers can enhance and potentially modify the behavior of an application at runtime without having to revise or recompile the application code. We have identified three distinct interception points for the Java Remote Method Invocation (JavaRMI) model, at the proxy level, the transport level and the shared library level of the JavaRMI model. The interceptors implemented at these interception points employ the DynamicProxy API, the RMISocketFactory API, and a library mediation approach, respectively. Our interceptor implementations are novel in that they are transparent to the application, add nominal latency overheads and are easy to deploy, requiring only minimal modifications to the application. We describe how the interceptors can be exploited to introduce additional services (such as logging and profiling mechanisms) to the JavaRMI runtime. In particular, we describe the use of interceptors in the Aroma System to enhance the existing JavaRMI model with support for fault-tolerance through the consistent replication of JavaRMI objects. Copyright © 2001 John Wiley & Sons, Ltd.

Article ID: CPE576

Publisher John Wiley & Sons, Ltd. Chichester, UK
Category Research Article
Article Title High-performance file I/O in Java: Existing approaches and bulk I/O extensions
Volume ID 13
Issue ID 8-9
Date July 1 2001
DOI(URI) http://dx.doi.org/10.1002/cpe.576
Article ID CPE576
Author Name(s) Dan Bonachea1Phillip Dickens2Rajeev Thakur3
Author Email(s) bonachea@cs.berkeley.edu1
Affiliation(s) EECS Department, University of California at Berkeley, Berkeley, CA 94720, U.S.A. 1Dept. of Computer Science, Illinois Institute of Technology, 10 West 31st Street, Chicago, IL 60616, U.S.A. 2Mathematics and Computer Science Division, Argonne National Laboratory, Argonne, IL 60439, U.S.A. 3
Keyword(s) Java, I/O, bulk I/O, high perfomance, Titanium,
Abstract
There is a growing interest in using Java as the language for developing high-performance computing applications. To be successful in the high-performance computing domain, however, Java must not only be able to provide high computational performance, but also high-performance I/O. In this paper, we first examine several approaches that attempt to provide high-performance I/O in Java-many of which are not obvious at first glance-and evaluate their performance on two parallel machines, the IBM SP and the SGI Origin2000. We then propose extensions to the Java I/O library that address the deficiencies in the Java I/O API and improve performance dramatically. The extensions add bulk (array) I/O operations to Java, thereby removing much of the overhead currently associated with array I/O in Java. We have implemented the extensions in two ways: in a standard JVM using the Java Native Interface (JNI) and in a high-performance parallel dialect of Java called Titanium. We describe the two implementations and present performance results that demonstrate the benefits of the proposed extensions. Copyright © 2001 John Wiley & Sons, Ltd.

Article ID: CPE577

Publisher John Wiley & Sons, Ltd. Chichester, UK
Category Research Article
Article Title DISCOVER: An environment for Web-based interaction and steering of high-performance scientific applicationsThe DISCOVER collaboratory can be accessed at http://www.discoverportal.org/
Volume ID 13
Issue ID 8-9
Date July 1 2001
DOI(URI) http://dx.doi.org/10.1002/cpe.577
Article ID CPE577
Author Name(s) V. Mann1V. Matossian2R. Muralidhar3M. Parashar4
Author Email(s) parashar@caip.rutgers.edu4
Affiliation(s) Department of Electrical and Computer Engineering, Rutgers University, 94 Brett Road, Piscataway, NJ 08854, U.S.A. 1 2 3 4
Keyword(s) Web-based computational collaboratory, computational interaction and steering,
Abstract
This paper presents the design, implementation, and deployment of the DISCOVER Web-based computational collaboratory. Its primary goal is to bring large distributed simulations to the scientists"/engineers" desktop by providing collaborative Web-based portals for monitoring, interaction and control. DISCOVER supports a three-tier architecture composed of detachable thin-clients at the front-end, a network of interaction servers in the middle, and a control network of sensors, actuators, and interaction agents at the back-end. The interaction servers enable clients to connect and collaboratively interact with registered applications using a browser. The application control network enables sensors and actuators to be encapsulated within, and directly deployed with the computational objects. The application interaction gateway manages overall interaction. It uses Java Native Interface to create Java proxies that mirror computational objects and allow them to be directly accessed at the interaction server. Security and authentication are provided using customizable access control lists and SSL-based secure servers. Copyright © 2001 John Wiley & Sons, Ltd.

Article ID: CPE578

Publisher John Wiley & Sons, Ltd. Chichester, UK
Category Research Article
Article Title HBench:Java: An application-specific benchmarking framework for Java Virtual Machines
Volume ID 13
Issue ID 8-9
Date July 1 2001
DOI(URI) http://dx.doi.org/10.1002/cpe.578
Article ID CPE578
Author Name(s) Xiaolan Zhang1Margo Seltzer2
Author Email(s) cxzhang@eecs.harvard.edu1
Affiliation(s) Division of Engineering and Applied Sciences, Harvard University, 33 Oxford Street, Cambridge, MA 02138, U.S.A. 1 2
Keyword(s) Java performance, benchmarking,
Abstract
Java applications represent a broad class of programs, ranging from programs running on embedded products to high-performance server applications. Standard Java benchmarks ignore this fact and assume a fixed workload. When an actual application"s behavior differs from that included in a standard benchmark, the benchmark results are useless, if not misleading. In this paper, we present HBench:Java, an application-specific benchmarking framework, based on the concept that a system"s performance must be measured in the context of the application of interest. HBench:Java employs a methodology that uses vectors to characterize the application and the underlying Java Virtual Machine (JVM) and carefully combines the two vectors to form a single metric that reflects a specific application"s performance on a particular JVM such that the performance of multiple JVMs can be realistically compared. Our performance results demonstrate HBench:Java"s superiority over traditional benchmarking approaches in predicting relative performance of real applications and its ability to pinpoint performance problems, even with a simplified vector. Copyright © 2001 John Wiley & Sons, Ltd.

Article ID: CPE579

Publisher John Wiley & Sons, Ltd. Chichester, UK
Category Research Article
Article Title An OpenMP-like interface for parallel programming in Java
Volume ID 13
Issue ID 8-9
Date July 1 2001
DOI(URI) http://dx.doi.org/10.1002/cpe.579
Article ID CPE579
Author Name(s) M. E. Kambites1J. Obdržálek2J. M. Bull3
Author Email(s) m.bull@epcc.ed.ac.uk3
Affiliation(s) Department of Mathematics, University of York, Heslington, York YO10 5DD, England, U.K. 1Faculty of Informatics, Masaryk University, Botanická 68a, 602 00 Brno, Czech Republic 2Edinburgh Parallel Computing Centre, University of Edinburgh, Mayfield Road, Edinburgh EH9 3JZ, Scotland, U.K. 3
Keyword(s) Java, parallel programming, shared memory, directives, compiler,
Abstract
This paper describes the definition and implementation of an OpenMP-like set of directives and library routines for shared memory parallel programming in Java. A specification of the directives and routines is proposed and discussed. A prototype implementation, consisting of a compiler and a runtime library, both written entirely in Java, is presented, which implements most of the proposed specification. Some preliminary performance results are reported. Copyright © 2001 John Wiley & Sons, Ltd.

Article ID: CPE580

Publisher John Wiley & Sons, Ltd. Chichester, UK
Category Research Article
Article Title Using JavaNws to compare C and Java TCP-Socket performance
Volume ID 13
Issue ID 8-9
Date July 1 2001
DOI(URI) http://dx.doi.org/10.1002/cpe.580
Article ID CPE580
Author Name(s) Chandra Krintz1Rich Wolski2
Author Email(s) ckrintz@cs.ucsd.edu1
Affiliation(s) Department of Computer Science and Engineering, University of California, San Diego, U.S.A. 1Computer Science Department, University of Tennessee, Knoxville, U.S.A. 2
Keyword(s) TCP/IP socket, performance, prediction, Java, C,
Abstract
As research and implementation continue to facilitate high-performance computing in Java, applications can benefit from resource management and prediction tools. In this work, we present such a tool for network round-trip time and bandwidth between a user"s desktop and any machine running a Web server (this assumes that the user"s browser is capable of supporting Java 1.1 and above). JavaNws is a Java implementation and extension of a powerful subset of the Network Weather Service (NWS), a performance prediction toolkit that dynamically characterizes and forecasts the performance available to an application. However, due to the Java language implementation and functionality (portability, security, etc.), it is unclear whether a Java program is able to measure and predict the network performance experienced by C-applications with the same accuracy as an equivalent C program. We provide a quantitative equivalence study of the Java and C TCP-socket interface and show that the data collected by the JavaNws is as predictable as that collected by the NWS (using C). Copyright © 2001 John Wiley & Sons, Ltd.

Article ID: CPE581

Publisher John Wiley & Sons, Ltd. Chichester, UK
Category Research Article
Article Title Parallel application experience with replicated method invocation
Volume ID 13
Issue ID 8-9
Date July 1 2001
DOI(URI) http://dx.doi.org/10.1002/cpe.581
Article ID CPE581
Author Name(s) Jason Maassen1Thilo Kielmann2Henri E. Bal3
Author Email(s) jason@cs.vu.nl1
Affiliation(s) Division of Mathematics and Computer Science, Vrije Universiteit, Amsterdam, The Netherlands 1 2 3
Keyword(s) Java, object replication, RMI, parallel programming,
Abstract
We describe and evaluate a new approach to object replication in Java, aimed at improving the performance of parallel programs. Our programming model allows the programmer to define groups of objects that can be replicated and updated as a whole, using reliable, totally-ordered broadcast to send update methods to all machines containing a copy. The model has been implemented in the Manta highperformance Java system. We evaluate system performance both with microbenchmarks and with a set of five parallel applications. For the applications, we also evaluate ease of programming, compared to RMI implementations. We present performance results for a Myrinet-based workstation cluster as well as for a wide-area distributed system consisting of four such clusters. The microbenchmarks show that updating a replicated object on 64 machines only takes about three times the RMI latency in Manta. Applications using Manta"s object replication mechanism perform at least as fast as manually optimized versions based on RMI, while keeping the application code as simple as with naive versions that use shared objects without taking locality into account. Using a replication mechanism in Manta"s runtime system enables several unmodified applications to run efficiently even on the wide-area system. Copyright © 2001 John Wiley & Sons, Ltd.

Article ID: CPE586

Publisher John Wiley & Sons, Ltd. Chichester, UK
Category Research Article
Article Title Development and performance analysis of real-world applications for distributed and parallel architectures
Volume ID 13
Issue ID 10
Date Aug 25 2001
DOI(URI) http://dx.doi.org/10.1002/cpe.586
Article ID CPE586
Author Name(s) T. Fahringer1P. Blaha2A. Hössinger3J. Luitz4E. Mehofer5H. Moritsch6B. Scholz7
Author Email(s) tf@par.univie.ac.at1
Affiliation(s) Institute for Software Science, University of Vienna, Liechtensteinstrasse 22, A-1090 Vienna, Austria 1Institute of Physical and Theoretical Chemistry, Vienna University of Technology, Getreidemarkt 9/156, A-1060 Vienna, Austria 2Institute for Microelectronics, Vienna University of Technology, Gusshausstr. 27-29/E 360, A-1040 Vienna, Austria 3 4 5 Department of Business Administration, University of Vienna, Brünner Strasse 72, A-1210 Vienna, Austria 6 7
Keyword(s) performance analysis, performance monitoring, program instrumentation, performance visualization, parallel programs, distributed programs,
Abstract
Several large real-world applications have been developed for distributed and parallel architectures. We examine two different program development approaches. First, the usage of a high-level programming paradigm which reduces the time to create a parallel program dramatically but sometimes at the cost of a reduced performance; a source-to-source compiler, has been employed to automatically compile programs-written in a high-level programming paradigm-into message passing codes. Second, a manual program development by using a low-level programming paradigm-such as message passing-enables the programmer to fully exploit a given architecture at the cost of a time-consuming and error-prone effort. Performance tools play a central role in supporting the performance-oriented development of applications for distributed and parallel architectures. SCALA-a portable instrumentation, measurement, and post-execution performance analysis system for distributed and parallel programs-has been used to analyze and to guide the application development, by selectively instrumenting and measuring the code versions, by comparing performance information of several program executions, by computing a variety of important performance metrics, by detecting performance bottlenecks, and by relating performance information back to the input program. We show several experiments of SCALA when applied to real-world applications. These experiments are conducted for a NEC Cenju-4 distributed-memory machine and a cluster of heterogeneous workstations and networks. Copyright © 2001 John Wiley & Sons, Ltd.

Article ID: CPE587

Publisher John Wiley & Sons, Ltd. Chichester, UK
Category Research Article
Article Title Simulating asynchronous hardware on multiprocessor platforms: the case of AMULET1
Volume ID 13
Issue ID 10
Date Aug 25 2001
DOI(URI) http://dx.doi.org/10.1002/cpe.587
Article ID CPE587
Author Name(s) Georgios K. Theodoropoulos1
Author Email(s) gkt@cs.bham.ac.uk1
Affiliation(s) School of Computer Science, The University of Birmingham, Birmingham B15 2TT, U.K. 1
Keyword(s) asynchronous hardware, distributed simulation, CSP, occam, monitoring, load balancing,
Abstract
Synchronous VLSI design is approaching a critical point, with clock distribution becoming an increasingly costly and complicated issue and power consumption rapidly emerging as a major concern. Hence, recently, there has been a resurgence of interest in asynchronous digital design techniques which promise to liberate digital design from the inherent problems of synchronous systems. This activity has revealed a need for modelling and simulation techniques suitable for the asynchronous design style. The concurrent process algebra communicating sequential processes (CSP) and its executable counterpart, occam, are increasingly advocated as particularly suitable for this purpose. This paper focuses on issues related to the execution of CSP/occam models of asynchronous hardware on multiprocessor machines, including I/O, monitoring and debugging, partition, mapping and load balancing. These issues are addressed in the context of occarm, an occam simulation model of the AMULET1 asynchronous microprocessor; however, the solutions devised are more general and may be applied to other systems too. Copyright © 2001 John Wiley & Sons, Ltd.

Article ID: CPE588

Publisher John Wiley & Sons, Ltd. Chichester, UK
Category Research Article
Article Title Scalability and performance of OpenMP and MPI on a 128-processor SGI Origin 2000
Volume ID 13
Issue ID 10
Date Aug 25 2001
DOI(URI) http://dx.doi.org/10.1002/cpe.588
Article ID CPE588
Author Name(s) Glenn R. Luecke1Wei-Hua Lin2
Author Email(s) grl@iastate.edu1
Affiliation(s) 291 Durham Center, Iowa State University, Ames, Iowa 50011-2251, U.S.A. 1 2
Keyword(s) OpenMP, MPI, scalability,
Abstract
The purpose of this paper is to investigate the scalability and performance of seven, simple OpenMP test programs and to compare their performance with equivalent MPI programs on an SGI Origin 2000. Data distribution directives were used to make sure that the OpenMP implementation had the same data distribution as the MPI implementation. For the matrix-times-vector (test 5) and the matrix-times-matrix (test 7) tests, the syntax allowed in OpenMP 1.1 does not allow OpenMP compilers to be able to generate efficient code since the reduction clause is not currently allowed for arrays. (This problem is corrected in OpenMP 2.0.) For the remaining five tests, the OpenMP version performed and scaled significantly better than the corresponding MPI implementation, except for the right shift test (test 2) for a small message. Copyright © 2001 John Wiley & Sons, Ltd.

Article ID: CPE606

Publisher John Wiley & Sons, Ltd. Chichester, UK
Category Editorial
Article Title Special Issue: Object-oriented Databases
Volume ID 13
Issue ID 11
Date Sep 1 2001
DOI(URI) http://dx.doi.org/10.1002/cpe.606
Article ID CPE606
Author Name(s) Giovanna Guerrini1Isabella Merlo2Elena Ferrari3
Author Email(s)
Affiliation(s) 1 2 3
Keyword(s)
Abstract
No abstract

Article ID: CPE607

Publisher John Wiley & Sons, Ltd. Chichester, UK
Category Research Article
Article Title Distributed data integration by object-oriented mediator servers
Volume ID 13
Issue ID 11
Date Sep 1 2001
DOI(URI) http://dx.doi.org/10.1002/cpe.607
Article ID CPE607
Author Name(s) Tore Risch1Vanja Josifovski2
Author Email(s) Tore.Risch@dis.uu.se1 vanja@us.ibm.com2
Affiliation(s) Department of Information Science, Uppsala University, P.O. Box 513, SE-751 20 Uppsala, Sweden 1 2
Keyword(s) heterogeneous data integration, mediator systems, distributed databases, object-oriented databases,
Abstract
Integration of data from autonomous, distributed and heterogeneous data sources poses several technical challenges. This paper overviews the data integration system AMOS II based on the wrapper-mediator approach. AMOS II consists of: (i) a mediator database engine that can process and execute queries over data stored locally and in several external data sources, and (ii) object-oriented (OO) multi-database views for reconciliation of data and schema heterogeneities among sources with various capabilities. The data stored in different types of data sources is translated and integrated using OO mediation primitives, providing the user with a consistent view of the data in all the sources. Through its multi-database facilities many distributed AMOS II systems can interoperate in a federation. Since most data reside in the data sources, and to achieve high performance, the core of the system is a main-memory DBMS having a storage manager, query optimizer, transactions, client-server interface, disk backup, etc. The AMOS II data manager is optimized for main-memory access and is extensible so that new data types and query operators can be added or implemented in some external programming language. The extensibility is essential for providing seamlessaccess to a variety of data sources. Copyright © 2001 John Wiley & Sons, Ltd.

Article ID: CPE608

Publisher John Wiley & Sons, Ltd. Chichester, UK
Category Research Article
Article Title Consistency management in object-oriented databases
Volume ID 13
Issue ID 11
Date Sep 1 2001
DOI(URI) http://dx.doi.org/10.1002/cpe.608
Article ID CPE608
Author Name(s) H. Oakasha1S. Conrad2G. Saake3
Author Email(s) s.conrad@acm.org2
Affiliation(s) Cairo University, Fayoum Branch, Faculty of Science, Fayoum, Egypt 1Universität München, Institut für Informatik, Oettingenstrasse 67, 80538 München, Germany 2Universität Magdeburg, Fakultät für Informatik, Postfach 4120, 39016 Magdeburg, Germany 3
Keyword(s) object-oriented databases, constraint management, constraint maintenance, database integrity, integrity constraints,
Abstract
The paper presents concepts and ideas underlying an approach for consistency management in object-oriented (OO) databases. In this approach constraints are considered as first class citizens and stored in a meta-database called constraints catalog. When an object is created constraints of this object are retrieved from the constraints catalog and relationships between these constraints and the object are established. The structure of constraints has several features that enhance consistency management in OO database management systems which do not exist in conventional approaches in a satisfactory way. This includes: monitoring object consistency at different levels of update granularity, integrity independence, and efficiency of constraints maintenance; controlling inconsistent objects; enabling and disabling constraints, globally to all objects or locally to individual objects; and declaring constraints on individual objects. All these features are provided by means of basic notations of OO data models. Copyright © 2001 John Wiley & Sons, Ltd.

Article ID: CPE609

Publisher John Wiley & Sons, Ltd. Chichester, UK
Category Research Article
Article Title A temporal object-oriented model for digital libraries of documents
Volume ID 13
Issue ID 11
Date Sep 1 2001
DOI(URI) http://dx.doi.org/10.1002/cpe.609
Article ID CPE609
Author Name(s) M. J. Aramburu Cabo1R. Berlanga Llavori2
Author Email(s) berlanga@inf.uji.es1
Affiliation(s) Departamento de Informática, Universitat Jaume I, Castellón, Spain 1 2
Keyword(s) structured documents, complex data models, digital libraries,
Abstract
This paper presents a model for the representation and retrieval of structured documents considering their temporal properties. The purpose of this model is to serve as a platform for the development of digital library applications. Thus, it consists of both a new data model and a query language that are specially adapted to the requirements of these applications. The main elements of the data model are a flexible type system for structured documents, and two temporal dimensions that represent the temporal properties of documents and the evolution of the database schema. As for its query language, it allows the retrieval of documents by specifying conditions on their structure, contents and temporal properties. This query language has been designed for exploiting the temporal information stored into a large digital library, making possible to relate document contents in time, as well as to analyse the evolution of topics. The paper also includes some guidelines for the efficient implementation of databases of structured documents by adopting the proposed data and query models. Copyright © 2001 John Wiley & Sons, Ltd.

Article ID: CPE610

Publisher John Wiley & Sons, Ltd. Chichester, UK
Category Research Article
Article Title Compensation methods to support cooperative applications: A case study in automated verification of schema requirements for an advanced transaction model
Volume ID 13
Issue ID 11
Date Sep 1 2001
DOI(URI) http://dx.doi.org/10.1002/cpe.610
Article ID CPE610
Author Name(s) David Spelt1Susan Even2
Author Email(s)
Affiliation(s) Centre for Telematics and Information Technology, University of Twente, Enschede, The Netherlands 1 2
Keyword(s) object-oriented databases, formal methods, theorem proving,
Abstract
Compensation plays an important role in advanced transaction models, cooperative work and workflow systems. A schema designer is typically required to supply for each transaction $T$ another transaction $T^{-1}$ to semantically undo the effects of $T$. Little attention has been paid to the verification of the desirable properties of such operations, however. This paper demonstrates the use of a higher-order logic theorem prover for verifying that compensating transactions return a database to its original state. It is shown how an OODB schema is translated to the language of the theorem prover so that proofs can be performed on the compensating transactions. Copyright © 2001 John Wiley & Sons, Ltd.

Article ID: CPE589

Publisher John Wiley & Sons, Ltd. Chichester, UK
Category Research Article
Article Title PoLAPACK: parallel factorization routines with algorithmic blocking
Volume ID 13
Issue ID 12
Date Oct 1 2001
DOI(URI) http://dx.doi.org/10.1002/cpe.589
Article ID CPE589
Author Name(s) Jaeyoung Choi1
Author Email(s) choi@comp.ssu.ac.kr1
Affiliation(s) School of Computing, Soongsil University, 1-1, Sangdo-Dong, Dongjak-Ku, Seoul 156-743, Korea 1
Keyword(s)
Abstract
LU, QR, and Cholesky factorizations are the most widely used methods for solving dense linear systems of equations, and have been extensively studied and implemented on vector and parallel computers. Most of these factorization routines are implemented with block-partitioned algorithms in order to perform matrix-matrix operations, that is, to obtain the highest performance by maximizing reuse of data in the upper levels of memory, such as cache. Since parallel computers have different performance ratios of computation and communication, the optimal computational block sizes are different from one another in order to generate the maximum performance of an algorithm. Therefore, the ata matrix should be distributed with the machine specific optimal block size before the computation. Too small or large a block size makes achieving good performance on a machine nearly impossible. In such a case, getting a better performance may require a complete redistribution of the data matrix. In this paper, we present parallel LU, QR, and Cholesky factorization routines with an ‘algorithmic blocking’ on two-dimensional block cyclic data distribution. With the algorithmic blocking, it is possible to obtain the near optimal performance irrespective of the physical block size. The routines are implemented on the Intel Paragon and the SGI/Cray T3E and compared with the corresponding ScaLAPACK factorization routines. Copyright © 2001 John Wiley & Sons, Ltd.

Article ID: CPE590

Publisher John Wiley & Sons, Ltd. Chichester, UK
Category Research Article
Article Title Parallel versions of Stone"s strongly implicit algorithm
Volume ID 13
Issue ID 12
Date Oct 1 2001
DOI(URI) http://dx.doi.org/10.1002/cpe.590
Article ID CPE590
Author Name(s) J. S. Reeve1A. D. Scurr2J. H. Merlin3
Author Email(s) jsr@ecs.soton.ac.uk1
Affiliation(s) Department of Electronics and Computer Science, University of Southampton, Southampton SO17 1BJ, U.K. 1 2 3
Keyword(s) strongly implicit methods, Stones algorithm, parallel numerical algorithms,
Abstract
In this paper, we describe various methods of deriving a parallel version of Stone"s Strongly Implicit Procedure (SIP) for solving sparse linear equations arising from finite difference approximation to partial differential equations (PDEs). Sequential versions of this algorithm have been very successful in solving semi-conductor, heat conduction and flow simulation problems and an efficient parallel version would enable much larger simulations to be run. An initial investigation of various parallelizing strategies was undertaken using a version of high performance Fortran (HPF) and the best methods were reprogrammed using the MPI message passing libraries for increased efficiency. Early attempts concentrated on developing a parallel version of the characteristic wavefront computation pattern of the existing sequential SIP code. However, a red-black ordering of grid points, similar to that used in parallel versions of the Gauss-Seidel algorithm, is shown to be far more efficient. The results of both the wavefront and red-black MPI based algorithms are reported for various size problems and number of processors on a sixteen node IBM SP2. Copyright © 2001 John Wiley & Sons, Ltd.

Article ID: CPE591

Publisher John Wiley & Sons, Ltd. Chichester, UK
Category Research Article
Article Title Real-time multi-spectral image fusion
Volume ID 13
Issue ID 12
Date Oct 1 2001
DOI(URI) http://dx.doi.org/10.1002/cpe.591
Article ID CPE591
Author Name(s) Tiranee Achalakul1Stephen Taylor2
Author Email(s) tachalak@slgus.jnj.com1
Affiliation(s) 5568 Airport Road, Roanoke, VA 24012, U.S.A. 12-106 CST building, Syracuse University, Syracuse, NY 13244, U.S.A. 2
Keyword(s) principal component transform, multi-spectral camera, real-time image fusion, performance prediction,
Abstract
This paper describes a novel real-time multi-spectral imaging capability for surveillance applications. The capability combines a new high-performance multi-spectral camera system with a distributed algorithm that computes a spectral-screening principal component transform (PCT). The camera system uses a novel filter wheel design together with a high-bandwidth CCD camera to allow image cubes to be delivered at 110 frames $^{-1}$s with a spectral coverage between 400 and 1000 nm. The filters used in a particular application are selected to highlight a particular object based on its spectral signature. The distributed algorithm allows image streams from a dispersed collection of cameras to be disseminated, viewed, and interpreted by a distributed group of analysts in real-time. It operates on networks of commercial-off-the-shelf multiprocessors connected with high-performance (e.g. gigabit) networking, taking advantage of multi-threading where appropriate. The algorithm uses a concurrent formulation of the PCT to de-correlate and compress a multi-spectral image cube. Spectral screening is used to give features that occur infrequently (e.g. mechanized vehicles in a forest) equal importance to those that occur frequently (e.g. trees in the forest). A human-centered color-mapping scheme is used to maximize the impact of spectral contrast on the human visual system. To demonstrate the efficacy of the multi-spectral system, plant-life scenes with both real and artificial foliage are used. These scenes demonstrate the systems ability to distinguish elements of a scene that cannot be distinguished with the naked eye. The capability is evaluated in terms of visual performance, scalability, and real-time throughput. Our previous work on predictive analytical modeling is extended to answer practical design questions such as ‘For a specified cost, what system can be constructed and what performance will it attain?’ Copyright © 2001 John Wiley & Sons, Ltd.

Article ID: CPE592

Publisher John Wiley & Sons, Ltd. Chichester, UK
Category Research Article
Article Title Space-time tradeoffs for parallel 3D reconstruction algorithms for virus-structure determination
Volume ID 13
Issue ID 12
Date Oct 1 2001
DOI(URI) http://dx.doi.org/10.1002/cpe.592
Article ID CPE592
Author Name(s) Dan C. Marinescu1Yongchang Ji2Robert E. Lynch3
Author Email(s) dcm@cs.purdue.edu1
Affiliation(s) Department of Computer Sciences, Purdue University, West Lafayette, IN 47907, U.S.A. 1 2 3
Keyword(s) 3D reconstruction, electron microscopy, parallel algorithm, speedup,
Abstract
The 3D electron-density determination of viruses, from experimental data provided by electron microscopy, is a data-intensive computation that requires the use of clusters of PCs or parallel computers. In this paper we report on three parallel algorithms used for 3D reconstruction of asymmetric objects from their 2D projections. We discuss their computational, communication, I/O, and space requirements and present some performance data. The algorithms are general and can be used for 3D reconstruction of asymmetric objects for applications other than structural biology. Copyright © 2001 John Wiley & Sons, Ltd.

Article ID: CPE593

Publisher John Wiley & Sons, Ltd. Chichester, UK
Category Research Article
Article Title Lesser Bear-a light-weight process library for SMP computers
Volume ID 13
Issue ID 12
Date Oct 1 2001
DOI(URI) http://dx.doi.org/10.1002/cpe.593
Article ID CPE593
Author Name(s) Hisashi Oguma1Yasuichi Nakayama2
Author Email(s) oguma-h@igo.cs.uec.ac.jp1 yasu@cs.uec.ac.jp2
Affiliation(s) Department of Computer Science, The University of Electro-Communications, Chofu, Tokyo 182-8585, Japan 1 2
Keyword(s) thread library, SMP computer, parallelism, scheduler design,
Abstract
We have designed and implemented a light-weight process (thread) library called ‘Lesser Bear’ for SMP computers. Lesser Bear has high portability and thread-level parallelism. Creating UNIX processes as virtual processors and a memory-mapped file as a huge shared-memory space enables Lesser Bear to execute threads in parallel. Lesser Bear requires exclusive operation between peer virtual processors, and treats a shared-memory space as a critical section for synchronization of threads. Therefore, thread functions of the previous Lesser Bear are serialized. In this paper, we present a scheduling mechanism to execute thread functions in parallel. In the design of the proposed mechanism, we divide the entire shared-memory space into partial spaces for virtual processors, and prepare two queues (Protect Queue and Waiver Queue) for each partial space. We adopt an algorithm in which lock operations are not necessary for enqueueing. This algorithm allows us to propose a scheduling mechanism that can reduce the scheduling overhead. The mechanism is applied to Lesser Bear and evaluated by experimental results. Copyright © 2001 John Wiley & Sons, Ltd.

Article ID: CPE595

Publisher John Wiley & Sons, Ltd. Chichester, UK
Category Editorial
Article Title Special issue: formal techniques for Java programs
Volume ID 13
Issue ID 13
Date Nov 1 2001
DOI(URI) http://dx.doi.org/10.1002/cpe.595
Article ID CPE595
Author Name(s) Susan Eisenbach1Gary T. Leavens2
Author Email(s)
Affiliation(s) Imperial College, London 1Iowa State University 2
Keyword(s)
Abstract
No abstract

Article ID: CPE596

Publisher John Wiley & Sons, Ltd. Chichester, UK
Category Research Article
Article Title Type safety in the JVM: some problems in Java 2 SDK 1.2 and proposed solutions
Volume ID 13
Issue ID 13
Date Nov 1 2001
DOI(URI) http://dx.doi.org/10.1002/cpe.596
Article ID CPE596
Author Name(s) Alessandro Coglio1Allen Goldberg2
Author Email(s) coglio@kestrel.edu1
Affiliation(s) Kestrel Institute, 3260 Hillview Avenue, Palo Alto, CA 94304, U.S.A. 1Shoulders Corporation, 800 El Camino Real, Mountain View, CA 94040, U.S.A. 2
Keyword(s) Java, JVM, type safety, bugs,
Abstract
In the course of our work in developing formal specifications for components of the Java Virtual Machine (JVM), we have uncovered subtle bugs in the bytecode verifier of Sun"s Java 2 SDK 1.2. These bugs, which lead to type safety violations, relate to the naming of reference types. Under certain circumstances, these names can be spoofed through delegating class loaders. These flaws expose some inaccuracies and ambiguities in the JVM specification. We propose several solutions to all of these bugs. In particular, we propose a general solution that makes use of subtype loading constraints. Such constraints complement the equality loading constraints introduced in the Java 2 Platform, and are posted by the bytecode verifier when checking assignment compatibility of class types. By posting constraints instead of resolving and loading classes, the bytecode verifier in our solution has a cleaner interface with the rest of the JVM, and allows lazier loading. We sketch some excerpts of our mathematical formalization of this approach and of its type safety results. Copyright © 2001 John Wiley & Sons, Ltd.

Article ID: CPE597

Publisher John Wiley & Sons, Ltd. Chichester, UK
Category Research Article
Article Title Verified lightweight bytecode verification
Volume ID 13
Issue ID 13
Date Nov 1 2001
DOI(URI) http://dx.doi.org/10.1002/cpe.597
Article ID CPE597
Author Name(s) Gerwin Klein1Tobias Nipkow2
Author Email(s) kleing@in.tum.de1
Affiliation(s) Department of Informatics, Technische Universität München, 80290 Munich, Germany 1 2
Keyword(s) Java, bytecode verification, theorem proving, data flow analysis,
Abstract
Eva and Kristoffer Rose proposed a (sparse) annotation of Java Virtual Machine code with types to enable a one-pass verification of well-typedness. We have formalized a variant of their proposal in the theorem prover Isabelle/HOL and proved soundness and completeness. Copyright © 2001 John Wiley & Sons, Ltd.

Article ID: CPE598

Publisher John Wiley & Sons, Ltd. Chichester, UK
Category Research Article
Article Title Hoare logic for Java in Isabelle/HOL
Volume ID 13
Issue ID 13
Date Nov 1 2001
DOI(URI) http://dx.doi.org/10.1002/cpe.598
Article ID CPE598
Author Name(s) David von Oheimb1
Author Email(s)
Affiliation(s) Institut für Informatik, Technische Universität München, D 80290 München, Germany 1
Keyword(s) Hoare logic, axiomatic semantics, Java, Isabelle/HOL; verification, soundness, completeness, auxiliary variables, side effects, mutual recursion, dynamic binding, exception handling, type safety,
Abstract
This article presents a Hoare-style calculus for a substantial subset of Java Card, which we call Java$^{ell ight}$. In particular, the language includes side-effecting expressions, mutual recursion, dynamic method binding, full exception handling, and static class initialization. The Hoare logic of partial correctness is proved not only sound (w.r.t. our operational semantics of Java$^{ell ight}$, described in detail elsewhere) but even complete. It is the first logic for an object-oriented language that is provably complete. The completeness proof uses a refinement of the Most General Formula approach. The proof of soundness gives new insights into the role of type safety. Further by-products of this work are a new general methodology for handling side-effecting expressions and their results, the discovery of the strongest possible rule of consequence, and a flexible Call rule for mutual recursion. We also give a small but non-trivial application example. All definitions and proofs have been done formally with the interactive theorem prover Isabelle/HOL. This guarantees not only rigorous definitions, but also gives maximal confidence in the results obtained. Copyright © 2001 John Wiley & Sons, Ltd.

Article ID: CPE599

Publisher John Wiley & Sons, Ltd. Chichester, UK
Category Research Article
Article Title Java access protection through typing
Volume ID 13
Issue ID 13
Date Nov 1 2001
DOI(URI) http://dx.doi.org/10.1002/cpe.599
Article ID CPE599
Author Name(s) Eva Rose1Kristoffer Høgsbro Rose2
Author Email(s) krisrose@watson.ibm.com2
Affiliation(s) GIE Dyade, INRIA-Rocquencourt, Domaine de Voluceau, Rocquencourt B.P.105, 78153 Le Chesnay, France 1IBM T. J. Watson Research Center, 30 Saw Mill River Road, Hawthorne, NY 10532, U.S.A. 2
Keyword(s) Java, Java Virtual Machine, Java bytecode verification, read-only field access,
Abstract
We propose an integration of field access rights into the Java type system such that those access permission checks which are now performed dynamically (at run time), can instead be done statically, i.e. checked by the Java compiler and rechecked (at link time) by the bytecode verifier. We explain how this can be extended to remove all dynamic checks of field read access rights, completely eliminating the overhead of get methods for reading the value of a field. Improvements include using fast static lookup instead of dynamic dispatch for field access (without requiring a sophisticated inlining analysis), the space required by get methods is avoided, and denial-of-service attacks on field access is prevented. We sketch a formalization of adding field access to the bytecode verifier which will make it possible to prove that the change is safe and backwards compatible. Copyright © 2001 John Wiley & Sons, Ltd.

Article ID: CPE594

Publisher John Wiley & Sons, Ltd. Chichester, UK
Category Research Article
Article Title Open consensus
Volume ID 13
Issue ID 14
Date Dec 10 2001
DOI(URI) http://dx.doi.org/10.1002/cpe.594
Article ID CPE594
Author Name(s) Romain Boichat1Svend Frølund2Rachid Guerraoui3
Author Email(s) Romain.Boichat@epfl.ch1
Affiliation(s) Swiss Federal Institute of Technology, Lausanne, Switzerland 1Hewlett-Packard Laboratories, Palo Alto, CA, U.S.A. 2 3
Keyword(s) modularity, distribution, reliability, consensus, total-order broadcast, open implementation,
Abstract
This paper presents the abstraction of open consensus and argues for its use as an effective component for building reliable agreement protocols in practical asynchronous systems where processes and links can crash and recover. The specification of open consensus has a decoupled, on-demand and re-entrant flavour that make its use very efficient, especially in terms of forced logs, which are known to be major sources of overhead in distributed systems. We illustrate the use of open consensus as a basic building block to develop a modular, yet efficient, total-order broadcast protocol. Finally, we describe our Java implementation of our open-consensus abstraction and we convey our efficiency claims through some practical performance measures. Copyright © 2001 John Wiley & Sons, Ltd.

Article ID: CPE611

Publisher John Wiley & Sons, Ltd. Chichester, UK
Category Research Article
Article Title Deferring elimination of design alternatives in object-oriented methods
Volume ID 13
Issue ID 14
Date Dec 10 2001
DOI(URI) http://dx.doi.org/10.1002/cpe.611
Article ID CPE611
Author Name(s) Mehmet Aksit1Francesco Marcelloni2
Author Email(s) france@iet.unipi.it2
Affiliation(s) TRESE project, Department of Computer Science, University of Twente, P.O. Box 217, 7500 AE Enschede, The Netherlands 1Dipartimento di Ingegneria della Informazione, Università di Pisa, Via Diotisalvi, 2-56126, Pisa, Italy 2
Keyword(s) design alternatives, object-oriented methods, fuzzy logic, adaptable design models, CASE environments, software artifacts,
Abstract
While developing systems, software engineers generally have to deal with a large number of design alternatives. Current object-oriented methods aim to eliminate design alternatives whenever they are generated. Alternatives, however, should be eliminated only when sufficient information to take such a decision is available. Otherwise, alternatives have to be preserved to allow further refinements along the development process. Too early elimination of alternatives results in loss of information and excessive restriction of the design space. This paper aims to enhance the current object-oriented methods by modeling and controlling the design alternatives through the application of fuzzy-logic-based techniques. By using an example method, it is shown that the proposed approach increases the adaptability and reusability of design models. The method has been implemented and tested in our experimental CASE environment. Copyright © 2001 John Wiley & Sons, Ltd.

Article ID: CPE612

Publisher John Wiley & Sons, Ltd. Chichester, UK
Category Research Article
Article Title Parallel dynamic water supply scheduling in a cluster of computers
Volume ID 13
Issue ID 15
Date Dec 25 2001
DOI(URI) http://dx.doi.org/10.1002/cpe.612
Article ID CPE612
Author Name(s) M. Damas1M. Salmerón2J. Ortega3G. Olivares4H. Pomares5
Author Email(s) mdamas@atc.ugr.es1
Affiliation(s) Department of Computer Architecture and Computer Technology, Facultad de Ciencias, University of Granada, Campus Fuentenueva s/n. E-18071, Granada, Spain 1 2 3 4 5
Keyword(s) neuro-dynamic programming, evolutionary computation, parallel genetic algorithms, neural networks, clusters of computers, scheduling of water supply networks,
Abstract
The parallelization of complex planning and control problems arising in diverse application areas in the industrial, services and commercial environments not only allows the determination of control variables in the required times but also improves the performance of the control procedure as more processors are involved in the execution of the parallel program. In this paper we describe a scheduling application in a water supply network to demonstrate the benefits of parallel processing. The procedure we propose combines dynamic programming with genetic algorithms and time series prediction in order to solve problems in which decisions are made in stages, and the states and control belong to a continuous space. Taking into account the computational complexity of these applications and the time constraints that are usually imposed, the procedure has been implemented by a parallel program in a cluster of computers, an inexpensive and widely extended platform that can make parallelism a practical means of tackling complex problems in many different environments. Copyright © 2001 John Wiley & Sons, Ltd.

Article ID: CPE613

Publisher John Wiley & Sons, Ltd. Chichester, UK
Category Research Article
Article Title Comparing Windows NT, Linux, and QNX as the basis for cluster systems
Volume ID 13
Issue ID 15
Date Dec 25 2001
DOI(URI) http://dx.doi.org/10.1002/cpe.613
Article ID CPE613
Author Name(s) Avi Kavas1Dror G. Feitelson2
Author Email(s)
Affiliation(s) School of Computer Science and Engineering, The Hebrew University, 91904 Jerusalem, Israel 1 2
Keyword(s) Windows NT, Linux, QNX, node system, computational cluster,
Abstract
Clusters use commodity hardware and software components to provide an environment for high-performance parallel processing. A major issue in the development of a cluster system is the choice of the operating system that will run on each node. We compare three alternatives: Windows NT, Linux, and QNX-a real-time microkernel. The comparison is based on expressive power, performance, and ease-of-use metrics. The result is that none of these systems has a clear advantage over the others in all the metrics, but that each has its strong and weak points. Thus any choice of a base system will involve some technical compromises, but not major ones. Copyright © 2001 John Wiley & Sons, Ltd.