Issues with Collaboration (Java Distributed Computing)

9.2. Issues with Collaboration

For the most part, the issues that come up with collaborative systems are similar to those that arise in other concurrent programming tasks. In the context of collaborative systems, some of these issues take on even more importance; and, of course, collaboration raises a few issues of it own. In this section, we'll look briefly at four of the most important issues: communication needs, identity management, shared state information, and performance.

9.2.1. Communication Needs

A collaborative system has multiple remote agents collaborating dynamically, so it must be flexible in its ability to route transactions. Depending on the application, the underlying communications might need to support point-to-point messages between agents, broadcast messages sent to the entire agent community, or "narrowcast" messages sent to specific groups of participating agents. An interactive chat server that supports chat rooms is a good example of this. Messages are normally broadcast to the entire group, but if an individual is in a single chat room, then her messages should only be sent to the other participants in that room. In some cases, you may need to support private, one-on-one messaging between agents or users in the system (e.g., for private discussions).

In addition to simple message-like communications, there may be a need for agents to have a richer interface to other agents. Agents may need to pass object data to each other, and remote agents may need to interact directly with each other through distributed object interfaces.

9.2.2. Maintaining Agent Identities

If multiple agents are engaged in a collaboration, there has to be some way to uniquely identify them so that messages can be addressed and delivered, tasks can be assigned, etc. Also, if access to the system or to certain resources associated with the system needs to be restricted, then participant identities will need to be authenticated as well. Depending on the application, there may also be data or other resources associated with individual agents. This information must be maintained along with agent identities, and in some cases access to these resources must be controlled based on identities.

A practical example of this issue is a shared whiteboard application. A shared whiteboard is a virtual drawing space that multiple remote users can view and "write" on in order to share information, ideas, etc.--the digital equivalent of a group of people working around a real whiteboard in a meeting room. In order for the individuals using the whiteboard to understand what is being contributed by whom, the whiteboard system has to keep some kind of identity information for each participant. Each participant's contributions to the whiteboard (e.g., written information, graphics, etc.) must be shown with a visual indication of who is responsible for it (e.g., color, shading, etc.). It may also be desirable to allow each individual the right to modify or delete only his contributions to the whiteboard, which means adding access control based on identities.

9.2.3. Shared State Information

In many collaborative systems some data and resources are shared among participants. Shared data is common to most distributed systems, but is particularly important in collaborative systems. A cooperative effort among computing agents is usually expressed in terms of a set of data that needs to be shared by all agents in the system. In our shared whiteboard example, the current contents of the whiteboard are shared among all agents. With multiple agents accessing and potentially modifying this shared state information, maintaining the integrity of the information will be an important issue. If two or more agents attempt to alter the same piece of shared state information, then there has to be a reasonable and consistent way to determine how to merge these requests, and how to make it known to the affected agents what's been done with their transactions.

9.2.4. Performance

Some collaborative systems have to make a trade-off between keeping shared state consistent across all the agents and maximizing the overall performance. There are situations, such as in shared whiteboard applications, where it's important that all of the agents in the system have their view of the shared state of the system kept up-to-date as closely as possible. The simplest way to do this is to have a central mediator acting as a clearinghouse for agent events. The mediator gets notified whenever an agent does something that changes the shared state of the system, and the mediator is responsible for sending these state updates to all of the agents in the system. This also makes it simple to ensure that updates are sequenced correctly across all agents, if that's important to the application.

The problem is that the central mediator can become a bottleneck as the size of the system scales up. If there are lots of agents to be notified or lots of changes that agents have to be notified about, then the mediator may have trouble keeping up with the traffic and the agents in the system will waste a lot of time waiting for updates. Another approach would be not to use a mediator at all, and instead have a peer-to-peer system where each agent broadcasts its updates to all the other agents. While this may improve the throughput of updates, it makes it difficult to maintain consistency across the system. With each update being broadcast independently and asynchronously, it can be quite a feat to make sure that each agent ends up with the same state after all the updates have been sent, especially if the order of the updates is important.