Software Reflective Memory from Citrus Controls Software Solutions for Reflective Memory
Home AmirusMM Purchase Support Consulting
 

 

Up
Installer FAQ
ManagerFAQ
ConsultantFAQ
ProgrammerFAQ

 

Frequently Asked Questions

Analysts and Consultants


Table of Contents

  1. Q. How do hardware reflective memory systems compare with Mirror Memory?
  2. Q. How does timing on the host system compare?
  3. Q. What about networking characteristics?
  4. Q. Are collisions a factor on networks such as EtherNet?
  5. Q. Are special compiler support or coding techniques necessary?
  6. Q. What software overhead can be expected from the system?
  7. Q. What about determinacy?
  8. Q. How easily can the system be configured?
  9. Q. Tell me more about synchronization
  10. Q. Can a node be notified when data at an address changes?
  11. Q. Tell me about atomic operations on memory.
  12. Q. Are there interfaces to 3rd party products?

Q. How do hardware reflective memory systems compare with Mirror Memory?

A. There are differences in performance at both the host and network level.  AmirusMM is not a “poor man’s” reflective memory system – the characteristics are different.

Hardware systems require special-purpose networks and additional computer hardware. In return, they provide a quasi-real-time communication path which does not interfere with other network traffic. The hardware manages most of the communication without needing CPU intervention. If memory monitoring is implemented, the hardware can generate processor interrupts which require driver intervention. Data access requires backplane bandwidth.

The AmirusMM system is implemented purely in software, and runs on any network which supports UDP/IP. Data transmission and reception, together with exception processing, consume CPU cycles. As the 'hardware' support is usually through an integrated system network controller, backplane bandwidth requirements are minimal.

Since AmirusMMis fully integrated with the host operating system, processes and threads using the memory can be properly suspended and restarted when data saturation occurs or synchronization operations need to be performed. Hardware RM, on the other hand, has fewer options available -- typically, these cases are handled by either discarding the new data, delaying backplane handshakes, or generating an interrupt which must be handled via special kernel code. All these mechanisms have their drawbacks, and some solutions cause corruption or data loss.

Back to Top

Q. How does timing on the host system compare?

A. A hardware reflective memory system is typically mapped into peripheral I/O space, frequently on the other side of a PCI or VME bridge.  Since access to the board is by necessity through non-cached memory, the read and write timing is similar and governed by program I/O latency and whatever else is occurring on the same bus.  With AmirusMM, the mapped memory is regular memory – memory reads are satisfied from the cache if possible, whereas memory writes trigger hardware exceptions.  In short, reads execute significantly faster than on hardware systems; writes execute significantly slower.  AmirusMM is a preferred solution if reflective memory access is predominantly read oriented and data transfer rates between nodes are modest.

Back to Top

Q. What about networking characteristics?

A. Hardware systems use dedicated networks with well-characterized timing.  Communication paths are limited to ensure sufficient responsiveness, and are typically in the range 100-1000 meters. Sizes up to 10km are possible with additional bridging components

AmirusMM uses COTS hardware, and configuration options allow trades between scale and bandwidth to be made.  With multicast support, it can be scaled to global size.  AmirusMMworks best on networks with guaranteed bandwidth or on local dedicated networks which do not provide an explicit bandwidth (such as EtherNet).  Behavior on shared EtherNet networks is usually quite acceptable for development .

It should be noted that worse-case hardware networking latencies can be significant – error correction and variable amounts of data production on different nodes can significantly affect the throughput of both individual nodes and the overall system.  Determinism is only guaranteed when data production is well characterized, processor interrupts are limited, and network errors are rare.  In this respect, the two systems are similar.

Back to Top

Q. Are collisions a factor on networks such as EtherNet?

A. AmirusMM uses a modified round-robin algorithm to keep collisions on a shared network to the absolute minimum.  During regular sustained operation, each node gets ‘permission’ to talk and never collides with cooperating nodes.  Some collisions may occur when new nodes join the network, or when traffic other than AmirusMM traffic shares the same cable. And of course there will inevitably be some traffic on bridged networks as various routers exchange connectivity information and network hardware reports its status. On a well-behaved, dedicated network, this additional traffic will be very limited.

If a node experiences a timeout, a retransmission protocol broadcasts for missing data. This, too, is designed to minimize possible network collisions.

Back to Top

Q. Are special compiler support or coding techniques necessary?

A. Unlike some Distributed Shared Memory (DSM) implementations, neither special compiler support nor coding techniques are necessary for an application to use AmirusMM. This is because notification of memory writes are signaled to the support software through the computer exception handlers. The signaling mechanism is completely transparent to the application program, and allows programs using AmirusMMto be written in any language using any compiler.

Future releases of AmirusMM may include support for lazy writes and other caching techniques. Contact Citrus Controls for current thinking and implementation schedule.

Back to Top

 


Q. What software overhead can be expected from the system?

A. A background task communicating with other cooperating nodes uses resources that trade responsiveness for overhead.  The overhead is a function of the intelligence of the networking card and the protocol stack used to implement the network interconnections. There are also various parameters which can be changed in the configuration program to trade responsiveness for overhead. Typically, quiescent CPU requirements are at the 5% level or less, even if the Ring is optimized for low latency.

Each application program using AmirusMM incurs an overhead when writing to memory.  This overhead is primarily determined by the memory write-protection exception processing time, and is processor specific.  There is no additional overhead when performing a memory read – and often  the overhead will be significantly less than hardware memory access because the data may be found in the cache.

Back to Top

Q. What about determinacy?

A. Hardware vendors assert that their systems are deterministic -- you are assured that data written into a shared memory location will be visible to the other nodes within a certain time interval. For the most part (assuming that the network is error free) this is true. But except in the simplest of systems, this is not likely to be a particularly meaningful parameter.

In a multi-tasking implementation, there may be several processes or threads attempting to write to the shared memory at the same time. The buffers which hold the data prior to transmission may be full (causing the writer to stall), or part full (so that other data gets transmitted first and you end up waiting for the next transmission interval). Interrupts from elsewhere in the computer (including timer interrupts) may have a higher priority, and it could be a significant time before the write operations you are about to start actually execute. Finally, memory monitoring interrupts may monopolize the CPU if there is extensive incoming data from elsewhere.

In general-purpose OS systems, such as Windows 2000®, critical operating system functions can cause delays of several milliseconds. Data transmission in all systems -- hardware reflective memories included -- can be preempted unexpectedly. For this reason, any complete system should be carefully examined prior to final implementation. AmirusMM provides a number of utilities to characterize determinism and establish throughput parameters. AmirusMM™ is well matched for general purpose operating systems such as WindowsNT® or Linux. An finally, AmirusMM is a "Try before you buy" product -- it costs you nothing to test out your application.

Back to Top

Q. How easily can the system be configured?

A. A management tool is used to configure the system.  Up to four different contiguous memory areas may be defined, and each area may be up to 16Mbytes in size or as small as 4Kbytes.  Any number of application programs may share memory on any host machine in any combination.

Also centrally managed are the various Ring parameters which establish the communication path (such as network address and port), and timing parameters (allowing a trade between overhead and responsiveness). You further have control over the relative priorities of the various threads within the control program MMService

All these configuration options can be programmed and set from a single location using the MMManager utility. This utility communicates over the network with any inactive node (i.e. those nodes without active client programs).

Back to Top

Q. Tell me more about synchronization

A. There are two mechanism for synchronization. Firstly, memory can be flagged as 'sync', which ensures that a writing thread will not proceed until all other nodes have had their memory contents written. Second, there is a locking API which allow application programs to serialize access to resources.

Synchronization in memory is achieved by suspending the writing thread and flagging the outgoing data. When the data packet containing the flagged data has been received by each other node (and written into remote node memory), the suspended thread is restarted. The writing thread is suspended after the write operation, so that the local data (observed perhaps by another process or thread on the same node) gets updated before the other nodes see the data in their memory. However, when the writing thread is reactivated, all nodes see the same value.

The locking API can be used to simplify complex data structure access over all machines in the system. The application designer must agree upon a protocol system-wide -- for example, modifications to structure XXX can only be made after a process acquires lock YYY. Locks can be acquired in either Shared or Exclusive mode. The system monitors the owners (both by node and by process within a node) to make sure that the locks are released on premature failure.

Since AmirusMMis fully integrated with the underlying host operating system, during synchronization operations the requesting thread can be properly suspended -- other operations on the same machine can proceed without compromising system integrity or performance. Hardware implementations handle synchronization either through backplane timing extension (which can compromise system reliability) or by custom device driver support (which adds to the complexity and expense of the system).


Back to Top


Q. Can a node be notified when data at an address changes?

A. An application program can request that changes to a particular address be reported to it.  An application FIFO is created and this information can be received by the program by calling a library routine.  No drivers or other special software is required to get this information.

The system also allows application programs to receive status change events (such as other nodes joining or leaving), as well as events signaled by error conditions.

Back to Top


Q. Tell me about atomic operations on memory

A. An atomic operation is one that completes without interruption. Because Mirror Memory knows how much data is being written by the application, it can write the entire data area throughout the nodes in the system as one atomic operation.

However, system architecture constraints may limit the actual transfers into memory in an implementation-dependent way. For Pentium® processors, the only guarantee is that 8 bytes will be written without interruption. In addition, this is only guaranteed if the data being written is naturally aligned.

Back to Top

 


Q. Are they any interfaces to 3rd party products?

A. With version 1.1 of AmirusMM, Citrus has introduced a MATLAB interface. This is the first of our forthcoming product announcements to include support for many common 3rd party products.

Our MATLAB interface allows a MATLAB user to create data structures in AmirusMM shared memory. This allows a simple mechanism for the input and export of data for distributed applications, real-time data analysis, and control applications using this popular interface. Data synchronization for arrays is handled automatically though the Amirus distributed lock manager.

Version 1.2 includes support for Visual Basic .NET, simplifying the interface that would otherwise need to be implemented through the .NET framework.

Back to Top


 

Take a two minute tour of Amirus' advanced features!

 Comments about this web site? Please let us know.
 Copyright © 2003, 2004 Citrus Controls Incorporated
 Last modified: Monday January 26, 2004