|
Frequently Asked
Questions
Analysts and Consultants
-
Q. How do hardware reflective memory systems compare with Mirror Memory?
-
Q. How does timing on the host system compare?
-
Q. What about networking characteristics?
-
Q. Are collisions a factor on networks such as EtherNet?
-
Q. Are
special compiler support or coding techniques necessary?
-
Q. What
software overhead can be expected from the system?
-
Q. What about
determinacy?
-
Q. How
easily can the system be configured?
-
Q. Tell me more about synchronization
-
Q. Can a
node be notified when data at an address changes?
-
Q.
Tell me about
atomic operations on memory.
-
Q. Are there interfaces
to 3rd party products?
A. There are differences in performance at both the
host and network level. AmirusMM™ is not a “poor man’s”
reflective memory system – the characteristics are different.
Hardware systems require special-purpose networks and
additional computer hardware. In return, they provide a quasi-real-time
communication path which does not interfere with other network traffic. The
hardware manages most of the communication without needing CPU intervention. If
memory monitoring is implemented, the hardware can generate processor interrupts
which require driver intervention. Data access requires backplane bandwidth.
The AmirusMM™ system is implemented
purely in software, and runs on any network which supports UDP/IP. Data
transmission and reception, together with exception processing, consume CPU
cycles. As the 'hardware' support is usually through an integrated system
network controller, backplane bandwidth requirements are minimal.
Since AmirusMM™ is fully integrated with the host operating system,
processes and threads using the memory can be properly suspended and restarted
when data saturation occurs or synchronization operations need to be performed.
Hardware RM, on the other hand, has fewer options available -- typically, these
cases are handled by either discarding the new data, delaying backplane
handshakes, or generating an interrupt which must be handled via special kernel
code. All these mechanisms have their drawbacks, and some solutions cause
corruption or data loss.
A. A hardware reflective memory system is typically mapped into peripheral I/O
space, frequently on the other side of a PCI or VME bridge. Since access to the
board is by necessity through non-cached memory, the read and write timing is
similar and governed by program I/O latency and whatever else is occurring on
the same bus. With AmirusMM™, the mapped memory is regular memory – memory
reads are satisfied from the cache if possible, whereas memory writes trigger
hardware exceptions. In short, reads execute significantly faster
than on hardware systems; writes execute significantly slower.
AmirusMM™ is a preferred solution if reflective memory access is
predominantly read oriented and data transfer rates between nodes are modest.
A. Hardware systems use dedicated networks with well-characterized timing. Communication paths are limited to ensure sufficient responsiveness, and are
typically in the range 100-1000 meters. Sizes up to 10km are possible with
additional bridging components
AmirusMM™ uses COTS hardware, and
configuration options allow trades between scale and bandwidth to be made. With
multicast support, it can be scaled to global size. AmirusMM™ works
best on networks with guaranteed bandwidth or on local dedicated networks which
do not provide an explicit bandwidth (such as EtherNet). Behavior on
shared EtherNet networks is usually quite
acceptable for development .
It should be noted that worse-case hardware networking latencies can be significant –
error correction and variable amounts of data production on different nodes can
significantly affect the throughput of both individual nodes and the overall
system. Determinism is only guaranteed when data production is well
characterized, processor interrupts are limited, and network errors are rare. In this respect, the two
systems are similar.
A. AmirusMM™ uses a modified round-robin algorithm to keep collisions on a
shared network to the absolute minimum. During regular sustained operation, each
node gets ‘permission’ to talk and never collides with cooperating nodes. Some
collisions may occur when new nodes join the network, or when
traffic other than AmirusMM™ traffic shares the same cable. And of course
there will inevitably be some traffic on bridged networks as various routers
exchange connectivity information and network hardware reports its status. On a
well-behaved, dedicated network, this additional traffic will be very limited.
If a node experiences a timeout, a retransmission protocol
broadcasts for missing data. This, too, is designed to minimize possible network
collisions.
A. Unlike some Distributed Shared Memory (DSM) implementations, neither special
compiler support nor coding techniques are necessary for an application to use
AmirusMM™. This is because notification of memory writes are signaled to
the support software through the computer exception handlers. The signaling
mechanism is completely transparent to the application program, and allows
programs using AmirusMM™ to be written in any language using any
compiler.
Future releases of AmirusMM™ may include support for lazy writes and
other caching techniques. Contact
Citrus Controls for current thinking and implementation schedule.
A. A background task communicating with other cooperating nodes uses resources
that trade responsiveness for overhead. The overhead is a function of the
intelligence of the networking card and the protocol stack used to implement the
network interconnections. There are also various parameters which can be changed
in the configuration program to trade responsiveness for overhead. Typically,
quiescent CPU requirements are at the 5% level or less, even if the Ring is
optimized for low latency.
Each application program using AmirusMM™ incurs an overhead when writing to
memory. This overhead is primarily determined by the memory
write-protection exception processing time, and is processor specific. There is
no additional overhead when performing a memory read – and often the overhead will be
significantly less than hardware memory access because the data may be found in
the cache.
A. Hardware vendors assert that their systems are
deterministic -- you are assured that data written into a shared memory location
will be visible to the other nodes within a certain time interval. For the most
part (assuming that the network is error free) this is true. But except in the
simplest of systems, this is not likely to be a particularly meaningful
parameter.
In a multi-tasking implementation, there may be several
processes or threads attempting to write to the shared memory at the same time.
The buffers which hold the data prior to transmission may be full (causing the
writer to stall), or part full (so that other data gets transmitted first and
you end up waiting for the next transmission interval). Interrupts from
elsewhere in the computer (including timer interrupts) may have a higher priority,
and it could be a significant time before the write operations you are about to
start actually execute. Finally, memory monitoring interrupts may monopolize
the CPU if there is extensive incoming data from elsewhere.
In general-purpose OS systems, such as Windows 2000®, critical
operating system functions can cause delays of several milliseconds. Data
transmission in all systems -- hardware reflective memories included -- can be
preempted unexpectedly. For this
reason, any complete system should be carefully examined prior to final implementation.
AmirusMM™ provides a number of utilities to characterize
determinism and establish throughput parameters. AmirusMM™ is well matched for
general purpose operating systems such as WindowsNT® or Linux. An
finally, AmirusMM is a "Try before you buy" product -- it costs you nothing to
test out your application.
A. A management tool is used to configure the system.
Up to four different contiguous memory areas may be defined, and each area may
be up to 16Mbytes in size or as small as 4Kbytes. Any number of application
programs may share memory on any host machine in any combination.
Also centrally managed are the various Ring parameters which establish the
communication path (such as network address and port), and timing parameters
(allowing a trade between overhead and responsiveness). You further have control
over the relative priorities of the various threads within the control program
MMService
All these configuration options can be programmed and set from a single location
using the MMManager utility. This utility communicates over the network with any
inactive node (i.e. those nodes without active client programs).
A. There are two mechanism for synchronization. Firstly,
memory can be flagged as 'sync', which ensures that a writing thread will not
proceed until all other nodes have had their memory contents written. Second,
there is a locking API which allow application programs to serialize access to
resources.
Synchronization in memory is achieved by suspending the writing
thread and flagging the outgoing data. When the data packet containing the
flagged data has been received by each other node (and written into remote node
memory), the suspended thread is restarted. The writing thread is suspended
after the write operation, so that the local data (observed perhaps by another
process or thread on the same node) gets updated before the other nodes see the
data in their memory. However, when the writing thread is reactivated, all nodes
see the same value.
The locking API can be used to
simplify complex data structure access over all machines in the system. The
application designer must agree upon a protocol system-wide -- for example,
modifications to structure XXX can only be made after a process acquires lock YYY. Locks can be acquired in either
Shared or Exclusive mode. The system monitors the owners (both by
node and by process within a node) to make sure that the locks are released on
premature failure.
Since AmirusMM™ is fully integrated with the underlying host operating
system, during synchronization operations the requesting thread can be properly
suspended -- other operations on the same machine can proceed without
compromising system integrity or performance. Hardware implementations handle
synchronization either through backplane timing extension (which can compromise
system reliability) or by custom device driver support (which adds to the
complexity and expense of the system).
Back to Top
A. An application program can request that changes to a particular address be
reported to it. An application FIFO is created and this information can be
received by the program by calling a library routine. No drivers or other
special software is required to get this information.
The system also allows application programs to receive
status change events (such as other nodes joining or leaving), as well as events
signaled by error conditions.
Back to Top
A. An atomic operation is one that completes without
interruption. Because Mirror Memory knows how much data is
being written by the application, it can write the entire data area throughout
the nodes in the system as one atomic operation.
However, system architecture constraints
may limit the actual transfers into memory in an implementation-dependent way.
For Pentium® processors, the only guarantee is that
8 bytes will be written without interruption. In addition, this is only
guaranteed if the data being written is naturally aligned.
Back to Top
A. With version 1.1 of AmirusMM, Citrus has introduced a
MATLAB interface. This is the first of our forthcoming product announcements to
include support for many common 3rd party products.
Our MATLAB interface allows a MATLAB user to create data structures in
AmirusMM shared memory. This allows a simple mechanism for the input and export
of data for distributed applications, real-time data analysis, and control
applications using this popular interface. Data synchronization for arrays is
handled automatically though the Amirus distributed lock manager.
Version 1.2 includes support for Visual Basic
.NET, simplifying the interface that would otherwise need to be implemented
through the .NET framework.
Back to Top
|