MAVRIC - Extended Braitenberg Architecture Overview

A Braitenberg Architecture

Valentino Braitenberg is a neurobiologist and Professor of Cybernetics at Max-Planck-Institute for Biological Cybernetics. In 1984 he published Vehicles: Experiments in Synthetic Psychology (MIT Press) which has become a classic in theoretical animatics.  In this book Braitenberg describes how seemingly complex or "motivated" behavior can arise from seemingly simple information processing architectures.  Simple animatic creatures can be (and have been) built that, indeed, demonstrate many of the predicted behaviors.

The architecture that Braitenberg described is comprised of simple sensors, processing elements (for what he called logic), unidirectional wires to connect elements, and motors to move the animats.  Figure 1 shows two of the simplest vehicles, one labelled 'Hate', the other 'Love'.  These two vehicles are composed of the same elements but differ in the fact that the wires cross in 'Love'.  It is fairly easy to see how an excitatory signal (set up by the activity in the photodetectors - yellow 'eyes') could activate the opposite motor (red) to cause it to go faster than the proximal motor.  When the vehicle is directly facing the light source, both sensors are equally active and both motors are running at the same speed.  The vehicle will first turn toward the light and then go faster until it crashes into the light [actually Braitenberg called this one hate, if I recall correctly, because it would ram the light aggressively! I chose the nomenclature to reflect the notions of avoidance and attraction that will play a prominent role in MAVRIC.]

Figure 1.  Two Braitenberg vehicles (after Braitenberg, 1984, p. 8), one which avoids (hates) light and one which seeks (loves) it.  Two photodetectors (yellow), set at slightly divergent angles from the centerline, provide semi-directional sensing of light sources.  Each sensor provides an excitatory signal that is propagated through a unidirectional wire (black) or, more generally, through a processing element or 'neuron' (light blue).  The excitatory signal excites a motor element (red) which turns a drive wheel (dark blue).  If one motor turns more rapidly than the other, then the vehicle will turn in the opposite direction.

A number of more elaborate vehicles were described by Braitenberg, including some in which the wires have special memory and adaptive properties.  Wire connections may be inhibitory as well as excitatory.  Neurons can interact with each other, forming associative networks between sensory perception and motor output.  What is important to note about this approach is the fact that behavior of the vehicle is not programmed into it explicitly.  In fact, for  more complex wiring schemes the behavior is not even, necessarily implicitly present.  Rather, behavior emerges from interactions between internal elements through both internal connections (if present) and through interaction with the external environment.  If memory is involved (Braitenberg's Ergotrix and Mnemotrix wires) these behaviors can become quite complex in both space and time.  Indeed, it can be argued that the actual behavior expressed by a sufficiently complex vehicle, having associative and causal memory, is not even predictable, simply from knowledge of the wiring diagram.

An Extended Braitenberg Architecture

Increasing the number and kinds of sensory detectors, associative networks and motor outputs but following the basic architecture of the Braitenberg vehicle has been termed the Extended Braitenberg Architecture, EBA (Pfeifer & Scheier, 1999).  This architecture can be easily implemented with simple light, sound, proximity and other, relatively cheap, sensors, a microcontroller or embedded computer program to simulate the neurons and networking, and various drive motors for wheels and other simple effectors.  Indeed, the Braitenberg architecture has inspired a wide range of embodied agents (see above reference).  In these robots, light, sound, proximity and many additional sensory modalities have been incorporated in a single agent giving rise to impressive behavioral outputs.  Some of these systems have included some form of learning in the form of neural networks implementing the associative aspects of the Mnemotrix wire.

MAVRIC EBA

Overview

MAVRIC (Mobile, Autonomous Vehicle for Research in Intelligent Control) is similarly inspired by Braitenberg's work.  We have chosen the extended Braitenberg architecture to test theories in learning and behavior.  MAVRIC has the brain of a moronic snail.  We are attempting to demonstrate that even such a simplified nervous system employs some very sophisticated mechanisms for adaptation and learning.  MAVRIC's brain has some general resemblance to a living invertebrate brain.  There are subsystems for sensory and perception processing, associative learning to drive choice selection, and action selection/motor control.  These subsystems are implemented with adaptrode-based (dynamic) neurons.  We have further extended this architecture by including what might be called external (to the brain) body functions such as digestion, depletion of energy reserves with on-going activity and the monitoring of "tissue damage" resulting from interactions with "painfull" objects.  These functions simulate endocrine and similar, longer time-scale, processes that are needed to provide evaluative feedback for the the adaptrode learning mechanism.
 


Figure 2.  MAVRIC - an embodied EBA agent.  The robot is based on an ActivMedia Pioneer I platform.  Light sensors

The adaptrode fills the role of both a Mnemotrix and Ergotrix wire in Braitenberg's world.  The memory trace encoding ability of the adaptrode provides the same functionality as Braitenberg claimed for Mnemotrix wire, which has a high initial resistance to the flow of current, but lowers its resistance according to the Hebb rule for associative encoding.  An Ergotrix wire is a little more complicated.  It will change its resistance only if the source neuron is excited before the sink neuron.  The Ergotrix wire encodes a temporally ordered correlation (causal correlation) giving rise to the possibility that the sink neuron could become a predictor of the activation of the source neuron.  This is exactly the effect given by the associative adaptrode, which enforces a strict temporal ordering in encoding memory traces.  Employing adaptrode based neurons in an EBA significantly extends the Braitenberg model.

Figure 3, below, shows an overview of the MAVRIC EBA.  This figure shows a number of different kinds of elements and their relationships.  Detailed breakdown of each of several subsystems are given in detail below.  The fundamental design of MAVRIC is that it is motivated to obtain (by search) energy resources.  In our experimental setup, we simulate such a resource with a specific tone sounded when MAVRIC is in contact with the source object.  As long as the tone is sounding, MAVRIC is feeding and accumulating food in its "stomach" (not shown).  The ingestion of food (integration of the tone amplitude over time) accumulates in the Food input slot, where it provides intermediate-term evaluative feedback (reward) to the Seek neuron.  The Digestion process converts food into energy over a longer time scale and provides longer-term evaluative (confirmation) feedback to the Seek neuron.

In general, sensory and pre-perceptual data is made available through input slots (inslots), most of which are on the left-hand side of the figure.  Eight-bit greyscale data is recorded each 100 msec slice and converted to a value between 0 and 1 for processing in the neural network.  Processing proceeds from left to right.  Some perceptual information, such as where, relative to the centerline of the robot, the object is located, is used directly by the action selection network.  Two main associative neurons modulate which action is selected based on whether the robot has learned to seek or avoid an object.

Figure 3. Overview of the MAVRIC EBA-based brain and external body functions.  Elongated rectangles represent input slots (see text).  Other rectangles are either reflex activity (Wander and Escape) or endocrine-mediated processes (Digestion, Disruption).

Below we examine each of the four subsystems of this architecture and provide more detailed explanation of the functions.

Sensory and Perceptual System

The perceptual subsystem is diagramed in Figure 4.  Perception, as used here, refers to the extraction of object information from the input data stream from sensors.  Specifically, we are interested in what features are being identified and where these are relative to the centerline, front of the robot.  Typically the robot moves in a generally forward direction, backing up only when an obstruction or pain source is detected ahead.  In this figure some perceptual preprocessing has already been accomplished by the time signals are presented to the brain through inslots.  The sonar return signals from seven detectors have been combined so as to provide information about whether an object is being touched on the right or left (or both if the object is dead ahead).  Similarly, light is detected in three ways: the mere presence of a light emitting object is important as well as whether it is on the right or left (or both).
Figure 4.  The perceptual subsystem processes raw data from sensors to determine the location and form of objects that MAVRIC encounters.  Circles represent neurons in the brain.  Neurons having small squares on the output line (axon) are thresholded; the number near the box gives the value of the threshold.  Such neurons act as filters or forcing a kind of fuzzy AND response.  Numbers near the synapses give the adaptrode number in the neuron (see the Brain specification).  Flat termini synapses are excitatory while circular termini are inhibitory.

The four separate tones, simulating odorants for a sense of "smell", are presented directly to the network.  This information cannot be directionally determined from the single "nose" microphone.  In the current version, odor is either present or absent.  In a future version we are intending to compute derived direction from changing gradient information.  In this model, odor 0 represents a "food" odor a priori, hard wired into the system.  Similarly, odor 3 represents "poison".  Both of these tones/odors drive specific neurons which interact with the external body functions.  For example the sounding of tone 0 at the same time that MAVRIC is touching some object directly ahead constitutes an episode of feeding.  The touching neuron (2), which normally signals an undesirable situation, from which MAVRIC would try to escape or at least turn away, is inhibited during feeding.  The feeding neuron (16) requires both tone 0 and touch ahead (neuron 1) to become excited, but then fires at a rate proportional to the volume of the tone.

Pain is activated by either touching or poison, above a certain level of activity in either of these.  Pain is used to reinforce the Avoid neuron (see below) as the "punishment" signal.  The Pain neuron (17) integrates pain-causing conditions, such as the presence of poison and is available through an outslot (see figure 7 below).  This value is transfered by the Disruption body function to a Pain inslot (9) for the purpose of being available for the reinforcement signal.

Associative System

MAVRIC learns to associate light and odors 1 and 2 with the perception of food or pain.  These associations give rise to the drive of seeking or avoiding behaviors.  In seeking behavior MAVRIC is drawn toward a stimulus (say light) that it has learned to associate with the acquisition of food (as described above).  The association neurons, Seek and Avoid provide the convergence zone for all of these, otherwise disparate signals.  Adaptrodes for these synapses are multi-time scale, multi-associative as described in (Mobus, 1994).  Odor 0 drives the unconditionable stimulus input for the Seek neuron.  Hence, if any of the conditionable stimuli are active a short time (1/2 second approx.) prior to Odor 0 becoming active, they become associated with the latter in the short-term trace.  If feeding ensues, the increase in activity of the Food inslot (11) will provide rewarding evaluative feedback (R in figure 5).  Confirming evaluative feedback, on a longer time scale, comes from the increase in energy that occurs as the Digestion body function converts food into energy.

Figure 5.  The associative subsystem allows feature sets to be associated with either seeking or avoiding behavior.

Avoidance behavior is driven either by Odor 3 (poison) or by light touching (that isn't feeding).  Intermediate-term evaluative feedback comes from the perception of pain, while longer-term feedback comes from the accumulation of tissue damage (see Disruption task below).

Note in figure 5 that this circuit is not symetrical.  The activation of avoidance will inhibit seeking behavior.  This is what required the Touching neuron to be inhibited by the presence of Odor 0 (food).  Avoidance has priority over seeking behavior as a result.

Action Selection

The combination of perceptions and learned behavioral propensities must eventually result in actual motor responses.  In true Braitenberg-fashion, MAVRIC can only change the relative motor speeds of its two drive motors.  The Wander task normally controls these motors (see below) but Wander takes modulating signals from what we have called the action selection circuit (Figure 6).  This circuit's primary job is to associate the perceived location of an object (right/left or ahead) with the learned behavior mode (seek or avoid) and then map these associations onto a direction control layer.

The figure is largely self explanatory.  The arrows on the right lead first to outslots which are mapped onto the appropriate control variables in the Wander task.  These signals modulate the oscillatory output of Wander so that the robot tends more strongly in the indicated direction.  Slow (14) and Fast (15) modulate the base speed of Wander.  A sufficiently strong, extended activation of slow will stop MAVRIC (as for feeding).

Figure 6.  Action selection is performed by a network of neurons that associate either seeking or avoiding behavior with the location of the perceived object (light/touch/tone).
 

Body Functions

Some tasks are so reflexive or involve longer time scale processes that need not be represented by neural structures in the above sense.  These tasks are covered in MAVRIC by programatic modules (microtasks in Saphira).  They are represented in Figure 7 by larger rectangles.
Figure 7.  Body functions are tasks outside the basic brain task that simulate reflex or endocrine functions in the body of an agent.

Wander is the main motor reflex in MAVRIC.  Wander causes MAVRIC to move in a drunken sailor walk unless it is being modulated by signals from one of the outslots shown (see Mobus & Fisher, 1994).

Escape is a programmed behavior activated by the Avoid Ahead neuron (10).  It involves an inhibition of Wander along with a reversing of the wheels, a characteristic turn of about 180 degrees and a short run forward.  After the execution of this behavior MAVRIC goes back to wandering.

MR and ML stand for Motor Right and Motor Left respectively.  These tasks simply translate the real values from the outputs of Wander (and Escape) into appropriate form for the robot's motor commands.

Digestion's main job is to remove "food" from the "stomach" while increasing the stored energy available to MAVRIC.  The use of these variables as evaluative feedback to the Seek neuron has already been covered above.  Digestion operates over a much longer time scale than the feeding or food accumulation actions.  An increase in energy means that MAVRIC was successful in finding resources.  So the energy value is used to reinforce the learning of feature associations which led to successful feeding.

Disruption's task is analogous to Digestion's, except that it operates over a much shorter time scale.  Punishment and Damage signals provide very rapid feedback to the Avoid neuron.  The learning of a painful lesson is much quicker than that of learning a positive lessson!
 

References

Mobus, G.E., "Toward a theory of learning and representing causal inferences in neural networks", in Levine, D.S. and Aparicio, M (Eds.), Neural Networks for Knowledge Representation and Inference,  Lawrence Erlbaum Associates, 1994. HTML version [412k including graphics]

Mobus, G.E. and Paul S. Fisher, "Foraging Search at the Edge of Chaos", Presented at Metroplex Institute of Neurodynamics Conferenceon Oscillations in Neural Networks, May, 1994. This paper appears as an invited chapter (16) in D.S. Levine, V. R. Brown and V. T. Shirey (Eds.), Oscillations in Neural Systems, Lawrence Erlbaum Associates, Publishers, Mahwah, New Jersey. Available in HTML version [221k including graphics]



Back to Adaptive Agents Homepage.
 

This material is based upon work supported by the National Science Foundation under Grant No. IIS-9907102.

Any opinions, findings, and conclusions or recommendations expressed in this material are those of the author and do not necessarily reflect the views of the National Science Foundation.