Introducing a New Blog Series
Rarely do I make new year's resolution. But I found myself pinky swearing myself a single resolution for this year: write a series of tractable posts concerning the development of software for robotics. Accordingly, I intend to introduce background, techniques, and examples with respect to the following topics and more:
- Software Architectural Approaches for Robotics (e.g., Deliberative, Reactive, Hybrid)
- Software Design Methodologies (e.g., for Behavior Based Robotics, for Hierarchical)
- Integration Patterns (Message Channels, Publish/Subscriber, RPC)
- Machine Learning and Adaptation Techniques (Neural Nets, Genetic Algorithms, Fuzzy Logic)
- Planning Algorithms (Discrete Planning, Continuous Spaces, Motion Planning)
- Simultaneous Localization and Mapping Techniques (Kalman Filters, Particle Filters)
- Tools & Frameworks (Microsoft Robotics Studio, Robot Operating System)
I'm writing this series of posts for two reasons. The first, as always, is to bounce ideas off readers as a sanity check. The second is to instill interest in readers in this rapidly evolving field. The sophistication of software algorithms within robotics is beginning to lag behind the potential capabilities of available hardware. Accordingly, if this series acts as motivational tool to get others involved in tackling the current algorithmic challenges involved in robotics, then all the better.
To make this series more interesting and applicable to the development audience that reads http://devlicio.us, I will attempt, when appropriate, to focus on patterns and general techniques that could possibly be applicable to scenarios outside of robotics. Additionally, due to this series' focus on software, I'll concentrate on leveraging simulators over physical, often expensive, robotics hardware to make it easier for readers to duplicate provided examples.
Introduction to Architectural Paradigms of Robotic Control
While "architecture" is likely one of the least definable terms in software development, it is unavoidably a topic which has one of the greatest impacts on the extensibility and maintainability of an application. Indeed, a frequent cause of an application re-write is due to the architectural decisions that were made early in the project, or the lack thereof. Certainly one of the difficulties in instituting proper architectural practices in a project lies in the fact that architectural decisions must carefully be decided at many different levels of project development. Obviously, the architectural decisions at each level should be made in context of the project requirements in order to facilitate a proper balance of speed of development, scalability of development, and maintainability of the application in the future.
To illustrate, consider a basic eCommerce site vs. a corporate financial management application which integrates with various third party tools. There are two aspects of architecture which must be carefully considered. The first is deciding which project contexts will require architectural consideration. The second is defining the architectural approach to meet the needs of the given project context, and implementing any necessary prefactoring, accordingly.
For a basic eCommerce site, the project contexts requiring architectural consideration would likely include: appropriate separation between presentation and control, separation between control and the domain, separation between the domain and data access, and appropriate integration with a payment gateway. Perhaps the active record pattern might be chosen as the data access mechanism. If inversion of control seems overkill for such a small application, perhaps direct communications to the payment gateway via the domain objects might be chosen as the means of integration. Accordingly, judicious prefactoring would suggest that an architectural spike be created demonstrating appropriate use of the active record pattern with a selected tool along with an example of how and where integration with the payment gateway would take place.
For the more complex and demanding needs of the financial management application, architecture considerations, in addition to those taken into account for the basic eCommerce site, might include: what integration patterns might be leveraged to facilitate client integration, messaging patterns that might facilitate server side integration with third party financial calculators, and where inversion of control is appropriate. Perhaps RESTful services would be included to provide integration support for clients requiring such capabilities. Messaging via a publish/subscribe mechanism using a composed message processor might be selected as the ideal means for coordinating data with third party vendors on the server side. Again, proper refactoring, from an architectural perspective would include developing architectural spikes along with the foundational pieces of such an application to provide appropriate guidance for developers assisting with the project.
Developing software for robotics is no different, in this respect, than developing any other application. One must decide where the key architectural decisions need to be made and then provide adequate decisions and guidance to facilitate development in accordance with those decisions. As stated, these decisions must be made in consideration of the project needs. This post focuses on one project context, the architectural context of determining the overall approach to robotic motor control. We'll delve into three major architectural approaches to motor control along with highlighting a few specific implementations.
Deliberative vs. Reactive Robotics
Certainly the best source of intelligent systems to study and emulate are those found in nature. From the humble ant to the exalted human, evolution has honed a variety of strategies for dealing with the physical world. Accordingly, if we are to create intelligent systems, a good place to start is by emulating the behaviors and responses to stimuli demonstrated by living creatures.
In the early days of robotics, it was presumed that the most effective approach to emulating intelligence was by taking in detailed measurements provided by sensors, creating an internal representation of the world (e.g., grid-based maps), making plans based on that internal representation, and sending pre-planned commands to actuators (devices which convert energy into motion) for moving and interacting with the world. Inevitable, this approach presumes that the internal representation of the world is highly accurate, that the sensor reading are precise, and that there is enough time to carry out a plan before the world changes. In other words, this approach is highly deliberative. Obviously, the world is highly dynamic, sensor readings are sometimes erroneous, and time is sometimes of the essence when interacting with the world. E.g., you probably wouldn't want to spend much time creating an internal representation of the world if there's a Mac truck speeding towards you.
At the other extreme from deliberative is a reactive approach to interacting with the world. Taking a purely reactive approach does away with plans and internal representations of the world altogether; instead, a reactive approach focuses on using the world as its own model and reacts, accordingly. Instead of plans, reactive approaches often rely upon finite state machines to modify behavior as the world changes. Rodney Brooks changed robotics thinking upside down when he introduced this paradigm shift in the 80s. The robots he and his team produced were much faster in their reaction times, when compared to deliberative robots, and exhibited surprisingly complex behavior, appearing quite intelligent in many scenarios. But as is any extreme, a purely reactive approach to the world had it's own drawbacks; its difficulty in managing complex scenarios which demand careful planning is one such example.
The figure at right illustrates some of the differences between deliberative and reactive systems. Adapted from (Arkin, 1998).
While there are certainly pros and cons to either approach, there are some scenarios which are appropriate to one or the other. While still others may require more of a hybrid approach. Let's now take a look at each approach in more detail.
One of the first successful architectural implementations on a mobile robotic platform was the well known robot known as Shakey, created at Stanford University in the 1960s. Of particular note was the robot's architecture which was made up of three predominant layers: a sensing layer, a planning layer, and an execution layer. Accordingly, as information would be made available by sensors, Shakey would make plans on how to react to the perceived world and send those plans on to the execution layer for low level control of actuators. This began the architectural paradigm known as sense-plan-act (SPA). As the SPA approach matured, additional layers were introduced in a hierarchical style, typically all of which having access to a global world model. The upper layers in the hierarchy would use the world model to make plans reaching into the future. After plan development completed at the highest levels, the plans would be broken down into smaller commands and passed to lower levels for execution. Lower layers would then further decompose the commands into more actionable tasks which would ultimately be passed on to actuators at the lowest level. In a hierarchical approach, the sensing layers participate in keeping the internal representation of the world updated. Furthermore, to better accommodate dynamic changes to the world environment, lower planning layers may suggest changes to plans based on recent changes to the world model.
The diagram at right roughly illustrates this hierarchical architectural approach. As should be noted, the sensing layers update the world model while a hierarchy of planning and execution layers formulate plans and breaks those plans down into actionable tasks. Adapted from (Kortenkamp, 1998).
The current pinnacle of hierarchical architectures may be found in the reference architecture known as 4D/RCS (4 Dimensional / Real-time Control System). 4D/RCS is the latest version of the RCS reference model architecture for intelligent systems that has been evolving for decades. With 4D/RCS, six (more or less) layers are defined for creating and decomposing plans into low level action:
- Unit/Vehicle/Mission Layer: this layer decomposes the overall plan into actions for individual modules. Plans are regenerated every 2 hours to account for changes to the world model.
- Module Layer: converts actions into tasks to be performed on specific objects and schedules the tasks, accordingly. Plans are regenerated every 10 minutes.
- Task Layer: converts actions on a single object into sequences of moves to carry out the action. Plans are regenerated every minute.
- Elemental Layer: creates plans for each move, avoiding obstacles and collisions. Plans are regenerated every 5 seconds.
- Primitive Layer: determines motion primitives (speed, trajectory) to facilitate smooth movement. Plans are regenerated every 0.5 second.
- Servo Layer: sends servo control commands (velocity, force) to actuators. Plans are regenerated every 0.05 seconds.
At the heart of each and every layer are one or more 4D/RCS nodes which contain a behavior generation process which accepts commands from the next higher level and issues subgoals to the behavior generation process at lower levels. Furthermore, each node reports its task execution status up the chain for consideration into further planning. While plans are being solidified and disseminated, a sensory processing process in each node receives sensory input from lower levels, updates the world model, and collates the sensory data into larger units which it passes on to the next higher level; e.g., points get converted to lines which get converted to surfaces which get converted to objects with each successive rise through the 4D/RCS layers A visual summary of a 4D/RCS node quickly shows just how complex such a system can become. But sometimes the demands of a task require a respective amount of sophistication in the architectural approach. A comprehensive review of the 4D/RCS approach is discussed in (Albus, 2006).
The primary advantage to deliberative, hierarchical approaches such as 4D/RCS is that competent plans for managing complex scenarios can be generated and broken down into smaller chunks for lower levels to execute. But an obvious disadvantage to this comes in the form of much added complexity to the overall system while penalizing the speed at which the system can react to a changing environment.
In the 1980s, Rodney Brooks, in an effort to overcome some of the limitations of sense-plan-act that Shakey and other such robots exhibited, introduced the concept of reactive control. "Simply put, reactive control is a technique for tightly coupling perception and action, typically in the context of motor behaviors, to produce timely robotic response in dynamic and unstructured worlds." (Arkin, 1998). In other words, by eliminating the reliance on maintaining an internal world model and avoiding large amounts of time generating plans, simple responses can be executed in reaction to specific stimuli, thus exhibiting behavior similar to that of living organisms. If there was any doubt in what Brooks was trying to imply, he boldly titled one of his works, Planning is Just a Way of Avoiding Figuring Out What to do Next. This paradigm shift away from planning turned sense-plan-act into a simpler sense-act.
Accordingly, in a sense-act paradigm, the primary focus is in carefully defining behaviors and the environment stimulus which should invoke those behaviors. This focus is deeply rooted in the idea of behaviorism; from a behaviorism perspective, behavior is defined in terms of stimulus and response by observing what an organism does. This approach of developing robotics around such behaviors, unsurprisingly enough, is commonly referred to as Behavior Based Robotics. By leveraging simple state machines, which we'll examine below, to define which behaviors are active for a given stimuli, the overall architectural complexity is greatly reduced while giving rise to responsive, seemingly intelligent behaviors. (The question of whether or not the robot is truly intelligent and/or self-aware is a debate that I'll leave to others such as Searle and Dennett.) While the reactive approach took the spotlight for a number of years, and is still very appropriate in some cases, its limitations in managing complex scenarios and performing sophisticated planning indicated that this approach was not the end all panacea for all situations.
The diagram at right clarifies that reactive systems remove planning and deal with sensory input within the context of each behavior. I.e., "Behavior 1" has no knowledge of the feedback coming from "Sensor 3." Adapted from (Kortenkamp, 1998).
A Design Methodology for Reactive Systems
Due to the simplistic nature of reactive systems, the corresponding design methodology for developing such systems is rather straight-forward (Kortenkamp, 1998):
- Choose the tasks to be performed. E.g., find kitchen.
- Break down the tasks in terms of specific motor-behaviors necessary to accomplish each task. E.g., find wall, follow wall, recognize kitchen.
- For each action, find a condition under which each action should be taken. E.g., wall found.
- For each condition, find a set of perceptual measurements that are sufficient to determine the truth of the condition. E.g., long straight line connected to obstacle.
- For each measurement, design a custom sensory process to compute it. E.g., bumper sensor activated, edge length over threshold found.
When implemented, we find that the defined behaviors can be realized as states within a finite state machine and that found conditions act as the mechanism for changing from one behavior state to another. What's missing is an arbitration mechanism to determine which behavior wins out in light of competing conditions. Let's briefly look at implementation approaches to reactive systems along with how such arbitration is achieved.
Subsumption & Motor Schemas
A key concern with reactive, behavior based robotics is determining which behavior should take precedence if conditions exist to activate two or more behaviors. An arbitration mechanism needs to be introduced to resolve such competing situation. Brooks dealt with this issue by proposing an architecture known as subsumption. Simply enough, a subsumption architecture still uses a finite state machine to codify behaviors and transitions, but introduces behavioral layers which can provide input to other layers and override behaviors of lower layers. The advantage is in facilitating more sophisticated behaviors as a sum of "lesser" behaviors. It's better understood by reviewing the following example from (Arkin, 1998):
In the example, there are three behavioral layers: Avoid-Objects, Explore, and Back-Out-of-Tight-Situations. The Avoid-Objects Layer's responsibility is to avoid objects by moving away from any perceived obstacles. The Explore Layer's responsibility is to cover large areas of space in the absence of obstacles. The highest level assists the robot in getting out of tight spaces if the lower layers are unable to do so. Each discrete behavior can invoke transitions to other behaviors and/or provide input or advice to subsequent behaviors based on perceived conditions. The "trick" of subsumption is that higher levels can suppress commands between lower level behaviors; consequently, higher layers are able to handle more complex scenarios by manipulating lower level behaviors in addition to its own. To illustrate suppression, note, in the above example that the "Reverse" behavior in the Back-Out-of-Tight-Situations Layer suppresses any commands that the "Forward" behavior is sending to the actuators. By doing so, complex behaviors may emerge using a number of basic behaviors and a relatively simple architectural approach.
A major challenge in using a subsumption architecture is deciding the appropriate hierarchy of behavioral layers. In more complex scenarios, it quickly gets sticky deciding which layer should have the right to suppress which other layers. (Otherwise known as spaghetti-prone.) Additionally, since the layers have bi-directional dependencies amongst each other, in order to provide input and suppression, changes to layers can have large impacts on other layers, often resulting in shotgun surgery with any change. Ronald Arkin introduced a subsequent architecture to address these, and other concerns, with an approach known as Motor-Schema based control which does away with arbitrating competing behaviors. Each active behavior in a Motor-Schema based control system calculates a vector to be carried out by an actuator (e.g., wheels or arm). Using vector addition, a final vector for each actuator is computed and sent for execution. With this approach, the output of behaviors is combined instead of arbitrated. This avoids the need to determine suppression hierarchies and makes a more extensible application, within the limits of behavior based robotics.
Most modern architectural approaches to robotics control attempt to combine the planning capabilities of deliberative systems with the responsiveness of reactive systems. Appropriately enough, this is referred to as a hybrid style. While the deliberative approach takes a sense-plan-act perspective and the reactive approach follows with plan-act, a hybrid approach typically takes the form of plan, sense-act. So while the sense-act layers carry out behaviors, the deliberative planning layer can observe the progress of reactive behaviors and suggest direction based on reasoning, planning and problem-solving. The diagram at right crudely demonstrates this combination of the two approaches. Adapted from (Kortenkamp, 1998).
Three-Layer Architecture (3T)
James Firby's thesis proposing Reactive Action Packages (RAPs) (Firby, 1989) provided a solid approach for integrating deliberative planning with reactive behaviors in the form of a three layer architecture. A multitude of subsequent architectures have emulated a similar approach, coming to be known as 3T architecture. From Eran Gat's essay Three Layer Architecture (Kortenkamp, 1998):
Three-layer architectures organize algorithms according to whether they contain no state, contain state reflecting memories about the past, or contain state reflecting predictions about the future. Stateless sensor-based algorithms inhabit the control component. Algorithms that contain memory about the past inhabit the sequencer. Algorithms that make predictions about the future inhabit the deliberator. ... In 3T the components are called the skill layer, the sequencing layer, and the planning [deliberative] layer.
As implied by being a hybrid approach, this 3T architecture is not mutually exclusive to behavior driven robotics. Indeed, the skill layer itself is made up of unique behaviors which resemble that of reactive systems. The sequencing layer then uses the world model to determine when a change in behavior is required. The planning layer can then monitor the lower layers and perform more deliberative, time consuming processes such as path planning. The primary variation amongst 3T implementations lies in the decision between whether the planning layer "runs the show" or if the sequencing layer takes command and invokes the planning layer only when needed. Atlantis is one such example that leaves primary control to the sequencing layer to invoke the planning layer. Other examples of 3T implementations include Bonasso's Integrating Reaction Plans and Layered Competences through Synchronous Control and Standford's Junior which took 2nd place in Darpa's Urban Challenge.
The intention of this article has been to provide an introductory overview of paradigms of robotic control including deliberative/hierarchical systems, reactive systems, and hybrid systems. While each approach is appropriate in select contexts, a hybrid architecture is very adaptable for accommodating a large variety of robotic control scenarios with sufficient planning and reactive capabilities. The interested reader is encouraged to dig further into the references and links provided to learn more about these various approaches and design methodologies for implementation.
In the next post, we will examine messaging strategies to facilitate the communications amongst the layers and components of robotic systems.
Albus, J., Madhavan, R., Messina, E. 2006. Intelligent Vehicle Systems: A 4D/RCS Approach.
Arkin, R. 1998. Behavior Based Robotics.
Firby, J. 1989. Adaptive Execution in Complex Dynamic Worlds.
Kortenkamp, D., Bonasso, R., Murphy, R. 1998. Artificial Intelligence and Mobile Robots.
02-03-2010 3:39 PM