This chapter briefly describes overall architectural requirements to the autonomous systems, and provides details on how implicit perceptual symbols emerge in the brain, some evidence from neuroscience, and mathematical methods that can generate implicit symbols from perceptual information.
TopIntroduction
When we see anything, we neither compute a precise 3-D model of visual scene nor are able to describe in a human language everything that we see. But we can understand what we see. More precisely, we can understand space and time order of what we see, which comprises our visual scene. And we can recognize and name some objects in the visual scene, using our acute foveal vision. And we can understand what we have to do about it. But what does such an “understanding” actually mean?
Neither perception nor cognition processes organize a linear bottom-up sequence. For instance, if you have recognized some objects, this may help you to identify the order of things in the visual scene. And when you have an idea of what kind of visual scene you are observing, it is easier to recognize or identify objects in the visual scene depending on where they are. So, the overall process looks like a filling puzzle.
And when we are filling out this puzzle, we are not doing this from scratch every time. In the same way like every other puzzle, this one requires some preliminary knowledge. Without it, solving the problem is not possible.
Solving the problem of vision requires knowledge of the order in the world. We are not able to estimate precisely distances and timings, and this means that no restoration of a precise 3D model ever happens in the brain. But we are always able to say that one object is closer to us than another object. Or, that one event has occurred earlier than another one. And this means that we can understand the relational order of the world that is a subject of a new relational topology.
This order is determined both by physical laws of the world and the ways how human tends to organize information. For instance, human frequently organizes close things into categories and hierarchies.
There is evidence from the cognitive neuroscience that perceptual cortical areas wired in the way that for 1/3 feed-forward projections there are 2/3 backward projections. And this means that some context system does exist and strongly affects human recognition capabilities.
Brain processes information in a hierarchical manner. On the lower levels, we can find perceptual features, and top levels are some cognitive models used for decision making.
The order, in which this puzzle is filled out, may also be crucial for survivability. Some objects or parts of the visual scene may require immediate reaction. But in a general case, visual information is used for planning mostly on tactical and further – on a strategic level. And this conscious or subconscious planning dictates the search order in the visual scene.
Extracted information is used for a tactical decision making and strategic planning. Tactical decision making in a real world is a rapid process because real life rarely gives another chance. And a successful tactical move may not be sufficient if the overall strategy fails on the following steps. So we also need to extract more information for strategic planning. And this is where human hunters were always winning over the animals.
All these processes interact and involve some higher level cognitive knowledge that serves as a context system for recognition. And all of this may help us to solve ambiguity in recognition of real world images. Reliable recognition frequently requires a context, which decreases uncertainty.
Vision is not just about recognition, although recognition is an important part of it. Vision is only a part of a larger system that allows us to build and maintain a reliable model of surrounding environment. And it is a part of situation awareness system that is mandatory for our survival, decision making, planning, and reaching our goals on tactical and strategic levels. And all of this serves as a top-level context system, which significantly affects what we need to search for and recognize in the sensor stream.
People senses have same payload problems that any other information processing system. Certain information has to be identified and processes extremely quickly. Ignoring certain information is safe. Here our context system and senses come into play and effects processing.
And there is another important observation: for solving human problems, we may need a human level of intelligence. Animals are excellent in navigation and moving their bodies in the surrounding world. But would an animal brain even be able to drive a car, it still cannot perform tasks that a real human driver does, because driver’s goals and tasks on the route are not just all about driving.