Across all sectors of human innovation, the line between theory and practicality is constantly pushed to its limits. Computer vision (CV), the discipline of image-based decision making, is no exception to this struggle. At its core, developers and researchers in the field of CV are interested in creating their own artificial abstraction of human sight, lending itself to many powerful applications in emerging technologies such as automated vehicles and advanced medical imaging. Though considerable strides have been made in recent years, there remains an important bottleneck: machines are incapable of extracting meaning from images in the way that a human can. For instance, a person can look at the image below and understand that the row of houses is positioned in front of the metropolitan area in the background, whereas a machine cannot recognize such an observation on its own.
Until recently, researchers and developers addressed this, as well as many other issues, through more advanced image processing solutions. While this approach was certainly effective in theory, it did not carry over well to real time systems (i.e. applications that require low-latency, high-throughput results). The fact of the matter is that processing images for AI applications is computationally expensive, infested with irrelevant information, and seasonal in quality with respect to time and location. With that insight, there came an important paradigm shift: event-based vision. Rather than processing every pixel in a given frame, event-based vision looks only at pixels that exhibit a change in light intensity (i.e. event). After these event images are obtained, they can be fed into artificial neural frameworks (e.g. convolutional neural network) to drive a variety of complex decisions. By framing CV problems through this topology, real-time developers can reduce energy and bandwidth costs through the data’s simplicity, while still achieving minimal error in their output inferences. Furthermore, event-based images are virtually impervious to environmental noise, making it a robust option for handling diverse situations.
While event-based vision opens up many exciting possibilities, its neurologically-inspired processing characteristics have become difficult to implement in existing CMOS architectures. Because of its massive parallel computation requirements and irregular flow of information, event-based vision is beginning to shift toward neuromorphic computing to streamline contextual extraction in an efficient and time-effective manner.
What is Neuromorphic Computing?
Neuromorphic computing refers to the use of very-large-scale integrated (VLSI) systems to mimic the processes of the human nervous system. Before understanding more on how this technology works, it is important to understand why it is needed.
Looking at the modern-day Von Neumann computer architecture, there exist several limiting characteristics. First, physical separation between the CPU and memory introduces a scalability issue to systems with advanced perceptive needs. One of the most important concepts of neurological processing is error minimization, which involves many iterations of checking predictions against ground truth and back-propagating the differences. From a system level, this self-check characteristic translates to frequent data shuttling between CPU and memory, which produces a severe temporal bottleneck. Another important limitation exists at the device level. The primary components of Von Neumann machines consist of passive circuit elements such as transistors, inductors, and capacitors. While on their own they provide useful advantages such as speed and energy optimization, they are also limited in various performance metrics and have low fault-tolerance. With these system and device limitations in mind, researchers aim to develop neuromorphic computing to ensure non-redundant information flow and robust system health for advanced developments.
With the need for neuromorphic computing established, the next step is to understand its intuition from a technical standpoint. Similar to how the contemporary computer is composed of passive digital and analog circuit elements (i.e. transistors, resistors, capacitors, inductors), neuromorphic computing does not deviate from this foundation. Its real distinction as an emerging technology is how it manipulates those same building blocks to emulate the complex processes of the brain and its sensing constituents.
Rather than trying to cover every aspect of neuromorphic computing, a simple example will be illustrated and explained. After which, the reader should have the necessary insight to extrapolate and understand its functionality in large scale applications. To begin, consider the diagram below:
The image on the left represents the anatomy of a neuron: the most fundamental building block of the human brain. Its job is to transport information, in the form of electrical energy, to other neurons to facilitate memory, critical thinking, emotions, etc. This process is split up into three parts. First, its dendrites receive a discrete electric potential (i.e. spiking voltage) from another neuron. Second, depending on what that value is, the neuron’s soma computes what output voltage it will deliver. Third, the soma transmits the calculated output voltage down its axon and out its synapse to be received by the next neuron. The image on the right provides a graphical representation of this output signal in the form of a hypothetical 100 mV spike.
The figure below takes the same intuition from figure 3, but represents it in the form of a mixed-signal application. Scaled and adapted to billions of other such topologies, a design like this has the potential to grow in tremendous and unforeseen ways. And through advanced control strategies, each neuron can be adaptively monitored, and potentially rerouted, in order to prevent system-wide failure.
Combining Neuromorphic Computing with Event-based Vision
Though complete replication of human sight is still at its infancy, the ideology of neuromorphic computing has inspired a new era of sensing technology in the space of event-based vision. One of the more notable advancements have come from event-based cameras: devices that accept light spectrums and output only the strongest pixel gradients, analogous to how the human retina detects and reports environmental stimuli to the brain. Though a number of research efforts have begun to exploit this principle, one of the more recent examples came from an IEEE publication last January (https://arxiv.org/pdf/1804.01310.pdf), explaining how event-based cameras, combined with a convolutional neural network, can produce highly accurate steering angle predictions for automated vehicles. Though the algorithmic portion of this example was implemented on a contemporary computer architecture, parallel developments in commercial neuromorphic chips, such as Intel’s Loihi and IBM’s SpinNNaker, give hope to end-to-end solutions for the years to come.
- Tang, Tianqi & Xia, Lixue & Li, Boxun & Luo, Rong & Chen, Yiran & Wang, Yu & Yang, Huazhong. (2015). Spiking Neural Network with RRAM: Can We Use It for Real-World Application?. 860-865. 10.7873/DATE.2015.1085.
- Maqueda, Ana I, et al. “Event-Based Vision Meets Deep Learning on Steering Prediction for Self-Driving Cars.” Https://Arxiv.org/, IEEE, 2018, arxiv.org/pdf/1804.01310.pdf.
- Spiking Neural Network with RRAM: Can We Use It for Real-World Application? - Scientific Figure on ResearchGate. Available from: https://www.researchgate.net/Circuit-of-the-spike-neuron_fig3_300717563
- Poon, Chi-Sang, and Kuan Zhou. “Neuromorphic Silicon Neurons and Large-Scale Neural Networks: Challenges and Opportunities.” Frontiers in Neuroscience, vol. 5, 2011, doi:10.3389/fnins.2011.00108.
- Ahn, Byungik. “Neuron-like Digital Hardware Architecture for Large-Scale Neuromorphic Computing.” 2015 International Joint Conference on Neural Networks (IJCNN), 2015, doi:10.1109/ijcnn.2015.7280724.
- San Francisco 2018: Best of San Francisco, CA Tourism.” TripAdvisor, www.tripadvisor.com/Tourism-g60713-San_Francisco_California-Vacations.html.