Historically, traditional cameras have only had video functionality, but did not have the capability to contextualize what they saw or recorded.
That’s about to change. Computer vision (CV) uses artificial intelligence (AI) to convert video streams into meaningful information. Using technology to rethink a camera as an image sensor, backed by intelligence, is analogous to a human eye and brain and allows for a diverse range of use cases for computers to perform tasks that previously required human intelligence.
From counting the number of cars in a parking lot, to monitoring foot traffic in a retail store, or spotting defective products on a production line, these are some examples of how CV is helping in commercial applications. And it’s not just limited to businesses. At home, smart cameras can tell us when a package has arrived, if a puppy has slipped out of the backyard, or when a vehicle is in your driveway.
The use of CV is growing exponentially both in consumer and commercial markets, with more than 2 billion camera devices used worldwide for surveillance and security. Technology has made great strides in security, computing, image processing, and cloud services, enabling future CV products to have better capabilities than ever before. However, there are some key hurdles to market adoption of such CV applications:
- High total solution cost: Traditionally, most CV algorithms required a powerful GPU to run complex and compute-hungry neural nets, making it impossible to run computer vision on a local low-cost hardware. The solution required 24X7 video stream to a cloud-based GPU to do computation required for CV. Such an approach is prohibitively expensive, and it also brings increased privacy/security concerns and delayed decision making. Moreover, if a business takes an alternate approach of “rip and replace” existing cameras with smart cameras, with 16-20 cameras for a given location, this approach might cost as much as $3,200 in smart camera equipment and another $1,000 for installation.
- Privacy and digital security concerns: IoT security is a top priority for the tech industry, but it is also a challenge. It is critical to ensure the security of video content, especially when these devices have acquired and stored image data related to people, places, and high-value assets. Unauthorized access to data from cameras monitoring factories, hospitals, schools, or homes is not only a serious violation of privacy rights, but it can also lead to untold harm such as criminal activity and leakage of confidential data. There is a need for technology to keep everyone safe and not create new avenues for privacy violations.
- Lack of accessibility to data or system settings: The traditional CCTV architecture stores camera data locally through a network video recorder (NVR) or digital video recorder (DVR). This model has many limitations, including the need for huge storage space and a limited number of physical connection ports per NVR. Moreover, any critical time-sensitive alerts, as well as data, are not accessible to a user who is not physically located on the premises of where the NVR/DVR is located. The architecture also does not permit users to access settings remotely or for devices to be monitored and updated.
There is a dire need for a solution that addresses all the key market hurdles. That solution would include the combination of image sensor technology and AI at the edge (AIoT) that enables computers to perform increasingly complex inferences from massive amounts of CV data. New machine learning (ML) algorithms are being built that require less computing power and memory to do specific functions such as detecting people or animals, identifying specific objects, and reading license plate numbers. These CV applications all require running ML algorithms on end devices, rather than sending data to the cloud for inference processing. Small neural nets running on inexpensive EDGE devices may be the holy grail of AIoT. By moving computing power closer to the data, we can improve latency, reduce bandwidth costs, improve data security/privacy, and reduce the total solution cost.
This technology also serves as a bridge to connect existing cameras to the cloud. Moving to a cloud-native model simplifies the deployment of AI analytics solutions: Cameras of any order can be configured and managed through configuration files downloaded to the device. It’s also a virtuous cycle: Video data for specific alerts can be used to train models stored on the cloud for specific use cases, making better AI models. As the AI models improve, they need to upload less data. Moreover, the data collected can then be compiled and formulated in the form of insightful dashboards for businesses. For example, a business can use such an AI analytics dashboard to analyze count of unique visitors, demographics such as gender and age distribution, or peak demand staffing needs.
Security and privacy also gets a boost by deploying an EDGE AI solution. Even if the camera is damaged or destroyed, the footage is stored safely in the cloud.
All in all, demands from new application scenarios are driving the need for continuous improvements across computing and imaging technologies. As computer vision advances, new cameras will do more than just become image sensors, but will allow us to understand pictures and make better decisions. More efficient, more powerful, and more intelligent computing will enable current cameras to be more than just sensors. Bridging the analog and digital worlds is opening up new use cases like quality inspection, safety and compliance, and patient care, that were once unthinkable.
Yamin Durrani is CEO of Kami Vision, an edge-based vision artificial intelligence (AI) platform for businesses and their customers. He has over 20 years of experience in the technology sector.