Tuesday Jan 28, 2025
Thursday, 30 May 2024 00:11 - - {{hitsCtrl.values.hits}}
Machines that understand the world
We the humans have a pair of eyes, as you open the eyes surrounding light enters and creates an imaginary object on the retina. Optical nerves connected to the retina encode and transmit the image in a form of electrical impulses into your brain’s visual cortex where the received signals are perceived with respect to prior knowledge. Visual perception enables you to interact with your loved ones, read, learn, play, work, drive, and enjoy scenic views. Essentially, computer vision (CV) aims to achieve these human or much better visual perception capabilities using cameras and computers.
Most of the recent advancements in the CV were innovated within the last 20 years thanks to the evolution in GP-GPU (General purpose graphics processing units) like hardware accelerated computing platforms and research success in machine learning, particularly in deep learning. Now CV is surpassing average human capabilities, and indicating superhuman vision capabilities allowing self-driving vehicles and autonomous robots to advance. CV is a branch of Artificial intelligence (AI) providing the eyes, and it uses subjects such as image processing, pattern recognition, machine learning, mathematics, physics, and signal processing.
Humans achieve visual perception without ever thinking about that. However, computers to learn these visual perceptions tasks is hugely difficult due to many aspects. Visual perception is beyond just the calculations, but it is more about learning models of the visual world. Typical computer vision systems start with cameras which could be numerous types, e.g., Monocular cameras, Pan-Tilt-Zoom cameras, Stereo cameras, Structured light-based depth cameras. One scanning by camera is called an image frame which is the input into the CV system. A computing platform can be mobile phone, laptop, PC, Embedded systems, or supercomputer. Nowadays, most of these systems use hardware accelerators such as GP-GPUs.
CV Capabilities
At its core CV tries to see the world and understand it. Therefore, recognizing shapes, objects, human languages (OCR), human faces, human posture, and activities are key capabilities where research and development are focused. Most of these capabilities require extracting some features in the image related to the matter of interest, training, and inference using machine learning models for classifying the items into different classes, recognizing associations and interactions between each other.
Applications of CV
Anything that can be performed with human vision is a potential candidate for CV to perform efficiently and accurately. CV is a core enabling technology for autonomous robotics, self-driving vehicles, augmented reality, and visual data analytics. Each of those creates numerous applications in various fields such as in agriculture, industry, medical, defence, mining, space, and social segments.
In Autonomous robotics and self-driving vehicles, CV helps following key tasks.
In Augmented reality, CV helps following core enabling capabilities.
In Visual Analytics, CV provides following key capabilities.
Industrial
Agriculture
Soil and land surveying using aerial imagery
Mining
Medical
Transport
Space
Retail
Product quality recognition – e.g., for fish, fruits, vegetables, etc.
Social
Opportunities in enterprise and social development in Sri Lanka
Implement more disciplined road traffic system with CV based un-biased traffic violation monitoring system.
We are living at a very interesting point in the timeline in the evolution of the Computer Vision and AI in general. Hardware gets computationally more powerful, compact, cost-effective, and power efficient while software gets more robust inside to handle finer details, and researchers are pushing the boundaries. Sri Lanka has the potential to be a leader in these emerging technologies.
Author: Dr. Kalana Withanage obtained PhD (2019) in Computer vision/Human-robot interaction (HRI) at the University of South Australia, and BSc (Hons) in Electrical and Information Engineering at the University of Ruhuna (2008), Sri Lanka.