Through my affiliation with the Association of Information Technology Professionals (AITP) Atlanta, I was honored to host an incredible presentation by Chris Debellis on the topic of Computer Vision.
Here are some of my key takeaways from the presentation.
Let’s start with the basics: what is artificial intelligence (AI)? AI has become synonymous with systems that can perform a specific task to the same ability that a human could. In some cases, better. AI usually involves deep learning utilizing neural networks.
Let’s dive deeper. What is computer vision?
When you look at a picture of you and your family you can see how many people are in the photo. What everyone is wearing. If someone blinked. You can even see background details like where the picture was taken, what time of day it was. You might even be able to tell what season it is.
But to a computer, your family photo is just pixels and numerical values that represent shades of red, green and blue. With computer vision, a computer program can now determine which pixels belong to each object or, in the case of your family photo, person – more like a human being would
Deep learning perception
Deep learning perception takes computer vision to the next level. A convolutional neural network, CNN, consists of multiple layers of artificial neurons and mathematical components. The goal of a CNN is to imitate the real-life object in an image.
When processing an image, a CNN uses layers to extract a specific feature from the image. The first layer detects basic things like vertical and horizontal edges. And, as the CNN moves deeper, it’s able to depict more complex things like corners and shapes. Eventually, its final layer can detect doors, cars, buildings and even faces.
Deep learning perception utilizes multiple sensors and puts them together to make a full interpretation of the environment. There are two different types of sensors: 2D and 3D.
2D sensors include RGM color cameras and night vision. 3D sensors include RGB-D cameras, Lidar and point cloud representation.
The 3D point cloud approach uses data points in 3D space that represent the surfaces of an object. With this approach you can flip the model and take distance measurements. Through this you can figure out how far the end-effector on the arm is from the object.
Another 3D approach is 3D depth maps. Depth maps are able to show distance like the point cloud approach, but don’t give you a full understanding of the environment.
Through the use of computer vision, companies are able to inspect bridges and buildings that may be corrosive by utilizing drones. Drone inspections can take a closer look at an area where companies might not feel safe sending humans.
Drone inspections can also inspect cell towers and power lines. The visual spectrum isn’t needed for this process. The infrared spectrum can be used to gain a meaningful understanding of the landscape. When utilizing drone inspections for power lines, companies can us infrared (IR) to identify hot spots that are prone to failures in the near future.
Deployment of deep learning models
Computer vision engineers, whose specialty is on deep learning models, focus on:
- AI on “Edge” devices – In Edge AI, the AI algorithms are processed locally on a hardware device, without requiring any connection. It uses data that’s generated from the device and processes it to give real-time insights in less than a few milliseconds. Edge devices include IoT devices, cameras, drones, etc.
- Models deployed on mobile devices – Mobile deployment of computer vision algorithms is possible thanks to the advances in compute capacity and processing speed available in smaller handheld devices. The exact goal of the project determines if this model is right.
- Robot Operating Systems (ROS) – ROS is a set of software libraries and tools to help build robot applications. The point of ROS is to create a robotics standard, so reinvention of the wheel when building a new robotic software is not necessary.
- Machine Learning (ML) – In ML, engineers build and deploy models and infrastructure that collects and processes real-time streaming data.
There are also computer vision engineers whose primary focus is on the optimizations made in AI implement accelerations in conjunction with various coding languages through:
- Nvidia TensorRT – a software development kit (SDK) for high-performance deep learning inference.
- Intel OpenVINO – an SDK used to deploy high-performance deep learning inference.
- Qualcomm SnapDragon – this processor gives you vison processing APIs with hardware acceleration and performance on mobile devices.
AI departments also have roles such as project managers, sales and marketing and legal and policy. These are great area opportunities to look for if you’re interested in AI, but don’t have the coding background.
In summary, what appears to be very simple actions taken by a machine, are the result of countless man hours by a variety of intelligent and motivated professionals, each responsible for very specific outcomes. One of Chris DeBellis’ projects resulted in only a few seconds of action but took two years of arduous work to get correct. Computer Vision, Robotics, AI, Machine Learning and Deep Learning are all areas that hold a plethora of opportunity for those with the desire, knowledge……and patience! You can see the presentation in its entirety.
About the author
Steven Wright serves as a Senior Account Executive for Synergis and volunteers his time as President of the Atlanta chapter of the Association of Information Technology Professionals. Steve has had a career in technology, spanning more than 25 years. He has always served in an advisory, and relationship development capacity, working within sales and business development groups for healthcare technology outsourcing, manufacturing, professional service, and, most recently, the staffing industry. In his free time, Steve enjoys learning about new and emerging technologies. This love of tech has helped him aid clients and candidates alike in their career and talent journeys.