Of all the information humans acquire from nature, visual information accounts for the largest share, approximately 80% of the total. With the development of information technology, human visual functions are gradually being applied to computers, robots, and other intelligent machines. Machine vision, currently a booming technology, is one such example. It uses image processing to achieve automatic detection and analysis applications, including automatic inspection, process control, and robot navigation. Machine vision (MV) technology has already been commercialized. Vision sensors, lenses, high-speed cameras, light sources, vision software, image acquisition cards, and vision processors are becoming increasingly sophisticated. In industrial automation environments, machine vision is receiving increasing attention and is being widely used in areas such as autonomous vehicles, food production, packaging and logistics, robotics, and drones.
When it comes to machine vision, technicians may have many different understandings. This article attempts to explain the truth about machine vision from four aspects.
Truth #1: Machine vision ≠ Computer vision
Machine vision is a device that automatically receives and processes images of real objects using optical equipment and non-contact sensors to obtain necessary information or control the movement of robots. Operating since the 1950s, this technology truly took off and gradually became widespread between 1980 and 1990. Over decades of development, machine vision has accumulated various definitions of what it is and how it works.
The Association for Automated Imaging (AIA) offers a broader definition: machine vision encompasses all industrial and non-industrial applications, where a combination of hardware and software provides operational guidance for devices to perform image-based capture and processing functions. SearchEnterpriseAI, on the other hand, provides a narrower definition of machine vision, referring to it as "the vision capability of a computer," a system that uses one or more cameras, analog-to-digital converters (ADCs), and digital signal processing (DSPs) to transmit the resulting data to a computer or robot controller.
In practical applications, machine vision typically needs to work in conjunction with other advanced technologies, including natural language processing, robotic process automation (RPA), artificial intelligence (AI), and machine learning (ML), to provide the "vision" capabilities required for automation. You can think of machine vision as the eyes of automation, AI and ML as the brain, and RPA as the "keyboard operator" needed to complete the task. In recent years, the adoption of automation has accelerated, which is crucial for businesses to maintain their competitiveness. If we imagine automation as "digital employees" at work, without the addition of machine vision, all these "digital employees" would be blind.
Computer vision has become a hot topic in the industry in recent years, so what is its relationship with machine vision? Broadly speaking, machine vision is a technological capability that integrates with existing technologies in new ways and applies them to solve real-world problems; it is a systems engineering discipline. Computer vision, on the other hand, is a form of computer science and is not achieved through tangible hardware such as cameras fixed to robots.
More specifically, machine vision is the core of a system, while computer vision is the system's intelligence, its brain for processing information. Without computer vision, machine vision cannot function. Machine learning, deep learning, and neural networks are three technologies used by machine vision systems to process tasks at a faster pace. These three technologies can be used to expand machine vision's understanding of what it needs to locate, making it a valuable asset. As computer vision technology advances, the potential applications of machine vision also increase accordingly.
It is worth noting that machine vision and image processing are two different concepts. Image processing is a process of outputting an image, while machine vision systems can detect and classify a wide variety of objects and items across a broad range of industries, including automotive, electronics and semiconductors, food and beverage, road and vehicle traffic or intelligent transportation systems (ITS), medical imaging, packaging, labeling and printing, pharmaceuticals, television broadcasting, and more. Machine vision-based technologies are becoming central to the creation of automation.
Truth 2:
Advances in hardware and software have driven progress in machine vision.
Machine vision is the eye of industrial automation. Its main workflow is as follows: the system uses machine vision products (such as cameras, CMOS, or CCD) to convert the captured target into an image signal, and then transmits the image signal to a dedicated image processing system. Based on information such as pixel distribution, brightness, and color, the image signal is then converted into a digital signal, ultimately enabling the machine (robot or other industrial tool) to complete industrial tasks such as manufacturing and quality verification.
Machine vision is a key element of Industry 4.0, helping industrial automation systems improve efficiency in various ways, such as by improving inventory management, detecting defective products, and enhancing manufacturing quality. To accurately simulate human perception, machine vision requires a range of devices and software. The continuous development of these hardware and software technologies further drives the evolution of machine vision technology.
#01
Smart camera
Cameras are the primary devices in machine vision systems for inspecting objects or items. Sometimes, a specific inspection point may require multiple cameras to ensure every detail is correctly examined. Smart cameras are essential when machine vision systems need to capture and extract application-specific information from images. Smart cameras typically include all necessary communication interfaces and can connect to Wi-Fi or servers to transmit captured image data. As a powerful tool, deep learning enables system designers to quickly automate complex and subjective decisions while effectively improving product quality and productivity. The FLIR Firefly DL camera, provided by Teledyne Flir, features built-in deep learning inference, eliminating the need for a host system for task classification and significantly reducing system cost and complexity. Its compact size, lightweight design, and low power consumption make it ideal for embedding in mobile, desktop, and handheld systems.
Figure 1: Teledyne's FLIR Firefly DL camera
It features small size and low power consumption.
(Image source: Teledyne)
Omron Industrial Automotive's S133 UVC color CMOS camera is also a smart camera product. It features a built-in CMOS sensor, an ultra-compact design, and plug-and-play functionality, making it an ideal choice for those seeking a camera with machine vision capabilities. Due to its ease of use, the S133 UVC color CMOS camera is popular in industrial/machine vision applications, automotive, and life sciences fields.
Figure 2: S133 UVC Color CMOS Camera
(Image source: Omron)
#02, 3D Camera
3D cameras can display the depth of an object in an image, showing different angles within the image. Using 3D cameras in machine vision systems provides different perspectives and depth perception. Time-of-flight (ToF) cameras are 3D cameras that measure distance using the time-of-flight principle. ToF imaging technology allows for 3D imaging without scanning objects. This technology typically covers distances from a few meters to about 40 meters, capturing up to 100 images per second, with a distance resolution of approximately 5 to 10 millimeters and a lateral resolution of approximately 200 x 200.
Historically, Time-of-Flight (ToF) technology has been often considered a low-precision 3D sensing technology due to doubts about its accuracy. However, in recent years, many leading companies have developed high-resolution products with up to 1.3 megapixels. High-precision ToF cameras for machine vision systems can significantly improve production flexibility and automation. Sony's IMX556 DepthSense ToF sensor uses CAPD and back-illuminated (BSI) technologies, offering millimeter-level accuracy compared to existing ToF solutions on the market, providing a 640 x 480 resolution at 30fps at a working distance of 6 meters.
Figure 3: The Sony IMX556 DepthSense ToF sensor can reliably transmit data in 3D format.
More detailed and faster frame rate reconstruction of the object under test
(Image source: Sony)
Texas Instruments' OPT8241 Time-of-Flight (ToF) sensor combines ToF sensing with an analog-to-digital converter and a programmable timing generator (TG). This device can deliver 320 x 240 resolution images at frame rates up to 150 frames per second. The built-in TG controls reset, modulation, and readout of the digitized sequence. Furthermore, the TG is programmable, allowing for flexible optimization of various depth-sensing performance metrics, such as power, motion robustness, signal-to-noise ratio, and ambient noise cancellation.
Figure 4: Block diagram of TI's OPT8241 ToF sensor system
(Image source: TI)
#03, Vision Sensor
Vision sensors are the core of machine vision systems and the source of maximizing environmental characteristics. Their core components are image sensors such as CCDs and CMOS sensors. These high-resolution vision sensors typically generate images containing more pixels, greatly improving image quality and making it easier to identify visual details.
For a long time, CCD sensors have been the mainstream technology for capturing high-quality, low-noise images. However, CCD sensors are expensive to manufacture, resulting in higher prices and significantly higher power consumption compared to CMOS sensors. Today, CMOS sensor technology has advanced to the point where it can quickly approach the quality and functionality of CCD technology, while being cheaper, smaller, and consuming less power. CMOS cameras typically have higher frame rates than CCD cameras, a crucial feature for machine vision systems that rely on real-time image processing for automation or image data analysis. Furthermore, CMOS sensors are more sensitive to infrared wavelengths than CCD sensors; CMOS chip and camera manufacturers leverage this advantage to capture infrared light, providing additional imaging capabilities for image recognition. Considering these factors, CMOS sensors may be more suitable for machine vision applications.
The Onsemi AR0130 is a 1/3-inch CMOS digital image sensor with an active pixel array of 1280H x 960V, capturing images using rolling shutter readout. This product includes sophisticated camera features such as automatic exposure control, windowing, and video and single-frame modes. The AR0130 is capable of capturing very sharp digital images and can capture continuous video and single frames, making it particularly suitable for high-performance machine vision applications.
#04, Light Source
As an auxiliary imaging device, the light source often plays a crucial role in image quality. Taking LED lighting products as an example, they offer greater flexibility, with adjustable angles and additional wavelengths, and a more consistent spectral response. A variety of wavelengths and shapes of light sources are available on the market, making product selection relatively straightforward.
#05, Image Acquisition Card
Image acquisition cards typically exist as computer expansion cards, and their primary function is to transmit image output to the computer host. Image acquisition cards need to convert analog or digital signals from the camera into image data streams in a specific format, and can also control some camera parameters, such as trigger signals, exposure/integration time, and shutter speed. Image acquisition cards usually have different hardware architectures and different bus types for different types of cameras, such as PCI, PCI64, Compact PCI, PC104, and ISA.
#06, Visual Processing Software
Machine vision software is used to process input image data and then obtain the desired results through certain calculations. General-purpose machine vision software appears in the form of C/C++ image libraries, ActiveX controls, and graphics-based programming environments. It can be specialized, such as for LCD inspection, BGA inspection, template alignment, etc., or general-purpose, including positioning, measurement, barcode/character recognition, spot detection, etc.
Truth 3:
The rapid development of the machine vision market owes much to the automotive industry.
The value of machine vision in automation lies in its ability to quickly and efficiently capture and process large amounts of documents, images, and videos, far exceeding human capabilities in both quantity and speed.
The broad application prospects and huge market potential determine that machine vision will inevitably be a growing market. According to data from Markets and Markets, the market size of machine vision is expected to grow from $10.7 billion in 2020 to $14.7 billion in 2025, with a compound annual growth rate of 6.5%.
According to data from Grand View Research, the global machine vision market reached $13.23 billion in 2021 and is projected to grow at a CAGR of 7.7% from 2022 to 2030. Demand for vision-guided robotic systems from the automotive, food and beverage, pharmaceutical and chemical, and packaging sectors is the primary driver of market growth. The automotive industry remains the largest adopter of machine vision systems globally, accounting for over 15.0% of revenue in 2021, and is expected to continue its steady growth in the coming years.
Figure 5: US Machine Vision Market Development Trends from 2020 to 2030, by Industry Segmentation
(Image source: Grand View Research)
Truth 4:
Machine vision will play a significant role in robotics applications.
In terms of market reach and applications, machine vision has many opportunities to expand. These opportunities require some imagination, meaning that machine vision is not merely about replacing the eyes of technicians, but about fully leveraging robots to accomplish tasks that technicians cannot. Machine vision gives robots the ability to "see" in real time and with high detail, allowing them to make decisions based on a comprehensive view of an object or environment. Robots are increasingly used worldwide. When robots are equipped with machine vision, they gain greater accuracy, orientation, and understanding, enabling them to grasp objects more accurately, place them with greater precision, and perform more complex tasks faster.
Machine vision is becoming increasingly important in robotics applications. According to a recent report by the Association for the Advancement of Automation (A3), the robotics and machine vision market saw substantial growth in the second quarter of 2021 compared to 2020. Industrial robots are already widely used, and with the emergence of collaborative robots and the rapid development of 3D machine vision, they will be used in combination more and more frequently.
Machine vision embodies a technological capability, as do other capabilities such as automation, machine learning, deep learning, and neural networks. It's a capability that can be integrated into other technologies and processes to benefit industries and improve business efficiency. Robots are increasingly incorporating machine vision, enabling them to perform more complex tasks. These tasks would be impossible without machine vision telling robots the exact location of objects. Machine vision is key to unlocking the full potential of automation, adding more intelligence to intelligent automation.