Detailed Explanation of Machine Vision and Image Analysis Technologies

1. Key points

1. Not all vision-related projects require expert consultation; with the help of hardware and development tool vendors, developers lacking experience in vision system development can often complete most (if not all) of the development work and save their companies money.

2. Before you begin developing a vision system, you must answer about five or six questions; your answers will largely determine the hardware cost of the system.

3. You can greatly improve efficiency by choosing a menu-driven environment that allows you to start device development and then refine the program through graphical or syntax-based programming.

4. You are used to the idea that vision systems need careful maintenance after installation; you often cannot foresee the various reasons that may be necessary to adjust the algorithm after the system has been running for a period of time.

Developing a vision-based device successfully can require a great deal of expertise, so much so that many developers who intend to do so are hesitant to attempt the task, instead turning to consultants who have built their careers on mastering the nuances of every aspect of the technology. Often, a consultant can save you not only several times the consulting fees but also a significant amount of valuable time. Even so, there are increasingly more compact, packaged software packages for vision-based system development that allow those without machine vision or image analysis experience to confidently undertake projects.

If you lack the appropriate experience, the first good step is to determine which tasks require external assistance and which you can likely complete quickly yourself using pre-packaged software. Vendors providing development tools and hardware can often help you make this judgment. In many cases, their websites offer tools to aid in this decision-making process. Calling such a vendor will usually connect you with an application engineer who can gather information about your equipment. When appropriate, most vendors will recommend their consulting experts familiar with their work. Often, the most economical approach is to use consulting assistance only for certain parts of a project, such as the lighting.

Image analysis and machine vision are related yet distinct fields. In one sense, image analysis is a part of machine vision. However, in another sense, image analysis is a broader discipline. In reality, the boundary between these two fields is often blurred.

Machine vision applications are often commercially driven. For example, machine vision is a critical part of many manufacturing processes. On the other hand, "image analysis"—as most people understand it—is more likely to be used in scientific research laboratories. Some experts say that image analysis often deals with less precise operations than machine vision. Characterizing or classifying images of unknown objects, such as animal tissue cells in academic laboratories (Figure 1) or even clinical pathology laboratories, is one example.

Figure 1. A research team at Cold Spring Harbor Laboratory (New York) and Howard Hughes Medical School used Matlab and its image capture and image processing toolboxes to study how the mammalian brain works. Using the image capture toolbox, researchers can stream microscope images directly from a camera to Matlab and use the image processing toolbox to analyze images over a period of time. To enable capture and analysis with the push of a button, the researchers created a vivid graphical user interface for Matlab.

In machine vision, you typically have a general understanding of the objects observed by a camera or image sensor, but you need more specific information. Product inspection equipment falls under this category. For example, you know what kind of printed circuit board model an image depicts, but you must determine if all the components are the correct type and in the correct position. Determining the correctness and proper placement of components certainly involves image analysis, but this analysis is far more intuitive than the kind used in clinical laboratories.

II. Classification of Machine Vision Tasks

Several experts categorized the main machine vision tasks into the following types:

1. Count components such as washers, nuts, and bolts, and extract visual information from a noisy background.

2. Measure (also known as determine) angles, dimensions, and related positions.

3. Reading includes operations such as extracting information from barcodes, performing OCR (Optical Character Recognition) on characters etched on semiconductor chips, and reading two-dimensional DataMatrix codes.

4. Compare objects, for example, comparing a unit on the production line with a KGU (Known Quality Unit) of the same type to identify manufacturing defects such as missing components or labels. This comparison might be a simple pattern subtraction or it might involve geometric or vector graphic matching algorithms. If the objects being compared are different in size or orientation, the latter must be used. Types of comparison include detecting the presence or absence of objects, matching colors, and comparing print quality. The object being inspected might be as simple as an aspirin tablet, whose correct labeling needs to be verified before packaging.

The above list is quite specific, which may mean you can use menu-driven, graphical development tools to create machine vision devices instead of writing code in a text-based language like C++. While developers with a long history of programming machine vision devices in text-based languages often prefer to stick with the tools they've succeeded with over the years, you can indeed use one of several menu-driven graphical application development packages. Although some in the industry criticize this reluctance to change, ask yourself how you would feel if you hired a consultant specializing in a particular device to try out a new software package for your job for the first time.

Even among various graphical tools, vendors differentiate between those that truly offer programmability and those that merely allow users to configure devices. This configurable approach allows you to get devices up and running faster and provides much of the flexibility developers need. Programmable functionality offers developers greater flexibility but can increase development time—especially for those using a tool for the first time. In some cases, both configurable and programmable approaches produce output in the same language, allowing you to use programmable functionality to modify or improve devices created using the configurable approach (Figure 2). The potential benefits of this flexibility are enormous: you can use more powerful tools to refine a device and quickly get it working at a raw level with basic tools. This approach reduces the likelihood of wasting time refining methods that you later discover have fundamental flaws.

Figure 2 illustrates the advantages of the toolkit, a key alternative technology for device development using DataTranslation's VisionFoundry. The toolkit allows you to quickly validate concepts using configurable, menu-based, interactive tools, and then improve your devices through programming capabilities. In VisionFoundry, most programming tasks can be accomplished by writing intuitive scripts.

III. Ongoing Adjustments

Perhaps more importantly, it simplifies the inevitable adjustments being made in many machine vision systems by leveraging the easy interchangeability of these two methods. For example, in AOI (Automated Optical Inspection), you might want to reject any UUT (Unit Under Test) that differs from the KGU. Unfortunately, this strategy would likely reject a large portion of your manufactured units, even if most of them have acceptable performance. A simple example of an AOI system rejecting a high-quality part due to minor differences is when the date code of a component used in the UUT differs from the date code of its equivalent component on the KGU.

At this point, you can anticipate data code issues during the device design phase and ensure the system ignores image differences in areas containing date codes. Unfortunately, other minor differences are harder to predict, and you must anticipate needing to modify the device when you discover them. In fact, some AOI system software can perform such modifications almost automatically; if you inform the system that it has removed a high-quality cell, the software will compare the cell's image with the original KGU and will not inspect subsequent cells in the areas of difference.

However, this method can sometimes produce unsatisfactory results. Suppose the inspection system is installed in a room where external light can enter through a window, thus changing the illumination of the UUT. While inspectors might adapt to this change without question, it can cause the visual system to categorize images of the same object as different objects, leading to unpredictable inspection failures. Although blocking the window would prevent external light from entering, adjusting the testing procedures to ensure the KGU passes under various lighting extremes might be more cost-effective.

Even so, this example highlights the importance of lighting in machine vision and image analysis. Lighting itself is both a science and an art. Various lighting techniques have different advantages and disadvantages, and lighting methods for the UUT can solve or improve common machine vision problems (Reference 1).

IV. Project Costs and Timeframe

The cost of machine vision projects varies widely. Some projects cost less than $5,000, including hardware, pre-packaged software development tools, and the labor costs of the equipment developers. However, such low project costs likely do not include the costs of adjusting and debugging the equipment to achieve satisfactory performance. At the other end of the cost range, projects can exceed one million dollars. The most common examples of these projects are likely major upgrades to automated production lines in the automotive and aerospace industries. According to some suppliers, the most common project costs typically range from tens of thousands to slightly over one hundred thousand dollars. The project timeline from management approval to the vision system being operational in production is usually less than six months, and often only one or two months.

Unsurprisingly, almost all vision projects begin with answers to fundamental questions. The answers to these questions largely determine the cost of the vision system hardware: How many cameras are needed? What image resolution is required? Is color imaging necessary? How many frames per second must be acquired? Are cameras that produce analog output needed? If so, a frame receiver board needs to be selected to convert analog signals into digital form, and, if necessary, the acquisition of image frames needs to be synchronized with external trigger events.

While some frame receivers for analog cameras can accept input from multiple cameras simultaneously, it's more common for circuit boards to provide one interface per camera at a time. If you choose a camera with a digital interface, will you use a "smart" camera capable of image processing and acquisition, or will the camera send raw (unprocessed) image data to a host PC for processing? Also, what interface standard or bus does the digital camera use to communicate with the host PC? Digital cameras that work with certain buses require a frame receiver. However, unlike frame receivers for analog cameras, frame receivers for digital cameras do not perform analog-to-digital conversion.

Hardware-related considerations can extend beyond these issues. Furthermore, some problems rely on the generally accepted default assumption that the mainframe computer for the vision system is a PC running a standard version of Windows (www.microsoft.com). Machine vision systems sometimes run on real-time operating systems, while image analysis software often runs on Unix or Linux. Additionally, like other real-time systems, many real-time vision systems use CPUs different from Pentium (www.intel.com) or Athlon (www.amd.com) devices.

V. Camera Interface

Interfacing the camera with the host computer remains a critical issue in vision system design. Despite the emergence of cameras with digital interfaces, and despite imaging systems using IEEE 1394 (also known as FireWire and i-Link) to interface with cameras, the choice of camera interface still warrants careful consideration. (USB 2.0, which is rapidly becoming the mainstream high-speed PC peripheral interface, is not a factor in industrial imaging, primarily because, although its 480Mbps data transfer rate is nominally higher than the original FireWire, USB 2.0's host-centric protocol is slower for imaging than FireWire.)

FireWire is a popular high-speed serial bus in consumer video systems and home entertainment systems. This plug-and-play bus uses a multi-point architecture and a peer-to-peer communication protocol. The initial specifications of the standard included data transmission rates up to 400 Mbps. Data transmission rates were eventually planned to reach 3.2 Gbps. In January 2003, IEEE released 1394b, and its proponents expected to see an 800 Mbps version in visual hardware soon. However, despite the reasonable cost of industrial FireWire cameras, their increasing availability in consumer devices (where the required resolution—and sometimes frame rate—is more modest than that required in industrial devices), the ease of use of their thin and flexible serial cables, and the noise immunity of their bus digital technology, still limit the selection of such cameras.

Cost may limit the adoption of FireWire in industrial imaging. Industrial FireWire cameras are more expensive than industrial analog output cameras with the same frame rate and resolution. On the other hand, cost comparisons between FireWire and analog cameras can sometimes be misleading. In systems with a built-in FireWire port, the camera typically does not require additional interface hardware. Such cameras include an ADC (analog-to-digital converter), while analog cameras require a frame receiver to perform the necessary ADC functions (Figure 3).

Figure 3 illustrates National Instruments' Celeron-based CVS-1454 CompactVision System, showcasing machine vision hardware designed for factory environments. While this system (top right) is not a standard office PC, it includes three FireWire ports, eliminating the need for special camera interface hardware. The system is used in conjunction with National Instruments' LabVIEW graphical development environment, which allows for rapid program development using interactive graphical tools, followed by full graphical programming functionality for device debugging if necessary.

FireWire cameras use the synchronous protocol IEEE 1394, which guarantees bandwidth and ensures that data packets arrive in the order they were sent (if they all arrive). Other protocols in this standard (asynchronous) guarantee message delivery but do not guarantee that data packets arrive in the order they were sent. Each synchronous device can issue a bandwidth request every 125 μs—that is, at a maximum rate of 8 kHz. The device acting as the bus manager grants each requesting device the authority to send a predetermined number of data packets within the subsequent 125 μs.

The more synchronization devices on the bus, the less bandwidth each device gets. When there's only one camera on the FireWire bus, a 1280×960 pixel monochrome camera can send approximately 15 frames per second. A 640×480 pixel FireWire color camera can send approximately 30 frames per second. While neither of these examples seems to utilize the full available data transfer capacity of the bus, the number of bits per pixel and the method the camera uses to format the data do affect the maximum frame rate. Incidentally, higher resolution isn't always better. Higher-resolution cameras are not only more expensive and typically have slower frame rates than lower-resolution cameras, but they also more easily reveal subtle differences between UUTs and KGUs, thus increasing the rate at which AOI systems erroneously detect faults.

VI. More camera interfaces

Besides FireWire, interface options for digital output cameras include RS422 parallel interfaces and CameraLink (Table 1). RS422 camera interfaces are not yet fully standardized, so a dedicated interface card for the camera is usually required. These cards are not frame receivers in the sense of interface cards used for analog output cameras, but they can typically be plugged into the host PC's PCI bus. Parallel interfaces have proven impractical due to the sometimes need for over 50 connections. Nevertheless, RS422 digital cameras remain popular and continue to be widely used.

AIA's CameraLink is the highest-performance digital output camera interface standard. Unlike FireWire, CameraLink allows only one camera per bus, but many PCs can accommodate multiple CameraLink buses. CameraLink can transmit data at speeds up to 4.8 Gbps using SERDES (serialization/deserialization) technology on parallel-combined unidirectional links, serial links, and point-to-point links. Each link can transmit data from 7 channels and uses LVDS (Low Voltage Differential Signaling) technology, which requires two wires per link. The number of channels determines the maximum data rate of the CameraLink bus. A fully configured bus can have 76 channels, including 11 links and 22 wires, although the standard takes into account buses with 28 and 56 channels (4 and 8 links and 8 and 16 wires respectively). Each CameraLink bus typically requires a separate interface card in a PC.

Choosing the CameraLink bus currently involves writing additional software. Because CameraLink bus cards for PCs are both scarce and not fully standardized, compact application development packages often lack a CameraLink launcher. Nevertheless, if you require the impressive speed of CameraLink, you don't have many options.

Sometimes, you can use smart cameras to reduce the amount of data your vision system has to process because they can process or compress the data they capture before sending it to the host PC. Such cameras can sometimes reduce the data rate between the camera and the host, as well as the data rate between hosts and their respective loads, but this is more expensive. However, you must ensure that the data compression is either truly lossless or eliminates any data loss during compression.

Detailed Explanation of Machine Vision and Image Analysis Technologies

Read next

CATDOLL 126CM Alisa (TPE Body + Hard Silicone Head)

CATDOLL 128CM Himari Silicone Doll

The hazards of vacuum circuit breaker closing bounce and countermeasures

CATDOLL Alisa Hard Silicone Head