0 Introduction
Machine vision systems are non-contact optical sensing systems. They integrate both hardware and software, automatically acquiring information from captured images or generating control actions. Since its inception, machine vision has a 15-year history, primarily progressing through three stages: digital circuit composition, PC and output device composition, and embedded systems. Embedded machine vision systems, relying on specialized computer technology, feature real-time multi-tasking operating systems, high-efficiency compression chips, and powerful microprocessors. They can integrate video compression, transmission, and processing entirely onto the chip, and after internal processing, can directly connect to Ethernet or WAN for real-time remote network monitoring, making it one of the current research hotspots.
In domestic and international research, there are three main ways to implement embedded machine vision systems:
(1) A system based on a standard bus and using a DSP as the computing and control processor. Although DSP chips can process large amounts of information and operate at high speed, their I/O interfaces are simple, not easy to expand, and have weak control capabilities, which still have certain limitations.
(2) Machine vision system based on DSP+FPGA. The combination of FPGA and DSP can realize broadband signal processing and greatly improve the signal processing speed. However, FPGA uses hardware description language, and its algorithm development is very difficult. The function implementation is controlled by hardware, and the system is greatly affected by the environment.
(3) Machine vision systems using ARM microprocessors or ARM+DSP architecture have powerful human-computer interaction functions, high integration, good real-time performance, and support for multiple tasks. However, the data exchange method between ARM and DSP in this system still uses external circuit connection, which increases the instability of the system.
Based on the advantages and disadvantages of the above-mentioned technical solutions, this paper proposes a novel machine vision system to achieve high-speed acquisition and storage of image information.
Its core chip is the latest advanced dual-core embedded chip from TI, integrating an ARM processor and a DSP processor into a single chip. The coordination between the ARM and DSP is achieved through software programming. Machine vision processing systems developed using this chip, leveraging the superior control performance of the ARM processor embedded with Linux and the powerful computing capabilities of the DSP, ensure excellent real-time performance and stability, providing a robust hardware platform for video acquisition and processing in machine vision research and applications.
1 System Functions
This system is a high-speed image data acquisition and storage system. Through hardware and software design, it can achieve real-time acquisition of three input signals: two channels with a resolution of 640×480, a frame rate of 60 frames per second, and a resolution of 12 pixels per second; and one channel with a resolution of 1024×1024, a frame rate of 60 frames per second, and a resolution of 12 pixels per second. Real-time uncompressed storage is also implemented.
As shown in Figure 1, the system controls the image sensor via a serial port, allowing three image data signals, a clock, and various synchronization signals to be input as required. The system then sequentially acquires, processes, and stores the image signals. Utilizing its built-in interface, the system can perform additional functions such as display, communication with a host computer, and keyboard control, enabling user-friendly human-machine interaction.
2 Hardware Design
This system uses the latest TMS320DM8168 chip from TI's DaVinci series. This chip integrates a 1GHz ARM Cortex-A8, a 1GHz TIC674x floating-point DSP, several second-generation programmable high-definition video coprocessors, an innovative high-definition video processing subsystem (HDVPSS), and a comprehensive codec, supporting high-definition resolutions including H.264, MPEG-4, and VC1. It also includes multiple interfaces such as Gigabit Ethernet, PCI Express, SATA2, DDR2, DDR3, USB2.0, MMC/SD, HDMI, and DVI, supporting more functional expansion and complex applications.
This chip was used to design and implement the acquisition, processing, and display of two or three image signals with different resolutions. The hardware schematic is shown in Figure 2. The hardware modules involved in the development and design of this system include: an image acquisition interface module, an image acquisition module, an image storage module, and a peripheral interface module.
2.1 Image Acquisition Interface Module
As a connection module between the image sensor and the high-speed acquisition system, this module can perform image acquisition and control of USB interface cameras or CameraLink interface cameras. USB interface connection is very convenient; since the system has a USB peripheral interface, it can be connected according to the USB standard protocol. The CameraLink interface has an open interface protocol, allowing different manufacturers to maintain product differences while ensuring interoperability. Therefore, the image acquisition interface module in the system adopts the CameraLink interface protocol. This module uses DS90CR288A, DS90LV049, and DS90LV047 to control the image sensor, acquire image information, and facilitate bidirectional communication between the image sensor and the image acquisition system.
2.2 Image Acquisition Module
The TMS320DM8168's HDVPSS (HDVideoProcessing Subsystem) provides video input and output interfaces. The video input interface allows for the connection of external image devices (such as image sensors, video decoders, etc.).
HDVPSS supports up to three 1080p channels at 60fps, simultaneously supporting H.264 high-quality D1 encoding and 8-channel D1 decoding of 16 channels of CIF data streams. It supports two independent video capture input ports, each supporting scaling and pixel format conversion. Both video input capture ports can operate as either a single 16-bit input channel (with separate Y and Cb/Cr inputs) or two independently clocked 8-bit input channels (with interleaved Y/C data inputs). The first video input port can operate in 24-bit mode to support RGB capture. All capture modes have a capture clock speed of up to 165MHz, meeting the demands of high-speed image acquisition.
The High-Definition Video Processing Subsystem (HDVPSS) has two independent video capture input ports, VIP0 and VIP1. VIP0 can be configured in 24-bit, 16-bit, and two independent 8-bit modes, while VIP1 can be configured in 16-bit and two independent 8-bit modes. As can be seen from the capture frequency and various configuration modes, multiple implementation methods are possible for different traffic volumes. For simplicity in storage design, this solution configures VIP0 for 24-bit acquisition. In this mode, the maximum traffic volume is 165MB × 248 = 495MB/s, which meets the traffic requirements.
The highest capture clock indicates that the acquisition interval is 1165MHz, approximately 6.1ns. Based on calculations and for design convenience, three CameraLink cameras with a base configuration and a frame rate of 200fps are proposed. Frame rate control is externally triggered. Each CameraLink camera outputs two pixels at a time, each pixel being 12 bits (2×12 bits), which perfectly matches the 24-bit acquisition of VIP0. Taking three-channel time-division acquisition as an example, as shown in Figure 3, the acquisition method involves the three cameras taking turns, meaning each camera acquires one frame per cycle. This requires implementing the timing signals for the three-channel time-division acquisition. A timer generates a 1/200s pulse width, which, after a delay, ensures the frame rate is high and sent to the three cameras in a time-division manner. The timing relationship of the three acquisition signals is as follows: one camera has no delay, one camera is delayed by 1/200s, and the last camera is delayed by 2/200s.
After receiving the command via the DS90LV047A, the camera divides the captured image data into 4 LVDS data signals and 1 LVDS clock signal, and transmits them to the DS90CR288A through the interface connector MDR26. The DS90CR288A converts the serial data into 28 parallel signals and 1 accompanying clock signal, and transmits them to the TMS320DM8168 video capture port VIP0 for acquisition.
2.3 Image Storage Module
Based on the above design, the system storage speed is approximately 160MB/s. For large data volumes, a high-capacity, high-speed solid-state drive can be selected and written to via its SATA2 interface.
After data acquisition is completed, the HDVPSS subsystem is configured to send the data to VPDMA, and finally to DDR memory. When the amount of data in DDR memory reaches the set amount, an interrupt is generated. After the interrupt occurs, DMA transfer between memory and solid-state drive is started according to the storage address, and the acquired image is stored on SSD through SATA2 interface to realize data storage.
Then, the timer is started to generate the next frame rate pulse, and the data acquisition for the next cycle begins.
The external expansion memory selected is DDR3 (1600) memory supported by the system. Based on the system's 32-bit memory controller width, the memory speed can reach 32/8 × 1600M = 6.4GB/s. In this mode, acquisition and storage can be processed in parallel. The data acquired by the cache is moved to DDR3 memory, which has a much higher speed than the data per second acquired by the port. Because this scheme acquires data frame by frame, and the data within each frame is already arranged in a compact order, the data rearrangement work can be greatly reduced, requiring only the removal of some auxiliary data. The acquisition system sets all other relevant signals into a frame-by-frame format, allowing the camera's clock signal to communicate with the system's acquisition port clock signal. A small amount of auxiliary data precedes the image signal; this auxiliary data is skipped when setting the DMA start address. Therefore, even when the system is almost not running programs, the solid-state drive can occupy DMA control for at least 80% of the time to store image data in memory. Based on the selected hard drive's continuous write speed of 250MB/s, 250 × 0.8 = 200MB/s is greater than 160MB/s, so data acquired in 1 second can be stored in real time. After uploading the data, you can choose to clear the existing data and free up hard drive space.
2.4 Peripheral Interface Module
Based on the rich peripheral interfaces of the TMS320DM8168 chip, this system can flexibly design external interfaces to control peripheral devices and realize communication functions with external processors. Available interfaces, depending on requirements, include: two Gigabit Ethernet MACs (10Mb/s, 100Mb/s, 1000Mb/s) with GMII and MDIO interfaces; two USB ports with integrated 2.0 PHYs; dual DDR2/3 SDRAM interfaces, etc., as shown in Figure 2.
The TMS320DM8168's two USB ports allow for the connection of a keyboard and mouse when uploading acquired image data to a host computer. The LCD and VGA interfaces can be used to directly display images. The serial port can also be used for communication with the host computer and to control the CameraLink camera used in this design. The gigabit Ethernet interface, with its ultra-high speed, ensures high-speed transmission of image data.
The above technologies are mainly implemented by driving peripheral interfaces through software programming. For specific solutions, please refer to the software design.
3 Software Design
This system uses the Linux operating system and features a user-friendly interface, making operation more flexible and enabling multitasking. Camera control, image acquisition, stopping, display, and image uploading can be performed through the interface. This part of the development consists of two parts: porting and independent development. The software design is shown in Figure 4.
3.1 Porting Procedure
The ported programs include the Linux kernel, network card driver, USB 2.0 driver, LCD driver, serial port driver, VGA driver, and SATA2 driver. TI provides excellent support in this regard, offering a dedicated Linux operating system for the DM8168, version 2.6.37, which can be used for development using TI's LinuxEZ software development kit (EZSDK).
3.2 Self-developed program
3.2.1 Driver
To ensure proper operation under the Linux operating system, the image acquisition circuitry requires driver support from the image acquisition application. The acquisition circuitry can be divided into several functional modules, each with its own driver. These include a camera acquisition driver (responsible for operations after data enters through VIP0); a control driver (responsible for controlling the Timer); and a driver to support changes in camera operating status based on external environmental conditions. The acquisition driver implements the `open` and `close` methods. The control section implements the `open`, `close`, and `ioctl` methods. Adaptive rate adjustment requires the implementation of the `open`, `close`, `ioctl`, and `read` methods. Device nodes are created in the `/DEV` directory, and the application operates through these device nodes.
3.2.2 Application
The application will be developed using the QT development tool. It will be designed as a multi-threaded program, consisting of a main thread and an adaptive parameter adjustment thread. The application will primarily implement data acquisition, stop, display, configuration, and upload functions, each corresponding to a specific button.
The program corresponding to the data acquisition button calls the open method of the device node. In the open method, the corresponding hardware is configured, the interrupt routine is registered, and the Timer is started to begin data acquisition. The process is shown in Figure 5.
Because the system already has a serial port driver, the configuration program can directly program the serial port. The adaptive environment rate adjustment program starts a new thread from the main interface program. This thread reads data through the corresponding device node, determines whether to adjust, and if adjustment is needed, resets it through the serial port device node or control device node mentioned above.
4 Conclusion
The machine vision system constructed in this paper is a small, independent, and controllable multifunctional system with an operating system. It is implemented through both hardware and software design, and its functional modules include video image acquisition and processing, video image storage, video image communication, and video image display. Employing an advanced dual-core embedded processor, it acquires video image signals from multiple image sensors in high-speed parallel processing. It performs lossless image compression and image fusion as needed, and the data can be stored in large-capacity real-time. It communicates with a host computer through multiple interfaces, features a user-friendly human-machine interface, and can drive various display screens to perform high-definition display and information playback functions.
Because the platform has a Linux operating system, system parameter settings and function selection can be completed without a host computer. This system can provide high-definition target information for airborne, missile-borne, and vehicle-mounted optoelectronic systems to perform high-speed scanning, rapid detection, active identification, and precise tracking tasks. It is expected to be applied in various fields such as smart cities, security, industrial control, medical and educational institutions, logistics management, power grid operation, smart homes, smart cars, and food safety.