If the extracted features are not precise enough, then the judgments made based on those features will inevitably be inaccurate. At the same time, if the extracted features are not refined enough or the feature space has too large a dimension, it may lead to extremely high complexity in subsequent discrimination algorithms, resulting in the "curse of dimensionality".
Commonly used image features for surface defect detection in the industry include geometric features, shape features, color features, texture features, and grayscale features.
The most basic characteristic of a defect is its geometric features, which are generally represented by information such as the defect's perimeter, area, location, and centroid. The defect's perimeter and area represent the number of pixels at the defect's boundary and inside, respectively; its geometric features can be extracted by counting the number of pixels.
Shape features refer to descriptive information such as rectangularity, slenderness, roundness, density, invariant moments, and eccentricity. The description of shape features can be mainly divided into two categories: contour-based and region-based. The distinction lies in whether the shape features are extracted solely from the contour or from the shape region. The combination of geometric and shape features is an important basis for distinguishing defect types.
Color features are the most widely used visual features in image retrieval and are also the primary perceptual features for human image recognition. Unlike geometric and shape features, color features possess a certain degree of rotation and translation invariance, exhibiting strong robustness. Color features can be extracted and matched using methods such as color histograms, color aggregation vectors, and color moments.
Texture features are an important inherent characteristic of images, reflecting the slowly changing or periodically changing surface structure and arrangement of an object's surface. Commonly used methods for describing texture features include statistical methods and spectral methods. Statistical methods use the moments of the image's histogram to describe the texture structure, while spectral methods describe the texture structure of an image based on the characteristics of the Fourier spectrum.
The grayscale feature of a defect is a statistical representation of the distribution of grayscale values of each pixel within the grayscale quantization level of an image. The grayscale feature of an image can be obtained by using the grayscale histogram information of the image (such as variance, mean, entropy).
Feature extraction from defect images transforms the image space into a feature space. In practical projects, various basic features of the image are typically combined to form a comprehensive defect description feature vector. However, not all features are useful for subsequent defect detection and image understanding. Extracting too many features results in a high-dimensional feature vector, leading to redundant information and complex computations, requiring dimensionality reduction methods such as Principal Component Analysis (PCA). Conversely, extracting too few features results in inaccurate defect descriptions, leading to unsatisfactory accuracy and precision.