Scroll to:
An algorithm for detecting low-contrast objects in different target environments
https://doi.org/10.38013/2542-0542-2021-2-76-89
Abstract
We propose an algorithm for detecting low-contrast objects in different target environments for application in an optoelectronic system. The algorithm makes it possible to detect low-contrast objects in a complex environment with account of relative movement of the camera and the object in real time.
For citations:
Volchkova D.S., Dolgova A.S. An algorithm for detecting low-contrast objects in different target environments. Journal of «Almaz – Antey» Air and Space Defence Corporation. 2021;(2):76-89. https://doi.org/10.38013/2542-0542-2021-2-76-89
Introduction
Optoelectronic systems (OES) are widely used as part of automatic target tracking systems to track objects that can be observed in clear sky or cloudy conditions in order to detect objects and estimate motion parameters in real time. Observation objects may include airplanes, helicopters, self-propelled anti-aircraft vehicles (SPAAV) and others.
According to open source data, studies in this field suggest re-equipping the existing weapon systems with OES. This paper analyses one of the trends to enhance functional capabilities of OES for anti-aircraft defence (AAD) systems operated by land forces. Such enhancement allows to detect air and ground targets using algorithms for processing images acquired via an optical channel in real time. The quality of problem solution largely depends on the accuracy of extraction of the region with the moving object on a series of frames, because any initial data are extracted from images [1].
Fig. 1. Image blur effect: a – static image, b – image shift by 10 pixels
Images obtained through video recording of high-speed objects are susceptible to distortions, the most critical of which is blurring of an image with a moving object (Fig. 1) of around 10 pixels. We should take into account that both the object and camera can move, thus further complicating the problem. Also, recording is often carried out in insufficient illumination conditions, exposed to natural interference such as snow, rain, mist, or image blurring due to limited camera exposure time. The problem complexity is caused by dynamic changes in the scene such as illumination or weather changes. An efficient detection algorithm shall be able to deal with changes in a frame, which are not moving objects, such as foliage or image change due to a shaking camera. Besides, the algorithm shall be adaptive to illumination change, including smooth (change between day and night) and abrupt (light on/off) changes. Illumination change can be either global or local (for example, object shadows and reflections). Another adverse factor is that moving objects may have characteristics similar to the background characteristics. This makes object extraction much more complicated. Therefore, we have to face the complex problem of moving object detection in adverse conditions to be solved by applying different methods of image processing.
An overview of existing methods
Main types of interference affecting the result of image object extraction are: illumination change, camera position change, colour noise and system noise. Adaptive filtering is widely used to eliminate noise at the motion detection phase. The use of a sigma-delta filter at the background subtraction phase is proposed in [2] in order to acquire a more accurate object mask. This method allows to reduce the impact of the unstable background state and noises caused by a slight motion of the camera or background. Also, the Gaussian filter is used to reduce noise caused by unstable background [3]. To eliminate the impact of noise, especially the noise caused by illumination change, multithreshold processing is used [4]. For example, the double-threshold method is proposed in [5].
To avoid threshold determination, the Sobel edge detection algorithm was used for an image obtained as a result of pixel-by-pixel difference between the current image and a reference image [6].
Statistic analysis is based on noise distribution simulation [7][8][9][10][11]. Instead of threshold application to a difference image, this method compares the statistic behaviour of small regions around each pixel of a difference image with a noise model.
The study [12] proposes a detection method with the object viewed as a system of elements, i.e. selected boundary lines. This method is based on the neural network technique that employs the reference image of an object for training, with the reference frame based on such an image. The analysed image is processed using the Haar filter with subsequent extraction and connection of points. The detection procedure includes formation of the analysed image frame, search of the best alignment with the reference image, making a decision on detection by comparing the amount of matched lines with the threshold value. The disadvantage of the method is that it can be applied only to a limited category of objects of particular shape.
An improved image background extraction algorithm is proposed in [13]. To improve extraction, a boundary detection method with subsequent boundary filling and alignment with the detected object is used. Detection of a moving object is optimized with the help of median filtering and morphological processing methods. This algorithm is efficient only for processing images shot by a stationary camera.
Wavelet transform-based method with image transformation into a frequency domain may be used for detecting moving objects [14]. In the study [15], wavelet transform is used for background evaluation based on previous frames in a video sequence, as well as for extracting a moving object and its location. Another method based on low-order fractional statistics was proposed in [16]. Background illumination change has minor effect on the result of extraction of the region with a moving object, using methods implemented in the frequency domain, in comparison with spatial domain methods. However the problem related to shadows is of the similar importance. The distinctive feature of methods related to the frequency domain is high computational complexity, therefore, they are used much less commonly than spatial domain methods.
The research paper [17] proposes an algorithm that allows to extract moving objects from an initial video sequence in real time, using low-end equipment. The principle of the algorithm concept is based on the fact that the more often a pixel takes a particular colour, the higher the probability is that it is related to the background.
According to the analysis, despite its significance, the problem of detecting a moving object in adverse conditions is not completely solved. Some problems can be solved using computationally-complex algorithms, but solving real-world problems requires high computational speed.
That is why the urgent problem is to develop a low-contrast object real-time detection algorithm resistant to the impact of additive noises, natural interference, camera shaking and background illumination change.
The study objective is to improve the low-contrast object detection accuracy in an image sequence by using a combination of methods of image processing in the frequency domain and mathematical statistics apparatus.
Detection algorithm description
Assume there is a video sequence in format YCrCb-4:2:2 – an ordered set of image frames , where t – frame number, zi,j – brightness of pixel with coordinates (i, j). The task is to detect a low-contrast object in video sequence frames and extract its image. In this paper, low-contrast objects are considered to be moving objects with the visible contrast not exceeding 0.3.
A camera can be mounted on a moving target or a fixed platform. Received data are obtained in the visible spectrum and represented in the RGB colour space. A camera was adjusted to the infinity focus, which allowed to detect remote objects. The camera’s noise model was unknown beforehand.
The algorithm has an important limitation: objects of interest such as air or ground objects occupy a small portion of the camera coverage area. An exception appears when these conditions are not satisfied, and such images are not subject to further processing.
The proposed low-contrast object detection algorithm can be divided into two phases:
- Frame preprocessing:
1.1. filtering;
1.2. contrast enhancement. - Extraction of motion region:
2.1. evaluation and compensation of inter-frame geometric transformations;
2.2. detection of low-contrast objects and frame subtraction;
2.3. detection marker placement.
The block diagram of the proposed algorithm is shown in Fig. 2. Description of each sub-algorithm is given below.
Fig. 2. Block diagram of proposed algorithm
1. Frame preprocessing
The problem of moving object detection in a video sequence consists in detecting considerable changes with minor ones ignored. An image sequence is prepared for further analysis at the frame prepossessing phase. Frame preprocessing includes operations intended to correct distortions in geometry and brightness.
1.1. Filtering
OESs included in surface-to-air missile systems (SAMS) generate images of air and ground targets, the specific feature of which is that such images mostly represent dark objects on a light background. In our study, we used a median filtering algorithm, which was more efficient for noise elimination in a dark image.
Unlike a smoothing filter, a median filter makes it possible to maintain brightness jumps (object contours). If impulse noise is an interference source, a median filter is supposed to be a more efficient solution. In this paper, the sizes of the median filter’s sliding window are equal to 3×3 pixels.
1.2. Contrast enhancement
Noise elimination allows to minimize the amount of false responses of the difference motion detector to interference that occurs during image recording and transmission, but frames containing the same scene may still differ from one another to a great extent. This difference is caused by changing the illumination level while recording different frames. A change in the illumination level may be caused by switching on/off artificial lights in case of an outdoor scene or by changing weather conditions. Usually, at that moment pixel-bypixel differences of two adjacent frames reach very high values, causing false response of the detector which records any motion within the entire space with variable illumination. To prevent such false responses, the preprocessing phase includes the procedure for correcting brightness levels of a video frame sequence [7].
The most widespread image contrast enhancement algorithm is histogram equalization, which is written as follows:
where grmin, grmax – minimum and maximum boundaries of histogram expansion; – frame after filtering procedure.
2. Extraction of motion region
To extract the region with a moving object, we need to find all image pixels that correspond to a moving object, i.e. all image pixels shall be divided into two groups: background and foreground (moving object). A unified approach to solving the problem is to compare the current frame with a reference model of the background.
2.1. Evaluation and compensation of inter-frame geometric distortions
An original image is generally a dynamic, or time-varying, image. A camera is often mounted on a mobile object. This results in generating images of multiple frames with mutual shifts, turns and scale variations.
To compare two adjacent frames, these images shall be interlinked to correct relative spatial shifts, differences in amplification, displacements caused by turning, as well as geometric distortions.
As the object distortion model, we used an affine model of similarity, including reference axis shift parameters, scale and turning angle factor. Reference axis shift estimation is based on the phase correlation method.
After passing the preprocessing phase, two adjacent frames – and
– are sent to the algorithm input.
Then each frame is processed using discrete Fourier transform (DFT):
(1)
where ,
– frames after filtering procedure; Ga(u, v), Gb(u, v) – image DFT
,
; F{...} – DFT.
In this case, discrete Fourier transforms of images will be phase-shifted:
(2)
where M, N – image height and width in pixels;
x = 0, 1, ..., M – 1;
y = 0, 1, ..., N – 1.
Calculated values are converted into logarithmic polar coordinates to calculate the scale and turning angle factor. Let us calculate the cross-spectral power density:
(3)
Using inverse Fourier transform, we get the normalized cross-correlation (phase correlation):
(4)
where R – normalized cross-power spectrum; F–1 – inverse Fourier transform.
Let us determine the position of peak r (*):
(5)
Further, using affine transform, the previous frame is shifted relatively to the current one based on obtained shift estimates (5) (Fig. 3).
Fig. 3. Compensation of inter-frame geometric distortions
The advantage of the phase correlation method is its resistance to noise, occlusions and other defects.
Fig. 4 shows the result of alignment of 12 frames selected from a video sequence (every third frame) with shift compensation. The developed algorithm allowed to align adjacent frames of a video sequence, enabling estimation of a true change in the position of a moving object relatively to the scene. Fig. 4 demonstrates good accuracy of background linking.
Fig. 4. Result of linking of 12 sample frames. Linking accuracy is 3 pixels
2.2. Detection of low-contrast objects and frame subtraction
Compilation of a low-contrast detection algorithm is based on application of the mathematical statistics and probability theory apparatus.
The developed algorithm scans a video sequence frame, the size of which is equal to 31×31 pixels for remote objects and 61×61 pixels for near objects. The window size was selected based the hypothesis stating that mathematical statistics and probability theory algorithms are efficient in case of large sample.
Mathematical expectation and dispersion values are calculated in the selected window for each incoming frame. Mathematical expectation of image portion is the average brightness level to be calculated by the following ratio:
(6)
where – image portion after filtering procedure.
Root-mean-square error of the image is the measure of brightness dispersion to be calculated as follows:
(7)
The next step is to normalize the image by subtracting the mathematical expectation and dividing by the root-mean-square error:
(8)
After the above-listed procedures are completed, the resulting image has a histogram corresponding to the normal distribution (Fig. 5b).
Fig. 5. Histogram: а – original image, b – normalized image
Since the histogram of the obtained image corresponds to the normal distribution graph, the three sigma rule can be applied to the image as an extra filtering option:
To separate stationary and moving objects, the background subtraction procedure shall be carried out. Background simulation is the main stage for background subtraction algorithms. The developed algorithm employed the inter-frame difference method:
(9)
where ,
– images after compensation of inter-frame distortions.
The method allows to use the previous frame as a background model for the current frame. The method does not require a large memory size for storing previous frames. It is easy to implement and allows to perform real-time image processing.
In case of direct application, the method will be sensitive to noise and any change in a frame (for example, snow, rain, swinging of trees). However, if combined with the method of interframe geometric distortion compensation and methods of mathematical statistics, this method reduces probabilities of false alarm and target miss.
Also, the method prevents extraction of regions with moving objects if the objects stop, because only the previous frame is used. Such an approach is not able to determine internal pixels of a large uniformly coloured moving object. Using the algorithm for contrast enhancement and highlighting the difference between the object and background allows to eliminate this drawback.
2.3. Placing the detection marker
To ease the perception of detected objects in the binary mask, a special marker is placed on the original image in order to display the location of the detected object (Fig. 6).
Fig. 6. Example of detection marker placement: а – first frame, b – second frame, c – binary mask, d – original image with marker
The important task for further processing, including plotting the moving object’s trajectory, is to improve the spatial accuracy of extraction of the moving object region from the image sequence. The proposed algorithm uses a new combination of the method of computation of geometric distortions in the frequency domain with subsequent application of a statistical method of frame analysis in order to identify objects of interest.
Research results
For software implementation and testing of the algorithm, we used a standard CPU Intel Core i7- 4770K 3.50 GHz personal computer.
Video records were acquired in different weather conditions, including clear sky, partly cloudy and cloudy weather throughout the day. Video data were recorded with resolution of 800×600 pixels at a rate of 24 frames per second. The data set comprised 250 video sequences randomly selected from among the available data. Each sequence comprised 100 frames that formed a continuous 10-second interval. A video sequence contained single and multiple objects leaving and entering the camera coverage area.
Simulation model structure and components
The functional components of the program are shown in Fig. 7, including program modules and inter-module connections. The main module is responsible for integrated program control.
Fig. 7. Functional program components
A video sequence is transmitted to the main module input which serves for displaying algorithm operation results on the screen. The video file loader module converts a video sequence into frames to be sent to the filtering module input. The module for evaluation and compensation of geometric distortions simultaneously receives two frames – current and previous ones – from the frame buffer. Frames generated in the marker placement module are sent to the main module input for displaying.
Detection accuracy evaluation
The algorithm analysis resulted in evaluation of the quality of object detection in different conditions. The sample in question comprised 100 video sequences.
To evaluate the algorithm performance quality, we calculated the correct detection probability and false alarm probability as the amount of properly detected objects in the image and as the amount of false responses of the detector:
(10)
(11)
where Pпо – probability of correct detection;
Nоб – amount of objects detected by the algorithm;
Nо – amount of objects in the image;
Pлт – probability of false alarm;
Mлт – amount of false alarm pixels;
H, W – image height and width, respectively.
Average correct detection probability was 0.92. False alarm probability was 10–5.
Solving object detection problems involves the Neyman – Pearson criterion, which does not require the knowledge of a priori probabilities and is based on a fixed false alarm probability. That is why the false alarm probability shall be very low: 10–10–10–5.
Evaluation of visible target contrast
Contrast of the recognition object with the background (K) is determined as the ratio of the absolute value of the difference between the object and background brightness to the background brightness.
Contrast of the recognition object with the background is considered:
• high – at K over 0.5 (great difference in brightness between the object and the background);
• medium – at K of 0.25 to 0.5 (visible difference in brightness between the object and the background);
• low – at K less than 0.3 (minor difference in brightness between the object and the background).
The following procedure was used for visible target contrast evaluation.
1. The region containing only the object, plus the region containing the background are extracted on the original image. Both regions have similar sizes.
2. Average brightness values are calculated in both object and background regions.
3. The object contrast relative to the background is calculated by the Vorobel formula [9].
Fig. 8. Graphic interface of main program module
Some visual target contrast estimates by the example of two video sequences are given in Table 1. The region with a low-contrast object is shown in a white frame in the images.
Table 1
Evaluation of visible target contrast
Thus, the developed algorithm allows to detect those objects with the average contrast of 0.15.
Analysis of computational resources
The algorithm is compiled in C++ using an objectoriented approach. The program is developed in the Microsoft Visual Studio 2019 environment.
Average time required for the algorithm to process a single frame in the video sequence for different CPUs is given in Table 2.
Table 2
Algorithm operation time
The exact algorithm operation time will depend on an applicable CPU, a type of data (halftone or colour video sequence), and a programming language (if implemented in a different programming language).
The results of object detection algorithm performance in different target environments are given in Table 3.
Table 3
Results of detection algorithm operation
Conclusion
The important factor in operation of image processing systems such as OESs being part of SPAAV and SAMS is the minimization of operator’s intervention into combat operation control. For this purpose, the moving target detection algorithm is implemented to significantly facilitate operator’s actions when locking on the target. The algorithm rationale is calculation of background estimates and generation of a binary image with the extracted moving object, based on such estimates.
Average probability of correct detection by the algorithm is 0.92; false alarm probability is 10–5. Average detection algorithm operation time is 200 ms with the frame size of 800×600 pixels. The developed algorithm is able to detect moving objects, the visible contrast of which is less than 0.3. It is also applicable for object detection with contrast over 0.3.
Although the proposed algorithm is a complete solution to the problem of low contrast object detection, there are many ways for its further development to improve its performance:
1) improving the quality of an original image by pre-filtering: restoration of lost image segments, additional enhancement of its important details;
2) use of a background model based on a combination of Gaussians;
3) use of a segmentation algorithm: extraction of a precise target contour, deletion of the target shadow region;
4) use of convolutional neural networks for solving the image recognition problem.
A software package for implementing low-contrast object detection and tracking algorithms has been developed. The developed algorithm and software can be used in development (refinement) of digital image processing systems related to aerial, ground and surface target imagery to be processed by OES for upgraded and advanced surface-to-air missile systems and self-propelled AA vehicles, as well as in robotics and dual-use technologies.
Thus, the use of the proposed algorithm allows SAMS combat crews to detect low-contrast air and ground targets in given conditions, using equipment fitted with OES. With such systems used individually or integrated with other standard systems, a combat crew can fulfil air defence tasks to protect objects from attacks.
References
1. Волчкова Д.С., Смирнов П.В. Обнаружение малоконтрастных объектов в различной фоноцелевой обстановке. Современные проблемы проектирования, производства и эксплуатации радиотехнических систем: XI Всероссийская научно-практическая конференция (с участие стран СНГ): Сборник научных трудов. Г. Ульяновск, 10–11 октября 2019 г. Ульяновск: УлГТУ, 2019. С. 100–102.
2. Vargas M., Milla J.M., Toral S.L., Barrero F. An Enhanced Background Estimation Algorithm for Vehicle Detection in Urban Traffic Scenes. Vehicular Technology, IEEE Transactions. 2010. V. 59 (8). P. 3694–3709.
3. Stauffer C. Learning patterns of activity using real-time tracking. IEEE Trans. Pattern Analysis and Machine Intelligence. 2000. V. 22. P. 747–757.
4. Wang L., Yung N.H.C. Extraction of Moving objects from their Background based on mulitple adaptive threshold and boundary evaluation. IEEE Trans. Intelligent transportation systems. 2010. V. 11. P. 40–51.
5. Haritaoglu I., Harwood D., Davis LS. Realtime surveillance of people and their activities. IEEE Trans. Pattern Analysis and Machine Intelligence. 2000. V. 22. P. 809–830.
6. Cavallaro A., Ebrahimi T. Change detection based on color edges, circuits and systems. The 2001 IEEE Int Symposium. 2001. P. 141–144.
7. Aach T., Kaup A., Mester R. Statistical model-based change detection in moving video. Signal Processing. 1993. V. 31. P. 165–180.
8. Cavallaro A., Ebrahimi T. Video object extraction based on adaptive background and statistical change detection. Proc. SPIE Electronic Imaging 2001 – Visual Communications and Image Processing. 2001. V. 4310 P. 465–475.
9. Hotter M., Mester R., Muller F. Detection and description of moving objects by stochastic modelling and analysis of complex scenes. Signal Proces: Image Comm. 1996. V. 8. P. 281–293.
10. Mech R., Wollborn M. A noise robust method for 2D shape estimation of moving objects in video sequences considering a moving camera. Signal Processing. 1998. V. 66 (2). P. 203–217.
11. Neri A., Colonnese S., Russo G., Talone P. Automatic moving object and background separation. Signal Processing. 1998. V. 66 (2). P. 219–232.
12. Алфимцев А.Н., Лычков И.И. Метод обнаружения объекта в видеопотоке в реальном времени. Вестник ТГТУ. 2011. Т. 17. № 1. С. 44–55.
13. Zuo J., Jia Z., Yang J., et al. Moving object detection in video sequence images based on an improved visual background extraction algorithm. Multimed Tools Appl. 2020. V. 79. P. 29663–29684. DOI: 10.1007/s11042-020-09530-0
14. Antić B., Crnojević V., Ćulibrk D. Efficient wavelet based detection of moving objects. Proc. 16th Int. Conf. Digital Signal Process. 2009. P. 1–6.
15. Töreyin B.U., Enis Çetin A., Aksay A., Akhan M.B. Moving Object Detection in Wavelet Compressed Video. Signal Processing: Image Communication, EURASIP. 2005. V. 20. P. 255–264.
16. Bagci M., Yardimci Y., Cetin A.E. Moving object detection using adaptive subband decomposition and fractional lower order statistics in video sequences. Signal Process. International Journal of Signal Processing. 2002. P. 1942–1947
17. Butler D., Sridharan S., Bove V.M. Jr. Realtime Adaptive Background Segmentation. Acoustics, Speech, and Signal Processing. 2003. P. 349–352.
About the Authors
D. S. VolchkovaRussian Federation
Volchkova Daria Sergeevna – Design Engineer of the 3rd Category; Post-graduate student, Department of Radio Engineering.
Research interests: digital image processing.
Ulyanovsk
A. S. Dolgova
Russian Federation
Dolgova Alyona Sergeevna – Design Engineer of the 3rd Category.
Research interests: digital image processing.
Ulyanovsk
Review
For citations:
Volchkova D.S., Dolgova A.S. An algorithm for detecting low-contrast objects in different target environments. Journal of «Almaz – Antey» Air and Space Defence Corporation. 2021;(2):76-89. https://doi.org/10.38013/2542-0542-2021-2-76-89