Journal of Multimedia Information System
Korea Multimedia Society
Section A

# A Survey of Real-time Road Detection Techniques Using Visual Color Sensor

Gwang-Soo Hong1, Byung-Gyu Kim2,*, Debi Prosad Dogra3, Partha Pratim Roy4
1Dept. of Computer Engineering, SunMoon University, Asan, Korea, gs.Hong@ivpl.sookmyung.ac.kr
2Dept. of IT Engineering, Sookmyung Women ’s University, Seoul, Korea, bg.kim@sm.ac.kr
3IIT Bhubaneswar, India, dpdogra@iitbbs.ac.in
4IIT Roorkee, India, proy.fcs@iitr.ac.in
*Corresponding Author: Byung-Gyu Kim, Dept. of IT Engineering, Sookmyung Women’s University, Seoul, Korea, Tel: 82-2-2077-7293, bg.kim@sm.ac.kr.

© Copyright 2018 Korea Multimedia Society. This is an Open-Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (http://creativecommons.org/licenses/by-nc/4.0/) which permits unrestricted non-commercial use, distribution, and reproduction in any medium, provided the original work is properly cited.

Received: Feb 13, 2018 ; Accepted: Feb 19, 2018

Published Online: Mar 31, 2018

## Abstract

A road recognition system or Lane departure warning system is an early stage technology that has been commercialized as early as 10 years but can be optional and used as an expensive premium vehicle, with a very small number of users. Since the system installed on a vehicle should not be error prone and operate reliably, the introduction of robust feature extraction and tracking techniques requires the development of algorithms that can provide reliable information. In this paper, we investigate and analyze various real-time road detection algorithms based on color information. Through these analyses, we would like to suggest the algorithms that are actually applicable.

Keywords: Imaging Sensor; Smart Car; Color Model; Color Segmentation; Road Detection

## I. INTRODUCTION

Advanced driver assistance system (ADAS) means a system that assists the driver in driving. Lane departure warning system means a system that informs the driver when the vehicle leaves the lane, by means of visual, audible, or vibration. It does not be warned when the driver changes lanes or when the flashing signal is switched on [1], [2]. Types of sensors for lane detection are vision sensors, laser sensors, infrared sensors, and so on. This technology was first introduced to the vehicle in 2000 and has since been actively introduced into the manufacturing of some of the latest Advanced Operating System supported models, such as Nissan, Toyota, Honda, Audi, General Motors and Mercedes – Benz.

At present, collision warning using radar rather than vision systems was studied first by the lane keeping system. This is to use the radar to estimate the distance to the vehicle ahead and the speed of the vehicle to estimate the likelihood of a collision, with which the pick-up warning system has many drawbacks. In particular, distance information to the direction aimed by radar is consistent with the direction of travel of the vehicle in which it is intended to be highly accurate, or where it is different from the direction of progress of the actual vehicle, or where it is not intended to be on a road ahead.

Currently, autonomous vehicles with vision systems are first commercialized in Japan, but due to limitations in image noise rejection, the concept of an automatic system that makes it difficult for drivers to create an AVAS, or an AVS Research is actively being carried out on how to prevent accidents by identifying vehicles nearby in advance in order to prevent lane departure, which is the cause of most traffic accidents.

Recent studies have suggested video processing techniques to separate lanes, vehicles, and obstacles from road images obtained in a complex environment such as urban roads. In other words, techniques from video processing techniques to automotive identification and tracking and collision avoidance are also proposed. Sensors such as laser, radar, ultrasonic waves, and infrared rays are used to detect nearby terrain information and obstacles. Detection of the vehicle is essential for safe driving in vehicle driving aids. Research related to this is receiving much attention and various detection algorithms has been proposed. However, the vehicle areas in which vehicles are present shall be extracted from a complex road image to increase the efficiency of the vehicle detector.

Model-based vehicle area extraction techniques, which assume a combination of typical straight lines and curves, are resistant to noise and loss of lane information. However, it is not suitable for detecting vehicle areas in any roadway type because it is focused on a particular type of road. To resolve the above issues, a technique has been proposed to extract the vehicle area from lane detection using the B-snake technique, which applies both straight and curved roads.

It was also suggested that lane detection techniques be used to distinguish between the color and the width information that roads and lanes have. Many studies proposed that the method extract the area of road that is the vehicle area from lane detection using Hough transform, a technique that detects linear components within the video. Hough transform is known as an efficient way to detect pixels listed as a straight line within an image. However, detection of the vehicle road area with a normal Huff transformation is only well applied if the background information in the road image is simple. Road images with complex and diverse backgrounds, such as urban roads, are difficult to estimate the exact area of the road because of the complicated information on the edges.

This paper is composed as followings: Various robust road feature analysis and recognition technologies based on color are discussed in Section 2. Finally, we will make a conclusion in Section 3.

## II. RELATED WORKS ON ROAD DETECTION WITH COLOR INFORMATION

This section describes the techniques used to detect roads according to color space. Since the road areas are different colors from the background, color-based road areas can be detected.

2.1. Robust Color model on illumination change

When an image of the model is given as RGB, the HSV component of RGB image is obtained using the following formula [3].

(1)
$\text{θ}\text{\hspace{0.17em}}\text{=}\text{\hspace{0.17em}}\text{\hspace{0.17em}}\left\{\frac{\frac{1}{2}\left[\left(R-G\right)\right]+\left[\left(R-B\right)\right]}{\left[\left({\left(R-G\right)}^{2}+\left(R-B\right){\left(G-B\right)}^{\frac{1}{2}}\right)\right]}\right\}$
(2)
$\text{S}\text{\hspace{0.17em}}\text{=}\text{\hspace{0.17em}}\text{1}\text{\hspace{0.17em}}-\text{\hspace{0.17em}}\frac{3}{\left(R+G+B\right)}\left[\text{min}\left(R,G,B\right)\right]$
(3)
$\text{V}\text{\hspace{0.17em}}\text{=}\text{\hspace{0.17em}}\frac{1}{3}\text{\hspace{0.17em}}\left(R+G+B\right).$
(4)
Fig. 1. HSV color space.

RGB values are normalized to the [0, 1] range, and it is assumed that the angle θ is measured based on the red axis in the HSV space. The color can be normalized to the range of [0, 1] by dividing all the values obtained from Equation (1) by 360°.

Rural roads are very complex and it is difficult to find their edges or features clearly, and it is not easy to find them using gray images. As in other papers, road detection is carried out using a HIS color model, where the brightness component can be separated separately. From the original images in Figure 2 (a), Figures 2 (b), (c), and (d) are photographs of each component of the HIS taken separately. When you look at the picture of the ingredients, you can see that I, the brightness component, best represents the road. Detection of roads using the brightness component I is the complete opposite of other papers. In this paper, only the I-component was used and improved Fuzzy C-Means (FCM) has been employed to detect the road region [4].

Fig. 2. HSV color components of real road image.

The suggested FCM algorithm is as the following:

Step 1. Based on the average gray values of N areas divided in the images computed by the FCM algorithm

Step 2. Locate areas where the grey values change significantly. Here, Cn denotes illumination value of each region and C is the average illumination, respectively.

${C}_{n}\text{\hspace{0.17em}}=\text{\hspace{0.17em}}{\left(\frac{1}{c}\sum {}_{i=0}^{c-1}‖{x}_{i}\text{\hspace{0.17em}}-\text{\hspace{0.17em}}x‖{\text{\hspace{0.17em}}}^{2}\right)}^{\frac{1}{2}}\text{\hspace{0.17em}},\text{\hspace{0.17em}}C\text{\hspace{0.17em}}=\left(\frac{1}{N}\text{\hspace{0.17em}}\sum {}_{n=0}^{N-1}{C}_{n}\right)$
(5)

With Eq. (5), areas in which the grey value satisfies the Cn ≥ α· C are considered as true road areas.

Step 3. All N areas are computed by the FCM algorithm until the condition is satisfied.

Figure 3 shows the performance results of the improved FCM. This method uses the improved FCM algorithm to detect rural roads with I components that are the brightness component. But rural roads are the only way to detect an opaque road, unpaved.

Fig. 3. The result of the improved FCM.

MultI-color detection model has been proposed by using genetic programming (GP) [5]. GP has been employed for the variable element in representation of an object using the tree structure, and many applications are being made for complex and practical design and optimization issues. Figure 4 shows a crossover structure of genetic algorithm.

Fig. 4. Crossover structure of genetic programming.

In Figure 5, mutation operation is illustrated in genetic programming. Each node that forms the tree represents a function (Figure 4) or terminal (Figure 4: constants and variables), and each object is composed of a set of functions. The actions of the mating (Figure 4) and the mutation operator (Figure 5) at GP for the subtree are replaced with the subtree at the selection point. The terminals used for the conversion operation use the values of R,G,B,I1,I2,I3,Y,Cb,Cr,H,S,L,L *,a *,b * and the total converted values of channel is in [0, 1]. It uses a total of 16 terminals including random values between [0, 1].

Fig. 5. Mutation of genetic programming.

Figures 6 and 7 show some results by segmentation in RGB and HSV color spaces, GP method. From these results, the GP method get the better result than other color space model. GP technique can check exact detection results, except for the noise present in the image with darker brightness for the learned color, but for those that are not learned, it cannot detect some regions according to their brightness areas.

Fig. 6. Results by segmentation in RGB space, HSV, and GP method.
Fig. 7. Results by segmentation in RGB space, HSV, and GP method.
2.3. Road Detection Using Region Splitting

The human brain unconsciously carries out image segmentation to understand scenes. Many problems with computer vision also require high quality video segmentation. Image search, object tracking, face recognition, augmented reality, and gesture recognition can be solved more than half as long as the segmentation is carried out successfully.

Splitting of normal natural images, such as Figure 8, is a very difficult problem. The split expression of the input image f may be defined by equation (6) [6]-[8].

Fig. 8. An example of image region segmentation.
$\left\{\begin{array}{c}{\text{r}}_{\text{i}}\text{\hspace{0.17em}}\cap \text{\hspace{0.17em}}{r}_{j}\text{\hspace{0.17em}}=\text{\hspace{0.17em}}\varphi \text{\hspace{0.17em}}\le i,j\le n,i\ne j\\ \cup \text{\hspace{0.17em}}\text{\hspace{0.17em}}{i}_{=}1,n\text{\hspace{0.17em}}{r}_{i}\text{\hspace{0.17em}}=f\\ q\left({r}_{i}\right)=True\text{\hspace{0.17em}},\text{\hspace{0.17em}}1\le i\le \text{\hspace{0.17em}}n\\ q\left({r}_{i}\text{\hspace{0.17em}}\cup \text{\hspace{0.17em}}{r}_{j}\right)=False,\text{\hspace{0.17em}}1\le i,j\text{\hspace{0.17em}}\text{\hspace{0.17em}}\le \text{\hspace{0.17em}}n,\text{\hspace{0.17em}}\\ {r}_{i}\text{\hspace{0.17em}}and\text{\hspace{0.17em}}{r}_{j}\text{\hspace{0.17em}}are\text{\hspace{0.17em}}neighboring\text{\hspace{0.17em}}regions.\end{array}$
(6)

Here ri is the i-th area obtained from segmentation and n is the number of zones. The first condition means that the regions can not overlap each other, and the second is the demand that all areas cover the entire image. The third equation q(ri) is a condition function that all pixels belonging to an area ri must have the same characteristics (similar business cards). So, the third and the fourth expressions are that it is a requirement that pixels belonging to the same area should have the same characteristics and that neighboring areas have different characteristics.

Figure 9 shows the connectivity in image segmentation [9]-[10]. Indicates the properties connected to pixel values of the same characteristic in the centered circle image. Image segmentation algorithms are constantly evolving and developing more than ever before, but they limit their practicality under unrestricted circumstances. Many currently deployed application systems use more local features than areas. However, when one recognizes, one uses domain information, not local characteristics.

Fig. 9. Various types of the connectivity.
2.4. Clustering-based Segmentation

In the case of color images, pixels can be mapped to three-dimensional space. Figure 11 shows the mapped result. The dots floating in the 3-dimensional space will be closer to each other as the colors are similar. Color division can be carried out using these properties. The representative approach is data clustering mechanism [11]-[13].

Fig. 10. The RGB space of a color image.

The CAMShift algorithm uses MeanShift’s shortcomings with a technique to self-size search windows to improve the MeanShift algorithm in the Color Segment method for use in the tracking environment [14], [15].

Used to track objects at high speed and have poor performance in light-change, noisy backgrounds. Using the distribution of Hue values in the areas of the detected object, the location to be changed is predicted and detected, and then the object is centrally tracked.

Once a region of interest (ROI) is given, it converts to the Hue value of the SHV color model, builds, stores, and uses a 1-dimensional histogram of the areas of interest. It is used by setting several windows within the image and by repeatedly changing the size and position of each window's center point, where the position of the pixels within the window is averaged separately for the pixel values to be averaged.

This process is repeated until the windows are assembled, and whether or not the windows are collected is determined by a change in the location and size of the windows. That is, the changed window will converge when its position and size do not differ from the previous value. When all windows in an area are clustered, the maximum window output can vary. Figure 11 shows a result by CAMshift algorithm.

Fig. 11. The road candidates detection by CAMshift algorithm.

## V. CONCLUSION

In this paper, video processing techniques were investigated on a color-based basis to separate lanes, vehicles, and obstacles from road images obtained in a complex setting, such as a city road. From our analysis, we need to choose the right color space, first, because it can be insensitive to the variety of lighting changes. Also we should define some prior knowledge such as the region of interest (ROI). With these information, we have to develop the color segmentation or merging algorithm to suppress some noise components. The geometric structure of camera view can be also recommended in the detection system.

## Acknowledgement

Authors would like to thank all reviewers and staff members of the JMIS for qualifying our paper in the peer-to-peer review process.

## REFERENCES

[1].

Diego. Ferran, J.M.A. Alvarez, J. Serrat, A.M. Lopez. “Vision-based road detection via on-line video registration.” International IEEE Conference on Intelligent Transportation Systems (ITSC), pp.1135 - 1140, 2010.

[2].

J.M.A. Alvarez, T. Gevers, A.M. Lopez. “Vision-based road detection using road models.” IEEE International Conference on Image Processing (ICIP), pp.2073 - 2076, 2009.

[3].

J. Lu, M. Yang, H. Wang, B. Zhang. “Vision-based real-time road detection in urban traffic.” International Society for Optics and Photonics, pp.75-82, 2002.

[4].

Sivic, Josef, Andrew Zisserman. “Video Google: A text retrieval approach to object matching in videos.” International Conference on Computer Vision, vol.2, pp.1470-1477, 2003.

[5].

H. Dahlkamp, A. Kaehler, D. Stavens, S. Thrun, G. Bradski. “Self-supervised Monocular Road Detection in Desert Terrain.”, Robotics: science and systems. 2006.

[6].

J.M.A. Alvarez, T. Gevers, Y. LeCun, A.M. Lopez. “Road scene segmentation from a single image.”, European Conference on Computer Vision (ECCV), pp. 376-389, 2012.

[7].

J.M.A. Alvarez, J. Lopez, M. Antonio. “Road detection based on illuminant invariance.”, IEEE Transactions on Intelligent Transportation Systems, vol.12, pp.184-193 , 2011.

[8].

J.M.A. Alvarez, M. Salzmann, N. Barnes. “Large-scale semantic co-labeling of image sets”, IEEE Winter Conference on Applications of Computer Vision (WACV), pp.501-508, 2014.

[9].

Hui Kong, J.Y. Audibert, J. Ponce. “General Road Detection From a Single Image”, IEEE Transactions on Image Processing, vol.19, pp.2211-2220, 2010.

[10].

Bai, Li, Yan Wang, Michael Fairhurst. “An extended hyperbola model for road tracking for video-based personal navigation.” Knowledge-Based Systems, pp.265-272, 2008.

[11].

J.M.A. Alvarez, T. Gevers, A.M. Lopez. “3D Scene priors for road detection”, 2010 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp.57-64, 2010.

[12].

T. Kuhnl, J. Fritsch. “Visio-spatial road boundary detection for unmarked urban and rural roads”, IEEE Intelligent Vehicles Symposium Proceedings, pp.1251-1256, 2014.

[13].

F. Diego, J.M.A. Alvarez, J. Serrat, A.M. Lopez. “Vision-based road detection via on-line video registration”, International IEEE Conference on Intelligent Transportation Systems (ITSC), pp.1135-1140, 2010.

[14].

J.M.A. Alvarez, T. Gevers, A.M. Lopez. “Vision-based road detection using road models” IEEE International Conference on Image Processing (ICIP), pp.2073-2076, 2009.

[15].

Katramados, Ioannis, Steve Crumpler, Toby P. Breckon. “Real-time traversable surface detection by color space fusion and temporal analysis”, International Conference on Computer Vision Systems, (ICVS), pp.265-274, 2009.