作者单位:State Key Laboratory of Automotive Simulation and Control, Beijing Advanced Innovation Center for Big Data and Brain Computing
发表期刊: Sensors 2019, 19, 1089;
摘要: Region proposal network (RPN) based object detection, such as Faster Regions with CNN(Faster R-CNN), has gained considerable attention due to its high accuracy and fast speed. However,it has room for improvements when used in special application situations, such as the on-boardvehicle detection. Original RPN locates multiscale anchors uniformly on each pixel of the last featuremap and classififies whether an anchor is part of the foreground or background with one pixel in thelast feature map. The receptive fifield of each pixel in the last feature map is fifixed in the original fasterR-CNN and does not coincide with the anchor size. Hence, only a certain part can be seen for largevehicles and too much useless information is contained in the feature for small vehicles. This reducesdetection accuracy. Furthermore, the perspective projection results in the vehicle bounding box sizebecoming related to the bounding box position, thereby reducing the effectiveness and accuracyof the uniform anchor generation method. This reduces both detection accuracy and computingspeed. After the region proposal stage, many regions of interest (ROI) are generated. The ROI poolinglayer projects an ROI to the last feature map and forms a new feature map with a fifixed size forfifinal classifification and box regression. The number of feature map pixels in the projected regioncan also inflfluence the detection performance but this is not accurately controlled in former works.In this paper, the original faster R-CNN is optimized, especially for the on-board vehicle detection.This paper tries to solve these above-mentioned problems. The proposed method is tested on theKITTI dataset and the result shows a signifificant improvement without too many tricky parameteradjustments and training skills. The proposed method can also be used on other objects with obviousforeshortening effects, such as on-board pedestrian detection. The basic idea of the proposed methoddoes not rely on concrete implementation and thus, most deep learning based object detectors withmultiscale feature maps can be optimized with it.
关键词:vehicle detection; anchor generation optimization; receptive fifield matching;ROI assignment