Review: PR-012-Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks
- R-CNN gives many foundational ideas in establishing R-CNN style detection neural networks.
- First, it finds about 2k region proposals using selective search algorithm. Selective search groups pixels based on its similarity in features such as RGB, HSV, texture … etc.
- Then, this goes through a CNN to extract features.
- After, bounding boxes are refined with respect to the above equations.
- Bounding boxes are refined by minimizing L2 losses between d(P) and w⋅ϕ(P).
- Fast R-CNN enhances R-CNN especially its training and inference time.
- Instead of separately generating all 2k feature maps according to generated RoIs using selective search, Fast R-CNN makes use of a unified feature map.
- One new idea is RoI pooling.
- RoI pooling is a methodology that lets us make all RoIs into the same size.
- Using selective search is a huge bottleneck! Can we do better than this?
- Yes! We can use RPN — Region Proposal Network.
- Region Proposal Network is just a small neural network that gives us RoIs.