We address the classic problems of detection and segmentation using a part based detector that operates on a novel part, which we refer to as a poselet. Poselets are tightly clustered in both appearance space (and thus are easy to detect) as well as in configuration space (and thus are helpful for localization and segmentation). We demonstrate poselets are effective for detection, pose extraction, segmentation, action/pose estimation and attribute classification. Poselet construction requires extra annotations beyond the object bounds. To train poselets we have created H3D (Humans in 3D) - a dataset of 1200+ person annotations. The annotations include the joints, the extracted 3D pose, keypoint visibility and region labels. We have also annotated the people in the training and validation sets of PASCAL VOC 2009.
Our poselet classifier achieves state-of-the-art results for the person category on PASCAL VOC 2007, 2008, 2009 and 2010 as well as on our dataset, H3D.
You can browse the 150 poselets for the person category.
The following are results as of September 23, 2010 for the Person category of the PASCAL VOC challenges.
Poselets | Second-highest score | |
---|---|---|
VOC 2010 | 48.5 |
47.5 *** |
VOC 2009 | 48.6 |
47.9 *** |
VOC 2008 | 54.1 |
43.1 ** |
VOC 2007 | 46.9 |
43.2 * |
* P. Felzenszwalb, R. Girshick, D. McAllester, D. Ramanan Object Detection with Discriminatively Trained Part Based Models, (Release 4, 2010)
** P. Felzenszwalb, R. Girshick, D. McAllester, D. Ramanan Object Detection with Discriminatively Trained Part Based Models, PAMI (preprint, 2009)
*** P. Felzenszwalb, R. Girshick, D. McAllester, D. Ramanan, PASCAL VOC 2010 competition
In this comparison we included all methods participating in Comp 3 (trained on VOC data) and Comp 4 (trained on own data). Our method requires extra annotations, so we competed in Comp 4, but we were the only submission in that category.
Core papers:
Applications papers:
Below is stand-alone code that takes an image and draws bounding boxes of the people in it and can also perform interactive visualization of the poselets. Requirements: Matlab + Image Processing toolbox. The code is released with a non-commercial license. The released code and trained detector is similar to the one we used in the PASCAL 2010 competition, which is slightly improved in accuracy (but slower) than our ECCV 2010 paper.
June 2011 BETA release. The trained models are the same, but the code is cleaned up and there are a lot of visualization utilities. Please send email to lbourdev at eecs.berekely.edu if you have problems with the new version.
Note: If you use WinZip and Matlab reports that your file is corrupt, please try WinRAR. If you need an older release please let is know.
The Java3D tool that we used to create H3D and a video tutorial are available here. There are no license restrictions on using the tool for your own annotations.
For comments or questions about poselets please email Lubomir Bourdev lbourdev-at-eecs dot berekely dot edu.