We present convolutional neural networks for the tasks of keypoint (pose) prediction
and action classification of people in unconstrained images. Our approach
involves training an R-CNN detector with loss functions depending on the task
being tackled. We evaluate our method on the challenging PASCAL VOC dataset
and compare it to previous leading approaches. Our method gives state-of-theart
results for keypoint and action prediction. Additionally, we introduce a new
dataset for action detection, the task of simultaneously localizing people and classifying
their actions, and present results using our approach.
paper
Before using the available source code, you need to install Caffe.
You can download the action dataset, as used in the paper. The dataset contains the PASCAL VOC Action 2012 images, with complete annotations of all the people and their action labels.
Dataset download: action_dataset.tar.gz
When citing our system, please cite this work. The bibtex entry is provided below for your convenience.
@inproceedings{poseactionrcnn,
Author = {G. Gkioxari and B. Hariharan and R. Girshick and J. Malik},
Title = {R-CNNs for Pose Estimation and Action Detection},
ArchivePrefix = {arXiv},
Eprint = {1406.5212},
PrimaryClass = {cs.CV},
Year = {2014}}
For any questions regarding the work or the implementation, contact the author at gkioxari@eecs.berkeley.edu