Call for participation:
While there exist datasets for image segmentation and object recognition, there is no publicly available and commonly used dataset for human action recognition. A survey of publications from the last major conferences shows that there exist a considerable number of different datasets (see list of publications below), but that there is no exchange of video material. We believe that commonly accepted datasets will facilitate the advance of the field and enable better understanding of techniques.
Therefore we suggest the creation of a public repository of video sequences for action recognition. First we would like to encourage you to make existing video sequences available on-line. We will create a webpage at Berkeley that collects references to the different existing datasets. We also offer to record additional footage containing a broad set of actions in different environments. We are open to suggestions on the set of actions, locations, viewpoints, etc.
Video sequences of people in different environments have been collected for the "Performance Evaluation of Tracking and Surveillance" (PETS) workshop. However, the focus of PETS is people tracking and counting, not action recognition. Therefore the set of different actions in this dataset is small (walking, standing, running) and not sufficient for our purpose. In contrast, some authors have collected large sets of video sequences of various actions performed by different people in varying environments. If made available, this material would serve as the basis of an extensive dataset that could become the standard in the field.
We hope that, in the spirit of cooperative scientific progress, publications in the future will make use of these common datasets for two reasons: (1) So that different algorithms can be compared to each other, and (2) So that progress in performance can be tracked over time. While there is always room for application-specific advance in understanding and recognizing actions, there is a need to provide some common ground in order to understand the relationship between various techniques. We hope that common datasets will prove to be a step in that direction.
If you are the owner of an existing dataset and wish to make it available, please send the url to one of the contact addresses below and it will be added to the webpage.
The website is:
http://www.cs.berkeley.edu/projects/vision/action
Contact:
Prof. Jitendra Malik
Slav Petrov
Alex Berg
Links to Datasets:
-
"Free Viewpoint Action Recognition using Motion History Volumes (CVIU Nov./Dec. '06)."
D. Weinland, R. Ronfard, E. Boyer
-
"Actions as Space-Time Shapes (ICCV '05)."
M. Blank, L. Gorelick, E. Shechtman, M. Irani, R. Basri
- "Recognizing Human Actions: A Local SVM Approach (ICPR '04)."
C. Schuldt, I. Laptev and B. Caputo
-
"Propagation Networks for Recognizing Partially Ordered Sequential Activity (CVPR '04)."
Y. Shi, Y. Huang, D. Minnen, A. Bobick, I. Essa
-
"Tracking Multiple Objects through Occlusions (CVPR '05)."
Y. Huang, I. Essa
-
Sixth IEEE International Workshop on Performance Evaluation of Tracking and Surveillance (PETS - ECCV 2004)
Recent Action Recognition Papers:
- D. Weinland, R. Ronfard, E. Boyer (CVIU Nov./Dec. '06)
"Free Viewpoint Action Recognition using Motion History Volumes"
11 actors each performing 3 times 13 actions: Check Watch, Cross Arms, Scratch Head, Sit Down, Get Up, Turn Around, Walk, Wave, Punch, Kick, Point, Pick Up, Throw.
Multiple views of 5 synchronized and calibrated cameras are provided.
-
A. Yilmaz, M. Shah (ICCV '05)
"Recognizing Human Actions in Videos Acquired by Uncalibrated Moving Cameras"
18 Sequences, 8 Actions: 3 x Running, 3 x Bicycling, 3 x Sitting-down, 2 x Walking, 2 x Picking-up, 1 x Waving Hands, 1 x Forehand Stroke, 1 x Backhand Stroke
-
Y. Sheikh, M. Shah (ICCV '05)
"Exploring the Space of an Action for Human Action Recognition"
6 Actions: Sitting, Standing, Falling, Walking, Dancing, Running
-
M. Blank, L. Gorelick, E. Shechtman, M. Irani, R. Basri (ICCV '05)
"Actions as Space-Time Shapes"
81 Sequences, 9 Actions, 9 People: Running, Walking, Bending, Jumping-Jack, Jumping-Forward-On-Two-Legs, Jumping-In-Place-On-Two-Legs, Galloping-Sideways, Waving-Two-Hands, Waving-One-Hand Ballet
-
A. Yilmaz, M. Shah (CVPR '05)
"Action Sketch: A Novel Action Representation"
28 Sequences, 12 Actions: 7 x Walking, 4 x Aerobics, 2 x Dancing, 2 x Sit-down, 2 x Stand-up, 2 x Kicking, 2 x Surrender, 2 x Hands-down, 2 x Tennis, 1 x Falling
-
E. Shechtman, M. Irani (CVPR '05)
"Space-Time Behavioral Correlation"
Walking, Diving, Jumping, Waving Arms, Waving Hands, Ballet Figure, Water Fountain
-
Y. Shi, Y. Huang, D. Minnen, A. Bobick, I. Essa (CVPR '04)
"Propagation Networks for Recognition of Partially Ordered Sequential Actions"
Glucose Monitor Calibration
-
C. Schuldt, I. Laptev and B. Caputo (ICPR '04)
"Recognizing Human Actions: A Local SVM Approach."
6 Actions x 25 Subjects x 4 Scenarios
-
V. Parameswaran, R. Chellappa (CVPR '03)
"View Invariants for Human Action Recognition"
25 x Walk, 6 x Run, 18 x Sit-down
-
D. Minnen, I. Essa, T. Starner (CVPR '03)
"Expectation Grammars: Leveraging High-Level Expectations for Activity Recognition"
Towers of Hanoi (only hands)
-
A. Efros, A. Berg, G. Mori, J. Malik (ICCV '03)
"Recognizing Actions at a Distance"
Soccer, Tennis, Ballet
|