Jason J. Corso
|
Code and Data Downloads
Links to code and data downloads are included with relevant project
pages and publication entries. I include them here as well for one
concise listing (but, some newer ones may yet be missing).
YouCook2
is the largest task-oriented, instructional video dataset in the vision community. It contains 2000 long untrimmed videos from 89 cooking recipes. The procedure steps for each video are annotated with temporal boundaries and described by imperative English sentences. ArXiV report for the paper/data.
Click-Here CNNs: This is the project page supplying code and data associated with our ICCV 2017 paper.
A2D: Actor-Action Dataset is a new
dataset to support a broad class of video understanding problems:
action recognition, actor-class recognition, multi-label
actor/action recognition, actor-action semantic segmentation.
Data and evaluation code is available. This dataset was released
with our CVPR
2015 paper.
Video2Text.net:
A website and web-service for automatic conversion of videos to natural
language sentences based on the video content. This website showcases our work
in the vision+language domain. The website implements our various
video to text
methods.
YouCook data set: 88 challenging
videos of various cooking (third-person viewpoint, different
backgrounds, dynamic camera and person movement) with natural
language annotations (about 8 per video) and object and action
annotations. Includes a benchmark ROUGE scoring evaluation. The
data set was published with our CVPR
2013 paper.
Hierarchy
Agreement Index: implementation of our AAAI LBP
2013 cross-hierarchy evaluation tool for general use.
Random Forest Distance -- tree-structured metric learning that implicitly adapts the metric over the sample space based on our KDD 2012 paper.
(Code updated 2/28/14)
Action Bank full code and processed data sets [direct link to code]
LIBSVX: A Supervoxel Library and Benchmark for Early Video Processing. Implements a suite of supervoxel video segmentation methods as well as a quantitative set of 2D and 3D metrics for good supervoxels.
Graph-Shifts Code (Java)
and
example data.
Video label propagation
code and benchmark data set.
UB/College Park stereo building facade dataset. [more information].
|