Jason J. Corso
Publications List
|
tag: video understanding
[1]
|
L. Zhou, Y. Kalantidis, X. Chen, J. J. Corso, and M. Rohrbach.
Grounded video description.
In Proceedings of IEEE Conference on Computer Vision and
Pattern Recognition, 2019.
[ bib |
.pdf ]
|
[2]
|
B. Griffin and J. J. Corso.
BubbleNets: Learning to select the guidance frame in video object
segmentation by deep sorting frames.
In Proceedings of IEEE Conference on Computer Vision and
Pattern Recognition, 2019.
[ bib |
.pdf ]
|
[3]
|
B. Griffin and J. J. Corso.
Tukey-inspired video object segmentation.
In Proceedings of IEEE Winter Conference on Applications of
Computer Vision, 2019.
[ bib |
http ]
|
[4]
|
H. Huang, L. Zhou, W. Zhang, J. J. Corso, and C. Xu.
Dynamic graph modules for modeling object-object interactions in
activity recognition.
In Proceedings of the British Machine Vision Conference, 2019.
[ bib |
.pdf ]
|
[5]
|
K. Min and J. J. Corso.
TASED-net: Temporally-aggregating spatial encoder-decoder network
for video saliency detection.
In Proceedings of IEEE International Conference on Computer
Vision, 2019.
[ bib |
.pdf ]
|
[6]
|
L. Zhou, C. Xu, and J. J. Corso.
Towards automatic learning of procedures from web instructional
videos.
In Proceedings of AAAI Conference on Artificial Intelligence,
2018.
[ bib |
code |
data |
http ]
|
[7]
|
L. Zhou, Y. Zhou, J. J. Corso, R. Socher, and C. Xiong.
End-to-end dense video captioning with masked transformer.
In Proceedings of IEEE Conference on Computer Vision and
Pattern Recognition, 2018.
[ bib |
code |
.pdf ]
|
[8]
|
X. Sun, R. Szeto, and J. J. Corso.
A Temporally-Aware Interpolation Network for Video Frame
Inpainting.
In Proceedings of Asian Conference on Computer Vision (ACCV),
2018.
[ bib |
code |
project |
http ]
|
[9]
|
L. Zhou, N. Louis, and J. J. Corso.
Weakly-supervised video object grounding from text by loss weighting
and object interaction.
In Proceedings of British Machine Vision Conference, 2018.
[ bib |
.pdf ]
|
[10]
|
M. R. Ganesh, E. Hofesmann, B. Min, N. Gafoor, and J. J. Corso.
T-recs: Training for rate-invariant embeddings by controlling speed
for action recognition.
Technical Report 1803.08094, ARXIV, 2018.
[ bib |
http ]
|
[11]
|
E. Hofesmann, M. R. Ganesh, and J. J. Corso.
M-PACT: An open source platform for repeatable activity
classification research.
Technical Report 1804.05879, ARXIV, 2018.
[ bib |
code |
http ]
|
[12]
|
T. Han, H. Yao, C. Xu, X. Sun, Y. Zhang, and J. J. Corso.
Dancelets mining for video recommendation based on dance styles.
IEEE Transactions on Multimedia, 19(4), 2017.
[ bib ]
|
[13]
|
Y. Yan, C. Xu, D. Cai, and J. J. Corso.
Weakly supervised actor-action segmentation via robust multi-task
ranking.
In Proceedings of IEEE Conference on Computer Vision and
Pattern Recognition, 2017.
[ bib ]
|
[14]
|
L. Zhou, C. Xu, P. Koch, and J. J. Corso.
Watch what you just said: Image captioning with text-conditional
attention.
In Proceedings of the Thematic Workshops of ACM Multimedia,
2017.
[ bib ]
|
[15]
|
V. Dhiman, Q.-H. Tran, J. J. Corso, and M. Chandraker.
A continuous occlusion model for road scene understanding.
In Proceedings of IEEE Conference on Computer Vision and
Pattern Recognition, 2016.
[ bib ]
|
[16]
|
C. Xu and J. J. Corso.
Actor-action semantic segmentation with grouping-process models.
In Proceedings of IEEE Conference on Computer Vision and
Pattern Recognition, 2016.
[ bib |
data ]
|
[17]
|
C. Xu and J. J. Corso.
LIBSVX: A supervoxel library and benchmark for early video
processing.
International Journal of Computer Vision, 119:272--290, 2016.
[ bib ]
|
[18]
|
R. Xu, C. Xiong, W. Chen, and J. J. Corso.
Jointly modeling deep video and compositional text to bridge vision
and language in a unified framework.
In Proceedings of AAAI Conference on Artificial Intelligence,
2015.
[ bib |
.pdf ]
|
[19]
|
J. Lu, R. Xu, and J. J. Corso.
Human action segmentation with hierarchical supervoxel consistency.
In Proceedings of IEEE Conference on Computer Vision and
Pattern Recognition, 2015.
[ bib |
.pdf ]
|
[20]
|
C. Xu, S.-H. Hsieh, C. Xiong, and J. J. Corso.
Can humans fly? Action understanding with multiple classes of
actors.
In Proceedings of IEEE Conference on Computer Vision and
Pattern Recognition, 2015.
[ bib |
poster |
data |
.pdf ]
|
[21]
|
W. Chen and J. J. Corso.
Action detection by implicit intentional motion clustering.
In Proceedings of IEEE International Conference on Computer
Vision, 2015.
[ bib |
poster |
.pdf ]
|
[22]
|
S. Oh, S. McCloskey, I. Kim, A. Vahdat, K. Cannons, H. Hajimirsadeghi, G. Mori,
A. G. A. Perera, M. Pandey, and J. J. Corso.
Multimedia event detection with multimodal feature fusion and
temporal concept localization.
Machine Vision and Applications, 25:49--69, 2014.
[ bib |
http ]
|
[23]
|
P. Agarwal, S. Kumar, J. Ryde, J. J. Corso, and V. N. Krovi.
Estimating dynamics on-the-fly using monocular video for vision-based
robotics.
IEEE/ASME Transactions on Mechatronics, 19(4):1412--1423, 2014.
[ bib |
http ]
|
[24]
|
C. Xu, R. F. Doell, S. J. Hanson, C. Hanson, and J. J Corso.
A study of actor and action semantic retention in video supervoxel
segmentation.
International Journal of Semantic Computing, 2014.
Selected as a Best Paper from ICSC; an earlier version appeared as
arXiv:1311.3318.
[ bib |
.pdf ]
|
[25]
|
S. Kumar, M. S. Narayanan, P. Singhal, J. J. Corso, and V. Krovi.
Surgical tool attributes from monocular video.
In Proceedings of IEEE International Conference on Robotics and
Automation, 2014.
[ bib ]
|
[26]
|
W. Chen, C. Xiong, R. Xu, and J. J. Corso.
Actionness ranking with lattice conditional ordinal random fields.
In Proceedings of IEEE Conference on Computer Vision and
Pattern Recognition, 2014.
[ bib |
poster |
code |
.pdf ]
|
[27]
|
A. Barbu, D. Barrett, W. Chen, N. Siddharth, C. Xiong, J. J. Corso,
C. D. Fellbaum, C. Hanson, S. J. Hanson, S. Hélie, E. Malaia, B. A.
Pearlmutter, J. M. Siskind, T. M. Talavage, and R. B. Wilbur.
Seeing is worse than believing: Reading people's minds better than
computer-vision methods recognize actions.
In Proceedings of European Conference on Computer Vision, 2014.
[ bib |
.pdf ]
|
[28]
|
P. Das, R. K. Srihari, and J. J. Corso.
Translating related words to videos and back through latent topics.
In Proceedings of Sixth ACM International Conference on Web
Search and Data Mining, 2013.
[ bib |
.pdf ]
|
[29]
|
P. Das, C. Xu, R. F. Doell, and J. J. Corso.
A thousand frames in just a few words: Lingual description of videos
through latent topics and sparse object stitching.
In Proceedings of IEEE Conference on Computer Vision and
Pattern Recognition, 2013.
[ bib |
poster |
data |
.pdf ]
|
[30]
|
S. Kumar, M. Narayanan, P. Singhal, J. J. Corso, and V. Krovi.
Product of tracking experts for surgical tool visual tracking.
In IEEE Conference on Automation Science and Engineering, 2013.
[ bib |
.pdf ]
|
[31]
|
C. Xu, R. F. Doell, S. J. Hanson, C. Hanson, and J. J Corso.
Are actor and action semantics retained in video supervoxel
segmentation?
In Proceedings of IEEE International Conference on Semantic
Computing, 2013.
[ bib |
.pdf ]
|
[32]
|
C. Xu, S. Whitt, and J. J. Corso.
Flattening supervoxel hierarchies by the uniform entropy slice.
In Proceedings of the IEEE International Conference on Computer
Vision, 2013.
[ bib |
poster |
project |
video |
.pdf ]
|
[33]
|
A. Barbu, N. Siddharth, C. Xiong, J. J. Corso, C. D. Fellbaum,
C. Hanson, S. J. Hanson, S. Hélie, E. Malaia, B. A. Pearlmutter, J. M.
Siskind, T. M. Talavage, and R. B. Wilbur.
The compositional natural of verb and argument representations in the
human brain.
Technical Report 1306.2293, arXiv, 2013.
[ bib |
http ]
|
[34]
|
C. Xu and J. J. Corso.
Evaluation of super-voxel methods for early video processing.
In Proceedings of IEEE Conference on Computer Vision and
Pattern Recognition, 2012.
[ bib |
code |
project |
.pdf ]
|
[35]
|
S. Sadanand and J. J. Corso.
Action bank: A high-level representation of activity in video.
In Proceedings of IEEE Conference on Computer Vision and
Pattern Recognition, 2012.
[ bib |
code |
project |
.pdf ]
|
[36]
|
P. Agarwal, S. Kumar, J. Ryde, J. J. Corso, and V. N. Krovi.
Estimating human dynamics on-the-fly using monocular video for pose
estimation.
In Proceedings of Robotics Science and Systems, 2012.
[ bib |
.pdf ]
|
[37]
|
R. Xu, P. Agarwal, S. Kumar, V. N. Krovi, and J. J. Corso.
Combining skeletal pose with local motion for human activity
recognition.
In Proceedings of VII Conference on Articulated Motion and
Deformable Objects, 2012.
[ bib |
slides |
.pdf ]
|
[38]
|
M. A. Bustamante and J. J. Corso.
Using probabilistic ontologies for video exploration.
In Proceedings of the Eighteenth Americas Conference on
Information Systems, 2012.
[ bib ]
|
[39]
|
P. Agarwal, S. Kumar, J. Ryde, J. J. Corso, and V. N. Krovi.
An optimization based framework for human pose estimation in
monocular videos.
In Proceedings of International Symposium on Visual Computing,
2012.
[ bib |
.pdf ]
|
[40]
|
C. Xiong and J. J. Corso.
Coaction discovery: Segmentation of common actions across multiple
videos.
In Proceedings of Multimedia Data Mining Workshop in Conjunction
with the ACM SIGKDD Conference on Knowledge Discovery and Data Mining
(MDMKDD), 2012.
[ bib |
.pdf ]
|
[41]
|
C. Xu, C. Xiong, and J. J. Corso.
Streaming hierarchical video segmentation.
In Proceedings of European Conference on Computer Vision, 2012.
[ bib |
code |
project |
.pdf ]
|
[42]
|
A. Y. C. Chen and J. J. Corso.
Temporally consistent multi-class video-object segmentation with the
video graph-shifts algorithm.
In Proceedings of the 2011 IEEE Workshop on Motion and Video
Computing, 2011.
[ bib |
code |
project |
.pdf ]
|
[43]
|
P. Agarwal, S. Kumar, J. J. Corso, and V. N. Krovi.
Estimating dynamics on-the-fly using monocular video.
In Proceedings of 4th Annual Dynamic Systems and Control
Conference, 2011.
[ bib |
.pdf ]
|
[44]
|
A. Y. C. Chen and J. J. Corso.
Propagating multi-class pixel labels throughout video frames.
In Proceedings of Western New York Image Processing Workshop,
2010.
[ bib |
.pdf ]
|
[45]
|
D. Burschka, J. J. Corso, M. Dewan, W. Lau, M. Li, H. Lin,
P. Marayong, N. Ramey, G. D. Hager, B. Hoffman, D. Larkin, and C. Hasser.
Navigating Inner Space: 3-D Assistance for Minimally Invasive
Surgery.
Robotics and Autonomous System, 2005.
[ bib ]
|
[46]
|
G. Ye, J. J. Corso, and G. D. Hager.
Real-Time Vision for Human-Computer Interaction, chapter 7:
Visual Modeling of Dynamic Gestures Using 3D Appearance and Motion Features,
pages 103--120.
Springer-Verlag, 2005.
[ bib |
.pdf ]
|
[47]
|
J. J. Corso.
Techniques for Vision-Based Human-Computer Interaction.
PhD thesis, The Johns Hopkins University, 2005.
[ bib |
.pdf ]
|
[48]
|
G. Ye, J. J. Corso, and G. D. Hager.
Gesture Recognition Using 3D Appearance and Motion Features.
In Proceedings of Workshop on Real-time Vision for
Human-Computer Interaction (at CVPR 2004), 2004.
[ bib |
.pdf ]
|
[49]
|
N. Ramey, J. J. Corso, W. W. Lau, D. Burschka, and G. D. Hager.
Real Time 3D Surface Tracking and Its Applications.
In Proceedings of Workshop on Real-time 3D Sensors and Their
Use (at CVPR 2004), 2004.
[ bib |
.pdf ]
|
[50]
|
R. Szeto, X. Sun, K. Lu, and J. J. Corso.
A Temporally-Aware Interpolation Network for Video Frame
Inpainting.
IEEE Transactions on Pattern Analysis and Machine
Intelligence, 2020 (to appear).
[ bib ]
|
[51]
|
Y. Yan, C. Xu, D. Cai, and J. J. Corso.
A weakly supervised multi-task ranking framework for actor-action
semantic segmentation.
International Journal of Computer Vision, 2019 (to appear).
[ bib ]
|
|