Jason J. Corso
Publications List
tag: computer vision
L. Zhou, H. Palangi, L. Zhang, H. Hu, J. J. Corso, and J. Gao.
Unified vision-language pre-training for image captioning and vqa.
In Proceedings of AAAI Conference on Artificial Intelligence,
[ bib ]
B. Griffin, V. Florence, and J. J. Corso.
Video object segmentation-based visual servo control and object depth
estimation on a mobile robot.
In Proceedings of IEEE Winter Conference on Applications of
Computer Vision, 2020.
[ bib ]
J. Y. Song, S. J. Lemmer, M. X. Liu, S. Yan, J. Kim, J. J. Corso,
and W. S. Lasecki.
Popup: Reconstructing 3d video using particle filtering to aggregate
crowd responses.
In Proceedings of ACM International Conference on Intelligent
User Interfaces, 2019.
[ bib |
http ]
H. Tang, X. Chen, W. Wang, D. Xu, J. J. Corso, N. Sebe, and Y. Yan.
Attribute-guided sketch generation.
In Proceedings of IEEE Conference on Automatic Face and Gesture
Recognition, 2019.
[ bib |
http ]
H. Tang, D. Xu, Y. Yan, Y. Wang, J. J. Corso, and N. Sebe.
Multi-channel attention selection GAN with cascaded semantic
guidance for cross-view image translation.
In Proceedings of IEEE Conference on Computer Vision and
Pattern Recognition, 2019.
[ bib |
.pdf ]
L. Zhou, Y. Kalantidis, X. Chen, J. J. Corso, and M. Rohrbach.
Grounded video description.
In Proceedings of IEEE Conference on Computer Vision and
Pattern Recognition, 2019.
[ bib |
.pdf ]
B. Griffin and J. J. Corso.
BubbleNets: Learning to select the guidance frame in video object
segmentation by deep sorting frames.
In Proceedings of IEEE Conference on Computer Vision and
Pattern Recognition, 2019.
[ bib |
.pdf ]
B. Griffin and J. J. Corso.
Tukey-inspired video object segmentation.
In Proceedings of IEEE Winter Conference on Applications of
Computer Vision, 2019.
[ bib |
http ]
H. Huang, L. Zhou, W. Zhang, J. J. Corso, and C. Xu.
Dynamic graph modules for modeling object-object interactions in
activity recognition.
In Proceedings of the British Machine Vision Conference, 2019.
[ bib |
.pdf ]
K. Min and J. J. Corso.
TASED-net: Temporally-aggregating spatial encoder-decoder network
for video saliency detection.
In Proceedings of IEEE International Conference on Computer
Vision, 2019.
[ bib |
.pdf ]
S. Kumar, V. Dhiman, P. Koch, and J. J. Corso.
Learning compositional sparse bimodal models.
IEEE Transactions on Pattern Analysis and Machine
Intelligence, 40(5):1032--1044, 2018.
[ bib |
code ]
L. Zhou, Y. Zhou, J. J. Corso, R. Socher, and C. Xiong.
End-to-end dense video captioning with masked transformer.
In Proceedings of IEEE Conference on Computer Vision and
Pattern Recognition, 2018.
[ bib |
code |
.pdf ]
X. Sun, R. Szeto, and J. J. Corso.
A Temporally-Aware Interpolation Network for Video Frame
In Proceedings of Asian Conference on Computer Vision (ACCV),
[ bib |
code |
project |
http ]
L. Zhou, N. Louis, and J. J. Corso.
Weakly-supervised video object grounding from text by loss weighting
and object interaction.
In Proceedings of British Machine Vision Conference, 2018.
[ bib |
.pdf ]
M. R. Ganesh, E. Hofesmann, B. Min, N. Gafoor, and J. J. Corso.
T-recs: Training for rate-invariant embeddings by controlling speed
for action recognition.
Technical Report 1803.08094, ARXIV, 2018.
[ bib |
http ]
E. Hofesmann, M. R. Ganesh, and J. J. Corso.
M-PACT: An open source platform for repeatable activity
classification research.
Technical Report 1804.05879, ARXIV, 2018.
[ bib |
code |
http ]
M. El Banani and J. J. Corso.
Adviser networks: Learning what question to ask for human-in-the-loop
viewpoint estimation.
Technical Report 1802.01666, ARXIV, 2018.
[ bib |
code |
http ]
T. Han, H. Yao, C. Xu, X. Sun, Y. Zhang, and J. J. Corso.
Dancelets mining for video recommendation based on dance styles.
IEEE Transactions on Multimedia, 19(4), 2017.
[ bib ]
C. Chen and J. J. Corso.
Joint occlusion boundary detection and figure/ground assignment by
extracting common-fate fragments in a back-projection scheme.
Pattern Recognition, 64:15--28, 2017.
[ bib ]
Y. Yan, C. Xu, D. Cai, and J. J. Corso.
Weakly supervised actor-action segmentation via robust multi-task
In Proceedings of IEEE Conference on Computer Vision and
Pattern Recognition, 2017.
[ bib ]
R. Szeto and J. J. Corso.
Click-here: Human-localized keypoints as guidance for viewpoint
In Proceedings of IEEE International Conference on Computer
Vision, 2017.
[ bib |
poster |
code |
project |
data |
.pdf ]
L. Zhou, C. Xu, P. Koch, and J. J. Corso.
Watch what you just said: Image captioning with text-conditional
In Proceedings of the Thematic Workshops of ACM Multimedia,
[ bib ]
V. Dhiman, Q.-H. Tran, J. J. Corso, and M. Chandraker.
A continuous occlusion model for road scene understanding.
In Proceedings of IEEE Conference on Computer Vision and
Pattern Recognition, 2016.
[ bib ]
C. Xu and J. J. Corso.
Actor-action semantic segmentation with grouping-process models.
In Proceedings of IEEE Conference on Computer Vision and
Pattern Recognition, 2016.
[ bib |
data ]
C. Xu and J. J. Corso.
LIBSVX: A supervoxel library and benchmark for early video
International Journal of Computer Vision, 119:272--290, 2016.
[ bib ]
R. Xu, C. Xiong, W. Chen, and J. J. Corso.
Jointly modeling deep video and compositional text to bridge vision
and language in a unified framework.
In Proceedings of AAAI Conference on Artificial Intelligence,
[ bib |
.pdf ]
J. Lu, R. Xu, and J. J. Corso.
Human action segmentation with hierarchical supervoxel consistency.
In Proceedings of IEEE Conference on Computer Vision and
Pattern Recognition, 2015.
[ bib |
.pdf ]
C. Xu, S.-H. Hsieh, C. Xiong, and J. J. Corso.
Can humans fly? Action understanding with multiple classes of
In Proceedings of IEEE Conference on Computer Vision and
Pattern Recognition, 2015.
[ bib |
poster |
data |
.pdf ]
W. Chen and J. J. Corso.
Action detection by implicit intentional motion clustering.
In Proceedings of IEEE International Conference on Computer
Vision, 2015.
[ bib |
poster |
.pdf ]
S. Oh, S. McCloskey, I. Kim, A. Vahdat, K. Cannons, H. Hajimirsadeghi, G. Mori,
A. G. A. Perera, M. Pandey, and J. J. Corso.
Multimedia event detection with multimodal feature fusion and
temporal concept localization.
Machine Vision and Applications, 25:49--69, 2014.
[ bib |
http ]
P. Agarwal, S. Kumar, J. Ryde, J. J. Corso, and V. N. Krovi.
Estimating dynamics on-the-fly using monocular video for vision-based
IEEE/ASME Transactions on Mechatronics, 19(4):1412--1423, 2014.
[ bib |
http ]
C. Xu, R. F. Doell, S. J. Hanson, C. Hanson, and J. J Corso.
A study of actor and action semantic retention in video supervoxel
International Journal of Semantic Computing, 2014.
Selected as a Best Paper from ICSC; an earlier version appeared as
[ bib |
.pdf ]
W. Chen, C. Xiong, R. Xu, and J. J. Corso.
Actionness ranking with lattice conditional ordinal random fields.
In Proceedings of IEEE Conference on Computer Vision and
Pattern Recognition, 2014.
[ bib |
poster |
code |
.pdf ]
S. Kumar, V. Dhiman, and J. J. Corso.
Learning compositional sparse models of bimodal percepts.
In Proceedings of AAAI Conference on Artificial Intelligence,
[ bib |
code |
.pdf ]
A. Barbu, D. Barrett, W. Chen, N. Siddharth, C. Xiong, J. J. Corso,
C. D. Fellbaum, C. Hanson, S. J. Hanson, S. Hélie, E. Malaia, B. A.
Pearlmutter, J. M. Siskind, T. M. Talavage, and R. B. Wilbur.
Seeing is worse than believing: Reading people's minds better than
computer-vision methods recognize actions.
In Proceedings of European Conference on Computer Vision, 2014.
[ bib |
.pdf ]
J. J. Corso.
Toward parts-based scene understanding with pixel-support
parts-sparse pictorial structures.
Pattern Recognition Letters: Special Issue on Scene
Understanding and Behavior Analysis, 34(7):762--769, 2013.
Early version appears as arXiv.org tech report 1108.4079v1.
[ bib |
.pdf ]
Y. Miao and J. J. Corso.
Hamiltonian streamline guided feature extraction with application to
face detection.
Journal of Neurocomputing, 120:226--234, 2013.
Early version appears as arXiv.org tech report 1108.3525v1.
[ bib |
http ]
P. Das, R. K. Srihari, and J. J. Corso.
Translating related words to videos and back through latent topics.
In Proceedings of Sixth ACM International Conference on Web
Search and Data Mining, 2013.
[ bib |
.pdf ]
J. A. Delmerico, D. Baran, P. David, J. Ryde, and J. J. Corso.
Ascending stairway modeling from dense depth imagery for
traversability analysis.
In Proceedings of IEEE International Conference on Robotics and
Automation, 2013.
[ bib |
project |
.pdf ]
P. Das, C. Xu, R. F. Doell, and J. J. Corso.
A thousand frames in just a few words: Lingual description of videos
through latent topics and sparse object stitching.
In Proceedings of IEEE Conference on Computer Vision and
Pattern Recognition, 2013.
[ bib |
poster |
data |
.pdf ]
C. Xu, R. F. Doell, S. J. Hanson, C. Hanson, and J. J Corso.
Are actor and action semantics retained in video supervoxel
In Proceedings of IEEE International Conference on Semantic
Computing, 2013.
[ bib |
.pdf ]
C. Xu, S. Whitt, and J. J. Corso.
Flattening supervoxel hierarchies by the uniform entropy slice.
In Proceedings of the IEEE International Conference on Computer
Vision, 2013.
[ bib |
poster |
project |
video |
.pdf ]
J. A. Delmerico, P. David, and J. J. Corso.
Building facade detection, segmentation, and parameter estimation for
mobile robot stereo vision.
Image and Vision Computing, 31(11):841--852, 2013.
[ bib |
project |
data |
.pdf ]
C. Xu and J. J. Corso.
Evaluation of super-voxel methods for early video processing.
In Proceedings of IEEE Conference on Computer Vision and
Pattern Recognition, 2012.
[ bib |
code |
project |
.pdf ]
S. Sadanand and J. J. Corso.
Action bank: A high-level representation of activity in video.
In Proceedings of IEEE Conference on Computer Vision and
Pattern Recognition, 2012.
[ bib |
code |
project |
.pdf ]
P. Agarwal, S. Kumar, J. Ryde, J. J. Corso, and V. N. Krovi.
Estimating human dynamics on-the-fly using monocular video for pose
In Proceedings of Robotics Science and Systems, 2012.
[ bib |
.pdf ]
R. Xu, P. Agarwal, S. Kumar, V. N. Krovi, and J. J. Corso.
Combining skeletal pose with local motion for human activity
In Proceedings of VII Conference on Articulated Motion and
Deformable Objects, 2012.
[ bib |
slides |
.pdf ]
M. A. Bustamante and J. J. Corso.
Using probabilistic ontologies for video exploration.
In Proceedings of the Eighteenth Americas Conference on
Information Systems, 2012.
[ bib ]
P. Agarwal, S. Kumar, J. Ryde, J. J. Corso, and V. N. Krovi.
An optimization based framework for human pose estimation in
monocular videos.
In Proceedings of International Symposium on Visual Computing,
[ bib |
.pdf ]
C. Xiong and J. J. Corso.
Coaction discovery: Segmentation of common actions across multiple
In Proceedings of Multimedia Data Mining Workshop in Conjunction
with the ACM SIGKDD Conference on Knowledge Discovery and Data Mining
(MDMKDD), 2012.
[ bib |
.pdf ]
C. Xu, C. Xiong, and J. J. Corso.
Streaming hierarchical video segmentation.
In Proceedings of European Conference on Computer Vision, 2012.
[ bib |
code |
project |
.pdf ]
J. A. Delmerico, J. J. Corso, D. Baran, P. David, and J. Ryde.
Ascending stairway modeling: A first step toward autonomous
multi-floor exploration.
In Proceedings of IEEE/RSJ Intelligent Robots and Systems (Video
Proceedings), 2012.
[ bib |
project |
video ]
C. S. Lea and J. J. Corso.
Efficient hierarchical markov random fields for object detection on a
mobile robot.
Technical Report 1111.1599v1, arXiv, November 2011.
[ bib ]
Y. Miao and J. J. Corso.
Hamiltonian streamline guided feature extraction with applications to
face detection.
Technical Report 1108.3525v1, arXiv, August 2011.
[ bib ]
A. Y. C. Chen and J. J. Corso.
Temporally consistent multi-class video-object segmentation with the
video graph-shifts algorithm.
In Proceedings of the 2011 IEEE Workshop on Motion and Video
Computing, 2011.
[ bib |
code |
project |
.pdf ]
D. R. Schlegel, A. Y. C. Chen, C. Xiong, J. A. Delmerico, and J. J.
AirTouch: Interacting with computer systems at a distance.
In Proceedings of IEEE Winter Vision Meetings: Workshop on
Applications of Computer Vision (WACV), 2011.
[ bib |
.pdf ]
P. Agarwal, S. Kumar, J. J. Corso, and V. N. Krovi.
Estimating dynamics on-the-fly using monocular video.
In Proceedings of 4th Annual Dynamic Systems and Control
Conference, 2011.
[ bib |
.pdf ]
J. A. Delmerico, P. David, and J. J. Corso.
Building facade detection, segmentation, and parameter estimation for
mobile robot localization and guidance.
In Proceedings of International Conference on Intelligent Robots
and Systems, 2011.
[ bib |
project |
data |
.pdf ]
A. Perera, S. Oh, M. Leotta, I. Kim, B. Byun, C.-H. Lee, S. McCloskey, J. Liu,
B. Miller, Z. F. Huang, A. Vahdat, W. Yang, G. Mori, K. Tang, D. Koller,
L. Fei-Fei, K. Li, G. Chen, J. J. Corso, Y. Fu, and R. K.
GENIE TRECVID2011 multimedia event detection: Late-fusion
approaches to combine multiple audio-visual features.
In NIST TRECVID Workshop, 2011.
[ bib ]
A. Y. C. Chen and J. J. Corso.
On the effects of normalization in adaptive MRF hierarchies.
In Proceedings of CompImage '10---Computational Modeling of
Objects Presented in Images, 2010.
[ bib |
.pdf ]
M. R. Malgireddy, J. J. Corso, S. Setlur, V. Govindaraju, and
D. Mandalapu.
A framework for hand gesture recognition and spotting using
sub-gesture modeling.
In Proceedings of the 20th International Conference on Pattern
Recognition, 2010.
[ bib |
.pdf ]
J. A. Delmerico, J. J. Corso, and P. David.
Boosting with stereo features for building facade detection on mobile
In Proceedings of Western New York Image Processing Workshop,
[ bib |
.pdf ]
A. Y. C. Chen and J. J. Corso.
Propagating multi-class pixel labels throughout video frames.
In Proceedings of Western New York Image Processing Workshop,
[ bib |
.pdf ]
J. J. Corso and G. D. Hager.
Image Description with Features that Summarize.
Computer Vision and Image Understanding, 113:446--458, 2009.
[ bib |
.pdf ]
T. J. Burns and J. J. Corso.
Robust unsupervised segmentation of degraded document images with
topic models.
In Proceedings of IEEE Conference on Computer Vision and Pattern
Recognition, 2009.
[ bib |
.pdf ]
R. Rodrigues, G. Schroeder, J. J. Corso, and V. Govindaraju.
Unconstrained face recognition using MRF priors and manifold
In Proceedings of IEEE International Conference on Biometrics:
Theory, Applications, Systems, 2009.
[ bib |
.pdf ]
I. Nwogu and J. J. Corso.
Labeling irregular graphs with belief propagation.
In Proceedings of International Workshop on Combinatorial Image
Analysis, volume LNCS 4958, pages 295--305, 2008.
[ bib |
.pdf ]
J. J. Corso, Z. Tu, and A. Yuille.
MRF Labeling with a Graph-Shifts Algorithm.
In Proceedings of International Workshop on Combinatorial Image
Analysis, volume LNCS 4958, pages 172--184, 2008.
[ bib |
.pdf ]
I. Nwogu and J. J. Corso.
(BP)2: Beyond Pairwise Belief Propagation, Labeling by
Approximating Kikuchi Free Energies.
In Proceedings of IEEE Conference on Computer Vision and
Pattern Recognition, 2008.
[ bib |
.pdf ]
J. J. Corso.
Discriminative Modeling by Boosting on Multilevel Aggregates.
In Proceedings of IEEE Conference on Computer Vision and
Pattern Recognition, 2008.
[ bib |
.pdf ]
J. J. Corso, A. Yuille, and Z. Tu.
Graph-Shifts: Natural Image Labeling by Dynamic Hierarchical
In Proceedings of IEEE Conference on Computer Vision and
Pattern Recognition, 2008.
[ bib |
code |
project |
.pdf ]
J. J. Corso, G. Ye, D. Burschka, and G. D. Hager.
A Practical Paradigm and Platform for Video-Based Human-Computer
IEEE Computer, 42(5):48--55, 2008.
[ bib |
.pdf ]
I. Nwogu, J. J. Corso, and T. Bittner.
The design of an ontology-enhanced anatomy labeler.
Technical Report 2008-09, University at Buffalo SUNY, 2008.
[ bib |
.pdf ]
A. Y. C. Chen, J. J. Corso, and L. Wang.
HOPS: Efficient region labeling using higher order proxy
In Proceedings of International Conference on Pattern
Recognition, 2008.
[ bib |
.pdf ]
J. Li, S. Tulyakov, F. Farooq, J. J. Corso, and V. Govindaraju.
Integrating minutiae based fingerprint matching with local mutual
In Proceedings of International Conference on Pattern
Recognition, 2008.
[ bib |
.pdf ]
D. Burschka, J. J. Corso, M. Dewan, W. Lau, M. Li, H. Lin,
P. Marayong, N. Ramey, G. D. Hager, B. Hoffman, D. Larkin, and C. Hasser.
Navigating Inner Space: 3-D Assistance for Minimally Invasive
Robotics and Autonomous System, 2005.
[ bib ]
G. Ye, J. J. Corso, and G. D. Hager.
Real-Time Vision for Human-Computer Interaction, chapter 7:
Visual Modeling of Dynamic Gestures Using 3D Appearance and Motion Features,
pages 103--120.
Springer-Verlag, 2005.
[ bib |
.pdf ]
D. Burschka, G. Ye, J. J. Corso, and G. D. Hager.
A Practical Approach for Integrating Vision-Based Methods into
Interactive 2D/3D Applicationsa.
Technical report, The Johns Hopkins University, 2005.
CIRL Lab Technical Report CIRL-TR-05-01.
[ bib |
.pdf ]
J. J. Corso, G. Ye, and G. D. Hager.
Analysis of Composite Gestures with a Coherent Probabilistic
Graphical Model.
Virtual Reality, 8(4):242--252, 2005.
[ bib |
.pdf ]
J. J. Corso.
Techniques for Vision-Based Human-Computer Interaction.
PhD thesis, The Johns Hopkins University, 2005.
[ bib |
.pdf ]
J. J. Corso and G. D. Hager.
Coherent Regions for Concise and Stable Image Description .
In Proceedings of IEEE Conference on Computer Vision and
Pattern Recognition, volume 2, pages 184--190, 2005.
[ bib |
.pdf ]
G. Ye, J. J. Corso, and G. D. Hager.
Gesture Recognition Using 3D Appearance and Motion Features.
In Proceedings of Workshop on Real-time Vision for
Human-Computer Interaction (at CVPR 2004), 2004.
[ bib |
.pdf ]
N. Ramey, J. J. Corso, W. W. Lau, D. Burschka, and G. D. Hager.
Real Time 3D Surface Tracking and Its Applications.
In Proceedings of Workshop on Real-time 3D Sensors and Their
Use (at CVPR 2004), 2004.
[ bib |
.pdf ]
J. J. Corso, M. Dewan, and G. D. Hager.
Image Segmentation Through Energy Minimization Based Subspace
Technical Report CIRL-TR-04-01, The Johns Hopkins University, 2004.
[ bib |
.pdf ]
G. Ye, J. J. Corso, D. Burschka, and G. D. Hager.
VICs: A Modular HCI Framework Using Spatio-Temporal Dynamics.
Machine Vision and Applications, 16(1):13--20, 2004.
[ bib ]
J. J. Corso, M. Dewan, and G. D. Hager.
Image Segmentation Through Energy Minimization Based Subspace
In Proceedings of 17th International Conference on Pattern
Recogntion (ICPR 2004), 2004.
[ bib |
.pdf ]
J. J. Corso.
Vision-Based Techniques for Dynamic, Collaborative Mixed-Realities.
In B. J. Thompson, editor, Research Papers of the Link
Foundation Fellows, volume 4. University of Rochester Press, 2004.
Invited Report for Link Foundation Fellowship.
[ bib ]
G. Ye, J. J. Corso, D. Burschka, and G. D. Hager.
VICs: A Modular Vision-Based HCI Framework.
In Proceedings of 3rd International Conference on Computer
Vision Systems, pages 257--267, 2003.
[ bib |
.pdf ]
J. J. Corso, D. Burschka, and G. D. Hager.
Direct Plane Tracking in Stereo Image for Mobile Navigation.
In Proceedings of International Conference on Robotics and
Automation, 2003.
[ bib |
.pdf ]
J. J. Corso, D. Burschka, and G. D. Hager.
The 4DT: Unencumbered HCI With VICs.
In Proceedings of CVPRHCI, 2003.
[ bib |
.pdf ]
J. J. Corso, N. Ramey, and G. D. Hager.
Stereo-Based Direct Surface Tracking with Deformable Parametric
Technical report, The Johns Hopkins University, 2003.
CIRL Lab Technical Report 2003-02.
[ bib |
.pdf ]
G. Ye, J. J. Corso, G. D. Hager, and A. M. Okamura.
VisHap: Augmented Reality Combining Haptics and Vision.
In Proceedings of IEEE International Conference on Systems, Man
and Cybernetics, 2003.
[ bib |
.pdf ]
J. J. Corso, G. Ye, D. Burschka, and G. D. Hager.
Software Systems for Vision-Based Spatial Interaction.
In Proceedings of 2002 Workshop on Intelligent Human
Augmentation and Virtual Environments, pages D--26 and D--56, 2002.
[ bib ]
J. J. Corso and J. D. Cohen.
Out-Of-Core Voxelization of Large Scalar Fields for Interactive
Multiresolution Volume Rendering.
Technical report, The Johns Hopkins University, 2002.
Graphics Lab Technical Report.
[ bib ]
J. J. Corso and G. D. Hager.
Planar Surface Tracking Using Direct Stereo.
Technical report, The Johns Hopkins University, 2002.
CIRL Lab Technical Report.
[ bib |
.pdf ]
R. Szeto, X. Sun, K. Lu, and J. J. Corso.
A Temporally-Aware Interpolation Network for Video Frame
IEEE Transactions on Pattern Analysis and Machine
Intelligence, 2020 (to appear).
[ bib ]
Y. Yan, C. Xu, D. Cai, and J. J. Corso.
A weakly supervised multi-task ranking framework for actor-action
semantic segmentation.
International Journal of Computer Vision, 2019 (to appear).
[ bib ]