Jason J. Corso; EECS @ U of Michigan

Jason J. Corso

Publications List

tag: computer vision

[1]	L. Zhou, H. Palangi, L. Zhang, H. Hu, J. J. Corso, and J. Gao. Unified vision-language pre-training for image captioning and vqa. In Proceedings of AAAI Conference on Artificial Intelligence, 2020. [ bib ]
[2]	B. Griffin, V. Florence, and J. J. Corso. Video object segmentation-based visual servo control and object depth estimation on a mobile robot. In Proceedings of IEEE Winter Conference on Applications of Computer Vision, 2020. [ bib ]
[3]	J. Y. Song, S. J. Lemmer, M. X. Liu, S. Yan, J. Kim, J. J. Corso, and W. S. Lasecki. Popup: Reconstructing 3d video using particle filtering to aggregate crowd responses. In Proceedings of ACM International Conference on Intelligent User Interfaces, 2019. [ bib \| http ]
[4]	H. Tang, X. Chen, W. Wang, D. Xu, J. J. Corso, N. Sebe, and Y. Yan. Attribute-guided sketch generation. In Proceedings of IEEE Conference on Automatic Face and Gesture Recognition, 2019. [ bib \| http ]
[5]	H. Tang, D. Xu, Y. Yan, Y. Wang, J. J. Corso, and N. Sebe. Multi-channel attention selection GAN with cascaded semantic guidance for cross-view image translation. In Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, 2019. [ bib \| .pdf ]
[6]	L. Zhou, Y. Kalantidis, X. Chen, J. J. Corso, and M. Rohrbach. Grounded video description. In Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, 2019. [ bib \| .pdf ]
[7]	B. Griffin and J. J. Corso. BubbleNets: Learning to select the guidance frame in video object segmentation by deep sorting frames. In Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, 2019. [ bib \| .pdf ]
[8]	B. Griffin and J. J. Corso. Tukey-inspired video object segmentation. In Proceedings of IEEE Winter Conference on Applications of Computer Vision, 2019. [ bib \| http ]
[9]	H. Huang, L. Zhou, W. Zhang, J. J. Corso, and C. Xu. Dynamic graph modules for modeling object-object interactions in activity recognition. In Proceedings of the British Machine Vision Conference, 2019. [ bib \| .pdf ]
[10]	K. Min and J. J. Corso. TASED-net: Temporally-aggregating spatial encoder-decoder network for video saliency detection. In Proceedings of IEEE International Conference on Computer Vision, 2019. [ bib \| .pdf ]
[11]	S. Kumar, V. Dhiman, P. Koch, and J. J. Corso. Learning compositional sparse bimodal models. IEEE Transactions on Pattern Analysis and Machine Intelligence, 40(5):1032--1044, 2018. [ bib \| DOI \| code ]
[12]	L. Zhou, Y. Zhou, J. J. Corso, R. Socher, and C. Xiong. End-to-end dense video captioning with masked transformer. In Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, 2018. [ bib \| code \| .pdf ]
[13]	X. Sun, R. Szeto, and J. J. Corso. A Temporally-Aware Interpolation Network for Video Frame Inpainting. In Proceedings of Asian Conference on Computer Vision (ACCV), 2018. [ bib \| code \| project \| http ]
[14]	L. Zhou, N. Louis, and J. J. Corso. Weakly-supervised video object grounding from text by loss weighting and object interaction. In Proceedings of British Machine Vision Conference, 2018. [ bib \| .pdf ]
[15]	M. R. Ganesh, E. Hofesmann, B. Min, N. Gafoor, and J. J. Corso. T-recs: Training for rate-invariant embeddings by controlling speed for action recognition. Technical Report 1803.08094, ARXIV, 2018. [ bib \| http ]
[16]	E. Hofesmann, M. R. Ganesh, and J. J. Corso. M-PACT: An open source platform for repeatable activity classification research. Technical Report 1804.05879, ARXIV, 2018. [ bib \| code \| http ]
[17]	M. El Banani and J. J. Corso. Adviser networks: Learning what question to ask for human-in-the-loop viewpoint estimation. Technical Report 1802.01666, ARXIV, 2018. [ bib \| code \| http ]
[18]	T. Han, H. Yao, C. Xu, X. Sun, Y. Zhang, and J. J. Corso. Dancelets mining for video recommendation based on dance styles. IEEE Transactions on Multimedia, 19(4), 2017. [ bib ]
[19]	C. Chen and J. J. Corso. Joint occlusion boundary detection and figure/ground assignment by extracting common-fate fragments in a back-projection scheme. Pattern Recognition, 64:15--28, 2017. [ bib ]
[20]	Y. Yan, C. Xu, D. Cai, and J. J. Corso. Weakly supervised actor-action segmentation via robust multi-task ranking. In Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, 2017. [ bib ]
[21]	R. Szeto and J. J. Corso. Click-here: Human-localized keypoints as guidance for viewpoint estimation. In Proceedings of IEEE International Conference on Computer Vision, 2017. [ bib \| poster \| code \| project \| data \| .pdf ]
[22]	L. Zhou, C. Xu, P. Koch, and J. J. Corso. Watch what you just said: Image captioning with text-conditional attention. In Proceedings of the Thematic Workshops of ACM Multimedia, 2017. [ bib ]
[23]	V. Dhiman, Q.-H. Tran, J. J. Corso, and M. Chandraker. A continuous occlusion model for road scene understanding. In Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, 2016. [ bib ]
[24]	C. Xu and J. J. Corso. Actor-action semantic segmentation with grouping-process models. In Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, 2016. [ bib \| data ]
[25]	C. Xu and J. J. Corso. LIBSVX: A supervoxel library and benchmark for early video processing. International Journal of Computer Vision, 119:272--290, 2016. [ bib ]
[26]	R. Xu, C. Xiong, W. Chen, and J. J. Corso. Jointly modeling deep video and compositional text to bridge vision and language in a unified framework. In Proceedings of AAAI Conference on Artificial Intelligence, 2015. [ bib \| .pdf ]
[27]	J. Lu, R. Xu, and J. J. Corso. Human action segmentation with hierarchical supervoxel consistency. In Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, 2015. [ bib \| .pdf ]
[28]	C. Xu, S.-H. Hsieh, C. Xiong, and J. J. Corso. Can humans fly? Action understanding with multiple classes of actors. In Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, 2015. [ bib \| poster \| data \| .pdf ]
[29]	W. Chen and J. J. Corso. Action detection by implicit intentional motion clustering. In Proceedings of IEEE International Conference on Computer Vision, 2015. [ bib \| poster \| .pdf ]
[30]	S. Oh, S. McCloskey, I. Kim, A. Vahdat, K. Cannons, H. Hajimirsadeghi, G. Mori, A. G. A. Perera, M. Pandey, and J. J. Corso. Multimedia event detection with multimodal feature fusion and temporal concept localization. Machine Vision and Applications, 25:49--69, 2014. [ bib \| http ]
[31]	P. Agarwal, S. Kumar, J. Ryde, J. J. Corso, and V. N. Krovi. Estimating dynamics on-the-fly using monocular video for vision-based robotics. IEEE/ASME Transactions on Mechatronics, 19(4):1412--1423, 2014. [ bib \| http ]
[32]	C. Xu, R. F. Doell, S. J. Hanson, C. Hanson, and J. J Corso. A study of actor and action semantic retention in video supervoxel segmentation. International Journal of Semantic Computing, 2014. Selected as a Best Paper from ICSC; an earlier version appeared as arXiv:1311.3318. [ bib \| .pdf ]
[33]	W. Chen, C. Xiong, R. Xu, and J. J. Corso. Actionness ranking with lattice conditional ordinal random fields. In Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, 2014. [ bib \| poster \| code \| .pdf ]
[34]	S. Kumar, V. Dhiman, and J. J. Corso. Learning compositional sparse models of bimodal percepts. In Proceedings of AAAI Conference on Artificial Intelligence, 2014. [ bib \| code \| .pdf ]
[35]	A. Barbu, D. Barrett, W. Chen, N. Siddharth, C. Xiong, J. J. Corso, C. D. Fellbaum, C. Hanson, S. J. Hanson, S. Hélie, E. Malaia, B. A. Pearlmutter, J. M. Siskind, T. M. Talavage, and R. B. Wilbur. Seeing is worse than believing: Reading people's minds better than computer-vision methods recognize actions. In Proceedings of European Conference on Computer Vision, 2014. [ bib \| .pdf ]
[36]	J. J. Corso. Toward parts-based scene understanding with pixel-support parts-sparse pictorial structures. Pattern Recognition Letters: Special Issue on Scene Understanding and Behavior Analysis, 34(7):762--769, 2013. Early version appears as arXiv.org tech report 1108.4079v1. [ bib \| .pdf ]
[37]	Y. Miao and J. J. Corso. Hamiltonian streamline guided feature extraction with application to face detection. Journal of Neurocomputing, 120:226--234, 2013. Early version appears as arXiv.org tech report 1108.3525v1. [ bib \| http ]
[38]	P. Das, R. K. Srihari, and J. J. Corso. Translating related words to videos and back through latent topics. In Proceedings of Sixth ACM International Conference on Web Search and Data Mining, 2013. [ bib \| .pdf ]
[39]	J. A. Delmerico, D. Baran, P. David, J. Ryde, and J. J. Corso. Ascending stairway modeling from dense depth imagery for traversability analysis. In Proceedings of IEEE International Conference on Robotics and Automation, 2013. [ bib \| project \| .pdf ]
[40]	P. Das, C. Xu, R. F. Doell, and J. J. Corso. A thousand frames in just a few words: Lingual description of videos through latent topics and sparse object stitching. In Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, 2013. [ bib \| poster \| data \| .pdf ]
[41]	C. Xu, R. F. Doell, S. J. Hanson, C. Hanson, and J. J Corso. Are actor and action semantics retained in video supervoxel segmentation? In Proceedings of IEEE International Conference on Semantic Computing, 2013. [ bib \| .pdf ]
[42]	C. Xu, S. Whitt, and J. J. Corso. Flattening supervoxel hierarchies by the uniform entropy slice. In Proceedings of the IEEE International Conference on Computer Vision, 2013. [ bib \| poster \| project \| video \| .pdf ]
[43]	J. A. Delmerico, P. David, and J. J. Corso. Building facade detection, segmentation, and parameter estimation for mobile robot stereo vision. Image and Vision Computing, 31(11):841--852, 2013. [ bib \| project \| data \| .pdf ]
[44]	C. Xu and J. J. Corso. Evaluation of super-voxel methods for early video processing. In Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, 2012. [ bib \| code \| project \| .pdf ]
[45]	S. Sadanand and J. J. Corso. Action bank: A high-level representation of activity in video. In Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, 2012. [ bib \| code \| project \| .pdf ]
[46]	P. Agarwal, S. Kumar, J. Ryde, J. J. Corso, and V. N. Krovi. Estimating human dynamics on-the-fly using monocular video for pose estimation. In Proceedings of Robotics Science and Systems, 2012. [ bib \| .pdf ]
[47]	R. Xu, P. Agarwal, S. Kumar, V. N. Krovi, and J. J. Corso. Combining skeletal pose with local motion for human activity recognition. In Proceedings of VII Conference on Articulated Motion and Deformable Objects, 2012. [ bib \| slides \| .pdf ]
[48]	M. A. Bustamante and J. J. Corso. Using probabilistic ontologies for video exploration. In Proceedings of the Eighteenth Americas Conference on Information Systems, 2012. [ bib ]
[49]	P. Agarwal, S. Kumar, J. Ryde, J. J. Corso, and V. N. Krovi. An optimization based framework for human pose estimation in monocular videos. In Proceedings of International Symposium on Visual Computing, 2012. [ bib \| .pdf ]
[50]	C. Xiong and J. J. Corso. Coaction discovery: Segmentation of common actions across multiple videos. In Proceedings of Multimedia Data Mining Workshop in Conjunction with the ACM SIGKDD Conference on Knowledge Discovery and Data Mining (MDMKDD), 2012. [ bib \| .pdf ]
[51]	C. Xu, C. Xiong, and J. J. Corso. Streaming hierarchical video segmentation. In Proceedings of European Conference on Computer Vision, 2012. [ bib \| code \| project \| .pdf ]
[52]	J. A. Delmerico, J. J. Corso, D. Baran, P. David, and J. Ryde. Ascending stairway modeling: A first step toward autonomous multi-floor exploration. In Proceedings of IEEE/RSJ Intelligent Robots and Systems (Video Proceedings), 2012. [ bib \| project \| video ]
[53]	C. S. Lea and J. J. Corso. Efficient hierarchical markov random fields for object detection on a mobile robot. Technical Report 1111.1599v1, arXiv, November 2011. [ bib ]
[54]	Y. Miao and J. J. Corso. Hamiltonian streamline guided feature extraction with applications to face detection. Technical Report 1108.3525v1, arXiv, August 2011. [ bib ]
[55]	A. Y. C. Chen and J. J. Corso. Temporally consistent multi-class video-object segmentation with the video graph-shifts algorithm. In Proceedings of the 2011 IEEE Workshop on Motion and Video Computing, 2011. [ bib \| code \| project \| .pdf ]
[56]	D. R. Schlegel, A. Y. C. Chen, C. Xiong, J. A. Delmerico, and J. J. Corso. AirTouch: Interacting with computer systems at a distance. In Proceedings of IEEE Winter Vision Meetings: Workshop on Applications of Computer Vision (WACV), 2011. [ bib \| .pdf ]
[57]	P. Agarwal, S. Kumar, J. J. Corso, and V. N. Krovi. Estimating dynamics on-the-fly using monocular video. In Proceedings of 4th Annual Dynamic Systems and Control Conference, 2011. [ bib \| .pdf ]
[58]	J. A. Delmerico, P. David, and J. J. Corso. Building facade detection, segmentation, and parameter estimation for mobile robot localization and guidance. In Proceedings of International Conference on Intelligent Robots and Systems, 2011. [ bib \| project \| data \| .pdf ]
[59]	A. Perera, S. Oh, M. Leotta, I. Kim, B. Byun, C.-H. Lee, S. McCloskey, J. Liu, B. Miller, Z. F. Huang, A. Vahdat, W. Yang, G. Mori, K. Tang, D. Koller, L. Fei-Fei, K. Li, G. Chen, J. J. Corso, Y. Fu, and R. K. Srihari. GENIE TRECVID2011 multimedia event detection: Late-fusion approaches to combine multiple audio-visual features. In NIST TRECVID Workshop, 2011. [ bib ]
[60]	A. Y. C. Chen and J. J. Corso. On the effects of normalization in adaptive MRF hierarchies. In Proceedings of CompImage '10---Computational Modeling of Objects Presented in Images, 2010. [ bib \| .pdf ]
[61]	M. R. Malgireddy, J. J. Corso, S. Setlur, V. Govindaraju, and D. Mandalapu. A framework for hand gesture recognition and spotting using sub-gesture modeling. In Proceedings of the 20th International Conference on Pattern Recognition, 2010. [ bib \| .pdf ]
[62]	J. A. Delmerico, J. J. Corso, and P. David. Boosting with stereo features for building facade detection on mobile platforms. In Proceedings of Western New York Image Processing Workshop, 2010. [ bib \| .pdf ]
[63]	A. Y. C. Chen and J. J. Corso. Propagating multi-class pixel labels throughout video frames. In Proceedings of Western New York Image Processing Workshop, 2010. [ bib \| .pdf ]
[64]	J. J. Corso and G. D. Hager. Image Description with Features that Summarize. Computer Vision and Image Understanding, 113:446--458, 2009. [ bib \| .pdf ]
[65]	T. J. Burns and J. J. Corso. Robust unsupervised segmentation of degraded document images with topic models. In Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, 2009. [ bib \| .pdf ]
[66]	R. Rodrigues, G. Schroeder, J. J. Corso, and V. Govindaraju. Unconstrained face recognition using MRF priors and manifold traversing. In Proceedings of IEEE International Conference on Biometrics: Theory, Applications, Systems, 2009. [ bib \| .pdf ]
[67]	I. Nwogu and J. J. Corso. Labeling irregular graphs with belief propagation. In Proceedings of International Workshop on Combinatorial Image Analysis, volume LNCS 4958, pages 295--305, 2008. [ bib \| .pdf ]
[68]	J. J. Corso, Z. Tu, and A. Yuille. MRF Labeling with a Graph-Shifts Algorithm. In Proceedings of International Workshop on Combinatorial Image Analysis, volume LNCS 4958, pages 172--184, 2008. [ bib \| .pdf ]
[69]	I. Nwogu and J. J. Corso. (BP)²: Beyond Pairwise Belief Propagation, Labeling by Approximating Kikuchi Free Energies. In Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, 2008. [ bib \| .pdf ]
[70]	J. J. Corso. Discriminative Modeling by Boosting on Multilevel Aggregates. In Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, 2008. [ bib \| .pdf ]
[71]	J. J. Corso, A. Yuille, and Z. Tu. Graph-Shifts: Natural Image Labeling by Dynamic Hierarchical Computing. In Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, 2008. [ bib \| code \| project \| .pdf ]
[72]	J. J. Corso, G. Ye, D. Burschka, and G. D. Hager. A Practical Paradigm and Platform for Video-Based Human-Computer Interaction. IEEE Computer, 42(5):48--55, 2008. [ bib \| .pdf ]
[73]	I. Nwogu, J. J. Corso, and T. Bittner. The design of an ontology-enhanced anatomy labeler. Technical Report 2008-09, University at Buffalo SUNY, 2008. [ bib \| .pdf ]
[74]	A. Y. C. Chen, J. J. Corso, and L. Wang. HOPS: Efficient region labeling using higher order proxy neighborhoods. In Proceedings of International Conference on Pattern Recognition, 2008. [ bib \| .pdf ]
[75]	J. Li, S. Tulyakov, F. Farooq, J. J. Corso, and V. Govindaraju. Integrating minutiae based fingerprint matching with local mutual information. In Proceedings of International Conference on Pattern Recognition, 2008. [ bib \| .pdf ]
[76]	D. Burschka, J. J. Corso, M. Dewan, W. Lau, M. Li, H. Lin, P. Marayong, N. Ramey, G. D. Hager, B. Hoffman, D. Larkin, and C. Hasser. Navigating Inner Space: 3-D Assistance for Minimally Invasive Surgery. Robotics and Autonomous System, 2005. [ bib ]
[77]	G. Ye, J. J. Corso, and G. D. Hager. Real-Time Vision for Human-Computer Interaction, chapter 7: Visual Modeling of Dynamic Gestures Using 3D Appearance and Motion Features, pages 103--120. Springer-Verlag, 2005. [ bib \| .pdf ]
[78]	D. Burschka, G. Ye, J. J. Corso, and G. D. Hager. A Practical Approach for Integrating Vision-Based Methods into Interactive 2D/3D Applicationsa. Technical report, The Johns Hopkins University, 2005. CIRL Lab Technical Report CIRL-TR-05-01. [ bib \| .pdf ]
[79]	J. J. Corso, G. Ye, and G. D. Hager. Analysis of Composite Gestures with a Coherent Probabilistic Graphical Model. Virtual Reality, 8(4):242--252, 2005. [ bib \| .pdf ]
[80]	J. J. Corso. Techniques for Vision-Based Human-Computer Interaction. PhD thesis, The Johns Hopkins University, 2005. [ bib \| .pdf ]
[81]	J. J. Corso and G. D. Hager. Coherent Regions for Concise and Stable Image Description . In Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, volume 2, pages 184--190, 2005. [ bib \| .pdf ]
[82]	G. Ye, J. J. Corso, and G. D. Hager. Gesture Recognition Using 3D Appearance and Motion Features. In Proceedings of Workshop on Real-time Vision for Human-Computer Interaction (at CVPR 2004), 2004. [ bib \| .pdf ]
[83]	N. Ramey, J. J. Corso, W. W. Lau, D. Burschka, and G. D. Hager. Real Time 3D Surface Tracking and Its Applications. In Proceedings of Workshop on Real-time 3D Sensors and Their Use (at CVPR 2004), 2004. [ bib \| .pdf ]
[84]	J. J. Corso, M. Dewan, and G. D. Hager. Image Segmentation Through Energy Minimization Based Subspace Fusion. Technical Report CIRL-TR-04-01, The Johns Hopkins University, 2004. [ bib \| .pdf ]
[85]	G. Ye, J. J. Corso, D. Burschka, and G. D. Hager. VICs: A Modular HCI Framework Using Spatio-Temporal Dynamics. Machine Vision and Applications, 16(1):13--20, 2004. [ bib ]
[86]	J. J. Corso, M. Dewan, and G. D. Hager. Image Segmentation Through Energy Minimization Based Subspace Fusion. In Proceedings of 17th International Conference on Pattern Recogntion (ICPR 2004), 2004. [ bib \| .pdf ]
[87]	J. J. Corso. Vision-Based Techniques for Dynamic, Collaborative Mixed-Realities. In B. J. Thompson, editor, Research Papers of the Link Foundation Fellows, volume 4. University of Rochester Press, 2004. Invited Report for Link Foundation Fellowship. [ bib ]
[88]	G. Ye, J. J. Corso, D. Burschka, and G. D. Hager. VICs: A Modular Vision-Based HCI Framework. In Proceedings of 3rd International Conference on Computer Vision Systems, pages 257--267, 2003. [ bib \| .pdf ]
[89]	J. J. Corso, D. Burschka, and G. D. Hager. Direct Plane Tracking in Stereo Image for Mobile Navigation. In Proceedings of International Conference on Robotics and Automation, 2003. [ bib \| .pdf ]
[90]	J. J. Corso, D. Burschka, and G. D. Hager. The 4DT: Unencumbered HCI With VICs. In Proceedings of CVPRHCI, 2003. [ bib \| .pdf ]
[91]	J. J. Corso, N. Ramey, and G. D. Hager. Stereo-Based Direct Surface Tracking with Deformable Parametric Models. Technical report, The Johns Hopkins University, 2003. CIRL Lab Technical Report 2003-02. [ bib \| .pdf ]
[92]	G. Ye, J. J. Corso, G. D. Hager, and A. M. Okamura. VisHap: Augmented Reality Combining Haptics and Vision. In Proceedings of IEEE International Conference on Systems, Man and Cybernetics, 2003. [ bib \| .pdf ]
[93]	J. J. Corso, G. Ye, D. Burschka, and G. D. Hager. Software Systems for Vision-Based Spatial Interaction. In Proceedings of 2002 Workshop on Intelligent Human Augmentation and Virtual Environments, pages D--26 and D--56, 2002. [ bib ]
[94]	J. J. Corso and J. D. Cohen. Out-Of-Core Voxelization of Large Scalar Fields for Interactive Multiresolution Volume Rendering. Technical report, The Johns Hopkins University, 2002. Graphics Lab Technical Report. [ bib ]
[95]	J. J. Corso and G. D. Hager. Planar Surface Tracking Using Direct Stereo. Technical report, The Johns Hopkins University, 2002. CIRL Lab Technical Report. [ bib \| .pdf ]
[96]	R. Szeto, X. Sun, K. Lu, and J. J. Corso. A Temporally-Aware Interpolation Network for Video Frame Inpainting. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2020 (to appear). [ bib ]
[97]	Y. Yan, C. Xu, D. Cai, and J. J. Corso. A weakly supervised multi-task ranking framework for actor-action semantic segmentation. International Journal of Computer Vision, 2019 (to appear). [ bib ]

Please report broken links to Prof. Corso jjcorso@eecs.umich.edu .