TennisVid2Text: Fine-grained Descriptions for Domain Specific Videos

Mohak Sukhwani and CV Jawahar

Abstract

Automatically describing videos has ever been fascinating. In this work, we attempt to describe videos from a specific domain -- broadcast videos of lawn tennis matches. Given a video shot from a tennis match, we intend to generate a textual commentary similar to what a human expert would write on a sports website. Unlike many recent works that focus on generating short captions, we are interested in generating semantically richer descriptions. This demands a detailed low-level analysis of the video content, specially the actions and interactions among subjects. We address this by limiting our domain to the game of lawn tennis. By leveraging a large corpus of human created descriptions harvested from internet we generate rich descriptions. We evaluate our method on a newly created tennis video data set. Extensive analysis demonstrate that our approach addresses both semantic correctness as well as readability aspects involved in the task. It also outperforms competing baselines.

Session

Poster 2

Files

PDF iconExtended Abstract (PDF, 430K)
PDF iconPaper (PDF, 1292K)
ZIP iconSupplemental Materials (ZIP, 660K)

DOI

10.5244/C.29.117
https://dx.doi.org/10.5244/C.29.117

Citation

Mohak Sukhwani and CV Jawahar. TennisVid2Text: Fine-grained Descriptions for Domain Specific Videos. In Xianghua Xie, Mark W. Jones, and Gary K. L. Tam, editors, Proceedings of the British Machine Vision Conference (BMVC), pages 117.1-117.12. BMVA Press, September 2015.

Bibtex

@inproceedings{BMVC2015_117,
	title={TennisVid2Text: Fine-grained Descriptions for Domain Specific Videos},
	author={Mohak Sukhwani and CV Jawahar},
	year={2015},
	month={September},
	pages={117.1-117.12},
	articleno={117},
	numpages={12},
	booktitle={Proceedings of the British Machine Vision Conference (BMVC)},
	publisher={BMVA Press},
	editor={Xianghua Xie, Mark W. Jones, and Gary K. L. Tam},
	doi={10.5244/C.29.117},
	isbn={1-901725-53-7},
	url={https://dx.doi.org/10.5244/C.29.117}
}