Perceptual Science Series
From Co-Occurrence to Correspondence
Dr. Benjamin Taskar
Monday, November 02, 2009, 12:00pm - 07:00pm
University of Pennsylvania, Computer and Information Sciences Department
While supervised learning methods for classification and structuredprediction are very effective in many domains, they require detailedand precise labeling of large amounts of data. Weakly or ambiguouslylabeled data present major challenges as well as opportunities. Forexample, to build a machine translation system, we typically havelarge amounts of translated sentences to learn from, but without wordor phrase level correspondence. Copious images and videos on the webor your harddrive are typically labeled with captions of who and whatis in the picture, but not where and when. The challenges are boththeoretical and algorithmic: under what assumptions can we guaranteeeffective and efficient learning of precise correspondence from pureco-occurrence? I will describe our ongoing work on weakly supervisedlearning approaches for machine translation and parsing of images,videos and text.