Designing convex surrogates for discrete prediction tasks via embeddings

Date: 

Friday, September 23, 2022, 1:00pm to 2:30pm

Location: 

SEC 1.413

We will have our long-awaited student back-to-back talk after two years. It will be in-person and each talk takes 25 minutes including the Q&A. For this week we have:

Speaker: Jessie Finocchiaro, Harvard

Title: Designing convex surrogates for discrete prediction tasks via embeddings

Abstract: We formalize and study the natural approach of designing convex surrogate loss functions via embeddings, for problems such as classification, ranking, or structured prediction. In this approach, one embeds each of the finitely many predictions (e.g. rankings) as a point in Rd, assigns the original loss values to these points, and “convexifies” the loss in some way to obtain a surrogate. We establish a strong connection between this approach and polyhedral (piecewise-linear convex) surrogate losses: every discrete loss is embedded by some polyhedral loss, and every polyhedral loss embeds some discrete loss. Our results are constructive, as we illustrate with some examples such as top-k classification and structured prediction tasks.

Speaker: Manuel Wüthrich, Harvard

Title: Representation with Incomplete Votes

Abstract: Platforms for online civic participation rely on methods for condensing thousands of comments into a relevant handful based on whether participants agree or disagree with them. We argue that these methods should guarantee fair representation of the participants, as their outcomes may affect the health of the conversation and inform impactful downstream decisions. To that end, we draw on the literature on approval-based committee elections. Our setting is novel in that the approval votes are incomplete since participants will typically not vote on all comments. We prove that this complication renders non-adaptive algorithms impractical in terms of the amount of information they must gather. Therefore, we develop an adaptive algorithm that uses information more efficiently by presenting incoming participants with statements that appear promising based on votes by previous participants. We prove that this method satisfies commonly used notions of fair representation, even when participants only vote on a small fraction of comments. Finally, an empirical evaluation on real data shows that the proposed algorithm provides representative outcomes in practice.

As a reminder, we will have talks every Friday, starting sharply at 1:05pm. Most talks this semester (including this week's) are in-person with an option to Zoom in, though a few will be virtual only. The permanent Zoom link for this academic year is below.

The room is on the ground level at the NW corner of the Harvard Science and Engineering Complex, which is open to the public: