#  Spring EconCS 2025 Seminars 

 
####  calendar\_today Date and Time 

 **February 7, 2025** 

 01:30PM - 02:30PM EST 

####  pin\_drop Location 

 **SEC LL 2.221**  


**Time &amp; Location**: Friday, February 07, 1:30pm - 2:30pm at SEC LL 2.221

**Speaker 1:** André Cruz (PhD student at MPI and currently visiting Rediet Adebe at Harvard)

**Title:** Evaluating language models as risk scorers

**Abstract:** Current LLM benchmarks predominantly focus on accuracy in realizable (factual) tasks. Such benchmarks necessarily fail to evaluate LLMs’ ability to quantify ground-truth outcome uncertainty. In this work, we leverage US Census data to evaluate LLMs’ ability to generate meaningful real-world distributions. We introduce folktexts, a python package to standardize the evaluation of uncertainty, calibration, and fairness of LLMs on real-world tabular data tasks. We find that predictive risk scores produced by state-of-the-art LLMs have high predictive signal but are wildly miscalibrated. Our evaluation reveals a general inability of instruction-tuned LLMs to express data uncertainty in multiple-choice Q&amp;A, exhibiting strong over-confidence bias across a variety of benchmark tasks. These differences in ability to quantify data uncertainty cannot be revealed in realizable settings, and highlight a blind-spot in the current evaluation ecosystem that folktexts covers.

**Speaker 2:** Ben Schiffer (PhD Student at Harvard Statistics Department):

**Title:** Clone-Robust AI Alignment

**Abstract:** A key challenge in training Large Language Models (LLMs) is properly aligning them with human preferences. Reinforcement Learning with Human Feedback (RLHF) uses pairwise comparisons from human annotators to train reward functions and has emerged as a popular alignment method. However, input datasets in RLHF can be unbalanced due to adversarial manipulation or inadvertent repetition. Therefore, we want RLHF algorithms to perform well even when the set of alternatives is not uniformly distributed. Drawing on insights from social choice theory, we introduce robustness to approximate clones, a desirable property of RLHF algorithms which requires that adversarially adding near-duplicate alternatives does not significantly change the learned reward function. We first demonstrate that the standard RLHF algorithm based on regularized maximum likelihood estimation (MLE) fails to satisfy this property. We then propose the weighted MLE, a new RLHF algorithm that modifies the standard regularized MLE by weighting alternatives based on their similarity to other alternatives. This new algorithm guarantees robustness to approximate clones while preserving desirable theoretical properties. Joint work with Ariel Procaccia and Shirley Zhang.

As a reminder you can find the up-to-date schedule for the upcoming talks [at this google sheet](https://urldefense.proofpoint.com/v2/url?u=https-3A__docs.google.com_spreadsheets_d_1wPTWoLfoMrAiceoh4jUoqQg9iu0KiVVwCGqproWbiUo_edit-3Fgid-3D711432702-23gid-3D711432702&d=DwMFaQ&c=WO-RGvefibhHBZq3fL85hQ&r=ZOP6tLIqLOHbdgCvrXjUlPta0tw7K_-ivqiItQhh6LQ&m=3pvvDy9lA_p1WUYlBr54Tmoyk5-WrgF50MArdmLfgIwT3CwnpCdbT8-ZMUXTMa5A&s=dSfMiNhuVZIASRKebntspWzGoEFGSV3zc6msCNojJNw&e=) (requires Harvard login) or on the [BEACH](https://urldefense.proofpoint.com/v2/url?u=https-3A__www.boston-2Dec-2Dhub.com_&d=DwMFaQ&c=WO-RGvefibhHBZq3fL85hQ&r=ZOP6tLIqLOHbdgCvrXjUlPta0tw7K_-ivqiItQhh6LQ&m=3pvvDy9lA_p1WUYlBr54Tmoyk5-WrgF50MArdmLfgIwT3CwnpCdbT8-ZMUXTMa5A&s=RzPtB-nCnr9c1J0BDsRlrBwD00H6EfGZkuUTV_9zU_0&e=) calendar (publicly available).


 See also:- [ EconCS Seminars ](/taxonomycalendarseminar/seminars)
- [ Spring EconCS 2025 Seminars ](/taxonomycalendarseminar/spring-econcs-2025-seminars)
- [ Seminar history ](/taxonomycalendarseminar/seminar-history)
 
 
 Share on:- [     Facebook ](#)
- [     Twitter ](#)
- [     Linkedin ](#)
 

 Save: [ Add to calendar calendar\_today ](https://econcs.seas.harvard.edu/node/1902581/event-feed.ics)  Copy link link