Learning From People

THIS REPORT HAS BEEN WITHDRAWN

Nihar Shah

EECS Department
University of California, Berkeley
Technical Report No. UCB/EECS-2017-130
July 23, 2017

Learning from people represents a new and expanding frontier for data science. Crowdsourcing, where data is collected from non-experts online, is now extensively employed in academic research, industry, and also for many societal causes. Two critical challenges in crowdsourcing and learning form people are that of (i) developing algorithms for maximally accurate learning and estimation that operate under minimal modeling assumptions, and (ii) designing incentive mechanisms to elicit high-quality data from people. In this thesis, we addresses these fundamental challenges in the context of several canonical problem settings that arise in learning from people.

For the challenge of estimation, there are various algorithms proposed in past literature, but their reliance on strong parameter-based assumptions is severely limiting. In this thesis, we introduce a class of "permutation-based" models that are considerably richer than classical parameter-based models. We present algorithms for estimation which we show are both statistically optimal and significantly more robust than prior state-of-the-art methods. We also prove that our estimators automatically adapt and are simultaneously optimal over the classical parameter-based models as well, thereby enjoying a surprising win-win in the statistical bias-variance tradeoff.

As for the second challenge of incentivizing people, we design a class of payment mechanisms that take a "multiplicative" form. For several common interfaces in crowdsourcing, we show that these multiplicative mechanisms are surprisingly the only mechanisms that can guarantee honest responses and satisfy a mild and natural requirement which we call no-free-lunch. We show that our mechanisms have several additional desirable qualities. The simplicity of our mechanisms imparts them with an additional practical appeal.

Advisor: Kannan Ramchandran and Martin Wainwright

Author Comments: see http://www2.eecs.berkeley.edu/Pubs/TechRpts/2017/EECS-2017-133.html