Mechanism Design with Bandit Feedback.

We study a multi-round welfare-maximising mechanism design problem in instances where agents do not know their values. On each round, a mechanism assigns an allocation each to a set of agents and charges them a price; then the agents provide (stochastic) feedback to the mechanism for the allocation they received. This is motivated by applications in cloud markets and online advertising where an agent may know her value for an allocation only after experiencing it. Therefore, the mechanism needs to explore different allocations for each agent, while simultaneously attempting to find the socially optimal set of allocations. Our focus is on truthful and individually rational mechanisms which imitate the classical VCG mechanism in the long run. To that end, we define three notions of regret for the welfare, the individual utilities of each agent and that of the mechanism. We show that these three terms are interdependent via an $\Omega(T^{\frac{2}{3}})$ lower bound for the maximum of these three terms after $T$ rounds of allocations, and describe a family of anytime algorithms which achieve this rate. Our framework provides flexibility to control the pricing scheme so as to trade-off between the agent and seller regrets, and additionally to control the degree of truthfulness and individual rationality.

Discussion(0)

No comments yet. Be the first to comment.

Open reviews(0)

Public, signed peer feedback on this preprint.

No reviews yet.

Publication Info

Year: 2020
Published: —
Language: en

Preprint Details

Link Of The Paper: https://arxiv.org/pdf/2004.08924.pdf

Timeline

Created:June 19, 2026

Related publications

Preprint2020

Mechanism Design with Bandit Feedback.

Abstract

Discussion(0)

Open reviews(0)

Related publications

VCG Mechanism Design with Unknown Agent Values under Stochastic Bandit Feedback

Learning Competitive Equilibria in Exchange Economies with Bandit Feedback

Truthful mobile crowd sensing with interdependent valuations

Online Learning of Competitive Equilibria in Exchange Economies.

Fair and Efficient Memory Sharing: Confronting Free Riders