Index-based policies for discounted multi-armed bandits on parallel machines

Wilkinson, DJ;, Glazebrook, KD

Index-based policies for discounted multi-armed bandits on parallel machines

Lookup NU author(s): Professor Kevin Glazebrook, Professor Darren Wilkinson

Downloads

Full text for this publication is not currently held within this repository. Alternative links are provided below where available.

Abstract

We utilize and develop elements of the recent achievable region account of Gittins indexation by Bertsimas and Niño-Mora to design index-based policies for discounted multi-armed bandits on parallel machines. The policies analyzed have expected rewards which come within an O(α) quantity of optimality, where a α 0 is a discount rate. In the main, the policies make an initial once for all allocation of bandits to machines, with each machine then handling its own workload optimally. This allocation must take careful account of the index structure of the bandits. The corresponding limit policies are average-overtaking optimal.

Publication metadata

Author(s): Wilkinson DJ; Glazebrook KD

Publication type: Article

Publication status: Published

Journal: Annals of Applied Probability

Year: 2000

Volume: 10

Issue: 3

Pages: 877-896

ISSN (print): 1050-5164

ISSN (electronic):

Publisher: Institute of Mathematical Statistics

URL: http://www.jstor.org/stable/2667323

ePrints

Index-based policies for discounted multi-armed bandits on parallel machines

Downloads

Abstract

Publication metadata

Share