⭐ 欢迎来到虫虫下载站! | 📦 资源下载 📁 资源专辑 ℹ️ 关于我们
⭐ 虫虫下载站

📄 2.txt

📁 This complete matlab for neural network
💻 TXT
字号:
发信人: GzLi (笑梨), 信区: DataMining
标  题:  Machine Learning 47(2/3) <2>
发信站: 南京大学小百合站 (Thu Jul 18 00:45:08 2002), 站内信件
篇名: Finite-time Analysis of the Multiarmed Bandit Problem 
刊名: Machine Learning 
ISSN: 0885-6125 
卷期: 47 卷 2/3 期 出版日期: 200205/06  
页码: 从 235 页到 256 页共 22 页 
作者: Auer Peter   University of Technology Graz, A-8010 Graz, Austria. 
pauer@igi.tu-graz.ac.at
 
Cesa-Bianchi Nicolò   DTI, University of Milan, via Bramante 65, I-26013
 Crema, Italy. cesa-bianchi@dti.unimi.it
 
Fischer Paul   Lehrstuhl Informatik II, Universit?t Dortmund, D-44221 Dortmund
, Germany. fischer@ls2.informatik.uni-dortmund.de
 
 
文摘: 
Reinforcement learning policies face the exploration versus exploitation 
dilemma, i.e. the search for a balance between exploring the environment 
to find profitable actions while taking the empirically best action as often
 as possible. A popular measure of a policy&apos;s success in addressing 
this dilemma is the regret, that is the loss due to the fact that the globally
 optimal policy is not followed all the times. One of the simplest examples
 of the exploration/exploitation dilemma is the multi-armed bandit problem
. Lai and Robbins were the first ones to show that the regret for this problem
 has to grow at least logarithmically in the number of plays. Since then,
 policies which asymptotically achieve this regret have been devised by Lai
 and Robbins and many others. In this work we show that the optimal logarithmic
 regret is also achievable uniformly over time, with simple and efficient
 policies, and for all reward distributions with bounded support.
 


--
              ***  端庄厚重 谦卑含容 事有归着 心存济物  ***
今天你挖了吗? DataMining  http://DataMining.bbs.lilybbs.net 
 MathToolshttp://bbs.sjtu.edu.cn/cgi-bin/bbsdoc?board=MathTools [m

※ 修改:.GzLi 于 Jul 18 00:46:50 修改本文.[FROM: 211.80.38.29]
※ 来源:.南京大学小百合站 bbs.nju.edu.cn.[FROM: 211.80.38.29]

⌨️ 快捷键说明

复制代码 Ctrl + C
搜索代码 Ctrl + F
全屏模式 F11
切换主题 Ctrl + Shift + D
显示快捷键 ?
增大字号 Ctrl + =
减小字号 Ctrl + -