Best-of-All-Worlds Bounds for Online Learning with Feedback Graphs

Liad Erez, Tomer Koren

Abstract: We study the online learning with feedback graphs framework introduced by Mannor and Shamir (2011), in which the feedback received by the online learner is specified by a graph $G$ over the available actions. We develop an algorithm that simultaneously achieves regret bounds of the form: $\smash{\mathcal{O}(\sqrt{\theta(G) T})}$ with adversarial losses; $\mathcal{O}(\theta(G)\operatorname{polylog}{T})$ with stochastic losses; and $\mathcal{O}(\theta(G)\operatorname{polylog}{T} + \smash{\sqrt{\theta(G) C ...

Interpretation of convexity lemma

Entering edit mode

4.0 years ago

Dustin 125

In order to define the Tsallis-Shannon entropy which they will use as a regularizer the authors define what they call the Tsallis-perspective (H) of a convex function h. The issue is that H is not always convex. However, given a certain condition on h given in Lemma 1 in section 3.1 we get a lower bound on the hessian of H which implies that H is convex.

What is a geometric interpretation of the inequality in Lemma 1?

Alternatively is there a natural calculation in which the left hand side of the inequality shows up?

entropy • 1.1k views

ADD COMMENT

last updated by Admin User 1 • posted by Dustin

We appreciate the help!

Best-of-All-Worlds Bounds for Online Learning with Feedback Graphs