3

votes

1

answer

111

views

0

votes

0

answers

56

views

0

votes

0

answers

56

views

0

votes

0

answers

60

views

0

votes

0

answers

49

views

0

votes

0

answers

48

views

0

votes

0

answers

60

views

10 weeks ago by
Roman
▴
15

0

votes

0

answers

72

views

0

votes

0

answers

70

views

1

vote

0

answers

70

views

3

votes

1

answer

178

views

2

votes

0

answers

212

views

10

votes

0

answers

165

views

0

votes

0

answers

133

views

updated 14 months ago by
Admin User
1
•
written 14 months ago by
Dustin

0

votes

0

answers

128

views

Interpretation of convexity lemma

Best-of-All-Worlds Bounds for Online Learning with Feedback Graphs

Best-of-All-Worlds Bounds for Online Learning with Feedback Graphs

updated 14 months ago by
Admin User
1
•
written 14 months ago by
Dustin

0

votes

0

answers

108

views

Generalizing attention length beyond training data length

Transformer-XL: Attentive Language Models Beyond a Fixed-Length Context

Transformer-XL: Attentive Language Models Beyond a Fixed-Length Context

updated 14 months ago by
Admin User
1
•
written 14 months ago by
Dustin

Recent Awards •
All

Invitee to
Yuval Filmus
▴
15

Founding Contributor to
FrankS
▴
55

Invitee to
marco.fellous-asiani
▴
15

Founding Contributor to
Ken
▴
45

Recent Replies

Answer: Implementing an algorithm
by
Andy
46

Use a distibution that's symmetric around the origin and normalize the results so they lie on the sphere. E.g. you can use a Gaussian. He…

Answer: Application of Property RD
by
Dustin
125

It is used to prove the bound in proposition 4.6 which is then used crucially in proposition 4.7 to prove the uniform boundedness of the op…

Traffic: 1 users visited in the last hour