Lê Nguyên Hoang: Created page with "Alignment is the problem of designing the goal of a maximization algorithm, so that the goal reflects what we really want to optimize. It has been argued that the aggregatio..."

2020-01-20T21:32:38Z

Created page with "Alignment is the problem of designing the goal of a maximization algorithm, so that the goal reflects what we really want to optimize. It has been argued that the aggregatio..."

New page

Alignment is the problem of designing the goal of a maximization algorithm, so that the goal reflects what we really want to optimize. It has been argued that the [[aggregation|social choice]] [https://www.aaai.org/ocs/index.php/AAAI/AAAI18/paper/view/17052/15857 NGAD+][https://dblp.org/rec/bibtex/conf/aaai/NoothigattuGADR18 18] of humans' [[volition]] [https://intelligence.org/files/CEV.pdf Yudkowsky][https://scholar.google.ch/scholar?hl=en&as_sdt=0%2C5&q=coherent+extrapolated+volition+yudkowsky&btnG= 04] could be the best approach to alignment [https://laboutique.edpsciences.fr/produit/1107/9782759824304/Le%20fabuleux%20chantier HoangElmhamdi][https://scholar.google.ch/scholar?hl=en&as_sdt=0%2C5&q=Le+fabuleux+chantier%3A+Rendre+l%27intelligence+artificielle+robustement+b%C3%A9n%C3%A9fique&btnG= 19<sup>FR</sup>].

== Framing other approaches as alignment ==

Cite Hadfield. Interruptibility.

== Pitfalls ==

Alignment is widely recognized as a difficult problem. Take the case of companies for example.

In fact, alignment is so hard that even in the very restricted yet widely studied framework of supervised learning, there's still no good theory of what function should be optimized. Indeed, because of [[overfitting]], it has become common to add regularization terms, whose usefulness has been lately questioned by the discovery of [[double descent]]. From a [[Bayesian|Bayesianism]] viewpoint, this all boils down to noting that learning is not naturally described by an optimization framework, and that it must invoke some prior.

[[Goodhart's law]] formalizes this difficulty.

[https://arxiv.org/pdf/1901.00064.pdf Eckersley][https://dblp.org/rec/bibtex/conf/aaai/Eckersley19 18] exploit the repugnant conclusion and variants to argue against implementing a utility function.

Alignment - Revision history

Lê Nguyên Hoang: Created page with "Alignment is the problem of designing the goal of a maximization algorithm, so that the goal reflects what we really want to optimize. It has been argued that the aggregatio..."