Difference between revisions of "Welcome to the Robustly Beneficial Wiki"

From RB Wiki
m
 
(11 intermediate revisions by the same user not shown)
Line 1: Line 1:
Welcome to the Robustly Beneficial wiki!!  
+
Welcome to the [[Robustly beneficial|Robustly Beneficial]] wiki!!  
  
This wiki aims to better grasp the scope and the limits of current AI ethics research. It lists references, key ideas and relevant open questions to make algorithms robustly beneficial. Please check also our [https://www.youtube.com/watch?v=WWbw4cla2jw&list=PLgqL_7nXb23FKk_rUfs7vnvyrPshYPfA8 Robustly Beneficial Podcast] (iTunes, RSS), our [https://www.youtube.com/playlist?list=PLgqL_7nXb23HvhToBb9FwFxj83navY6oq&playnext=1&index=1 Robustly Beneficial Talks] and our [https://twitter.com/robustlyb Twitter account].
+
This wiki aims to better grasp the scope and the limits of current AI ethics research. It lists references, key ideas and relevant open questions to make algorithms robustly beneficial. Please check also our [https://www.youtube.com/watch?v=WWbw4cla2jw&list=PLgqL_7nXb23FKk_rUfs7vnvyrPshYPfA8 Robustly Beneficial Podcast] ([https://podcasts.apple.com/fr/podcast/robustly-beneficial-podcast/id1496159681 iTunes], [https://playlists.podmytube.com/UCgl_MmjatQif8juz3Lt6iPw/PLgqL_7nXb23FKk_rUfs7vnvyrPshYPfA8.xml RSS]), our [https://www.youtube.com/playlist?list=PLgqL_7nXb23HvhToBb9FwFxj83navY6oq&playnext=1&index=1 Robustly Beneficial Talks] and our [https://twitter.com/robustlyb Twitter account].
  
 
The wiki has just been launched, so most pages are still being written. But they will never be finished — this is the whole point of a wiki!
 
The wiki has just been launched, so most pages are still being written. But they will never be finished — this is the whole point of a wiki!
Line 11: Line 11:
 
=== Why AI ethics is becoming critical ===
 
=== Why AI ethics is becoming critical ===
  
If you are new to AI ethics, you should probably start with the [[AI risks]] page. You could then go into arguably today's most important case of AI ethics, namely [[YouTube]]. Note that algorithms also offer formidable [[AI opportunities]] that are definitely worth considering. Find out more by reading about [[online polarization]], [[misinformation]], [[addiction]], [[mental health]] or [[hate]].
+
If you are new to AI ethics, you should probably start with the [[AI risks]] page. You could then go into arguably today's most important case of AI ethics, namely [[YouTube]]. Note that algorithms also offer formidable [[AI opportunities]] that are definitely worth considering. Find out more by reading about [[online polarization]], [[misinformation]], [[addiction]], [[mental health]] or [[hate]]. And as an example of an urgent AI ethics dilemma, check [https://twitter.com/le_science4all/status/1227690739104174080 this Twitter thread] on responses to a "is climate change a hoax?" search.
  
 
And if you know little about the current state of algorithmic research, you might want to check the latest [[impressive advances in AI]]. Or you could check some [[funny applications of AI]]. You can also read Lê's [https://www.lesswrong.com/posts/bwqDrSZvhEDKxRf6z/a-rant-against-robots rant against robots].
 
And if you know little about the current state of algorithmic research, you might want to check the latest [[impressive advances in AI]]. Or you could check some [[funny applications of AI]]. You can also read Lê's [https://www.lesswrong.com/posts/bwqDrSZvhEDKxRf6z/a-rant-against-robots rant against robots].
Line 17: Line 17:
 
=== How today's (and probably tomorrow's) AIs work ===
 
=== How today's (and probably tomorrow's) AIs work ===
  
The most important principle of today's AI is surely [[machine learning]]. Today, it mostly relies on [[stochastic gradient descent]] for (deep) [[neural networks]], which allow [[representational learning]] (see [[convolutional neural network]], [[residual network]], [[transformer]]). See also [[Turing 1950]], [[convexity]], [[generative adversarial network]] and [[linear systems]].
+
The most important principle of today's AI is surely [[machine learning]]. Today, it mostly relies on [[stochastic gradient descent]] for (deep) [[neural networks]], which allow [[representational learning]] (see [[convolutional neural network]], [[residual network]], [[transformer]]). See also [[Turing 1950]], [[convexity]], [[generative adversarial network]], [[specialized hardware]] and [[linear systems]].
  
[[Bayesianism]] has been argued to be the ideal form of supervised and unsupervised learning, if we had infinite computational power (see [[Solomonoff's demon]]). It has numerous desirable properties, like [[statistical admissibility]], [[Bayesian agreement]] or the [[Bayesian brain]] hypothesis. See also [[Bayesian examination]] and [[conjugate priors]].
+
[[Bayesianism]] has been argued to be the ideal form of supervised and unsupervised learning, if we had infinite computational power (see [[Solomonoff's demon]], [[Laplace 1814]]). It has numerous desirable properties, like [[statistical admissibility]], [[Bayesian agreement]] or the [[Bayesian brain]] hypothesis. See also [[Bayesian examination]] and [[conjugate priors]].
  
 
A branch of learning called [[reinforcement learning]], which relies on [[Q-learning]] or [[policy learning]], seems likely to become the core framework of today's and tomorrow's AIs. [[AIXI]] achieves the upper-bound for [[Legg-Hutter intelligence]], which aims to measure [[artificial general intelligence|general intelligence]].
 
A branch of learning called [[reinforcement learning]], which relies on [[Q-learning]] or [[policy learning]], seems likely to become the core framework of today's and tomorrow's AIs. [[AIXI]] achieves the upper-bound for [[Legg-Hutter intelligence]], which aims to measure [[artificial general intelligence|general intelligence]].
Line 29: Line 29:
 
=== Why AI safety and ethics is harder than meets the eye ===
 
=== Why AI safety and ethics is harder than meets the eye ===
  
We want to get algorithms to do what we would really want them to do. But this turns out to raise numerous highly nontrivial problems, like [[Goodhart's law]], [[overfitting]], [[robust statistics]], [[confounding variables]], [[adversarial attacks]], [[algorithmic bias]], [[cognitive bias]], [[backfire effect]], [[distributional shift]], [[privacy]], [[interpretability]], [[reward hacking]], [[wireheading]] and [[instrumental convergence]]. Because of all such problems, it seems crucial that AIs be able to reason about their ignorance, using [[Bayesianism|Bayesian]] principles, [[moral uncertainty]] and [[second opinion querying]].
+
We want to get algorithms to do what we would really want them to do. But this turns out to raise numerous highly nontrivial problems, like [[Goodhart's law]], [[overfitting]], [[robust statistics]], [[confounding variables]], [[adversarial attacks]], [[algorithmic bias]], [[cognitive bias]], [[backfire effect]], [[distributional shift]], [[privacy]], [[human liabilities]], [[interpretability]], [[reward hacking]], [[wireheading]] and [[instrumental convergence]]. Because of all such problems, it seems crucial that algorithms be able to reason about their ignorance, using [[Bayesianism|Bayesian]] principles, [[moral uncertainty]] and [[second opinion querying]]. Algorithms must be [[robustly beneficial]].
  
AI ethics also demands that we solve thorny philosophical dilemmas, like the [[repugnant conclusion]], [[Newcomb's paradox]] and [[moral realism]]. Unfortunately, we have numerous [[cognitive bias|cognitive biases]], which seem critical to understand to solve AI ethics. Results about [[counterfactual]], [[von Neumann-Morgenstern theorem]] and [[Dutch book]] also seem useful to consider.
+
AI ethics also demands that we solve thorny philosophical dilemmas, like the [[repugnant conclusion]], [[Newcomb's paradox]] and [[moral realism]]. Unfortunately, we have numerous [[cognitive bias|cognitive biases]], which seem critical to understand to solve AI ethics. Results about [[counterfactual]], [[von Neumann-Morgenstern preferences]] and [[Dutch book]] also seem useful to consider.
  
 
=== How to solve AI ethics (hopefully) ===
 
=== How to solve AI ethics (hopefully) ===
  
To solve AI ethics, [http://ceur-ws.org/Vol-2301/paper_1.pdf Hoang][https://dblp.org/rec/bibtex/conf/aaai/Hoang19 19a] proposed the [[ABCDE roadmap]], which decomposes the [[alignment]] problem into numerous (hopefully) orthogonal and complementary subproblems. Such subproblems include [[data certification]], perhaps through [[Blockchain]], [[world model inference]] through [[Bayesianism]] and/or [[representational learning]], [[volition]] learning and [[social choice]] solutions, [[corrigibility]] and safe [[reinforcement learning]].
+
To solve AI ethics, [http://ceur-ws.org/Vol-2301/paper_1.pdf Hoang][https://dblp.org/rec/bibtex/conf/aaai/Hoang19 19a] proposed the [[ABCDE roadmap]], which decomposes the [[alignment]] problem into numerous (hopefully) orthogonal and complementary subproblems. Such subproblems include [[data certification]], perhaps through [[Blockchain]], [[world model inference]] of [[knowledge representation]] through [[Bayesianism]] and/or [[representational learning]], [[volition]] learning perhaps from [[Preference learning from comparisons|comparisons]] and [[social choice]] solutions, [[corrigibility]] and safe [[reinforcement learning]].
  
 
The fabulous endeavor to make AIs robustly beneficial can seem overwhelming, given how extraordinarily interdisciplinary it is. While it is worthwhile to have an overview of the problem, we believe it is also useful for aspiring contributors to identify more precise problems they can contribute to. In this wiki, we propose targeted research directions for different expertises and research interests. Please check the following pages that may be of interest to you.
 
The fabulous endeavor to make AIs robustly beneficial can seem overwhelming, given how extraordinarily interdisciplinary it is. While it is worthwhile to have an overview of the problem, we believe it is also useful for aspiring contributors to identify more precise problems they can contribute to. In this wiki, we propose targeted research directions for different expertises and research interests. Please check the following pages that may be of interest to you.
Line 50: Line 50:
 
   <li>[[how journalists can contribute]]</li>
 
   <li>[[how journalists can contribute]]</li>
 
   <li>[[how medical doctors can contribute]]</li>
 
   <li>[[how medical doctors can contribute]]</li>
 +
  <li>[[how physicists can contribute]]</li>
 
   <li>[[how educators can contribute]]</li>
 
   <li>[[how educators can contribute]]</li>
 
   <li>[[how science communicators can contribute]]</li>
 
   <li>[[how science communicators can contribute]]</li>
Line 57: Line 58:
 
== About the authors ==
 
== About the authors ==
  
This wiki is written and edited mostly by members of the Lausanne Alignment Club, which is mostly a group of PhD students, postdoctoral researchers and professors at the École Polytechnique Fédérale de Lausanne, in Switzerland. So far, the main authors are [[User:Lê_Nguyên_Hoang|Lê Nguyên Hoang]], [[User:El_Mahdi_El_Mhamdi|El Mahdi El Mhamdi]] and [[User:Louis_Faucon|Louis Faucon]]. Please feel free to get in touch with them for further information (or just to say thanks!).
+
This wiki is written and edited mostly by members of the [[Robustly Beneficial group]], which regularly meets at EPFL, in Lausanne, Switzerland. Please feel free to [https://groups.google.com/forum/#!forum/lausannealignment ask to join]. So far, the main authors are [[User:Lê_Nguyên_Hoang|Lê Nguyên Hoang]], [[User:El_Mahdi_El_Mhamdi|El Mahdi El Mhamdi]] and [[User:Louis_Faucon|Louis Faucon]].  
  
 
Lê and Mahdi recently co-wrote the book <em>The Fabulous Endeavor: Make Artificial Intelligence Robustly Beneficial</em> [https://laboutique.edpsciences.fr/produit/1107/9782759824304/Le%20fabuleux%20chantier HoangElmhamdi][https://scholar.google.ch/scholar?hl=en&as_sdt=0%2C5&q=Le+fabuleux+chantier%3A+Rendre+l%27intelligence+artificielle+robustement+b%C3%A9n%C3%A9fique&btnG= 19<sup>FR</sup>] (the English version is pending).
 
Lê and Mahdi recently co-wrote the book <em>The Fabulous Endeavor: Make Artificial Intelligence Robustly Beneficial</em> [https://laboutique.edpsciences.fr/produit/1107/9782759824304/Le%20fabuleux%20chantier HoangElmhamdi][https://scholar.google.ch/scholar?hl=en&as_sdt=0%2C5&q=Le+fabuleux+chantier%3A+Rendre+l%27intelligence+artificielle+robustement+b%C3%A9n%C3%A9fique&btnG= 19<sup>FR</sup>] (the English version is pending).

Latest revision as of 10:37, 2 March 2020

Welcome to the Robustly Beneficial wiki!!

This wiki aims to better grasp the scope and the limits of current AI ethics research. It lists references, key ideas and relevant open questions to make algorithms robustly beneficial. Please check also our Robustly Beneficial Podcast (iTunes, RSS), our Robustly Beneficial Talks and our Twitter account.

The wiki has just been launched, so most pages are still being written. But they will never be finished — this is the whole point of a wiki!

The structure of the wiki

The wiki can be roughly divided into 4 main categories.

Why AI ethics is becoming critical

If you are new to AI ethics, you should probably start with the AI risks page. You could then go into arguably today's most important case of AI ethics, namely YouTube. Note that algorithms also offer formidable AI opportunities that are definitely worth considering. Find out more by reading about online polarization, misinformation, addiction, mental health or hate. And as an example of an urgent AI ethics dilemma, check this Twitter thread on responses to a "is climate change a hoax?" search.

And if you know little about the current state of algorithmic research, you might want to check the latest impressive advances in AI. Or you could check some funny applications of AI. You can also read Lê's rant against robots.

How today's (and probably tomorrow's) AIs work

The most important principle of today's AI is surely machine learning. Today, it mostly relies on stochastic gradient descent for (deep) neural networks, which allow representational learning (see convolutional neural network, residual network, transformer). See also Turing 1950, convexity, generative adversarial network, specialized hardware and linear systems.

Bayesianism has been argued to be the ideal form of supervised and unsupervised learning, if we had infinite computational power (see Solomonoff's demon, Laplace 1814). It has numerous desirable properties, like statistical admissibility, Bayesian agreement or the Bayesian brain hypothesis. See also Bayesian examination and conjugate priors.

A branch of learning called reinforcement learning, which relies on Q-learning or policy learning, seems likely to become the core framework of today's and tomorrow's AIs. AIXI achieves the upper-bound for Legg-Hutter intelligence, which aims to measure general intelligence.

To understand the gap between Bayesianism/AIXI and practical machine learning, we need to understand the constraints of computational complexity theory. By building upon the Church-Turing thesis, the Kolmogorov-Solomonoff complexity and knowledge from human brain computations, this allows some insights into human-level AI, in addition to experts' AI predictions. See also entropy and sophistication.

AIs are already doing distributed learning, which raises numerous challenges, like Byzantine fault tolerance and model drift.

Why AI safety and ethics is harder than meets the eye

We want to get algorithms to do what we would really want them to do. But this turns out to raise numerous highly nontrivial problems, like Goodhart's law, overfitting, robust statistics, confounding variables, adversarial attacks, algorithmic bias, cognitive bias, backfire effect, distributional shift, privacy, human liabilities, interpretability, reward hacking, wireheading and instrumental convergence. Because of all such problems, it seems crucial that algorithms be able to reason about their ignorance, using Bayesian principles, moral uncertainty and second opinion querying. Algorithms must be robustly beneficial.

AI ethics also demands that we solve thorny philosophical dilemmas, like the repugnant conclusion, Newcomb's paradox and moral realism. Unfortunately, we have numerous cognitive biases, which seem critical to understand to solve AI ethics. Results about counterfactual, von Neumann-Morgenstern preferences and Dutch book also seem useful to consider.

How to solve AI ethics (hopefully)

To solve AI ethics, Hoang19a proposed the ABCDE roadmap, which decomposes the alignment problem into numerous (hopefully) orthogonal and complementary subproblems. Such subproblems include data certification, perhaps through Blockchain, world model inference of knowledge representation through Bayesianism and/or representational learning, volition learning perhaps from comparisons and social choice solutions, corrigibility and safe reinforcement learning.

The fabulous endeavor to make AIs robustly beneficial can seem overwhelming, given how extraordinarily interdisciplinary it is. While it is worthwhile to have an overview of the problem, we believe it is also useful for aspiring contributors to identify more precise problems they can contribute to. In this wiki, we propose targeted research directions for different expertises and research interests. Please check the following pages that may be of interest to you.

About the authors

This wiki is written and edited mostly by members of the Robustly Beneficial group, which regularly meets at EPFL, in Lausanne, Switzerland. Please feel free to ask to join. So far, the main authors are Lê Nguyên Hoang, El Mahdi El Mhamdi and Louis Faucon.

Lê and Mahdi recently co-wrote the book The Fabulous Endeavor: Make Artificial Intelligence Robustly Beneficial HoangElmhamdi19FR (the English version is pending).