Difference between revisions of "Welcome to the Robustly Beneficial Wiki"

From RB Wiki
Line 31: Line 31:
 
We want to get algorithms to do what we would really want them to do. But this turns out to raise numerous highly nontrivial problems, like [[Goodhart's law]], [[overfitting]], [[robust statistics]], [[confounding variables]], [[adversarial attacks]], [[algorithmic bias]], [[cognitive bias]], [[backfire effect]], [[distributional shift]], [[privacy]], [[human liabilities]], [[interpretability]], [[reward hacking]], [[wireheading]] and [[instrumental convergence]]. Because of all such problems, it seems crucial that algorithms be able to reason about their ignorance, using [[Bayesianism|Bayesian]] principles, [[moral uncertainty]] and [[second opinion querying]]. Algorithms must be [[robustly beneficial]].
 
We want to get algorithms to do what we would really want them to do. But this turns out to raise numerous highly nontrivial problems, like [[Goodhart's law]], [[overfitting]], [[robust statistics]], [[confounding variables]], [[adversarial attacks]], [[algorithmic bias]], [[cognitive bias]], [[backfire effect]], [[distributional shift]], [[privacy]], [[human liabilities]], [[interpretability]], [[reward hacking]], [[wireheading]] and [[instrumental convergence]]. Because of all such problems, it seems crucial that algorithms be able to reason about their ignorance, using [[Bayesianism|Bayesian]] principles, [[moral uncertainty]] and [[second opinion querying]]. Algorithms must be [[robustly beneficial]].
  
AI ethics also demands that we solve thorny philosophical dilemmas, like the [[repugnant conclusion]], [[Newcomb's paradox]] and [[moral realism]]. Unfortunately, we have numerous [[cognitive bias|cognitive biases]], which seem critical to understand to solve AI ethics. Results about [[counterfactual]], [[von Neumann-Morgenstern theorem]] and [[Dutch book]] also seem useful to consider.
+
AI ethics also demands that we solve thorny philosophical dilemmas, like the [[repugnant conclusion]], [[Newcomb's paradox]] and [[moral realism]]. Unfortunately, we have numerous [[cognitive bias|cognitive biases]], which seem critical to understand to solve AI ethics. Results about [[counterfactual]], [[von Neumann-Morgenstern preferences]] and [[Dutch book]] also seem useful to consider.
  
 
=== How to solve AI ethics (hopefully) ===
 
=== How to solve AI ethics (hopefully) ===

Revision as of 23:33, 22 February 2020

Welcome to the Robustly Beneficial wiki!!

This wiki aims to better grasp the scope and the limits of current AI ethics research. It lists references, key ideas and relevant open questions to make algorithms robustly beneficial. Please check also our Robustly Beneficial Podcast (iTunes, RSS), our Robustly Beneficial Talks and our Twitter account.

The wiki has just been launched, so most pages are still being written. But they will never be finished — this is the whole point of a wiki!

The structure of the wiki

The wiki can be roughly divided into 4 main categories.

Why AI ethics is becoming critical

If you are new to AI ethics, you should probably start with the AI risks page. You could then go into arguably today's most important case of AI ethics, namely YouTube. Note that algorithms also offer formidable AI opportunities that are definitely worth considering. Find out more by reading about online polarization, misinformation, addiction, mental health or hate. And as an example of an urgent AI ethics dilemma, check this Twitter thread on responses to a "is climate change a hoax?" search.

And if you know little about the current state of algorithmic research, you might want to check the latest impressive advances in AI. Or you could check some funny applications of AI. You can also read Lê's rant against robots.

How today's (and probably tomorrow's) AIs work

The most important principle of today's AI is surely machine learning. Today, it mostly relies on stochastic gradient descent for (deep) neural networks, which allow representational learning (see convolutional neural network, residual network, transformer). See also Turing 1950, convexity, generative adversarial network, specialized hardware and linear systems.

Bayesianism has been argued to be the ideal form of supervised and unsupervised learning, if we had infinite computational power (see Solomonoff's demon, Laplace 1814). It has numerous desirable properties, like statistical admissibility, Bayesian agreement or the Bayesian brain hypothesis. See also Bayesian examination and conjugate priors.

A branch of learning called reinforcement learning, which relies on Q-learning or policy learning, seems likely to become the core framework of today's and tomorrow's AIs. AIXI achieves the upper-bound for Legg-Hutter intelligence, which aims to measure general intelligence.

To understand the gap between Bayesianism/AIXI and practical machine learning, we need to understand the constraints of computational complexity theory. By building upon the Church-Turing thesis, the Kolmogorov-Solomonoff complexity and knowledge from human brain computations, this allows some insights into human-level AI, in addition to experts' AI predictions. See also entropy and sophistication.

AIs are already doing distributed learning, which raises numerous challenges, like Byzantine fault tolerance and model drift.

Why AI safety and ethics is harder than meets the eye

We want to get algorithms to do what we would really want them to do. But this turns out to raise numerous highly nontrivial problems, like Goodhart's law, overfitting, robust statistics, confounding variables, adversarial attacks, algorithmic bias, cognitive bias, backfire effect, distributional shift, privacy, human liabilities, interpretability, reward hacking, wireheading and instrumental convergence. Because of all such problems, it seems crucial that algorithms be able to reason about their ignorance, using Bayesian principles, moral uncertainty and second opinion querying. Algorithms must be robustly beneficial.

AI ethics also demands that we solve thorny philosophical dilemmas, like the repugnant conclusion, Newcomb's paradox and moral realism. Unfortunately, we have numerous cognitive biases, which seem critical to understand to solve AI ethics. Results about counterfactual, von Neumann-Morgenstern preferences and Dutch book also seem useful to consider.

How to solve AI ethics (hopefully)

To solve AI ethics, Hoang19a proposed the ABCDE roadmap, which decomposes the alignment problem into numerous (hopefully) orthogonal and complementary subproblems. Such subproblems include data certification, perhaps through Blockchain, world model inference through Bayesianism and/or representational learning, volition learning perhaps from comparisons and social choice solutions, corrigibility and safe reinforcement learning.

The fabulous endeavor to make AIs robustly beneficial can seem overwhelming, given how extraordinarily interdisciplinary it is. While it is worthwhile to have an overview of the problem, we believe it is also useful for aspiring contributors to identify more precise problems they can contribute to. In this wiki, we propose targeted research directions for different expertises and research interests. Please check the following pages that may be of interest to you.

About the authors

This wiki is written and edited mostly by members of the Robustly Beneficial group, which regularly meets at EPFL, in Lausanne, Switzerland. Please feel free to ask to join. So far, the main authors are Lê Nguyên Hoang, El Mahdi El Mhamdi and Louis Faucon.

Lê and Mahdi recently co-wrote the book The Fabulous Endeavor: Make Artificial Intelligence Robustly Beneficial HoangElmhamdi19FR (the English version is pending).