Welcome to the Robustly Beneficial Wiki

Welcome to the Robustly Beneficial wiki!!

This wiki aims to better grasp the scope and the limits of current AI ethics research. It lists references, key ideas and relevant open questions to make algorithms robustly beneficial. Please check also our Robustly Beneficial Podcast (iTunes, RSS), our Robustly Beneficial Talks and our Twitter account.

The wiki has just been launched, so most pages are still being written. But they will never be finished — this is the whole point of a wiki!

The structure of the wiki

The wiki can be roughly divided into 4 main categories.

Why AI ethics is becoming critical

If you are new to AI ethics, you should probably start with the AI risks page. You could then go into arguably today's most important case of AI ethics, namely YouTube. Note that algorithms also offer formidable AI opportunities that are definitely worth considering. Find out more by reading about online polarization, misinformation, addiction, mental health or hate.

And if you know little about the current state of algorithmic research, you might want to check the latest impressive advances in AI. Or you could check some funny applications of AI. You can also read Lê's rant against robots.

How today's (and probably tomorrow's) AIs work

The most important principle of today's AI is surely machine learning. Today, it mostly relies on stochastic gradient descent for (deep) neural networks, which allow representational learning (see convolutional neural network, residual network, transformer). See also Turing 1950, convexity, generative adversarial network and linear systems.

Bayesianism has been argued to be the ideal form of supervised and unsupervised learning, if we had infinite computational power (see Solomonoff's demon). It has numerous desirable properties, like statistical admissibility, Bayesian agreement or the Bayesian brain hypothesis. See also Bayesian examination and conjugate priors.

A branch of learning called reinforcement learning, which relies on Q-learning or policy learning, seems likely to become the core framework of today's and tomorrow's AIs. AIXI achieves the upper-bound for Legg-Hutter intelligence, which aims to measure general intelligence.

To understand the gap between Bayesianism/AIXI and practical machine learning, we need to understand the constraints of computational complexity theory. By building upon the Church-Turing thesis and knowledge from human brain computations, this allows some insights into human-level AI, in addition to experts' AI predictions. See also entropy and sophistication.

AIs are already doing distributed learning, which raises numerous challenges, like Byzantine fault tolerance and model drift.

Why AI safety and ethics is harder than meets the eye

We want to get algorithms to do what we would really want them to do. But this turns out to raise numerous highly nontrivial problems, like Goodhart's law, overfitting, robust statistics, confounding variables, adversarial attacks, algorithmic bias, cognitive bias, backfire effect, distributional shift, privacy, interpretability, reward hacking, wireheading and instrumental convergence. Because of all such problems, it seems crucial that AIs be able to reason about their ignorance, using Bayesian principles, moral uncertainty and second opinion querying.

AI ethics also demands that we solve thorny philosophical dilemmas, like the repugnant conclusion, Newcomb's paradox and moral realism. Unfortunately, we have numerous cognitive biases, which seem critical to understand to solve AI ethics. Results about counterfactual, von Neumann-Morgenstern theorem and Dutch book also seem useful to consider.

How to solve AI ethics (hopefully)

To solve AI ethics, Hoang 19a proposed the ABCDE roadmap, which decomposes the alignment problem into numerous (hopefully) orthogonal and complementary subproblems. Such subproblems include data certification, perhaps through Blockchain, world model inference through Bayesianism and/or representational learning, volition learning and social choice solutions, corrigibility and safe reinforcement learning.

The fabulous endeavor to make AIs robustly beneficial can seem overwhelming, given how extraordinarily interdisciplinary it is. While it is worthwhile to have an overview of the problem, we believe it is also useful for aspiring contributors to identify more precise problems they can contribute to. In this wiki, we propose targeted research directions for different expertises and research interests. Please check the following pages that may be of interest to you.

About the authors

This wiki is written and edited mostly by members of the Lausanne Alignment Club, which is mostly a group of PhD students, postdoctoral researchers and professors at the École Polytechnique Fédérale de Lausanne, in Switzerland. So far, the main authors are Lê Nguyên Hoang, El Mahdi El Mhamdi and Louis Faucon. Please feel free to get in touch with them for further information (or just to say thanks!).

Lê and Mahdi recently co-wrote the book The Fabulous Endeavor: Make Artificial Intelligence Robustly Beneficial HoangElmhamdi 19^FR (the English version is pending).

Anonymous

Search

Welcome to the Robustly Beneficial Wiki

Namespaces

More

Page actions

Contents

The structure of the wiki

Why AI ethics is becoming critical

How today's (and probably tomorrow's) AIs work

Why AI safety and ethics is harder than meets the eye

How to solve AI ethics (hopefully)

About the authors

Navigation

Navigation

Wiki tools

Wiki tools

Anonymous

Search

Welcome to the Robustly Beneficial Wiki

Contents

The structure of the wiki

Why AI ethics is becoming critical

How today's (and probably tomorrow's) AIs work

Why AI safety and ethics is harder than meets the eye

How to solve AI ethics (hopefully)

About the authors

Navigation

Wiki tools

Page tools