Welcome to the Robustly Beneficial Wiki

Welcome to the Robustly Beneficial wiki!!

What is this wiki about? And why?

This wiki aims to list references, ideas and research questions in AI ethics. We hope to help better visualize the scope and the limits of current AI ethics.

The wiki is managed by the Lausanne Alignment Club, based in EPFL, Switzerland. Please check also our our Robustly Beneficial Podcast (iTunes, RSS) and our Robustly Beneficial Talks.

The wiki has just been launched, so most pages are still being written. But they will never be finished — this is the whole point of a wiki!

New here? How about starting with these pages?

The pages can be roughly divided into 4 categories.

Why AI ethics is becoming critical

If you are new to AI ethics, you should probably start with the AI risks page. You could then go into arguably today's most important case of AI ethics, namely YouTube. Note that algorithms also offer formidable AI opportunities that are definitely worth considering. Find out more by reading about online polarization, misinformation, addiction, mental health or hate.

And if you know little about the current state of algorithmic research, you might want to check the latest impressive advances in AI. Or you could check some funny applications of AI. You can also read Lê's rant against robots.

How today's (and probably tomorrow's) AIs work

The most important principle of today's AI is surely machine learning. Today, it mostly relies on stochastic gradient descent for (deep) neural networks, which allow representational learning (see convolutional neural network, residual network, LSTM). See also Turing 1950, convexity, generative adversarial network, transformer and linear systems.

Bayesianism has been argued to be the ideal form of supervised and unsupervised learning, if we had infinite computational power (see Solomonoff's demon). It has numerous desirable properties, like statistical admissibility, Bayesian agreement or the Bayesian brain hypothesis. See also Bayesian examination and conjugate priors.

A branch of learning called reinforcement learning, which relies on Q-learning or policy learning, seems likely to become the core framework of today's and tomorrow's AIs. AIXI achieves the upper-bound for Legg-Hutter intelligence, which aims to measure general intelligence.

To understand the gap between Bayesianism/AIXI and practical machine learning, we need to understand the constraints of computational complexity theory. By building upon the Church-Turing thesis and knowledge from human brain computations, this allows some insights into human-level AI, in addition to experts' AI predictions. See also entropy and sophistication.

AIs are already doing distributed learning, which raises numerous challenges, like Byzantine fault tolerance and model drift.

Why AI safety and ethics is harder than meets the eye

We want to get algorithms to do what we would really want them to do. But this turns out to raise numerous highly nontrivial problems, like Goodhart's law, overfitting, robust statistics, confounding variables, adversarial attacks, algorithmic bias, cognitive bias, backfire effect, distributional shift, privacy, interpretability, reward hacking, wireheading and instrumental convergence. Because of all such problems, it seems crucial that AIs be able to reason about their ignorance, using Bayesian principles, moral uncertainty and second opinion querying.

AI ethics also demands that we solve thorny philosophical dilemmas, like the repugnant conclusion, Newcomb's paradox and moral realism. Unfortunately, we have numerous cognitive biases, which seem critical to understand to solve AI ethics. Results about counterfactual, von Neumann-Morgenstern theorem and Dutch book also seem useful to consider.

How to solve AI ethics (hopefully)

To solve AI ethics, Hoang 19a proposed the ABCDE roadmap, which decomposes the alignment problem into numerous (hopefully) orthogonal and complementary subproblems. Such subproblems include data certification, perhaps through Blockchain, world model inference through Bayesianism and/or representational learning, volition learning and social choice solutions, corrigibility and safe reinforcement learning.

About the authors

Lê holds a PhD in applied mathematics from the École Polytechnique of Montreal. He did postdoctoral research at MIT, before joining EPFL as a science communicator. He runs EPFL YouTube channels Wandida and ZettaBytes, and French-speaking YouTube channels and podcasts Science4All, Axiome and Probablement. In 2020, he (will have) published The Equation of Knowledge: From Bayes Rule to a Unified Philosophy of Science at CRC Press (French version is out Hoang18^FR). Lê is still partly involved in research project. But his current main focus is synthesizing and organizing the current effort in AI ethics.

In 2020, Lê and Mahdi co-wrote the book The Fabulous Endeavor: Make Artificial Intelligence Robustly Beneficial HoangElmhamdi 19^FR (the English version is pending).

Anonymous

Search

Welcome to the Robustly Beneficial Wiki

Namespaces

More

Page actions

Contents

What is this wiki about? And why?

New here? How about starting with these pages?

Why AI ethics is becoming critical

How today's (and probably tomorrow's) AIs work

Why AI safety and ethics is harder than meets the eye

How to solve AI ethics (hopefully)

About the authors

Navigation

Navigation

Wiki tools

Wiki tools

Anonymous

Search

Welcome to the Robustly Beneficial Wiki

Contents

What is this wiki about? And why?

New here? How about starting with these pages?

Why AI ethics is becoming critical

How today's (and probably tomorrow's) AIs work

Why AI safety and ethics is harder than meets the eye

How to solve AI ethics (hopefully)

About the authors

Navigation

Wiki tools

Page tools