<?xml version="1.0"?>
<feed xmlns="http://www.w3.org/2005/Atom" xml:lang="en">
	<id>https://robustlybeneficial.org/wiki/index.php?action=history&amp;feed=atom&amp;title=ABCDE_roadmap</id>
	<title>ABCDE roadmap - Revision history</title>
	<link rel="self" type="application/atom+xml" href="https://robustlybeneficial.org/wiki/index.php?action=history&amp;feed=atom&amp;title=ABCDE_roadmap"/>
	<link rel="alternate" type="text/html" href="https://robustlybeneficial.org/wiki/index.php?title=ABCDE_roadmap&amp;action=history"/>
	<updated>2026-04-29T07:52:56Z</updated>
	<subtitle>Revision history for this page on the wiki</subtitle>
	<generator>MediaWiki 1.34.0</generator>
	<entry>
		<id>https://robustlybeneficial.org/wiki/index.php?title=ABCDE_roadmap&amp;diff=262&amp;oldid=prev</id>
		<title>Lê Nguyên Hoang: /* Motivation and justification */</title>
		<link rel="alternate" type="text/html" href="https://robustlybeneficial.org/wiki/index.php?title=ABCDE_roadmap&amp;diff=262&amp;oldid=prev"/>
		<updated>2020-03-04T08:58:28Z</updated>

		<summary type="html">&lt;p&gt;&lt;span dir=&quot;auto&quot;&gt;&lt;span class=&quot;autocomment&quot;&gt;Motivation and justification&lt;/span&gt;&lt;/span&gt;&lt;/p&gt;
&lt;table class=&quot;diff diff-contentalign-left&quot; data-mw=&quot;interface&quot;&gt;
				&lt;col class=&quot;diff-marker&quot; /&gt;
				&lt;col class=&quot;diff-content&quot; /&gt;
				&lt;col class=&quot;diff-marker&quot; /&gt;
				&lt;col class=&quot;diff-content&quot; /&gt;
				&lt;tr class=&quot;diff-title&quot; lang=&quot;en&quot;&gt;
				&lt;td colspan=&quot;2&quot; style=&quot;background-color: #fff; color: #222; text-align: center;&quot;&gt;← Older revision&lt;/td&gt;
				&lt;td colspan=&quot;2&quot; style=&quot;background-color: #fff; color: #222; text-align: center;&quot;&gt;Revision as of 08:58, 4 March 2020&lt;/td&gt;
				&lt;/tr&gt;&lt;tr&gt;&lt;td colspan=&quot;2&quot; class=&quot;diff-lineno&quot; id=&quot;mw-diff-left-l17&quot; &gt;Line 17:&lt;/td&gt;
&lt;td colspan=&quot;2&quot; class=&quot;diff-lineno&quot;&gt;Line 17:&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td class='diff-marker'&gt; &lt;/td&gt;&lt;td style=&quot;background-color: #f8f9fa; color: #222; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;== Motivation and justification ==&lt;/div&gt;&lt;/td&gt;&lt;td class='diff-marker'&gt; &lt;/td&gt;&lt;td style=&quot;background-color: #f8f9fa; color: #222; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;== Motivation and justification ==&lt;/div&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td class='diff-marker'&gt; &lt;/td&gt;&lt;td style=&quot;background-color: #f8f9fa; color: #222; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;/td&gt;&lt;td class='diff-marker'&gt; &lt;/td&gt;&lt;td style=&quot;background-color: #f8f9fa; color: #222; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td class='diff-marker'&gt;−&lt;/td&gt;&lt;td style=&quot;color: #222; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #ffe49c; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;The fundamental assumption of the ABCDE roadmap is that tomorrow's most powerful algorithms will be performing reinforcement learning.&lt;/div&gt;&lt;/td&gt;&lt;td class='diff-marker'&gt;+&lt;/td&gt;&lt;td style=&quot;color: #222; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #a3d3ff; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;The fundamental assumption of the ABCDE roadmap is that tomorrow's most powerful algorithms will be performing &lt;ins class=&quot;diffchange diffchange-inline&quot;&gt;within the [[&lt;/ins&gt;reinforcement learning&lt;ins class=&quot;diffchange diffchange-inline&quot;&gt;]] framework&lt;/ins&gt;. From a theoretical perspective, this assumption is strongly backed by the [[AIXI]] framework and, for instance, [[Solomonoff's completeness]] theorem.&lt;/div&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td class='diff-marker'&gt;−&lt;/td&gt;&lt;td style=&quot;color: #222; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #ffe49c; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt; &lt;/div&gt;&lt;/td&gt;&lt;td colspan=&quot;2&quot;&gt; &lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td class='diff-marker'&gt;−&lt;/td&gt;&lt;td style=&quot;color: #222; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #ffe49c; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;From a theoretical perspective, this assumption is strongly backed by the [[AIXI]] framework and, for instance, [[Solomonoff's completeness]] theorem.&lt;/div&gt;&lt;/td&gt;&lt;td colspan=&quot;2&quot;&gt; &lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td class='diff-marker'&gt; &lt;/td&gt;&lt;td style=&quot;background-color: #f8f9fa; color: #222; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;/td&gt;&lt;td class='diff-marker'&gt; &lt;/td&gt;&lt;td style=&quot;background-color: #f8f9fa; color: #222; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td class='diff-marker'&gt; &lt;/td&gt;&lt;td style=&quot;background-color: #f8f9fa; color: #222; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;From an empirical perspective, it is also backed by the numerous recent successes of reinforcement learning, for instance in Go, Chess and Shogi [https://arxiv.org/pdf/1712.01815 SHSAL+][https://dblp.org/rec/bibtex/journals/corr/abs-1712-01815 17] [https://arxiv.org/pdf/1911.08265 SAHSS+][https://dblp.org/rec/bibtex/journals/corr/abs-1911-08265 19], in video games like Atari games [https://arxiv.org/pdf/1312.5602 MKSGA+][https://dblp.org/rec/bibtex/journals/corr/MnihKSGAWR13 13] or StarCraft [https://www.nature.com/articles/s41586-019-1724-z.epdf VBCMD+][https://scholar.google.ch/scholar?hl=en&amp;amp;as_sdt=0%2C5&amp;amp;q=Grandmaster+level+in+StarCraft+II+using+multi-agent+reinforcement+learning&amp;amp;btnG= 19], in combinatorial problems like protein folding [https://kstatic.googleusercontent.com/files/b4d715e8f8b6514cbfdc28a9ad83e14b6a8f86c34ea3b3cc844af8e76767d21ac3df5b0a9177d5e3f6a40b74caf7281a386af0fab8ca62f687599abaf8c8810f EJKSG+][https://scholar.google.ch/scholar?hl=en&amp;amp;as_sdt=0%2C5&amp;amp;q=De+novo+structure+prediction+with+deep%C2%ADlearning+based+scoring&amp;amp;btnG= 18] [https://www.nature.com/articles/s41586-019-1923-7.epdf?author_access_token=Z_KaZKDqtKzbE7Wd5HtwI9RgN0jAjWel9jnR3ZoTv0MCcgAwHMgRx9mvLjNQdB2TlQQaa7l420UCtGo8vYQ39gg8lFWR9mAZtvsN_1PrccXfIbc6e-tGSgazNL_XdtQzn1PHfy21qdcxV7Pw-k3htw%3D%3D SEJKS+][https://scholar.google.ch/scholar?hl=en&amp;amp;as_sdt=0%2C5&amp;amp;q=Improved+protein+structure+prediction+using+potentials+from+deep+learning&amp;amp;btnG= 20], or in arguably today's most influential algorithm, namely YouTube's recommendation system [https://www.ijcai.org/proceedings/2019/0360.pdf IJWNA+][https://dblp.org/rec/bibtex/conf/ijcai/IeJWNAWCCB19 19].&lt;/div&gt;&lt;/td&gt;&lt;td class='diff-marker'&gt; &lt;/td&gt;&lt;td style=&quot;background-color: #f8f9fa; color: #222; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;From an empirical perspective, it is also backed by the numerous recent successes of reinforcement learning, for instance in Go, Chess and Shogi [https://arxiv.org/pdf/1712.01815 SHSAL+][https://dblp.org/rec/bibtex/journals/corr/abs-1712-01815 17] [https://arxiv.org/pdf/1911.08265 SAHSS+][https://dblp.org/rec/bibtex/journals/corr/abs-1911-08265 19], in video games like Atari games [https://arxiv.org/pdf/1312.5602 MKSGA+][https://dblp.org/rec/bibtex/journals/corr/MnihKSGAWR13 13] or StarCraft [https://www.nature.com/articles/s41586-019-1724-z.epdf VBCMD+][https://scholar.google.ch/scholar?hl=en&amp;amp;as_sdt=0%2C5&amp;amp;q=Grandmaster+level+in+StarCraft+II+using+multi-agent+reinforcement+learning&amp;amp;btnG= 19], in combinatorial problems like protein folding [https://kstatic.googleusercontent.com/files/b4d715e8f8b6514cbfdc28a9ad83e14b6a8f86c34ea3b3cc844af8e76767d21ac3df5b0a9177d5e3f6a40b74caf7281a386af0fab8ca62f687599abaf8c8810f EJKSG+][https://scholar.google.ch/scholar?hl=en&amp;amp;as_sdt=0%2C5&amp;amp;q=De+novo+structure+prediction+with+deep%C2%ADlearning+based+scoring&amp;amp;btnG= 18] [https://www.nature.com/articles/s41586-019-1923-7.epdf?author_access_token=Z_KaZKDqtKzbE7Wd5HtwI9RgN0jAjWel9jnR3ZoTv0MCcgAwHMgRx9mvLjNQdB2TlQQaa7l420UCtGo8vYQ39gg8lFWR9mAZtvsN_1PrccXfIbc6e-tGSgazNL_XdtQzn1PHfy21qdcxV7Pw-k3htw%3D%3D SEJKS+][https://scholar.google.ch/scholar?hl=en&amp;amp;as_sdt=0%2C5&amp;amp;q=Improved+protein+structure+prediction+using+potentials+from+deep+learning&amp;amp;btnG= 20], or in arguably today's most influential algorithm, namely YouTube's recommendation system [https://www.ijcai.org/proceedings/2019/0360.pdf IJWNA+][https://dblp.org/rec/bibtex/conf/ijcai/IeJWNAWCCB19 19].&lt;/div&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;/table&gt;</summary>
		<author><name>Lê Nguyên Hoang</name></author>
		
	</entry>
	<entry>
		<id>https://robustlybeneficial.org/wiki/index.php?title=ABCDE_roadmap&amp;diff=24&amp;oldid=prev</id>
		<title>Lê Nguyên Hoang: Created page with &quot;The ABCDE roadmap refers to a decomposition proposed by [http://ceur-ws.org/Vol-2301/paper_1.pdf Hoang][https://dblp.org/rec/bibtex/conf/aaai/Hoang19 19a] to better highlight...&quot;</title>
		<link rel="alternate" type="text/html" href="https://robustlybeneficial.org/wiki/index.php?title=ABCDE_roadmap&amp;diff=24&amp;oldid=prev"/>
		<updated>2020-01-20T21:32:27Z</updated>

		<summary type="html">&lt;p&gt;Created page with &amp;quot;The ABCDE roadmap refers to a decomposition proposed by [http://ceur-ws.org/Vol-2301/paper_1.pdf Hoang][https://dblp.org/rec/bibtex/conf/aaai/Hoang19 19a] to better highlight...&amp;quot;&lt;/p&gt;
&lt;p&gt;&lt;b&gt;New page&lt;/b&gt;&lt;/p&gt;&lt;div&gt;The ABCDE roadmap refers to a decomposition proposed by [http://ceur-ws.org/Vol-2301/paper_1.pdf Hoang][https://dblp.org/rec/bibtex/conf/aaai/Hoang19 19a] to better highlight the key challenges of AI ethics and safety. It is also discussed at greater length in [https://arxiv.org/pdf/1809.01036 Hoang][https://scholar.google.ch/scholar?hl=en&amp;amp;as_sdt=0%2C5&amp;amp;q=A+Roadmap+for+Robust+End-to-End+Alignment&amp;amp;btnG= 19b] and [https://laboutique.edpsciences.fr/produit/1107/9782759824304/Le%20fabuleux%20chantier HoangElmhamdi][https://scholar.google.ch/scholar?hl=en&amp;amp;as_sdt=0%2C5&amp;amp;q=Le+fabuleux+chantier%3A+Rendre+l%27intelligence+artificielle+robustement+b%C3%A9n%C3%A9fique&amp;amp;btnG= 19&amp;lt;sup&amp;gt;FR&amp;lt;/sup&amp;gt;].&lt;br /&gt;
&lt;br /&gt;
== The decomposition ==&lt;br /&gt;
&lt;br /&gt;
The decomposition consists of 5 steps: Alice, Bob, Charlie, Dave and Erin.&lt;br /&gt;
&lt;br /&gt;
Erin's goal is quality data collection and certification, by relying on all sorts of sensors and user inputs, as well as cryptography. Techniques related to [[Blockchain]] may also be useful to guarantee the traceability of data.&lt;br /&gt;
&lt;br /&gt;
Dave is in charge of world model inference from Erin's data. In particular, Dave should correct for sampling biases and account for uncertainty due to data incompleteness. To do so, it may rely on heuristic forms of [[Bayesianism]], like [[representational learning]] constructed by [[GAN]]-like architectures.&lt;br /&gt;
&lt;br /&gt;
Charlie must compute human preferences. In particular, she should probably implement some [[social choice]] mechanism to combine incompatible preferences. Also, she should probably distinguish [[volition]] from instinctive preference. Combining techniques like [[inverse reinforcement learning]] and [[active learning]] is probably critical to design Charlie.&lt;br /&gt;
&lt;br /&gt;
Bob would design incentive-compatible rewards to be given to Alice. By combining Erin, Dave and Charlie's computations, Dave could send to Alice humans' preferences for different states of the world, including the (likely) current state of the world and the probable future states of the world. But to avoid [[Goodhart's law]] and [[wireheading]], it would likely be dangerous to do so directly. Instead, Bob could enable and incentivize the [[corrigibility]] of Erin, Dave and Charlie's computations, by feeding Alice with larger rewards when Erin, Dave and Charlie perform more accurate computations. Designing Bob may be called the [[programmed corrigibility]] problem.&lt;br /&gt;
&lt;br /&gt;
Finally, Alice is going to perform [[reinforcement learning]] using Bob's rewards.&lt;br /&gt;
&lt;br /&gt;
== Motivation and justification ==&lt;br /&gt;
&lt;br /&gt;
The fundamental assumption of the ABCDE roadmap is that tomorrow's most powerful algorithms will be performing reinforcement learning.&lt;br /&gt;
&lt;br /&gt;
From a theoretical perspective, this assumption is strongly backed by the [[AIXI]] framework and, for instance, [[Solomonoff's completeness]] theorem.&lt;br /&gt;
&lt;br /&gt;
From an empirical perspective, it is also backed by the numerous recent successes of reinforcement learning, for instance in Go, Chess and Shogi [https://arxiv.org/pdf/1712.01815 SHSAL+][https://dblp.org/rec/bibtex/journals/corr/abs-1712-01815 17] [https://arxiv.org/pdf/1911.08265 SAHSS+][https://dblp.org/rec/bibtex/journals/corr/abs-1911-08265 19], in video games like Atari games [https://arxiv.org/pdf/1312.5602 MKSGA+][https://dblp.org/rec/bibtex/journals/corr/MnihKSGAWR13 13] or StarCraft [https://www.nature.com/articles/s41586-019-1724-z.epdf VBCMD+][https://scholar.google.ch/scholar?hl=en&amp;amp;as_sdt=0%2C5&amp;amp;q=Grandmaster+level+in+StarCraft+II+using+multi-agent+reinforcement+learning&amp;amp;btnG= 19], in combinatorial problems like protein folding [https://kstatic.googleusercontent.com/files/b4d715e8f8b6514cbfdc28a9ad83e14b6a8f86c34ea3b3cc844af8e76767d21ac3df5b0a9177d5e3f6a40b74caf7281a386af0fab8ca62f687599abaf8c8810f EJKSG+][https://scholar.google.ch/scholar?hl=en&amp;amp;as_sdt=0%2C5&amp;amp;q=De+novo+structure+prediction+with+deep%C2%ADlearning+based+scoring&amp;amp;btnG= 18] [https://www.nature.com/articles/s41586-019-1923-7.epdf?author_access_token=Z_KaZKDqtKzbE7Wd5HtwI9RgN0jAjWel9jnR3ZoTv0MCcgAwHMgRx9mvLjNQdB2TlQQaa7l420UCtGo8vYQ39gg8lFWR9mAZtvsN_1PrccXfIbc6e-tGSgazNL_XdtQzn1PHfy21qdcxV7Pw-k3htw%3D%3D SEJKS+][https://scholar.google.ch/scholar?hl=en&amp;amp;as_sdt=0%2C5&amp;amp;q=Improved+protein+structure+prediction+using+potentials+from+deep+learning&amp;amp;btnG= 20], or in arguably today's most influential algorithm, namely YouTube's recommendation system [https://www.ijcai.org/proceedings/2019/0360.pdf IJWNA+][https://dblp.org/rec/bibtex/conf/ijcai/IeJWNAWCCB19 19].&lt;br /&gt;
&lt;br /&gt;
Reinforcement learning are probably going to keep improving. The only way to make sure that reinforcement learning algorithms will be robustly beneficial is arguably to make sure that these algorithms optimize a desirable goal. This is known as the [[alignment]] problem.&lt;br /&gt;
&lt;br /&gt;
In the case of reinforcement learning, the goal is the sum of discounted future rewards. As a result, the reward function is critical to AI ethics and safety. The ABCDE roadmap highlights it, by further decomposing the components of the reward function.&lt;br /&gt;
&lt;br /&gt;
It is argued that the ABCDE roadmap is likely to be useful, because it allows to decompose the alignment problem into numerous subproblems, which are both (hopefully) independent enough to be tackled separately, and complementary enough so that solutions of subproblems can be easily combined into a solution to the global alignment problem.&lt;/div&gt;</summary>
		<author><name>Lê Nguyên Hoang</name></author>
		
	</entry>
</feed>