<?xml version="1.0"?>
<feed xmlns="http://www.w3.org/2005/Atom" xml:lang="en">
	<id>https://robustlybeneficial.org/wiki/index.php?action=history&amp;feed=atom&amp;title=Robust_statistics</id>
	<title>Robust statistics - Revision history</title>
	<link rel="self" type="application/atom+xml" href="https://robustlybeneficial.org/wiki/index.php?action=history&amp;feed=atom&amp;title=Robust_statistics"/>
	<link rel="alternate" type="text/html" href="https://robustlybeneficial.org/wiki/index.php?title=Robust_statistics&amp;action=history"/>
	<updated>2026-04-28T13:33:21Z</updated>
	<subtitle>Revision history for this page on the wiki</subtitle>
	<generator>MediaWiki 1.34.0</generator>
	<entry>
		<id>https://robustlybeneficial.org/wiki/index.php?title=Robust_statistics&amp;diff=232&amp;oldid=prev</id>
		<title>Lê Nguyên Hoang: /* Robustness to additive poisoning */</title>
		<link rel="alternate" type="text/html" href="https://robustlybeneficial.org/wiki/index.php?title=Robust_statistics&amp;diff=232&amp;oldid=prev"/>
		<updated>2020-02-26T07:00:25Z</updated>

		<summary type="html">&lt;p&gt;&lt;span dir=&quot;auto&quot;&gt;&lt;span class=&quot;autocomment&quot;&gt;Robustness to additive poisoning&lt;/span&gt;&lt;/span&gt;&lt;/p&gt;
&lt;table class=&quot;diff diff-contentalign-left&quot; data-mw=&quot;interface&quot;&gt;
				&lt;col class=&quot;diff-marker&quot; /&gt;
				&lt;col class=&quot;diff-content&quot; /&gt;
				&lt;col class=&quot;diff-marker&quot; /&gt;
				&lt;col class=&quot;diff-content&quot; /&gt;
				&lt;tr class=&quot;diff-title&quot; lang=&quot;en&quot;&gt;
				&lt;td colspan=&quot;2&quot; style=&quot;background-color: #fff; color: #222; text-align: center;&quot;&gt;← Older revision&lt;/td&gt;
				&lt;td colspan=&quot;2&quot; style=&quot;background-color: #fff; color: #222; text-align: center;&quot;&gt;Revision as of 07:00, 26 February 2020&lt;/td&gt;
				&lt;/tr&gt;&lt;tr&gt;&lt;td colspan=&quot;2&quot; class=&quot;diff-lineno&quot; id=&quot;mw-diff-left-l33&quot; &gt;Line 33:&lt;/td&gt;
&lt;td colspan=&quot;2&quot; class=&quot;diff-lineno&quot;&gt;Line 33:&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td class='diff-marker'&gt; &lt;/td&gt;&lt;td style=&quot;background-color: #f8f9fa; color: #222; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;On the other hand, assuming the true distribution is a spherical normal distribution &amp;lt;math&amp;gt;\mathcal N(0,I)&amp;lt;/math&amp;gt;, Tukey proposed another approach based on identifying the directions of largest variances, since these are likely to be the &amp;quot;attack line&amp;quot; of the adversary [https://scholar.google.ch/scholar?hl=en&amp;amp;as_sdt=0%2C5&amp;amp;q=Mathematics+and+the+picturing+of+data+tukey&amp;amp;btnG= Tukey75]. &amp;lt;em&amp;gt;Tukey's median&amp;lt;/em&amp;gt; yields &amp;lt;math&amp;gt;O(\varepsilon)&amp;lt;/math&amp;gt;-error with high probability &amp;lt;math&amp;gt;1-\tau&amp;lt;/math&amp;gt; for &amp;lt;math&amp;gt;n=\Omega\left( \frac{d+\log(1/\tau)}{\varepsilon^2}\right)&amp;lt;/math&amp;gt; data points. Unfortunately, Tukey's median is NP-hard to compute, and is typically exponential in d.&lt;/div&gt;&lt;/td&gt;&lt;td class='diff-marker'&gt; &lt;/td&gt;&lt;td style=&quot;background-color: #f8f9fa; color: #222; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;On the other hand, assuming the true distribution is a spherical normal distribution &amp;lt;math&amp;gt;\mathcal N(0,I)&amp;lt;/math&amp;gt;, Tukey proposed another approach based on identifying the directions of largest variances, since these are likely to be the &amp;quot;attack line&amp;quot; of the adversary [https://scholar.google.ch/scholar?hl=en&amp;amp;as_sdt=0%2C5&amp;amp;q=Mathematics+and+the+picturing+of+data+tukey&amp;amp;btnG= Tukey75]. &amp;lt;em&amp;gt;Tukey's median&amp;lt;/em&amp;gt; yields &amp;lt;math&amp;gt;O(\varepsilon)&amp;lt;/math&amp;gt;-error with high probability &amp;lt;math&amp;gt;1-\tau&amp;lt;/math&amp;gt; for &amp;lt;math&amp;gt;n=\Omega\left( \frac{d+\log(1/\tau)}{\varepsilon^2}\right)&amp;lt;/math&amp;gt; data points. Unfortunately, Tukey's median is NP-hard to compute, and is typically exponential in d.&lt;/div&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td class='diff-marker'&gt; &lt;/td&gt;&lt;td style=&quot;background-color: #f8f9fa; color: #222; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;/td&gt;&lt;td class='diff-marker'&gt; &lt;/td&gt;&lt;td style=&quot;background-color: #f8f9fa; color: #222; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td class='diff-marker'&gt;−&lt;/td&gt;&lt;td style=&quot;color: #222; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #ffe49c; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;[https://arxiv.org/pdf/1811.09380.pdf CDG][https://dblp.org/rec/bibtex/conf/soda/0002D019 19] [https://arxiv.org/pdf/1906.03058 DepersinLecué][https://scholar.google.ch/scholar?hl=en&amp;amp;as_sdt=0%2C5&amp;amp;q=Robust+subgaussian+estimation+of+a+mean+vector+in+nearly+linear+time&amp;amp;btnG= 19] proved that such a bound can be achieved in quasi-linear-time, even for heavy-tailed distribution with bounded but unknown variance. [https://arxiv.org/pdf/1811.09380.pdf CDG][https://dblp.org/rec/bibtex/conf/soda/0002D019 19] proved that in the strong poisoning model, their quasi-linear-time algorithm achieves &amp;lt;math&amp;gt;O(\sqrt{\varepsilon})&amp;lt;/math&amp;gt; error, which is asymptotically optimal in terms of performance and computation time. On the other hand, [https://arxiv.org/pdf/1906.03058 DepersinLecué][https://scholar.google.ch/scholar?hl=en&amp;amp;as_sdt=0%2C5&amp;amp;q=Robust+subgaussian+estimation+of+a+mean+vector+in+nearly+linear+time&amp;amp;btnG= 19] used an approach akin to median-of-means, for &amp;lt;math&amp;gt;K = \Omega(\varepsilon n)&amp;lt;/math&amp;gt;, they designed an algorithm that achieves an error &amp;lt;math&amp;gt;O\left( \sqrt{\frac{Tr(\Sigma)}{n}} + \sqrt{\frac{||\Sigma||_{op} K}{n}} \right)&amp;lt;/math&amp;gt; in time &amp;lt;math&amp;gt;\tilde O(nd + uKd)&amp;lt;/math&amp;gt;. Here, &amp;lt;math&amp;gt;u&amp;lt;/math&amp;gt; is an integer parameter, which is needed to guarantee a subgaussian decay rate of errors, which will be in &amp;lt;math&amp;gt;1-\exp(-\Theta(K+u))&amp;lt;/math&amp;gt;. Note that &amp;lt;math&amp;gt;Tr(\Sigma)&amp;lt;/math&amp;gt; is essentially the &amp;quot;effective dimension&amp;quot; of the data points.   &lt;/div&gt;&lt;/td&gt;&lt;td class='diff-marker'&gt;+&lt;/td&gt;&lt;td style=&quot;color: #222; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #a3d3ff; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;[https://arxiv.org/pdf/1811.09380.pdf CDG][https://dblp.org/rec/bibtex/conf/soda/0002D019 19] [https://arxiv.org/pdf/1906.03058 DepersinLecué][https://scholar.google.ch/scholar?hl=en&amp;amp;as_sdt=0%2C5&amp;amp;q=Robust+subgaussian+estimation+of+a+mean+vector+in+nearly+linear+time&amp;amp;btnG= 19] proved that such a bound can be achieved in quasi-linear-time, even for heavy-tailed distribution with bounded but unknown variance. [https://arxiv.org/pdf/1811.09380.pdf CDG][https://dblp.org/rec/bibtex/conf/soda/0002D019 19] proved that in the strong poisoning model, their quasi-linear-time algorithm achieves &amp;lt;math&amp;gt;O(&lt;ins class=&quot;diffchange diffchange-inline&quot;&gt;||\Sigma||_{op} &lt;/ins&gt;\sqrt{\varepsilon})&amp;lt;/math&amp;gt; error, which is asymptotically optimal in terms of performance and computation time. On the other hand, [https://arxiv.org/pdf/1906.03058 DepersinLecué][https://scholar.google.ch/scholar?hl=en&amp;amp;as_sdt=0%2C5&amp;amp;q=Robust+subgaussian+estimation+of+a+mean+vector+in+nearly+linear+time&amp;amp;btnG= 19] used an approach akin to median-of-means, for &amp;lt;math&amp;gt;K = \Omega(\varepsilon n)&amp;lt;/math&amp;gt;, they designed an algorithm that achieves an error &amp;lt;math&amp;gt;O\left( \sqrt{\frac{Tr(\Sigma)}{n}} + \sqrt{\frac{||\Sigma||_{op} K}{n}} \right)&amp;lt;/math&amp;gt; in time &amp;lt;math&amp;gt;\tilde O(nd + uKd)&amp;lt;/math&amp;gt;. Here, &amp;lt;math&amp;gt;u&amp;lt;/math&amp;gt; is an integer parameter, which is needed to guarantee a subgaussian decay rate of errors, which will be in &amp;lt;math&amp;gt;1-\exp(-\Theta(K+u))&amp;lt;/math&amp;gt;. Note that &amp;lt;math&amp;gt;Tr(\Sigma)&amp;lt;/math&amp;gt; is essentially the &amp;quot;effective dimension&amp;quot; of the data points.   &lt;/div&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td class='diff-marker'&gt; &lt;/td&gt;&lt;td style=&quot;background-color: #f8f9fa; color: #222; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;/td&gt;&lt;td class='diff-marker'&gt; &lt;/td&gt;&lt;td style=&quot;background-color: #f8f9fa; color: #222; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td class='diff-marker'&gt; &lt;/td&gt;&lt;td style=&quot;background-color: #f8f9fa; color: #222; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;Their technique relies on partitioning data into &amp;lt;math&amp;gt;K&amp;lt;/math&amp;gt; buckets, computing the means for each bucket, and replacing the computation of median of the means by a covering SDP that fits all configuration of bucket poisoning. It turns out that approximations of such an covering SDP can be founded in quasi-linear-time [https://arxiv.org/pdf/1201.5135 PTZ][https://dblp.org/rec/bibtex/conf/spaa/PengT12 12]. It turns out that, rather than being used to directly compute a mean estimator, this is actually used to perform gradient descent, starting from the coordinate-wise median, and then descending along the direction provided by the covering SDP. [https://arxiv.org/pdf/1906.03058 DepersinLecué][https://scholar.google.ch/scholar?hl=en&amp;amp;as_sdt=0%2C5&amp;amp;q=Robust+subgaussian+estimation+of+a+mean+vector+in+nearly+linear+time&amp;amp;btnG= 19] proved that only &amp;lt;math&amp;gt;O(\log d)&amp;lt;/math&amp;gt; such steps were needed to guarantee their bound.&lt;/div&gt;&lt;/td&gt;&lt;td class='diff-marker'&gt; &lt;/td&gt;&lt;td style=&quot;background-color: #f8f9fa; color: #222; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;Their technique relies on partitioning data into &amp;lt;math&amp;gt;K&amp;lt;/math&amp;gt; buckets, computing the means for each bucket, and replacing the computation of median of the means by a covering SDP that fits all configuration of bucket poisoning. It turns out that approximations of such an covering SDP can be founded in quasi-linear-time [https://arxiv.org/pdf/1201.5135 PTZ][https://dblp.org/rec/bibtex/conf/spaa/PengT12 12]. It turns out that, rather than being used to directly compute a mean estimator, this is actually used to perform gradient descent, starting from the coordinate-wise median, and then descending along the direction provided by the covering SDP. [https://arxiv.org/pdf/1906.03058 DepersinLecué][https://scholar.google.ch/scholar?hl=en&amp;amp;as_sdt=0%2C5&amp;amp;q=Robust+subgaussian+estimation+of+a+mean+vector+in+nearly+linear+time&amp;amp;btnG= 19] proved that only &amp;lt;math&amp;gt;O(\log d)&amp;lt;/math&amp;gt; such steps were needed to guarantee their bound.&lt;/div&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;/table&gt;</summary>
		<author><name>Lê Nguyên Hoang</name></author>
		
	</entry>
	<entry>
		<id>https://robustlybeneficial.org/wiki/index.php?title=Robust_statistics&amp;diff=231&amp;oldid=prev</id>
		<title>Lê Nguyên Hoang: /* Robustness to additive poisoning */</title>
		<link rel="alternate" type="text/html" href="https://robustlybeneficial.org/wiki/index.php?title=Robust_statistics&amp;diff=231&amp;oldid=prev"/>
		<updated>2020-02-26T06:59:30Z</updated>

		<summary type="html">&lt;p&gt;&lt;span dir=&quot;auto&quot;&gt;&lt;span class=&quot;autocomment&quot;&gt;Robustness to additive poisoning&lt;/span&gt;&lt;/span&gt;&lt;/p&gt;
&lt;table class=&quot;diff diff-contentalign-left&quot; data-mw=&quot;interface&quot;&gt;
				&lt;col class=&quot;diff-marker&quot; /&gt;
				&lt;col class=&quot;diff-content&quot; /&gt;
				&lt;col class=&quot;diff-marker&quot; /&gt;
				&lt;col class=&quot;diff-content&quot; /&gt;
				&lt;tr class=&quot;diff-title&quot; lang=&quot;en&quot;&gt;
				&lt;td colspan=&quot;2&quot; style=&quot;background-color: #fff; color: #222; text-align: center;&quot;&gt;← Older revision&lt;/td&gt;
				&lt;td colspan=&quot;2&quot; style=&quot;background-color: #fff; color: #222; text-align: center;&quot;&gt;Revision as of 06:59, 26 February 2020&lt;/td&gt;
				&lt;/tr&gt;&lt;tr&gt;&lt;td colspan=&quot;2&quot; class=&quot;diff-lineno&quot; id=&quot;mw-diff-left-l33&quot; &gt;Line 33:&lt;/td&gt;
&lt;td colspan=&quot;2&quot; class=&quot;diff-lineno&quot;&gt;Line 33:&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td class='diff-marker'&gt; &lt;/td&gt;&lt;td style=&quot;background-color: #f8f9fa; color: #222; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;On the other hand, assuming the true distribution is a spherical normal distribution &amp;lt;math&amp;gt;\mathcal N(0,I)&amp;lt;/math&amp;gt;, Tukey proposed another approach based on identifying the directions of largest variances, since these are likely to be the &amp;quot;attack line&amp;quot; of the adversary [https://scholar.google.ch/scholar?hl=en&amp;amp;as_sdt=0%2C5&amp;amp;q=Mathematics+and+the+picturing+of+data+tukey&amp;amp;btnG= Tukey75]. &amp;lt;em&amp;gt;Tukey's median&amp;lt;/em&amp;gt; yields &amp;lt;math&amp;gt;O(\varepsilon)&amp;lt;/math&amp;gt;-error with high probability &amp;lt;math&amp;gt;1-\tau&amp;lt;/math&amp;gt; for &amp;lt;math&amp;gt;n=\Omega\left( \frac{d+\log(1/\tau)}{\varepsilon^2}\right)&amp;lt;/math&amp;gt; data points. Unfortunately, Tukey's median is NP-hard to compute, and is typically exponential in d.&lt;/div&gt;&lt;/td&gt;&lt;td class='diff-marker'&gt; &lt;/td&gt;&lt;td style=&quot;background-color: #f8f9fa; color: #222; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;On the other hand, assuming the true distribution is a spherical normal distribution &amp;lt;math&amp;gt;\mathcal N(0,I)&amp;lt;/math&amp;gt;, Tukey proposed another approach based on identifying the directions of largest variances, since these are likely to be the &amp;quot;attack line&amp;quot; of the adversary [https://scholar.google.ch/scholar?hl=en&amp;amp;as_sdt=0%2C5&amp;amp;q=Mathematics+and+the+picturing+of+data+tukey&amp;amp;btnG= Tukey75]. &amp;lt;em&amp;gt;Tukey's median&amp;lt;/em&amp;gt; yields &amp;lt;math&amp;gt;O(\varepsilon)&amp;lt;/math&amp;gt;-error with high probability &amp;lt;math&amp;gt;1-\tau&amp;lt;/math&amp;gt; for &amp;lt;math&amp;gt;n=\Omega\left( \frac{d+\log(1/\tau)}{\varepsilon^2}\right)&amp;lt;/math&amp;gt; data points. Unfortunately, Tukey's median is NP-hard to compute, and is typically exponential in d.&lt;/div&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td class='diff-marker'&gt; &lt;/td&gt;&lt;td style=&quot;background-color: #f8f9fa; color: #222; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;/td&gt;&lt;td class='diff-marker'&gt; &lt;/td&gt;&lt;td style=&quot;background-color: #f8f9fa; color: #222; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td class='diff-marker'&gt;−&lt;/td&gt;&lt;td style=&quot;color: #222; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #ffe49c; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;[https://arxiv.org/pdf/1811.09380.pdf CDG][https://dblp.org/rec/bibtex/conf/soda/0002D019 19] [https://arxiv.org/pdf/1906.03058 DepersinLecué][https://scholar.google.ch/scholar?hl=en&amp;amp;as_sdt=0%2C5&amp;amp;q=Robust+subgaussian+estimation+of+a+mean+vector+in+nearly+linear+time&amp;amp;btnG= 19] proved that such a bound can be achieved in quasi-linear-time, even for heavy-tailed distribution with bounded but unknown variance. [https://arxiv.org/pdf/1811.09380.pdf CDG][https://dblp.org/rec/bibtex/conf/soda/0002D019 19] proved that in the strong poisoning model, their quasi-linear-time algorithm achieves &amp;lt;math&amp;gt;O(\sqrt{\varepsilon}&amp;lt;/math&amp;gt; error, which is asymptotically optimal in terms of performance and computation time. On the other hand, [https://arxiv.org/pdf/1906.03058 DepersinLecué][https://scholar.google.ch/scholar?hl=en&amp;amp;as_sdt=0%2C5&amp;amp;q=Robust+subgaussian+estimation+of+a+mean+vector+in+nearly+linear+time&amp;amp;btnG= 19] used an approach akin to median-of-means, for &amp;lt;math&amp;gt;K = \Omega(\varepsilon n)&amp;lt;/math&amp;gt;, they designed an algorithm that achieves an error &amp;lt;math&amp;gt;O\left( \sqrt{\frac{Tr(\Sigma)}{n}} + \sqrt{\frac{||\Sigma||_{op} K}{n}} \right)&amp;lt;/math&amp;gt; in time &amp;lt;math&amp;gt;\tilde O(nd + uKd)&amp;lt;/math&amp;gt;. Here, &amp;lt;math&amp;gt;u&amp;lt;/math&amp;gt; is an integer parameter, which is needed to guarantee a subgaussian decay rate of errors, which will be in &amp;lt;math&amp;gt;1-\exp(-\Theta(K+u))&amp;lt;/math&amp;gt;. Note that &amp;lt;math&amp;gt;Tr(\Sigma)&amp;lt;/math&amp;gt; is essentially the &amp;quot;effective dimension&amp;quot; of the data points.   &lt;/div&gt;&lt;/td&gt;&lt;td class='diff-marker'&gt;+&lt;/td&gt;&lt;td style=&quot;color: #222; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #a3d3ff; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;[https://arxiv.org/pdf/1811.09380.pdf CDG][https://dblp.org/rec/bibtex/conf/soda/0002D019 19] [https://arxiv.org/pdf/1906.03058 DepersinLecué][https://scholar.google.ch/scholar?hl=en&amp;amp;as_sdt=0%2C5&amp;amp;q=Robust+subgaussian+estimation+of+a+mean+vector+in+nearly+linear+time&amp;amp;btnG= 19] proved that such a bound can be achieved in quasi-linear-time, even for heavy-tailed distribution with bounded but unknown variance. [https://arxiv.org/pdf/1811.09380.pdf CDG][https://dblp.org/rec/bibtex/conf/soda/0002D019 19] proved that in the strong poisoning model, their quasi-linear-time algorithm achieves &amp;lt;math&amp;gt;O(\sqrt{\varepsilon}&lt;ins class=&quot;diffchange diffchange-inline&quot;&gt;)&lt;/ins&gt;&amp;lt;/math&amp;gt; error, which is asymptotically optimal in terms of performance and computation time. On the other hand, [https://arxiv.org/pdf/1906.03058 DepersinLecué][https://scholar.google.ch/scholar?hl=en&amp;amp;as_sdt=0%2C5&amp;amp;q=Robust+subgaussian+estimation+of+a+mean+vector+in+nearly+linear+time&amp;amp;btnG= 19] used an approach akin to median-of-means, for &amp;lt;math&amp;gt;K = \Omega(\varepsilon n)&amp;lt;/math&amp;gt;, they designed an algorithm that achieves an error &amp;lt;math&amp;gt;O\left( \sqrt{\frac{Tr(\Sigma)}{n}} + \sqrt{\frac{||\Sigma||_{op} K}{n}} \right)&amp;lt;/math&amp;gt; in time &amp;lt;math&amp;gt;\tilde O(nd + uKd)&amp;lt;/math&amp;gt;. Here, &amp;lt;math&amp;gt;u&amp;lt;/math&amp;gt; is an integer parameter, which is needed to guarantee a subgaussian decay rate of errors, which will be in &amp;lt;math&amp;gt;1-\exp(-\Theta(K+u))&amp;lt;/math&amp;gt;. Note that &amp;lt;math&amp;gt;Tr(\Sigma)&amp;lt;/math&amp;gt; is essentially the &amp;quot;effective dimension&amp;quot; of the data points.   &lt;/div&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td class='diff-marker'&gt; &lt;/td&gt;&lt;td style=&quot;background-color: #f8f9fa; color: #222; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;/td&gt;&lt;td class='diff-marker'&gt; &lt;/td&gt;&lt;td style=&quot;background-color: #f8f9fa; color: #222; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td class='diff-marker'&gt; &lt;/td&gt;&lt;td style=&quot;background-color: #f8f9fa; color: #222; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;Their technique relies on partitioning data into &amp;lt;math&amp;gt;K&amp;lt;/math&amp;gt; buckets, computing the means for each bucket, and replacing the computation of median of the means by a covering SDP that fits all configuration of bucket poisoning. It turns out that approximations of such an covering SDP can be founded in quasi-linear-time [https://arxiv.org/pdf/1201.5135 PTZ][https://dblp.org/rec/bibtex/conf/spaa/PengT12 12]. It turns out that, rather than being used to directly compute a mean estimator, this is actually used to perform gradient descent, starting from the coordinate-wise median, and then descending along the direction provided by the covering SDP. [https://arxiv.org/pdf/1906.03058 DepersinLecué][https://scholar.google.ch/scholar?hl=en&amp;amp;as_sdt=0%2C5&amp;amp;q=Robust+subgaussian+estimation+of+a+mean+vector+in+nearly+linear+time&amp;amp;btnG= 19] proved that only &amp;lt;math&amp;gt;O(\log d)&amp;lt;/math&amp;gt; such steps were needed to guarantee their bound.&lt;/div&gt;&lt;/td&gt;&lt;td class='diff-marker'&gt; &lt;/td&gt;&lt;td style=&quot;background-color: #f8f9fa; color: #222; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;Their technique relies on partitioning data into &amp;lt;math&amp;gt;K&amp;lt;/math&amp;gt; buckets, computing the means for each bucket, and replacing the computation of median of the means by a covering SDP that fits all configuration of bucket poisoning. It turns out that approximations of such an covering SDP can be founded in quasi-linear-time [https://arxiv.org/pdf/1201.5135 PTZ][https://dblp.org/rec/bibtex/conf/spaa/PengT12 12]. It turns out that, rather than being used to directly compute a mean estimator, this is actually used to perform gradient descent, starting from the coordinate-wise median, and then descending along the direction provided by the covering SDP. [https://arxiv.org/pdf/1906.03058 DepersinLecué][https://scholar.google.ch/scholar?hl=en&amp;amp;as_sdt=0%2C5&amp;amp;q=Robust+subgaussian+estimation+of+a+mean+vector+in+nearly+linear+time&amp;amp;btnG= 19] proved that only &amp;lt;math&amp;gt;O(\log d)&amp;lt;/math&amp;gt; such steps were needed to guarantee their bound.&lt;/div&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;/table&gt;</summary>
		<author><name>Lê Nguyên Hoang</name></author>
		
	</entry>
	<entry>
		<id>https://robustlybeneficial.org/wiki/index.php?title=Robust_statistics&amp;diff=230&amp;oldid=prev</id>
		<title>Lê Nguyên Hoang at 06:58, 26 February 2020</title>
		<link rel="alternate" type="text/html" href="https://robustlybeneficial.org/wiki/index.php?title=Robust_statistics&amp;diff=230&amp;oldid=prev"/>
		<updated>2020-02-26T06:58:43Z</updated>

		<summary type="html">&lt;p&gt;&lt;/p&gt;
&lt;table class=&quot;diff diff-contentalign-left&quot; data-mw=&quot;interface&quot;&gt;
				&lt;col class=&quot;diff-marker&quot; /&gt;
				&lt;col class=&quot;diff-content&quot; /&gt;
				&lt;col class=&quot;diff-marker&quot; /&gt;
				&lt;col class=&quot;diff-content&quot; /&gt;
				&lt;tr class=&quot;diff-title&quot; lang=&quot;en&quot;&gt;
				&lt;td colspan=&quot;2&quot; style=&quot;background-color: #fff; color: #222; text-align: center;&quot;&gt;← Older revision&lt;/td&gt;
				&lt;td colspan=&quot;2&quot; style=&quot;background-color: #fff; color: #222; text-align: center;&quot;&gt;Revision as of 06:58, 26 February 2020&lt;/td&gt;
				&lt;/tr&gt;&lt;tr&gt;&lt;td colspan=&quot;2&quot; class=&quot;diff-lineno&quot; id=&quot;mw-diff-left-l3&quot; &gt;Line 3:&lt;/td&gt;
&lt;td colspan=&quot;2&quot; class=&quot;diff-lineno&quot;&gt;Line 3:&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td class='diff-marker'&gt; &lt;/td&gt;&lt;td style=&quot;background-color: #f8f9fa; color: #222; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;This question has arguably become crucial, as large-scale algorithms perform learning from users' data. Yet, clearly, if the algorithm is used by thousands, millions or billions of users, many of the data will likely be corrupted, because of bugs [https://www.youtube.com/watch?v=yb2zkxHDfUE standupmaths20], or because some users will maliciously want to exploit or attack the algorithm. This latter case is known as a &amp;lt;em&amp;gt;poisoning attack&amp;lt;/em&amp;gt;.&lt;/div&gt;&lt;/td&gt;&lt;td class='diff-marker'&gt; &lt;/td&gt;&lt;td style=&quot;background-color: #f8f9fa; color: #222; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;This question has arguably become crucial, as large-scale algorithms perform learning from users' data. Yet, clearly, if the algorithm is used by thousands, millions or billions of users, many of the data will likely be corrupted, because of bugs [https://www.youtube.com/watch?v=yb2zkxHDfUE standupmaths20], or because some users will maliciously want to exploit or attack the algorithm. This latter case is known as a &amp;lt;em&amp;gt;poisoning attack&amp;lt;/em&amp;gt;.&lt;/div&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td class='diff-marker'&gt; &lt;/td&gt;&lt;td style=&quot;background-color: #f8f9fa; color: #222; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;/td&gt;&lt;td class='diff-marker'&gt; &lt;/td&gt;&lt;td style=&quot;background-color: #f8f9fa; color: #222; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td class='diff-marker'&gt;−&lt;/td&gt;&lt;td style=&quot;color: #222; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #ffe49c; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;Over the last three years, there have been fascinating recent advances, both for classical statistical tasks [https://arxiv.org/pdf/1911.05911.pdf DiakonikolasKane][https://dblp.org/rec/bibtex/journals/corr/abs-1911-05911 19] [https://arxiv.org/pdf/1906.03058 DepersinLecué][https://scholar.google.ch/scholar?hl=en&amp;amp;as_sdt=0%2C5&amp;amp;q=Robust+subgaussian+estimation+of+a+mean+vector+in+nearly+linear+time&amp;amp;btnG= 19] and [[modern machine learning]] [http://papers.nips.cc/paper/6617-machine-learning-with-adversaries-byzantine-tolerant-gradient-descent.pdf BEGS][https://dblp.org/rec/bibtex/conf/nips/BlanchardMGS17 17] especially in very high dimensional settings such as [[training neural networks]] [http://proceedings.mlr.press/v80/mhamdi18a EGR]. We discussed robust statistics in [https://www.youtube.com/watch?v=QguWgfGsG-k RB2].&lt;/div&gt;&lt;/td&gt;&lt;td class='diff-marker'&gt;+&lt;/td&gt;&lt;td style=&quot;color: #222; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #a3d3ff; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;Over the last three years, there have been fascinating recent advances, both for classical statistical tasks [https://arxiv.org/pdf/1911.05911.pdf DiakonikolasKane][https://dblp.org/rec/bibtex/journals/corr/abs-1911-05911 &lt;ins class=&quot;diffchange diffchange-inline&quot;&gt;19] [https://arxiv.org/pdf/1811.09380.pdf CDG][https://dblp.org/rec/bibtex/conf/soda/0002D019 &lt;/ins&gt;19] [https://arxiv.org/pdf/1906.03058 DepersinLecué][https://scholar.google.ch/scholar?hl=en&amp;amp;as_sdt=0%2C5&amp;amp;q=Robust+subgaussian+estimation+of+a+mean+vector+in+nearly+linear+time&amp;amp;btnG= 19] and [[modern machine learning]] [http://papers.nips.cc/paper/6617-machine-learning-with-adversaries-byzantine-tolerant-gradient-descent.pdf BEGS][https://dblp.org/rec/bibtex/conf/nips/BlanchardMGS17 17] especially in very high dimensional settings such as [[training neural networks]] [http://proceedings.mlr.press/v80/mhamdi18a EGR]. We discussed robust statistics in [https://www.youtube.com/watch?v=QguWgfGsG-k RB2].&lt;/div&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td class='diff-marker'&gt; &lt;/td&gt;&lt;td style=&quot;background-color: #f8f9fa; color: #222; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;/td&gt;&lt;td class='diff-marker'&gt; &lt;/td&gt;&lt;td style=&quot;background-color: #f8f9fa; color: #222; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td class='diff-marker'&gt; &lt;/td&gt;&lt;td style=&quot;background-color: #f8f9fa; color: #222; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;== Example of the median ==&lt;/div&gt;&lt;/td&gt;&lt;td class='diff-marker'&gt; &lt;/td&gt;&lt;td style=&quot;background-color: #f8f9fa; color: #222; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;== Example of the median ==&lt;/div&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td colspan=&quot;2&quot; class=&quot;diff-lineno&quot; id=&quot;mw-diff-left-l33&quot; &gt;Line 33:&lt;/td&gt;
&lt;td colspan=&quot;2&quot; class=&quot;diff-lineno&quot;&gt;Line 33:&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td class='diff-marker'&gt; &lt;/td&gt;&lt;td style=&quot;background-color: #f8f9fa; color: #222; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;On the other hand, assuming the true distribution is a spherical normal distribution &amp;lt;math&amp;gt;\mathcal N(0,I)&amp;lt;/math&amp;gt;, Tukey proposed another approach based on identifying the directions of largest variances, since these are likely to be the &amp;quot;attack line&amp;quot; of the adversary [https://scholar.google.ch/scholar?hl=en&amp;amp;as_sdt=0%2C5&amp;amp;q=Mathematics+and+the+picturing+of+data+tukey&amp;amp;btnG= Tukey75]. &amp;lt;em&amp;gt;Tukey's median&amp;lt;/em&amp;gt; yields &amp;lt;math&amp;gt;O(\varepsilon)&amp;lt;/math&amp;gt;-error with high probability &amp;lt;math&amp;gt;1-\tau&amp;lt;/math&amp;gt; for &amp;lt;math&amp;gt;n=\Omega\left( \frac{d+\log(1/\tau)}{\varepsilon^2}\right)&amp;lt;/math&amp;gt; data points. Unfortunately, Tukey's median is NP-hard to compute, and is typically exponential in d.&lt;/div&gt;&lt;/td&gt;&lt;td class='diff-marker'&gt; &lt;/td&gt;&lt;td style=&quot;background-color: #f8f9fa; color: #222; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;On the other hand, assuming the true distribution is a spherical normal distribution &amp;lt;math&amp;gt;\mathcal N(0,I)&amp;lt;/math&amp;gt;, Tukey proposed another approach based on identifying the directions of largest variances, since these are likely to be the &amp;quot;attack line&amp;quot; of the adversary [https://scholar.google.ch/scholar?hl=en&amp;amp;as_sdt=0%2C5&amp;amp;q=Mathematics+and+the+picturing+of+data+tukey&amp;amp;btnG= Tukey75]. &amp;lt;em&amp;gt;Tukey's median&amp;lt;/em&amp;gt; yields &amp;lt;math&amp;gt;O(\varepsilon)&amp;lt;/math&amp;gt;-error with high probability &amp;lt;math&amp;gt;1-\tau&amp;lt;/math&amp;gt; for &amp;lt;math&amp;gt;n=\Omega\left( \frac{d+\log(1/\tau)}{\varepsilon^2}\right)&amp;lt;/math&amp;gt; data points. Unfortunately, Tukey's median is NP-hard to compute, and is typically exponential in d.&lt;/div&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td class='diff-marker'&gt; &lt;/td&gt;&lt;td style=&quot;background-color: #f8f9fa; color: #222; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;/td&gt;&lt;td class='diff-marker'&gt; &lt;/td&gt;&lt;td style=&quot;background-color: #f8f9fa; color: #222; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td class='diff-marker'&gt;−&lt;/td&gt;&lt;td style=&quot;color: #222; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #ffe49c; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;[https://arxiv.org/pdf/1906.03058 DepersinLecué][https://scholar.google.ch/scholar?hl=en&amp;amp;as_sdt=0%2C5&amp;amp;q=Robust+subgaussian+estimation+of+a+mean+vector+in+nearly+linear+time&amp;amp;btnG= 19] proved that such a bound can be achieved &lt;del class=&quot;diffchange diffchange-inline&quot;&gt;for additive poisoning &lt;/del&gt;in quasi-linear-time, even for heavy-tailed distribution with bounded but unknown variance. &lt;del class=&quot;diffchange diffchange-inline&quot;&gt;More precisely&lt;/del&gt;, &lt;del class=&quot;diffchange diffchange-inline&quot;&gt;by using &lt;/del&gt;an approach akin to median-of-means, for &amp;lt;math&amp;gt;K = \Omega(\varepsilon n)&amp;lt;/math&amp;gt;, they designed an algorithm that achieves an error &amp;lt;math&amp;gt;O\left( \sqrt{\frac{Tr(\Sigma)}{n}} + \sqrt{\frac{||\Sigma||_{op} K}{n}} \right)&amp;lt;/math&amp;gt; in time &amp;lt;math&amp;gt;\tilde O(nd + uKd)&amp;lt;/math&amp;gt;. Here, &amp;lt;math&amp;gt;u&amp;lt;/math&amp;gt; is an integer parameter, which is needed to guarantee a subgaussian decay rate of errors, which will be in &amp;lt;math&amp;gt;1-\exp(-\Theta(K+u))&amp;lt;/math&amp;gt;. Note that &amp;lt;math&amp;gt;Tr(\Sigma)&amp;lt;/math&amp;gt; is essentially the &amp;quot;effective dimension&amp;quot; of the data points.   &lt;/div&gt;&lt;/td&gt;&lt;td class='diff-marker'&gt;+&lt;/td&gt;&lt;td style=&quot;color: #222; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #a3d3ff; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;&lt;ins class=&quot;diffchange diffchange-inline&quot;&gt;[https://arxiv.org/pdf/1811.09380.pdf CDG][https://dblp.org/rec/bibtex/conf/soda/0002D019 19] &lt;/ins&gt;[https://arxiv.org/pdf/1906.03058 DepersinLecué][https://scholar.google.ch/scholar?hl=en&amp;amp;as_sdt=0%2C5&amp;amp;q=Robust+subgaussian+estimation+of+a+mean+vector+in+nearly+linear+time&amp;amp;btnG= 19] proved that such a bound can be achieved in quasi-linear-time, even for heavy-tailed distribution with bounded but unknown variance. &lt;ins class=&quot;diffchange diffchange-inline&quot;&gt;[https://arxiv.org/pdf/1811.09380.pdf CDG][https://dblp.org/rec/bibtex/conf/soda/0002D019 19] proved that in the strong poisoning model, their quasi-linear-time algorithm achieves &amp;lt;math&amp;gt;O(\sqrt{\varepsilon}&amp;lt;/math&amp;gt; error, which is asymptotically optimal in terms of performance and computation time. On the other hand&lt;/ins&gt;, &lt;ins class=&quot;diffchange diffchange-inline&quot;&gt;[https://arxiv.org/pdf/1906.03058 DepersinLecué][https://scholar.google.ch/scholar?hl=en&amp;amp;as_sdt=0%2C5&amp;amp;q=Robust+subgaussian+estimation+of+a+mean+vector+in+nearly+linear+time&amp;amp;btnG= 19] used &lt;/ins&gt;an approach akin to median-of-means, for &amp;lt;math&amp;gt;K = \Omega(\varepsilon n)&amp;lt;/math&amp;gt;, they designed an algorithm that achieves an error &amp;lt;math&amp;gt;O\left( \sqrt{\frac{Tr(\Sigma)}{n}} + \sqrt{\frac{||\Sigma||_{op} K}{n}} \right)&amp;lt;/math&amp;gt; in time &amp;lt;math&amp;gt;\tilde O(nd + uKd)&amp;lt;/math&amp;gt;. Here, &amp;lt;math&amp;gt;u&amp;lt;/math&amp;gt; is an integer parameter, which is needed to guarantee a subgaussian decay rate of errors, which will be in &amp;lt;math&amp;gt;1-\exp(-\Theta(K+u))&amp;lt;/math&amp;gt;. Note that &amp;lt;math&amp;gt;Tr(\Sigma)&amp;lt;/math&amp;gt; is essentially the &amp;quot;effective dimension&amp;quot; of the data points.   &lt;/div&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td class='diff-marker'&gt; &lt;/td&gt;&lt;td style=&quot;background-color: #f8f9fa; color: #222; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;/td&gt;&lt;td class='diff-marker'&gt; &lt;/td&gt;&lt;td style=&quot;background-color: #f8f9fa; color: #222; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td class='diff-marker'&gt; &lt;/td&gt;&lt;td style=&quot;background-color: #f8f9fa; color: #222; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;Their technique relies on partitioning data into &amp;lt;math&amp;gt;K&amp;lt;/math&amp;gt; buckets, computing the means for each bucket, and replacing the computation of median of the means by a covering SDP that fits all configuration of bucket poisoning. It turns out that approximations of such an covering SDP can be founded in quasi-linear-time [https://arxiv.org/pdf/1201.5135 PTZ][https://dblp.org/rec/bibtex/conf/spaa/PengT12 12]. It turns out that, rather than being used to directly compute a mean estimator, this is actually used to perform gradient descent, starting from the coordinate-wise median, and then descending along the direction provided by the covering SDP. [https://arxiv.org/pdf/1906.03058 DepersinLecué][https://scholar.google.ch/scholar?hl=en&amp;amp;as_sdt=0%2C5&amp;amp;q=Robust+subgaussian+estimation+of+a+mean+vector+in+nearly+linear+time&amp;amp;btnG= 19] proved that only &amp;lt;math&amp;gt;O(\log d)&amp;lt;/math&amp;gt; such steps were needed to guarantee their bound.&lt;/div&gt;&lt;/td&gt;&lt;td class='diff-marker'&gt; &lt;/td&gt;&lt;td style=&quot;background-color: #f8f9fa; color: #222; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;Their technique relies on partitioning data into &amp;lt;math&amp;gt;K&amp;lt;/math&amp;gt; buckets, computing the means for each bucket, and replacing the computation of median of the means by a covering SDP that fits all configuration of bucket poisoning. It turns out that approximations of such an covering SDP can be founded in quasi-linear-time [https://arxiv.org/pdf/1201.5135 PTZ][https://dblp.org/rec/bibtex/conf/spaa/PengT12 12]. It turns out that, rather than being used to directly compute a mean estimator, this is actually used to perform gradient descent, starting from the coordinate-wise median, and then descending along the direction provided by the covering SDP. [https://arxiv.org/pdf/1906.03058 DepersinLecué][https://scholar.google.ch/scholar?hl=en&amp;amp;as_sdt=0%2C5&amp;amp;q=Robust+subgaussian+estimation+of+a+mean+vector+in+nearly+linear+time&amp;amp;btnG= 19] proved that only &amp;lt;math&amp;gt;O(\log d)&amp;lt;/math&amp;gt; such steps were needed to guarantee their bound.&lt;/div&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;/table&gt;</summary>
		<author><name>Lê Nguyên Hoang</name></author>
		
	</entry>
	<entry>
		<id>https://robustlybeneficial.org/wiki/index.php?title=Robust_statistics&amp;diff=203&amp;oldid=prev</id>
		<title>Lê Nguyên Hoang: /* Poisoning models */</title>
		<link rel="alternate" type="text/html" href="https://robustlybeneficial.org/wiki/index.php?title=Robust_statistics&amp;diff=203&amp;oldid=prev"/>
		<updated>2020-02-11T08:50:26Z</updated>

		<summary type="html">&lt;p&gt;&lt;span dir=&quot;auto&quot;&gt;&lt;span class=&quot;autocomment&quot;&gt;Poisoning models&lt;/span&gt;&lt;/span&gt;&lt;/p&gt;
&lt;table class=&quot;diff diff-contentalign-left&quot; data-mw=&quot;interface&quot;&gt;
				&lt;col class=&quot;diff-marker&quot; /&gt;
				&lt;col class=&quot;diff-content&quot; /&gt;
				&lt;col class=&quot;diff-marker&quot; /&gt;
				&lt;col class=&quot;diff-content&quot; /&gt;
				&lt;tr class=&quot;diff-title&quot; lang=&quot;en&quot;&gt;
				&lt;td colspan=&quot;2&quot; style=&quot;background-color: #fff; color: #222; text-align: center;&quot;&gt;← Older revision&lt;/td&gt;
				&lt;td colspan=&quot;2&quot; style=&quot;background-color: #fff; color: #222; text-align: center;&quot;&gt;Revision as of 08:50, 11 February 2020&lt;/td&gt;
				&lt;/tr&gt;&lt;tr&gt;&lt;td colspan=&quot;2&quot; class=&quot;diff-lineno&quot; id=&quot;mw-diff-left-l21&quot; &gt;Line 21:&lt;/td&gt;
&lt;td colspan=&quot;2&quot; class=&quot;diff-lineno&quot;&gt;Line 21:&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td class='diff-marker'&gt; &lt;/td&gt;&lt;td style=&quot;background-color: #f8f9fa; color: #222; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;Other models include an adversary with only erasing power, or an adversary that must choose its &amp;lt;em&amp;gt;outliers&amp;lt;/em&amp;gt; without knowledge of the values of the &amp;lt;em&amp;gt;inliers&amp;lt;/em&amp;gt;. Evidently, any guarantee for such weaker poisoning models will also hold for stronger poisoning models [https://arxiv.org/pdf/1911.05911.pdf DiakonikolasKane][https://dblp.org/rec/bibtex/journals/corr/abs-1911-05911 19].&lt;/div&gt;&lt;/td&gt;&lt;td class='diff-marker'&gt; &lt;/td&gt;&lt;td style=&quot;background-color: #f8f9fa; color: #222; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;Other models include an adversary with only erasing power, or an adversary that must choose its &amp;lt;em&amp;gt;outliers&amp;lt;/em&amp;gt; without knowledge of the values of the &amp;lt;em&amp;gt;inliers&amp;lt;/em&amp;gt;. Evidently, any guarantee for such weaker poisoning models will also hold for stronger poisoning models [https://arxiv.org/pdf/1911.05911.pdf DiakonikolasKane][https://dblp.org/rec/bibtex/journals/corr/abs-1911-05911 19].&lt;/div&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td class='diff-marker'&gt; &lt;/td&gt;&lt;td style=&quot;background-color: #f8f9fa; color: #222; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;/td&gt;&lt;td class='diff-marker'&gt; &lt;/td&gt;&lt;td style=&quot;background-color: #f8f9fa; color: #222; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td class='diff-marker'&gt;−&lt;/td&gt;&lt;td style=&quot;color: #222; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #ffe49c; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;Perhaps the most general form of poisoning attack is the following. Consider a &amp;quot;true dataset&amp;quot; &amp;lt;math&amp;gt;D&amp;lt;/math&amp;gt;. However, the attacker gets to distort the dataset using a distort function &amp;lt;math&amp;gt;f \in \mathcal F&amp;lt;/math&amp;gt;, thereby yielding &amp;lt;math&amp;gt;f(D)&amp;lt;/math&amp;gt;. Suppose we now have a best-possible machine learning algorithm &amp;lt;math&amp;gt;ML&amp;lt;/math&amp;gt; that learns from data. It would ideally compute &amp;lt;math&amp;gt;ML(D)&amp;lt;/math&amp;gt;. But we can only exploit &amp;lt;math&amp;gt;f(D)&amp;lt;/math&amp;gt;, by some hopefully robust machine learning algorithm &amp;lt;math&amp;gt;RML(f(D))&amp;lt;/math&amp;gt;. What we would like is to guarantee that &amp;lt;math&amp;gt;d(ML(D),RML(f(D))) &amp;lt; bound&amp;lt;/math&amp;gt;, for a suitable distance &amp;lt;math&amp;gt;d&amp;lt;/math&amp;gt; and any &amp;lt;math&amp;gt;f \in \mathcal F&amp;lt;/math&amp;gt;, where &amp;lt;math&amp;gt;RML&amp;lt;/math&amp;gt; is tractable. A further generalization of this could consist in assuming a prior probabilistic belief on the set &amp;lt;mathcal&lt;del class=&quot;diffchange diffchange-inline&quot;&gt;&amp;gt;&lt;/del&gt;F&amp;lt;/&lt;del class=&quot;diffchange diffchange-inline&quot;&gt;mathcal&lt;/del&gt;&amp;gt; of attack models that we need to defend against.&lt;/div&gt;&lt;/td&gt;&lt;td class='diff-marker'&gt;+&lt;/td&gt;&lt;td style=&quot;color: #222; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #a3d3ff; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;Perhaps the most general form of poisoning attack is the following. Consider a &amp;quot;true dataset&amp;quot; &amp;lt;math&amp;gt;D&amp;lt;/math&amp;gt;. However, the attacker gets to distort the dataset using a distort function &amp;lt;math&amp;gt;f \in \mathcal F&amp;lt;/math&amp;gt;, thereby yielding &amp;lt;math&amp;gt;f(D)&amp;lt;/math&amp;gt;. Suppose we now have a best-possible machine learning algorithm &amp;lt;math&amp;gt;ML&amp;lt;/math&amp;gt; that learns from data. It would ideally compute &amp;lt;math&amp;gt;ML(D)&amp;lt;/math&amp;gt;. But we can only exploit &amp;lt;math&amp;gt;f(D)&amp;lt;/math&amp;gt;, by some hopefully robust machine learning algorithm &amp;lt;math&amp;gt;RML(f(D))&amp;lt;/math&amp;gt;. What we would like is to guarantee that &amp;lt;math&amp;gt;d(ML(D),RML(f(D))) &amp;lt; bound&amp;lt;/math&amp;gt;, for a suitable distance &amp;lt;math&amp;gt;d&amp;lt;/math&amp;gt; and any &amp;lt;math&amp;gt;f \in \mathcal F&amp;lt;/math&amp;gt;, where &amp;lt;math&amp;gt;RML&amp;lt;/math&amp;gt; is tractable.  &lt;/div&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td colspan=&quot;2&quot;&gt; &lt;/td&gt;&lt;td class='diff-marker'&gt;+&lt;/td&gt;&lt;td style=&quot;color: #222; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #a3d3ff; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt; &lt;/div&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td colspan=&quot;2&quot;&gt; &lt;/td&gt;&lt;td class='diff-marker'&gt;+&lt;/td&gt;&lt;td style=&quot;color: #222; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #a3d3ff; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;A further generalization of this could consist in assuming a prior probabilistic belief on the set &amp;lt;&lt;ins class=&quot;diffchange diffchange-inline&quot;&gt;math&amp;gt;\&lt;/ins&gt;mathcal F&amp;lt;/&lt;ins class=&quot;diffchange diffchange-inline&quot;&gt;math&lt;/ins&gt;&amp;gt; of attack models that we need to defend against&lt;ins class=&quot;diffchange diffchange-inline&quot;&gt;. This would correspond to the study of Byzantine Bayesian learning, which we may model as &amp;lt;math&amp;gt;\max_{RML} \mathbb E_{\mathcal F} \left[ \min_{f \in \mathcal F} \mathbb E[u|D,RML(f(D))] \right]&amp;lt;/math&amp;gt;&lt;/ins&gt;.&lt;/div&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td class='diff-marker'&gt; &lt;/td&gt;&lt;td style=&quot;background-color: #f8f9fa; color: #222; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;/td&gt;&lt;td class='diff-marker'&gt; &lt;/td&gt;&lt;td style=&quot;background-color: #f8f9fa; color: #222; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td class='diff-marker'&gt; &lt;/td&gt;&lt;td style=&quot;background-color: #f8f9fa; color: #222; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;It is noteworthy that robustness to such attacks are useful, even if there are no adversary. Indeed, data may still get corrupted, because of bugs, crashes or misuse by a human operator (see for instance the case of gene mutations caused by Excel that plagued genetic research [https://www.sciencemag.org/news/2016/08/one-five-genetics-papers-contains-errors-thanks-microsoft-excel Boddy16]). Our algorithms need to remain performant despite such issues. An algorithm robust to strong attacks will be robust to such weaker flaws.&lt;/div&gt;&lt;/td&gt;&lt;td class='diff-marker'&gt; &lt;/td&gt;&lt;td style=&quot;background-color: #f8f9fa; color: #222; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;It is noteworthy that robustness to such attacks are useful, even if there are no adversary. Indeed, data may still get corrupted, because of bugs, crashes or misuse by a human operator (see for instance the case of gene mutations caused by Excel that plagued genetic research [https://www.sciencemag.org/news/2016/08/one-five-genetics-papers-contains-errors-thanks-microsoft-excel Boddy16]). Our algorithms need to remain performant despite such issues. An algorithm robust to strong attacks will be robust to such weaker flaws.&lt;/div&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;/table&gt;</summary>
		<author><name>Lê Nguyên Hoang</name></author>
		
	</entry>
	<entry>
		<id>https://robustlybeneficial.org/wiki/index.php?title=Robust_statistics&amp;diff=163&amp;oldid=prev</id>
		<title>Lê Nguyên Hoang at 08:33, 2 February 2020</title>
		<link rel="alternate" type="text/html" href="https://robustlybeneficial.org/wiki/index.php?title=Robust_statistics&amp;diff=163&amp;oldid=prev"/>
		<updated>2020-02-02T08:33:28Z</updated>

		<summary type="html">&lt;p&gt;&lt;/p&gt;
&lt;table class=&quot;diff diff-contentalign-left&quot; data-mw=&quot;interface&quot;&gt;
				&lt;col class=&quot;diff-marker&quot; /&gt;
				&lt;col class=&quot;diff-content&quot; /&gt;
				&lt;col class=&quot;diff-marker&quot; /&gt;
				&lt;col class=&quot;diff-content&quot; /&gt;
				&lt;tr class=&quot;diff-title&quot; lang=&quot;en&quot;&gt;
				&lt;td colspan=&quot;2&quot; style=&quot;background-color: #fff; color: #222; text-align: center;&quot;&gt;← Older revision&lt;/td&gt;
				&lt;td colspan=&quot;2&quot; style=&quot;background-color: #fff; color: #222; text-align: center;&quot;&gt;Revision as of 08:33, 2 February 2020&lt;/td&gt;
				&lt;/tr&gt;&lt;tr&gt;&lt;td colspan=&quot;2&quot; class=&quot;diff-lineno&quot; id=&quot;mw-diff-left-l3&quot; &gt;Line 3:&lt;/td&gt;
&lt;td colspan=&quot;2&quot; class=&quot;diff-lineno&quot;&gt;Line 3:&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td class='diff-marker'&gt; &lt;/td&gt;&lt;td style=&quot;background-color: #f8f9fa; color: #222; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;This question has arguably become crucial, as large-scale algorithms perform learning from users' data. Yet, clearly, if the algorithm is used by thousands, millions or billions of users, many of the data will likely be corrupted, because of bugs [https://www.youtube.com/watch?v=yb2zkxHDfUE standupmaths20], or because some users will maliciously want to exploit or attack the algorithm. This latter case is known as a &amp;lt;em&amp;gt;poisoning attack&amp;lt;/em&amp;gt;.&lt;/div&gt;&lt;/td&gt;&lt;td class='diff-marker'&gt; &lt;/td&gt;&lt;td style=&quot;background-color: #f8f9fa; color: #222; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;This question has arguably become crucial, as large-scale algorithms perform learning from users' data. Yet, clearly, if the algorithm is used by thousands, millions or billions of users, many of the data will likely be corrupted, because of bugs [https://www.youtube.com/watch?v=yb2zkxHDfUE standupmaths20], or because some users will maliciously want to exploit or attack the algorithm. This latter case is known as a &amp;lt;em&amp;gt;poisoning attack&amp;lt;/em&amp;gt;.&lt;/div&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td class='diff-marker'&gt; &lt;/td&gt;&lt;td style=&quot;background-color: #f8f9fa; color: #222; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;/td&gt;&lt;td class='diff-marker'&gt; &lt;/td&gt;&lt;td style=&quot;background-color: #f8f9fa; color: #222; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td class='diff-marker'&gt;−&lt;/td&gt;&lt;td style=&quot;color: #222; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #ffe49c; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;Over the last three years, there have been fascinating recent advances, both for classical statistical tasks [https://arxiv.org/pdf/1911.05911.pdf DiakonikolasKane][https://dblp.org/rec/bibtex/journals/corr/abs-1911-05911 19] [https://arxiv.org/pdf/1906.03058 DepersinLecué][https://scholar.google.ch/scholar?hl=en&amp;amp;as_sdt=0%2C5&amp;amp;q=Robust+subgaussian+estimation+of+a+mean+vector+in+nearly+linear+time&amp;amp;btnG= 19] and [[modern machine learning]] [http://papers.nips.cc/paper/6617-machine-learning-with-adversaries-byzantine-tolerant-gradient-descent.pdf BEGS][https://dblp.org/rec/bibtex/conf/nips/BlanchardMGS17 17] especially in very high dimensional settings such as [[training neural networks]] [http://proceedings.mlr.press/v80/mhamdi18a EGR].&lt;/div&gt;&lt;/td&gt;&lt;td class='diff-marker'&gt;+&lt;/td&gt;&lt;td style=&quot;color: #222; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #a3d3ff; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;Over the last three years, there have been fascinating recent advances, both for classical statistical tasks [https://arxiv.org/pdf/1911.05911.pdf DiakonikolasKane][https://dblp.org/rec/bibtex/journals/corr/abs-1911-05911 19] [https://arxiv.org/pdf/1906.03058 DepersinLecué][https://scholar.google.ch/scholar?hl=en&amp;amp;as_sdt=0%2C5&amp;amp;q=Robust+subgaussian+estimation+of+a+mean+vector+in+nearly+linear+time&amp;amp;btnG= 19] and [[modern machine learning]] [http://papers.nips.cc/paper/6617-machine-learning-with-adversaries-byzantine-tolerant-gradient-descent.pdf BEGS][https://dblp.org/rec/bibtex/conf/nips/BlanchardMGS17 17] especially in very high dimensional settings such as [[training neural networks]] [http://proceedings.mlr.press/v80/mhamdi18a EGR&lt;ins class=&quot;diffchange diffchange-inline&quot;&gt;]. We discussed robust statistics in [https://www.youtube.com/watch?v=QguWgfGsG-k RB2&lt;/ins&gt;].&lt;/div&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td class='diff-marker'&gt; &lt;/td&gt;&lt;td style=&quot;background-color: #f8f9fa; color: #222; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;/td&gt;&lt;td class='diff-marker'&gt; &lt;/td&gt;&lt;td style=&quot;background-color: #f8f9fa; color: #222; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td class='diff-marker'&gt; &lt;/td&gt;&lt;td style=&quot;background-color: #f8f9fa; color: #222; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;== Example of the median ==&lt;/div&gt;&lt;/td&gt;&lt;td class='diff-marker'&gt; &lt;/td&gt;&lt;td style=&quot;background-color: #f8f9fa; color: #222; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;== Example of the median ==&lt;/div&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;/table&gt;</summary>
		<author><name>Lê Nguyên Hoang</name></author>
		
	</entry>
	<entry>
		<id>https://robustlybeneficial.org/wiki/index.php?title=Robust_statistics&amp;diff=113&amp;oldid=prev</id>
		<title>El Mahdi El Mhamdi at 10:31, 27 January 2020</title>
		<link rel="alternate" type="text/html" href="https://robustlybeneficial.org/wiki/index.php?title=Robust_statistics&amp;diff=113&amp;oldid=prev"/>
		<updated>2020-01-27T10:31:46Z</updated>

		<summary type="html">&lt;p&gt;&lt;/p&gt;
&lt;table class=&quot;diff diff-contentalign-left&quot; data-mw=&quot;interface&quot;&gt;
				&lt;col class=&quot;diff-marker&quot; /&gt;
				&lt;col class=&quot;diff-content&quot; /&gt;
				&lt;col class=&quot;diff-marker&quot; /&gt;
				&lt;col class=&quot;diff-content&quot; /&gt;
				&lt;tr class=&quot;diff-title&quot; lang=&quot;en&quot;&gt;
				&lt;td colspan=&quot;2&quot; style=&quot;background-color: #fff; color: #222; text-align: center;&quot;&gt;← Older revision&lt;/td&gt;
				&lt;td colspan=&quot;2&quot; style=&quot;background-color: #fff; color: #222; text-align: center;&quot;&gt;Revision as of 10:31, 27 January 2020&lt;/td&gt;
				&lt;/tr&gt;&lt;tr&gt;&lt;td colspan=&quot;2&quot; class=&quot;diff-lineno&quot; id=&quot;mw-diff-left-l3&quot; &gt;Line 3:&lt;/td&gt;
&lt;td colspan=&quot;2&quot; class=&quot;diff-lineno&quot;&gt;Line 3:&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td class='diff-marker'&gt; &lt;/td&gt;&lt;td style=&quot;background-color: #f8f9fa; color: #222; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;This question has arguably become crucial, as large-scale algorithms perform learning from users' data. Yet, clearly, if the algorithm is used by thousands, millions or billions of users, many of the data will likely be corrupted, because of bugs [https://www.youtube.com/watch?v=yb2zkxHDfUE standupmaths20], or because some users will maliciously want to exploit or attack the algorithm. This latter case is known as a &amp;lt;em&amp;gt;poisoning attack&amp;lt;/em&amp;gt;.&lt;/div&gt;&lt;/td&gt;&lt;td class='diff-marker'&gt; &lt;/td&gt;&lt;td style=&quot;background-color: #f8f9fa; color: #222; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;This question has arguably become crucial, as large-scale algorithms perform learning from users' data. Yet, clearly, if the algorithm is used by thousands, millions or billions of users, many of the data will likely be corrupted, because of bugs [https://www.youtube.com/watch?v=yb2zkxHDfUE standupmaths20], or because some users will maliciously want to exploit or attack the algorithm. This latter case is known as a &amp;lt;em&amp;gt;poisoning attack&amp;lt;/em&amp;gt;.&lt;/div&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td class='diff-marker'&gt; &lt;/td&gt;&lt;td style=&quot;background-color: #f8f9fa; color: #222; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;/td&gt;&lt;td class='diff-marker'&gt; &lt;/td&gt;&lt;td style=&quot;background-color: #f8f9fa; color: #222; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td class='diff-marker'&gt;−&lt;/td&gt;&lt;td style=&quot;color: #222; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #ffe49c; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;Over the last three years, there have been fascinating recent advances, both for classical statistical tasks [https://arxiv.org/pdf/1911.05911.pdf DiakonikolasKane][https://dblp.org/rec/bibtex/journals/corr/abs-1911-05911 19] [https://arxiv.org/pdf/1906.03058 DepersinLecué][https://scholar.google.ch/scholar?hl=en&amp;amp;as_sdt=0%2C5&amp;amp;q=Robust+subgaussian+estimation+of+a+mean+vector+in+nearly+linear+time&amp;amp;btnG= 19] and [[&lt;del class=&quot;diffchange diffchange-inline&quot;&gt;neural networks&lt;/del&gt;]] [http://papers.nips.cc/paper/6617-machine-learning-with-adversaries-byzantine-tolerant-gradient-descent.pdf &lt;del class=&quot;diffchange diffchange-inline&quot;&gt;BMGS&lt;/del&gt;][https://dblp.org/rec/bibtex/conf/nips/BlanchardMGS17 17].&lt;/div&gt;&lt;/td&gt;&lt;td class='diff-marker'&gt;+&lt;/td&gt;&lt;td style=&quot;color: #222; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #a3d3ff; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;Over the last three years, there have been fascinating recent advances, both for classical statistical tasks [https://arxiv.org/pdf/1911.05911.pdf DiakonikolasKane][https://dblp.org/rec/bibtex/journals/corr/abs-1911-05911 19] [https://arxiv.org/pdf/1906.03058 DepersinLecué][https://scholar.google.ch/scholar?hl=en&amp;amp;as_sdt=0%2C5&amp;amp;q=Robust+subgaussian+estimation+of+a+mean+vector+in+nearly+linear+time&amp;amp;btnG= 19] and [[&lt;ins class=&quot;diffchange diffchange-inline&quot;&gt;modern machine learning&lt;/ins&gt;]] [http://papers.nips.cc/paper/6617-machine-learning-with-adversaries-byzantine-tolerant-gradient-descent.pdf &lt;ins class=&quot;diffchange diffchange-inline&quot;&gt;BEGS&lt;/ins&gt;][https://dblp.org/rec/bibtex/conf/nips/BlanchardMGS17 17&lt;ins class=&quot;diffchange diffchange-inline&quot;&gt;] especially in very high dimensional settings such as [[training neural networks]] [http://proceedings.mlr.press/v80/mhamdi18a EGR&lt;/ins&gt;].&lt;/div&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td class='diff-marker'&gt; &lt;/td&gt;&lt;td style=&quot;background-color: #f8f9fa; color: #222; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;/td&gt;&lt;td class='diff-marker'&gt; &lt;/td&gt;&lt;td style=&quot;background-color: #f8f9fa; color: #222; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td class='diff-marker'&gt; &lt;/td&gt;&lt;td style=&quot;background-color: #f8f9fa; color: #222; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;== Example of the median ==&lt;/div&gt;&lt;/td&gt;&lt;td class='diff-marker'&gt; &lt;/td&gt;&lt;td style=&quot;background-color: #f8f9fa; color: #222; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;== Example of the median ==&lt;/div&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td colspan=&quot;2&quot; class=&quot;diff-lineno&quot; id=&quot;mw-diff-left-l61&quot; &gt;Line 61:&lt;/td&gt;
&lt;td colspan=&quot;2&quot; class=&quot;diff-lineno&quot;&gt;Line 61:&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td class='diff-marker'&gt; &lt;/td&gt;&lt;td style=&quot;background-color: #f8f9fa; color: #222; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;The main application of robust statistics (at least relevant to AI ethics) seems to be the aggregation of [[stochastic gradient descent|stochastic gradients]] for [[neural networks]]. In this setting, even a linear-time algorithm in &amp;lt;math&amp;gt;\Omega(nd)&amp;lt;/math&amp;gt; is impractical if we demand &amp;lt;math&amp;gt;n \geq d&amp;lt;/math&amp;gt; (which is necessary to have dimension-independent guarantees). In practice, this setting is often carried out with batches whose size &amp;lt;math&amp;gt;n&amp;lt;/math&amp;gt; is significantly smaller than &amp;lt;math&amp;gt;d&amp;lt;/math&amp;gt;. In fact, despite conventional wisdom and PAC-learning theory, it seems that &amp;lt;math&amp;gt;n \ll d&amp;lt;/math&amp;gt; may be a desirable setting to do neural network learning (see [[overfitting]] where we discuss double descent). For &amp;lt;math&amp;gt;n \ll d&amp;lt;/math&amp;gt;, is there a gain in using algorithms more complex than coordinate-wise median?&lt;/div&gt;&lt;/td&gt;&lt;td class='diff-marker'&gt; &lt;/td&gt;&lt;td style=&quot;background-color: #f8f9fa; color: #222; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;The main application of robust statistics (at least relevant to AI ethics) seems to be the aggregation of [[stochastic gradient descent|stochastic gradients]] for [[neural networks]]. In this setting, even a linear-time algorithm in &amp;lt;math&amp;gt;\Omega(nd)&amp;lt;/math&amp;gt; is impractical if we demand &amp;lt;math&amp;gt;n \geq d&amp;lt;/math&amp;gt; (which is necessary to have dimension-independent guarantees). In practice, this setting is often carried out with batches whose size &amp;lt;math&amp;gt;n&amp;lt;/math&amp;gt; is significantly smaller than &amp;lt;math&amp;gt;d&amp;lt;/math&amp;gt;. In fact, despite conventional wisdom and PAC-learning theory, it seems that &amp;lt;math&amp;gt;n \ll d&amp;lt;/math&amp;gt; may be a desirable setting to do neural network learning (see [[overfitting]] where we discuss double descent). For &amp;lt;math&amp;gt;n \ll d&amp;lt;/math&amp;gt;, is there a gain in using algorithms more complex than coordinate-wise median?&lt;/div&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td class='diff-marker'&gt; &lt;/td&gt;&lt;td style=&quot;background-color: #f8f9fa; color: #222; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;/td&gt;&lt;td class='diff-marker'&gt; &lt;/td&gt;&lt;td style=&quot;background-color: #f8f9fa; color: #222; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td class='diff-marker'&gt;−&lt;/td&gt;&lt;td style=&quot;color: #222; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #ffe49c; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;[http://papers.nips.cc/paper/6617-machine-learning-with-adversaries-byzantine-tolerant-gradient-descent.pdf &lt;del class=&quot;diffchange diffchange-inline&quot;&gt;BMGS&lt;/del&gt;][https://dblp.org/rec/bibtex/conf/nips/BlanchardMGS17 17] proposed Krum and multi-Krum, aggregation algorithms for this setting that have weaker robustness guarantees but are more efficient. Is it possible to improve upon them?&lt;/div&gt;&lt;/td&gt;&lt;td class='diff-marker'&gt;+&lt;/td&gt;&lt;td style=&quot;color: #222; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #a3d3ff; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;[http://papers.nips.cc/paper/6617-machine-learning-with-adversaries-byzantine-tolerant-gradient-descent.pdf &lt;ins class=&quot;diffchange diffchange-inline&quot;&gt;BEGS&lt;/ins&gt;][https://dblp.org/rec/bibtex/conf/nips/BlanchardMGS17 17] proposed Krum and multi-Krum, aggregation algorithms for this setting that have weaker robustness guarantees but are more efficient. Is it possible to improve upon them?&lt;/div&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td colspan=&quot;2&quot;&gt; &lt;/td&gt;&lt;td class='diff-marker'&gt;+&lt;/td&gt;&lt;td style=&quot;color: #222; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #a3d3ff; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt; &lt;/div&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td colspan=&quot;2&quot;&gt; &lt;/td&gt;&lt;td class='diff-marker'&gt;+&lt;/td&gt;&lt;td style=&quot;color: #222; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #a3d3ff; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;&lt;ins class=&quot;diffchange diffchange-inline&quot;&gt;== Robust statistics for agreement and multi-agent settings ==&lt;/ins&gt;&lt;/div&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td colspan=&quot;2&quot;&gt; &lt;/td&gt;&lt;td class='diff-marker'&gt;+&lt;/td&gt;&lt;td style=&quot;color: #222; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #a3d3ff; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt; &lt;/div&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td colspan=&quot;2&quot;&gt; &lt;/td&gt;&lt;td class='diff-marker'&gt;+&lt;/td&gt;&lt;td style=&quot;color: #222; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #a3d3ff; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;&lt;ins class=&quot;diffchange diffchange-inline&quot;&gt;Another venue where robust statistics are needed is the increasingly multi-agent setting that modern AI is built on. [https://dl.acm.org/doi/10.1145/2488608.2488657 MH2013]&lt;/ins&gt;&lt;/div&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;/table&gt;</summary>
		<author><name>El Mahdi El Mhamdi</name></author>
		
	</entry>
	<entry>
		<id>https://robustlybeneficial.org/wiki/index.php?title=Robust_statistics&amp;diff=111&amp;oldid=prev</id>
		<title>Lê Nguyên Hoang: /* What if there are more outliers than inliers? */</title>
		<link rel="alternate" type="text/html" href="https://robustlybeneficial.org/wiki/index.php?title=Robust_statistics&amp;diff=111&amp;oldid=prev"/>
		<updated>2020-01-26T20:52:03Z</updated>

		<summary type="html">&lt;p&gt;&lt;span dir=&quot;auto&quot;&gt;&lt;span class=&quot;autocomment&quot;&gt;What if there are more outliers than inliers?&lt;/span&gt;&lt;/span&gt;&lt;/p&gt;
&lt;table class=&quot;diff diff-contentalign-left&quot; data-mw=&quot;interface&quot;&gt;
				&lt;col class=&quot;diff-marker&quot; /&gt;
				&lt;col class=&quot;diff-content&quot; /&gt;
				&lt;col class=&quot;diff-marker&quot; /&gt;
				&lt;col class=&quot;diff-content&quot; /&gt;
				&lt;tr class=&quot;diff-title&quot; lang=&quot;en&quot;&gt;
				&lt;td colspan=&quot;2&quot; style=&quot;background-color: #fff; color: #222; text-align: center;&quot;&gt;← Older revision&lt;/td&gt;
				&lt;td colspan=&quot;2&quot; style=&quot;background-color: #fff; color: #222; text-align: center;&quot;&gt;Revision as of 20:52, 26 January 2020&lt;/td&gt;
				&lt;/tr&gt;&lt;tr&gt;&lt;td colspan=&quot;2&quot; class=&quot;diff-lineno&quot; id=&quot;mw-diff-left-l55&quot; &gt;Line 55:&lt;/td&gt;
&lt;td colspan=&quot;2&quot; class=&quot;diff-lineno&quot;&gt;Line 55:&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td class='diff-marker'&gt; &lt;/td&gt;&lt;td style=&quot;background-color: #f8f9fa; color: #222; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;Second, one might try to a apply a robust Bayesian inference to the data, which would yield a set of posterior beliefs. This framework has yet to be defined.  &lt;/div&gt;&lt;/td&gt;&lt;td class='diff-marker'&gt; &lt;/td&gt;&lt;td style=&quot;background-color: #f8f9fa; color: #222; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;Second, one might try to a apply a robust Bayesian inference to the data, which would yield a set of posterior beliefs. This framework has yet to be defined.  &lt;/div&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td class='diff-marker'&gt; &lt;/td&gt;&lt;td style=&quot;background-color: #f8f9fa; color: #222; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;/td&gt;&lt;td class='diff-marker'&gt; &lt;/td&gt;&lt;td style=&quot;background-color: #f8f9fa; color: #222; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td class='diff-marker'&gt;−&lt;/td&gt;&lt;td style=&quot;color: #222; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #ffe49c; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;Third, we may assume that the data are more or less [[data certification|certified]]. This is a natural setting, as we humans often judge the reliability of raw data depending on its source, and we usually consider a continuum of reliability, rather than a clear cut binary classification. Algorithms should probably do the same at some point, but there does not yet seem to be an algorithmic framework to pose this problem. In particular, it could be interesting to analyze threat models.&lt;/div&gt;&lt;/td&gt;&lt;td class='diff-marker'&gt;+&lt;/td&gt;&lt;td style=&quot;color: #222; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #a3d3ff; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;Third, we may assume that the data are more or less [[data certification|certified]]. This is a natural setting, as we humans often judge the reliability of raw data depending on its source, and we usually consider a continuum of reliability, rather than a clear cut binary classification. Algorithms should probably do the same at some point, but there does not yet seem to be an algorithmic framework to pose this problem. In particular, it could be interesting to analyze threat models &lt;ins class=&quot;diffchange diffchange-inline&quot;&gt;where different degrees or sorts of certification have different levels of liability&lt;/ins&gt;.&lt;/div&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td class='diff-marker'&gt; &lt;/td&gt;&lt;td style=&quot;background-color: #f8f9fa; color: #222; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;/td&gt;&lt;td class='diff-marker'&gt; &lt;/td&gt;&lt;td style=&quot;background-color: #f8f9fa; color: #222; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td class='diff-marker'&gt; &lt;/td&gt;&lt;td style=&quot;background-color: #f8f9fa; color: #222; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;== Robust statistics for neural networks ==&lt;/div&gt;&lt;/td&gt;&lt;td class='diff-marker'&gt; &lt;/td&gt;&lt;td style=&quot;background-color: #f8f9fa; color: #222; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;== Robust statistics for neural networks ==&lt;/div&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;/table&gt;</summary>
		<author><name>Lê Nguyên Hoang</name></author>
		
	</entry>
	<entry>
		<id>https://robustlybeneficial.org/wiki/index.php?title=Robust_statistics&amp;diff=110&amp;oldid=prev</id>
		<title>Lê Nguyên Hoang: /* What if there are more outliers than inliers? */</title>
		<link rel="alternate" type="text/html" href="https://robustlybeneficial.org/wiki/index.php?title=Robust_statistics&amp;diff=110&amp;oldid=prev"/>
		<updated>2020-01-26T20:51:31Z</updated>

		<summary type="html">&lt;p&gt;&lt;span dir=&quot;auto&quot;&gt;&lt;span class=&quot;autocomment&quot;&gt;What if there are more outliers than inliers?&lt;/span&gt;&lt;/span&gt;&lt;/p&gt;
&lt;table class=&quot;diff diff-contentalign-left&quot; data-mw=&quot;interface&quot;&gt;
				&lt;col class=&quot;diff-marker&quot; /&gt;
				&lt;col class=&quot;diff-content&quot; /&gt;
				&lt;col class=&quot;diff-marker&quot; /&gt;
				&lt;col class=&quot;diff-content&quot; /&gt;
				&lt;tr class=&quot;diff-title&quot; lang=&quot;en&quot;&gt;
				&lt;td colspan=&quot;2&quot; style=&quot;background-color: #fff; color: #222; text-align: center;&quot;&gt;← Older revision&lt;/td&gt;
				&lt;td colspan=&quot;2&quot; style=&quot;background-color: #fff; color: #222; text-align: center;&quot;&gt;Revision as of 20:51, 26 January 2020&lt;/td&gt;
				&lt;/tr&gt;&lt;tr&gt;&lt;td colspan=&quot;2&quot; class=&quot;diff-lineno&quot; id=&quot;mw-diff-left-l55&quot; &gt;Line 55:&lt;/td&gt;
&lt;td colspan=&quot;2&quot; class=&quot;diff-lineno&quot;&gt;Line 55:&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td class='diff-marker'&gt; &lt;/td&gt;&lt;td style=&quot;background-color: #f8f9fa; color: #222; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;Second, one might try to a apply a robust Bayesian inference to the data, which would yield a set of posterior beliefs. This framework has yet to be defined.  &lt;/div&gt;&lt;/td&gt;&lt;td class='diff-marker'&gt; &lt;/td&gt;&lt;td style=&quot;background-color: #f8f9fa; color: #222; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;Second, one might try to a apply a robust Bayesian inference to the data, which would yield a set of posterior beliefs. This framework has yet to be defined.  &lt;/div&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td class='diff-marker'&gt; &lt;/td&gt;&lt;td style=&quot;background-color: #f8f9fa; color: #222; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;/td&gt;&lt;td class='diff-marker'&gt; &lt;/td&gt;&lt;td style=&quot;background-color: #f8f9fa; color: #222; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td class='diff-marker'&gt;−&lt;/td&gt;&lt;td style=&quot;color: #222; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #ffe49c; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;Third, we may assume that the data are more or less certified. This is a natural setting, as we humans often judge the reliability of raw data depending on its source, and we usually consider a continuum of reliability, rather than a clear cut binary classification. Algorithms should probably do the same at some point, but there does not yet seem to be an algorithmic framework to pose this problem.&lt;/div&gt;&lt;/td&gt;&lt;td class='diff-marker'&gt;+&lt;/td&gt;&lt;td style=&quot;color: #222; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #a3d3ff; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;Third, we may assume that the data are more or less &lt;ins class=&quot;diffchange diffchange-inline&quot;&gt;[[data certification|&lt;/ins&gt;certified&lt;ins class=&quot;diffchange diffchange-inline&quot;&gt;]]&lt;/ins&gt;. This is a natural setting, as we humans often judge the reliability of raw data depending on its source, and we usually consider a continuum of reliability, rather than a clear cut binary classification. Algorithms should probably do the same at some point, but there does not yet seem to be an algorithmic framework to pose this problem&lt;ins class=&quot;diffchange diffchange-inline&quot;&gt;. In particular, it could be interesting to analyze threat models&lt;/ins&gt;.&lt;/div&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td class='diff-marker'&gt; &lt;/td&gt;&lt;td style=&quot;background-color: #f8f9fa; color: #222; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;/td&gt;&lt;td class='diff-marker'&gt; &lt;/td&gt;&lt;td style=&quot;background-color: #f8f9fa; color: #222; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td class='diff-marker'&gt; &lt;/td&gt;&lt;td style=&quot;background-color: #f8f9fa; color: #222; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;== Robust statistics for neural networks ==&lt;/div&gt;&lt;/td&gt;&lt;td class='diff-marker'&gt; &lt;/td&gt;&lt;td style=&quot;background-color: #f8f9fa; color: #222; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;== Robust statistics for neural networks ==&lt;/div&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;/table&gt;</summary>
		<author><name>Lê Nguyên Hoang</name></author>
		
	</entry>
	<entry>
		<id>https://robustlybeneficial.org/wiki/index.php?title=Robust_statistics&amp;diff=99&amp;oldid=prev</id>
		<title>Lê Nguyên Hoang at 08:00, 25 January 2020</title>
		<link rel="alternate" type="text/html" href="https://robustlybeneficial.org/wiki/index.php?title=Robust_statistics&amp;diff=99&amp;oldid=prev"/>
		<updated>2020-01-25T08:00:14Z</updated>

		<summary type="html">&lt;p&gt;&lt;/p&gt;
&lt;table class=&quot;diff diff-contentalign-left&quot; data-mw=&quot;interface&quot;&gt;
				&lt;col class=&quot;diff-marker&quot; /&gt;
				&lt;col class=&quot;diff-content&quot; /&gt;
				&lt;col class=&quot;diff-marker&quot; /&gt;
				&lt;col class=&quot;diff-content&quot; /&gt;
				&lt;tr class=&quot;diff-title&quot; lang=&quot;en&quot;&gt;
				&lt;td colspan=&quot;2&quot; style=&quot;background-color: #fff; color: #222; text-align: center;&quot;&gt;← Older revision&lt;/td&gt;
				&lt;td colspan=&quot;2&quot; style=&quot;background-color: #fff; color: #222; text-align: center;&quot;&gt;Revision as of 08:00, 25 January 2020&lt;/td&gt;
				&lt;/tr&gt;&lt;tr&gt;&lt;td colspan=&quot;2&quot; class=&quot;diff-lineno&quot; id=&quot;mw-diff-left-l1&quot; &gt;Line 1:&lt;/td&gt;
&lt;td colspan=&quot;2&quot; class=&quot;diff-lineno&quot;&gt;Line 1:&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td class='diff-marker'&gt; &lt;/td&gt;&lt;td style=&quot;background-color: #f8f9fa; color: #222; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;Robust statistics is the problem of estimating parameters from unreliable empirical data. Typically, suppose that a fraction of the training dataset is compromised. Can we design algorithms that nevertheless succeed in learning adequately from such a partially compromised dataset?&lt;/div&gt;&lt;/td&gt;&lt;td class='diff-marker'&gt; &lt;/td&gt;&lt;td style=&quot;background-color: #f8f9fa; color: #222; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;Robust statistics is the problem of estimating parameters from unreliable empirical data. Typically, suppose that a fraction of the training dataset is compromised. Can we design algorithms that nevertheless succeed in learning adequately from such a partially compromised dataset?&lt;/div&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td class='diff-marker'&gt; &lt;/td&gt;&lt;td style=&quot;background-color: #f8f9fa; color: #222; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;/td&gt;&lt;td class='diff-marker'&gt; &lt;/td&gt;&lt;td style=&quot;background-color: #f8f9fa; color: #222; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td class='diff-marker'&gt;−&lt;/td&gt;&lt;td style=&quot;color: #222; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #ffe49c; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;This question has arguably become crucial, as large-scale algorithms perform learning from users' data. Yet, clearly, if the algorithm is used by thousands, millions or billions of users, many of the data will likely be corrupted, because of bugs [https://www.&lt;del class=&quot;diffchange diffchange-inline&quot;&gt;sciencemag&lt;/del&gt;.&lt;del class=&quot;diffchange diffchange-inline&quot;&gt;org&lt;/del&gt;/&lt;del class=&quot;diffchange diffchange-inline&quot;&gt;news/2016/08/one-five-genetics-papers-contains-errors-thanks-microsoft-excel Boddy16&lt;/del&gt;], or because some users will maliciously want to exploit or attack the algorithm. This latter case is known as a &amp;lt;em&amp;gt;poisoning attack&amp;lt;/em&amp;gt;.&lt;/div&gt;&lt;/td&gt;&lt;td class='diff-marker'&gt;+&lt;/td&gt;&lt;td style=&quot;color: #222; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #a3d3ff; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;This question has arguably become crucial, as large-scale algorithms perform learning from users' data. Yet, clearly, if the algorithm is used by thousands, millions or billions of users, many of the data will likely be corrupted, because of bugs [https://www.&lt;ins class=&quot;diffchange diffchange-inline&quot;&gt;youtube&lt;/ins&gt;.&lt;ins class=&quot;diffchange diffchange-inline&quot;&gt;com&lt;/ins&gt;/&lt;ins class=&quot;diffchange diffchange-inline&quot;&gt;watch?v=yb2zkxHDfUE standupmaths20&lt;/ins&gt;], or because some users will maliciously want to exploit or attack the algorithm. This latter case is known as a &amp;lt;em&amp;gt;poisoning attack&amp;lt;/em&amp;gt;.&lt;/div&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td class='diff-marker'&gt; &lt;/td&gt;&lt;td style=&quot;background-color: #f8f9fa; color: #222; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;/td&gt;&lt;td class='diff-marker'&gt; &lt;/td&gt;&lt;td style=&quot;background-color: #f8f9fa; color: #222; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td class='diff-marker'&gt; &lt;/td&gt;&lt;td style=&quot;background-color: #f8f9fa; color: #222; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;Over the last three years, there have been fascinating recent advances, both for classical statistical tasks [https://arxiv.org/pdf/1911.05911.pdf DiakonikolasKane][https://dblp.org/rec/bibtex/journals/corr/abs-1911-05911 19] [https://arxiv.org/pdf/1906.03058 DepersinLecué][https://scholar.google.ch/scholar?hl=en&amp;amp;as_sdt=0%2C5&amp;amp;q=Robust+subgaussian+estimation+of+a+mean+vector+in+nearly+linear+time&amp;amp;btnG= 19] and [[neural networks]] [http://papers.nips.cc/paper/6617-machine-learning-with-adversaries-byzantine-tolerant-gradient-descent.pdf BMGS][https://dblp.org/rec/bibtex/conf/nips/BlanchardMGS17 17].&lt;/div&gt;&lt;/td&gt;&lt;td class='diff-marker'&gt; &lt;/td&gt;&lt;td style=&quot;background-color: #f8f9fa; color: #222; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;Over the last three years, there have been fascinating recent advances, both for classical statistical tasks [https://arxiv.org/pdf/1911.05911.pdf DiakonikolasKane][https://dblp.org/rec/bibtex/journals/corr/abs-1911-05911 19] [https://arxiv.org/pdf/1906.03058 DepersinLecué][https://scholar.google.ch/scholar?hl=en&amp;amp;as_sdt=0%2C5&amp;amp;q=Robust+subgaussian+estimation+of+a+mean+vector+in+nearly+linear+time&amp;amp;btnG= 19] and [[neural networks]] [http://papers.nips.cc/paper/6617-machine-learning-with-adversaries-byzantine-tolerant-gradient-descent.pdf BMGS][https://dblp.org/rec/bibtex/conf/nips/BlanchardMGS17 17].&lt;/div&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;/table&gt;</summary>
		<author><name>Lê Nguyên Hoang</name></author>
		
	</entry>
	<entry>
		<id>https://robustlybeneficial.org/wiki/index.php?title=Robust_statistics&amp;diff=96&amp;oldid=prev</id>
		<title>Lê Nguyên Hoang: /* Poisoning models */</title>
		<link rel="alternate" type="text/html" href="https://robustlybeneficial.org/wiki/index.php?title=Robust_statistics&amp;diff=96&amp;oldid=prev"/>
		<updated>2020-01-24T17:09:10Z</updated>

		<summary type="html">&lt;p&gt;&lt;span dir=&quot;auto&quot;&gt;&lt;span class=&quot;autocomment&quot;&gt;Poisoning models&lt;/span&gt;&lt;/span&gt;&lt;/p&gt;
&lt;table class=&quot;diff diff-contentalign-left&quot; data-mw=&quot;interface&quot;&gt;
				&lt;col class=&quot;diff-marker&quot; /&gt;
				&lt;col class=&quot;diff-content&quot; /&gt;
				&lt;col class=&quot;diff-marker&quot; /&gt;
				&lt;col class=&quot;diff-content&quot; /&gt;
				&lt;tr class=&quot;diff-title&quot; lang=&quot;en&quot;&gt;
				&lt;td colspan=&quot;2&quot; style=&quot;background-color: #fff; color: #222; text-align: center;&quot;&gt;← Older revision&lt;/td&gt;
				&lt;td colspan=&quot;2&quot; style=&quot;background-color: #fff; color: #222; text-align: center;&quot;&gt;Revision as of 17:09, 24 January 2020&lt;/td&gt;
				&lt;/tr&gt;&lt;tr&gt;&lt;td colspan=&quot;2&quot; class=&quot;diff-lineno&quot; id=&quot;mw-diff-left-l21&quot; &gt;Line 21:&lt;/td&gt;
&lt;td colspan=&quot;2&quot; class=&quot;diff-lineno&quot;&gt;Line 21:&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td class='diff-marker'&gt; &lt;/td&gt;&lt;td style=&quot;background-color: #f8f9fa; color: #222; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;Other models include an adversary with only erasing power, or an adversary that must choose its &amp;lt;em&amp;gt;outliers&amp;lt;/em&amp;gt; without knowledge of the values of the &amp;lt;em&amp;gt;inliers&amp;lt;/em&amp;gt;. Evidently, any guarantee for such weaker poisoning models will also hold for stronger poisoning models [https://arxiv.org/pdf/1911.05911.pdf DiakonikolasKane][https://dblp.org/rec/bibtex/journals/corr/abs-1911-05911 19].&lt;/div&gt;&lt;/td&gt;&lt;td class='diff-marker'&gt; &lt;/td&gt;&lt;td style=&quot;background-color: #f8f9fa; color: #222; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;Other models include an adversary with only erasing power, or an adversary that must choose its &amp;lt;em&amp;gt;outliers&amp;lt;/em&amp;gt; without knowledge of the values of the &amp;lt;em&amp;gt;inliers&amp;lt;/em&amp;gt;. Evidently, any guarantee for such weaker poisoning models will also hold for stronger poisoning models [https://arxiv.org/pdf/1911.05911.pdf DiakonikolasKane][https://dblp.org/rec/bibtex/journals/corr/abs-1911-05911 19].&lt;/div&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td class='diff-marker'&gt; &lt;/td&gt;&lt;td style=&quot;background-color: #f8f9fa; color: #222; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;/td&gt;&lt;td class='diff-marker'&gt; &lt;/td&gt;&lt;td style=&quot;background-color: #f8f9fa; color: #222; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td class='diff-marker'&gt;−&lt;/td&gt;&lt;td style=&quot;color: #222; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #ffe49c; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;Perhaps the most general form of poisoning attack is the following. Consider a &amp;quot;true dataset&amp;quot; &amp;lt;math&amp;gt;D&amp;lt;/math&amp;gt;. However, the attacker gets to distort the dataset using a distort function &amp;lt;math&amp;gt;f \in \mathcal F&amp;lt;/math&amp;gt;, thereby yielding &amp;lt;math&amp;gt;f(D)&amp;lt;/math&amp;gt;. Suppose we now have a best-possible machine learning algorithm &amp;lt;math&amp;gt;ML&amp;lt;/math&amp;gt; that learns from data. It would ideally compute &amp;lt;math&amp;gt;ML(D)&amp;lt;/math&amp;gt;. But we can only exploit &amp;lt;math&amp;gt;f(D)&amp;lt;/math&amp;gt;, by some hopefully robust machine learning algorithm &amp;lt;math&amp;gt;RML(f(D))&amp;lt;/math&amp;gt;. What we would like is to guarantee that &amp;lt;math&amp;gt;d(ML(D),RML(f(D))) &amp;lt; bound&amp;lt;/math&amp;gt;, for a suitable distance &amp;lt;math&amp;gt;d&amp;lt;/math&amp;gt; and any &amp;lt;math&amp;gt;f \in \mathcal F&amp;lt;/math&amp;gt;, where &amp;lt;math&amp;gt;RML&amp;lt;/math&amp;gt; is tractable.&lt;/div&gt;&lt;/td&gt;&lt;td class='diff-marker'&gt;+&lt;/td&gt;&lt;td style=&quot;color: #222; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #a3d3ff; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;Perhaps the most general form of poisoning attack is the following. Consider a &amp;quot;true dataset&amp;quot; &amp;lt;math&amp;gt;D&amp;lt;/math&amp;gt;. However, the attacker gets to distort the dataset using a distort function &amp;lt;math&amp;gt;f \in \mathcal F&amp;lt;/math&amp;gt;, thereby yielding &amp;lt;math&amp;gt;f(D)&amp;lt;/math&amp;gt;. Suppose we now have a best-possible machine learning algorithm &amp;lt;math&amp;gt;ML&amp;lt;/math&amp;gt; that learns from data. It would ideally compute &amp;lt;math&amp;gt;ML(D)&amp;lt;/math&amp;gt;. But we can only exploit &amp;lt;math&amp;gt;f(D)&amp;lt;/math&amp;gt;, by some hopefully robust machine learning algorithm &amp;lt;math&amp;gt;RML(f(D))&amp;lt;/math&amp;gt;. What we would like is to guarantee that &amp;lt;math&amp;gt;d(ML(D),RML(f(D))) &amp;lt; bound&amp;lt;/math&amp;gt;, for a suitable distance &amp;lt;math&amp;gt;d&amp;lt;/math&amp;gt; and any &amp;lt;math&amp;gt;f \in \mathcal F&amp;lt;/math&amp;gt;, where &amp;lt;math&amp;gt;RML&amp;lt;/math&amp;gt; is tractable&lt;ins class=&quot;diffchange diffchange-inline&quot;&gt;. A further generalization of this could consist in assuming a prior probabilistic belief on the set &amp;lt;mathcal&amp;gt;F&amp;lt;/mathcal&amp;gt; of attack models that we need to defend against&lt;/ins&gt;.&lt;/div&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td class='diff-marker'&gt; &lt;/td&gt;&lt;td style=&quot;background-color: #f8f9fa; color: #222; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;/td&gt;&lt;td class='diff-marker'&gt; &lt;/td&gt;&lt;td style=&quot;background-color: #f8f9fa; color: #222; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td class='diff-marker'&gt; &lt;/td&gt;&lt;td style=&quot;background-color: #f8f9fa; color: #222; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;It is noteworthy that robustness to such attacks are useful, even if there are no adversary. Indeed, data may still get corrupted, because of bugs, crashes or misuse by a human operator (see for instance the case of gene mutations caused by Excel that plagued genetic research [https://www.sciencemag.org/news/2016/08/one-five-genetics-papers-contains-errors-thanks-microsoft-excel Boddy16]). Our algorithms need to remain performant despite such issues. An algorithm robust to strong attacks will be robust to such weaker flaws.&lt;/div&gt;&lt;/td&gt;&lt;td class='diff-marker'&gt; &lt;/td&gt;&lt;td style=&quot;background-color: #f8f9fa; color: #222; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;It is noteworthy that robustness to such attacks are useful, even if there are no adversary. Indeed, data may still get corrupted, because of bugs, crashes or misuse by a human operator (see for instance the case of gene mutations caused by Excel that plagued genetic research [https://www.sciencemag.org/news/2016/08/one-five-genetics-papers-contains-errors-thanks-microsoft-excel Boddy16]). Our algorithms need to remain performant despite such issues. An algorithm robust to strong attacks will be robust to such weaker flaws.&lt;/div&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;/table&gt;</summary>
		<author><name>Lê Nguyên Hoang</name></author>
		
	</entry>
</feed>