<?xml version="1.0"?>
<feed xmlns="http://www.w3.org/2005/Atom" xml:lang="en">
	<id>https://en.formulasearchengine.com/index.php?action=history&amp;feed=atom&amp;title=Indecomposable_distribution</id>
	<title>Indecomposable distribution - Revision history</title>
	<link rel="self" type="application/atom+xml" href="https://en.formulasearchengine.com/index.php?action=history&amp;feed=atom&amp;title=Indecomposable_distribution"/>
	<link rel="alternate" type="text/html" href="https://en.formulasearchengine.com/index.php?title=Indecomposable_distribution&amp;action=history"/>
	<updated>2026-06-02T00:39:25Z</updated>
	<subtitle>Revision history for this page on the wiki</subtitle>
	<generator>MediaWiki 1.43.0-wmf.28</generator>
	<entry>
		<id>https://en.formulasearchengine.com/index.php?title=Indecomposable_distribution&amp;diff=249378&amp;oldid=prev</id>
		<title>en&gt;Michael Hardy: /* Related concepts */</title>
		<link rel="alternate" type="text/html" href="https://en.formulasearchengine.com/index.php?title=Indecomposable_distribution&amp;diff=249378&amp;oldid=prev"/>
		<updated>2014-06-29T18:47:53Z</updated>

		<summary type="html">&lt;p&gt;&lt;span class=&quot;autocomment&quot;&gt;Related concepts&lt;/span&gt;&lt;/p&gt;
&lt;table style=&quot;background-color: #fff; color: #202122;&quot; data-mw=&quot;interface&quot;&gt;
				&lt;col class=&quot;diff-marker&quot; /&gt;
				&lt;col class=&quot;diff-content&quot; /&gt;
				&lt;col class=&quot;diff-marker&quot; /&gt;
				&lt;col class=&quot;diff-content&quot; /&gt;
				&lt;tr class=&quot;diff-title&quot; lang=&quot;en&quot;&gt;
				&lt;td colspan=&quot;2&quot; style=&quot;background-color: #fff; color: #202122; text-align: center;&quot;&gt;← Older revision&lt;/td&gt;
				&lt;td colspan=&quot;2&quot; style=&quot;background-color: #fff; color: #202122; text-align: center;&quot;&gt;Revision as of 20:47, 29 June 2014&lt;/td&gt;
				&lt;/tr&gt;&lt;tr&gt;&lt;td colspan=&quot;2&quot; class=&quot;diff-lineno&quot; id=&quot;mw-diff-left-l1&quot;&gt;Line 1:&lt;/td&gt;
&lt;td colspan=&quot;2&quot; class=&quot;diff-lineno&quot;&gt;Line 1:&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td class=&quot;diff-marker&quot; data-marker=&quot;−&quot;&gt;&lt;/td&gt;&lt;td style=&quot;color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #ffe49c; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;&lt;del style=&quot;font-weight: bold; text-decoration: none;&quot;&gt;{{no footnotes|date=September 2013}}&lt;/del&gt;&lt;/div&gt;&lt;/td&gt;&lt;td class=&quot;diff-marker&quot; data-marker=&quot;+&quot;&gt;&lt;/td&gt;&lt;td style=&quot;color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #a3d3ff; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;&lt;ins style=&quot;font-weight: bold; text-decoration: none;&quot;&gt;Andrew Simcox &lt;/ins&gt;is the &lt;ins style=&quot;font-weight: bold; text-decoration: none;&quot;&gt;name his parents gave him &lt;/ins&gt;and &lt;ins style=&quot;font-weight: bold; text-decoration: none;&quot;&gt;he totally loves this title&lt;/ins&gt;. &lt;ins style=&quot;font-weight: bold; text-decoration: none;&quot;&gt;Her family lives &lt;/ins&gt;in &lt;ins style=&quot;font-weight: bold; text-decoration: none;&quot;&gt;Ohio&lt;/ins&gt;. &lt;ins style=&quot;font-weight: bold; text-decoration: none;&quot;&gt;To climb &lt;/ins&gt;is some &lt;ins style=&quot;font-weight: bold; text-decoration: none;&quot;&gt;thing I really appreciate doing&lt;/ins&gt;. &lt;ins style=&quot;font-weight: bold; text-decoration: none;&quot;&gt;She functions &lt;/ins&gt;as a &lt;ins style=&quot;font-weight: bold; text-decoration: none;&quot;&gt;travel agent but quickly she&lt;/ins&gt;&#039;&lt;ins style=&quot;font-weight: bold; text-decoration: none;&quot;&gt;ll be &lt;/ins&gt;on &lt;ins style=&quot;font-weight: bold; text-decoration: none;&quot;&gt;her own&lt;/ins&gt;.&amp;lt;&lt;ins style=&quot;font-weight: bold; text-decoration: none;&quot;&gt;br&lt;/ins&gt;&amp;gt;&amp;lt;br&amp;gt;&lt;ins style=&quot;font-weight: bold; text-decoration: none;&quot;&gt;My blog&lt;/ins&gt;; &lt;ins style=&quot;font-weight: bold; text-decoration: none;&quot;&gt;psychic solutions &lt;/ins&gt;by &lt;ins style=&quot;font-weight: bold; text-decoration: none;&quot;&gt;lynne &lt;/ins&gt;([&lt;ins style=&quot;font-weight: bold; text-decoration: none;&quot;&gt;http&lt;/ins&gt;://&lt;ins style=&quot;font-weight: bold; text-decoration: none;&quot;&gt;www&lt;/ins&gt;.&lt;ins style=&quot;font-weight: bold; text-decoration: none;&quot;&gt;seekavideo&lt;/ins&gt;.&lt;ins style=&quot;font-weight: bold; text-decoration: none;&quot;&gt;com&lt;/ins&gt;/&lt;ins style=&quot;font-weight: bold; text-decoration: none;&quot;&gt;playlist&lt;/ins&gt;/&lt;ins style=&quot;font-weight: bold; text-decoration: none;&quot;&gt;2199&lt;/ins&gt;/&lt;ins style=&quot;font-weight: bold; text-decoration: none;&quot;&gt;video&lt;/ins&gt;/ www.&lt;ins style=&quot;font-weight: bold; text-decoration: none;&quot;&gt;seekavideo&lt;/ins&gt;.&lt;ins style=&quot;font-weight: bold; text-decoration: none;&quot;&gt;com&lt;/ins&gt;])&lt;/div&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td class=&quot;diff-marker&quot; data-marker=&quot;−&quot;&gt;&lt;/td&gt;&lt;td style=&quot;color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #ffe49c; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;&lt;del style=&quot;font-weight: bold; text-decoration: none;&quot;&gt;{{ref improve|date=September 2013}}&lt;/del&gt;&lt;/div&gt;&lt;/td&gt;&lt;td colspan=&quot;2&quot; class=&quot;diff-side-added&quot;&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td class=&quot;diff-marker&quot; data-marker=&quot;−&quot;&gt;&lt;/td&gt;&lt;td style=&quot;color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #ffe49c; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;&lt;del style=&quot;font-weight: bold; text-decoration: none;&quot;&gt;The &#039;&#039;&#039;cross-entropy (CE) method&#039;&#039;&#039; attributed to [[Reuven Rubinstein]] &lt;/del&gt;is &lt;del style=&quot;font-weight: bold; text-decoration: none;&quot;&gt;a general [[Monte Carlo method|Monte Carlo]] approach to&lt;/del&gt;&lt;/div&gt;&lt;/td&gt;&lt;td colspan=&quot;2&quot; class=&quot;diff-side-added&quot;&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td class=&quot;diff-marker&quot; data-marker=&quot;−&quot;&gt;&lt;/td&gt;&lt;td style=&quot;color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #ffe49c; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;&lt;del style=&quot;font-weight: bold; text-decoration: none;&quot;&gt;[[Combinatorial optimization|combinatorial]] and [[Continuous optimization|continuous]] multi-extremal [[Optimization (mathematics)|optimization]] and [[importance sampling]]. &lt;/del&gt;&lt;/div&gt;&lt;/td&gt;&lt;td colspan=&quot;2&quot; class=&quot;diff-side-added&quot;&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td class=&quot;diff-marker&quot; data-marker=&quot;−&quot;&gt;&lt;/td&gt;&lt;td style=&quot;color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #ffe49c; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;&lt;del style=&quot;font-weight: bold; text-decoration: none;&quot;&gt;The method originated from &lt;/del&gt;the &lt;del style=&quot;font-weight: bold; text-decoration: none;&quot;&gt;field of &#039;&#039;rare event simulation&#039;&#039;, where&lt;/del&gt;&lt;/div&gt;&lt;/td&gt;&lt;td colspan=&quot;2&quot; class=&quot;diff-side-added&quot;&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td class=&quot;diff-marker&quot; data-marker=&quot;−&quot;&gt;&lt;/td&gt;&lt;td style=&quot;color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #ffe49c; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;&lt;del style=&quot;font-weight: bold; text-decoration: none;&quot;&gt;very small probabilities need to be accurately estimated, for example in  network reliability analysis, queueing models, or performance analysis of telecommunication systems.&lt;/del&gt;&lt;/div&gt;&lt;/td&gt;&lt;td colspan=&quot;2&quot; class=&quot;diff-side-added&quot;&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td class=&quot;diff-marker&quot; data-marker=&quot;−&quot;&gt;&lt;/td&gt;&lt;td style=&quot;color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #ffe49c; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;&lt;del style=&quot;font-weight: bold; text-decoration: none;&quot;&gt;The CE method can be applied to static &lt;/del&gt;and &lt;del style=&quot;font-weight: bold; text-decoration: none;&quot;&gt;noisy combinatorial optimization problems such as the [[traveling salesman problem]], the [[quadratic assignment problem]], [[Sequence_alignment|DNA sequence alignment]], the [[Maxcut|max-cut]] problem and the buffer allocation problem, as well as continuous [[global optimization]] problems with many local [[extremum|extrema]]&lt;/del&gt;.&lt;/div&gt;&lt;/td&gt;&lt;td colspan=&quot;2&quot; class=&quot;diff-side-added&quot;&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td class=&quot;diff-marker&quot; data-marker=&quot;−&quot;&gt;&lt;/td&gt;&lt;td style=&quot;color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #ffe49c; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt; &lt;/div&gt;&lt;/td&gt;&lt;td colspan=&quot;2&quot; class=&quot;diff-side-added&quot;&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td class=&quot;diff-marker&quot; data-marker=&quot;−&quot;&gt;&lt;/td&gt;&lt;td style=&quot;color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #ffe49c; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;&lt;del style=&quot;font-weight: bold; text-decoration: none;&quot;&gt;In a nutshell the CE method consists of two phases:&lt;/del&gt;&lt;/div&gt;&lt;/td&gt;&lt;td colspan=&quot;2&quot; class=&quot;diff-side-added&quot;&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td class=&quot;diff-marker&quot; data-marker=&quot;−&quot;&gt;&lt;/td&gt;&lt;td style=&quot;color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #ffe49c; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt; &lt;/div&gt;&lt;/td&gt;&lt;td colspan=&quot;2&quot; class=&quot;diff-side-added&quot;&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td class=&quot;diff-marker&quot; data-marker=&quot;−&quot;&gt;&lt;/td&gt;&lt;td style=&quot;color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #ffe49c; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;&lt;del style=&quot;font-weight: bold; text-decoration: none;&quot;&gt;#Generate a random data sample (trajectories, vectors, etc.) according to a specified mechanism.&lt;/del&gt;&lt;/div&gt;&lt;/td&gt;&lt;td colspan=&quot;2&quot; class=&quot;diff-side-added&quot;&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td class=&quot;diff-marker&quot; data-marker=&quot;−&quot;&gt;&lt;/td&gt;&lt;td style=&quot;color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #ffe49c; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;&lt;del style=&quot;font-weight: bold; text-decoration: none;&quot;&gt;#Update the parameters of the random mechanism based on the data to produce a &quot;better&quot; sample &lt;/del&gt;in &lt;del style=&quot;font-weight: bold; text-decoration: none;&quot;&gt;the next iteration&lt;/del&gt;. &lt;del style=&quot;font-weight: bold; text-decoration: none;&quot;&gt;This step involves minimizing the [[cross entropy|&#039;&#039;cross-entropy&#039;&#039;]] or  [[Kullback–Leibler divergence]].&lt;/del&gt;&lt;/div&gt;&lt;/td&gt;&lt;td colspan=&quot;2&quot; class=&quot;diff-side-added&quot;&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td class=&quot;diff-marker&quot; data-marker=&quot;−&quot;&gt;&lt;/td&gt;&lt;td style=&quot;color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #ffe49c; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;&lt;del style=&quot;font-weight: bold; text-decoration: none;&quot;&gt; &lt;/del&gt;&lt;/div&gt;&lt;/td&gt;&lt;td colspan=&quot;2&quot; class=&quot;diff-side-added&quot;&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td class=&quot;diff-marker&quot; data-marker=&quot;−&quot;&gt;&lt;/td&gt;&lt;td style=&quot;color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #ffe49c; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;&lt;del style=&quot;font-weight: bold; text-decoration: none;&quot;&gt;==Estimation via importance sampling==&lt;/del&gt;&lt;/div&gt;&lt;/td&gt;&lt;td colspan=&quot;2&quot; class=&quot;diff-side-added&quot;&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td class=&quot;diff-marker&quot; data-marker=&quot;−&quot;&gt;&lt;/td&gt;&lt;td style=&quot;color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #ffe49c; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;&lt;del style=&quot;font-weight: bold; text-decoration: none;&quot;&gt;Consider the general problem of estimating the quantity &amp;lt;math&amp;gt;\ell = \mathbb{E}_{\mathbf{u}}[H(\mathbf{X})] = \int H(\mathbf{x})\, f(\mathbf{x}; \mathbf{u})\, \textrm{d}\mathbf{x}&amp;lt;/math&amp;gt;, where &amp;lt;math&amp;gt;H&amp;lt;/math&amp;gt; &lt;/del&gt;is some &lt;del style=&quot;font-weight: bold; text-decoration: none;&quot;&gt;&#039;&#039;performance function&#039;&#039; and &amp;lt;math&amp;gt;f(\mathbf{x};\mathbf{u})&amp;lt;/math&amp;gt; is a member of some [[parametric family]] of distributions&lt;/del&gt;. &lt;del style=&quot;font-weight: bold; text-decoration: none;&quot;&gt;Using [[importance sampling]] this quantity can be estimated &lt;/del&gt;as &lt;del style=&quot;font-weight: bold; text-decoration: none;&quot;&gt;&amp;lt;math&amp;gt;\hat{\ell} = \frac{1}{N} \sum_{i=1}^N H(\mathbf{X}_i) \frac{f(\mathbf{X}_i; \mathbf{u})}{g(\mathbf{X}_i)}&amp;lt;/math&amp;gt;, where &amp;lt;math&amp;gt;\mathbf{X}_1,\dots,\mathbf{X}_N&amp;lt;/math&amp;gt; is &lt;/del&gt;a &lt;del style=&quot;font-weight: bold; text-decoration: none;&quot;&gt;random sample from &amp;lt;math&amp;gt;g\,&amp;lt;/math&amp;gt;. For positive &amp;lt;math&amp;gt;H&amp;lt;/math&amp;gt;, the theoretically &lt;/del&gt;&#039;&lt;del style=&quot;font-weight: bold; text-decoration: none;&quot;&gt;&#039;optimal&#039;&#039; importance sampling [[probability density function|density]] (pdf)is given by &lt;/del&gt;&lt;/div&gt;&lt;/td&gt;&lt;td colspan=&quot;2&quot; class=&quot;diff-side-added&quot;&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td class=&quot;diff-marker&quot; data-marker=&quot;−&quot;&gt;&lt;/td&gt;&lt;td style=&quot;color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #ffe49c; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;&lt;del style=&quot;font-weight: bold; text-decoration: none;&quot;&gt;&amp;lt;math&amp;gt; g^*(\mathbf{x}) = H(\mathbf{x}) f(\mathbf{x};\mathbf{u})/\ell&amp;lt;/math&amp;gt;. This, however, depends &lt;/del&gt;on &lt;del style=&quot;font-weight: bold; text-decoration: none;&quot;&gt;the unknown &amp;lt;math&amp;gt;\ell&amp;lt;/math&amp;gt;. The CE method aims to approximate the optimal PDF by adaptively selecting members of the parametric family that are closest (in the [[Kullback–Leibler divergence|Kullback–Leibler]] sense) to the optimal PDF &amp;lt;math&amp;gt;g^*&amp;lt;/math&amp;gt;&lt;/del&gt;.&lt;/div&gt;&lt;/td&gt;&lt;td colspan=&quot;2&quot; class=&quot;diff-side-added&quot;&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td class=&quot;diff-marker&quot; data-marker=&quot;−&quot;&gt;&lt;/td&gt;&lt;td style=&quot;color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #ffe49c; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt; &lt;/div&gt;&lt;/td&gt;&lt;td colspan=&quot;2&quot; class=&quot;diff-side-added&quot;&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td class=&quot;diff-marker&quot; data-marker=&quot;−&quot;&gt;&lt;/td&gt;&lt;td style=&quot;color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #ffe49c; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;&lt;del style=&quot;font-weight: bold; text-decoration: none;&quot;&gt;==Generic CE algorithm==&lt;/del&gt;&lt;/div&gt;&lt;/td&gt;&lt;td colspan=&quot;2&quot; class=&quot;diff-side-added&quot;&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td class=&quot;diff-marker&quot; data-marker=&quot;−&quot;&gt;&lt;/td&gt;&lt;td style=&quot;color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #ffe49c; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;&lt;del style=&quot;font-weight: bold; text-decoration: none;&quot;&gt;# Choose initial parameter vector &lt;/del&gt;&amp;lt;&lt;del style=&quot;font-weight: bold; text-decoration: none;&quot;&gt;math&amp;gt;\mathbf{v}^{(0)}&amp;lt;/math&amp;gt;; set t = 1.&lt;/del&gt;&lt;/div&gt;&lt;/td&gt;&lt;td colspan=&quot;2&quot; class=&quot;diff-side-added&quot;&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td class=&quot;diff-marker&quot; data-marker=&quot;−&quot;&gt;&lt;/td&gt;&lt;td style=&quot;color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #ffe49c; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;&lt;del style=&quot;font-weight: bold; text-decoration: none;&quot;&gt;# Generate a random sample &amp;lt;math&amp;gt;\mathbf{X}_1,\dots,\mathbf{X}_N&amp;lt;/math&amp;gt; from &amp;lt;math&amp;gt;f(\cdot;\mathbf{v}^{(t-1)})&amp;lt;/math&lt;/del&gt;&amp;gt;&lt;del style=&quot;font-weight: bold; text-decoration: none;&quot;&gt;&amp;lt;/p&amp;gt;&lt;/del&gt;&lt;/div&gt;&lt;/td&gt;&lt;td colspan=&quot;2&quot; class=&quot;diff-side-added&quot;&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td class=&quot;diff-marker&quot; data-marker=&quot;−&quot;&gt;&lt;/td&gt;&lt;td style=&quot;color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #ffe49c; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;&lt;del style=&quot;font-weight: bold; text-decoration: none;&quot;&gt;# Solve for &amp;lt;math&amp;gt;\mathbf{v}^{(t)}&amp;lt;/math&amp;gt;, where&lt;/del&gt;&amp;lt;br&amp;gt;&lt;del style=&quot;font-weight: bold; text-decoration: none;&quot;&gt;&amp;lt;math&amp;gt;\mathbf{v}^{(t)} = \mathop{\textrm{argmax}}_{\mathbf{v}} \frac{1}{N} \sum_{i=1}^N H(\mathbf{X}_i)\frac{f(\mathbf{X}_i;\mathbf{u})}{f(\mathbf{X}_i;\mathbf{v}^{(t-1)})} \log f(\mathbf{X}_i;\mathbf{v})&amp;lt;/math&amp;gt;&lt;/del&gt;&lt;/div&gt;&lt;/td&gt;&lt;td colspan=&quot;2&quot; class=&quot;diff-side-added&quot;&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td class=&quot;diff-marker&quot; data-marker=&quot;−&quot;&gt;&lt;/td&gt;&lt;td style=&quot;color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #ffe49c; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;&lt;del style=&quot;font-weight: bold; text-decoration: none;&quot;&gt;# If convergence is reached then &#039;&#039;&#039;stop&#039;&#039;&#039;&lt;/del&gt;; &lt;del style=&quot;font-weight: bold; text-decoration: none;&quot;&gt;otherwise, increase t &lt;/del&gt;by &lt;del style=&quot;font-weight: bold; text-decoration: none;&quot;&gt;1 and reiterate from step 2.&lt;/del&gt;&lt;/div&gt;&lt;/td&gt;&lt;td colspan=&quot;2&quot; class=&quot;diff-side-added&quot;&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td class=&quot;diff-marker&quot; data-marker=&quot;−&quot;&gt;&lt;/td&gt;&lt;td style=&quot;color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #ffe49c; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt; &lt;/div&gt;&lt;/td&gt;&lt;td colspan=&quot;2&quot; class=&quot;diff-side-added&quot;&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td class=&quot;diff-marker&quot; data-marker=&quot;−&quot;&gt;&lt;/td&gt;&lt;td style=&quot;color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #ffe49c; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;&lt;del style=&quot;font-weight: bold; text-decoration: none;&quot;&gt;In several cases, the solution to step 3 can be found &#039;&#039;analytically&#039;&#039;.  Situations in which this occurs are&lt;/del&gt;&lt;/div&gt;&lt;/td&gt;&lt;td colspan=&quot;2&quot; class=&quot;diff-side-added&quot;&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td class=&quot;diff-marker&quot; data-marker=&quot;−&quot;&gt;&lt;/td&gt;&lt;td style=&quot;color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #ffe49c; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;&lt;del style=&quot;font-weight: bold; text-decoration: none;&quot;&gt;* When &amp;lt;math&amp;gt;f\,&amp;lt;/math&amp;gt; belongs to the [[Exponential_family|natural exponential family]]&lt;/del&gt;&lt;/div&gt;&lt;/td&gt;&lt;td colspan=&quot;2&quot; class=&quot;diff-side-added&quot;&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td class=&quot;diff-marker&quot; data-marker=&quot;−&quot;&gt;&lt;/td&gt;&lt;td style=&quot;color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #ffe49c; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;&lt;del style=&quot;font-weight: bold; text-decoration: none;&quot;&gt;* When &amp;lt;math&amp;gt;f\,&amp;lt;/math&amp;gt; is [[discrete space|discrete]] with finite [[Support (mathematics)|support]]&lt;/del&gt;&lt;/div&gt;&lt;/td&gt;&lt;td colspan=&quot;2&quot; class=&quot;diff-side-added&quot;&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td class=&quot;diff-marker&quot; data-marker=&quot;−&quot;&gt;&lt;/td&gt;&lt;td style=&quot;color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #ffe49c; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;&lt;del style=&quot;font-weight: bold; text-decoration: none;&quot;&gt;* When &amp;lt;math&amp;gt;H(\mathbf{X}) = \mathrm{I}_{\{\mathbf{x}\in A\}}&amp;lt;/math&amp;gt; and &amp;lt;math&amp;gt;f(\mathbf{X}_i;\mathbf{u}) = f(\mathbf{X}_i;\mathbf{v}^{&lt;/del&gt;(&lt;del style=&quot;font-weight: bold; text-decoration: none;&quot;&gt;t-1)})&amp;lt;/math&amp;gt;, then &amp;lt;math&amp;gt;\mathbf{v}^{(t)}&amp;lt;/math&amp;gt; corresponds to the &lt;/del&gt;[&lt;del style=&quot;font-weight: bold; text-decoration: none;&quot;&gt;[Maximum likelihood|maximum likelihood estimator]] based on those &amp;lt;math&amp;gt;\mathbf{X}_k \in A&amp;lt;/math&amp;gt;.&lt;/del&gt;&lt;/div&gt;&lt;/td&gt;&lt;td colspan=&quot;2&quot; class=&quot;diff-side-added&quot;&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td class=&quot;diff-marker&quot; data-marker=&quot;−&quot;&gt;&lt;/td&gt;&lt;td style=&quot;color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #ffe49c; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt; &lt;/div&gt;&lt;/td&gt;&lt;td colspan=&quot;2&quot; class=&quot;diff-side-added&quot;&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td class=&quot;diff-marker&quot; data-marker=&quot;−&quot;&gt;&lt;/td&gt;&lt;td style=&quot;color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #ffe49c; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;&lt;del style=&quot;font-weight: bold; text-decoration: none;&quot;&gt;== Continuous optimization&amp;amp;mdash;example==&lt;/del&gt;&lt;/div&gt;&lt;/td&gt;&lt;td colspan=&quot;2&quot; class=&quot;diff-side-added&quot;&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td class=&quot;diff-marker&quot; data-marker=&quot;−&quot;&gt;&lt;/td&gt;&lt;td style=&quot;color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #ffe49c; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;&lt;del style=&quot;font-weight: bold; text-decoration: none;&quot;&gt;The same CE algorithm can be used for optimization, rather than estimation. &lt;/del&gt;&lt;/div&gt;&lt;/td&gt;&lt;td colspan=&quot;2&quot; class=&quot;diff-side-added&quot;&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td class=&quot;diff-marker&quot; data-marker=&quot;−&quot;&gt;&lt;/td&gt;&lt;td style=&quot;color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #ffe49c; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;&lt;del style=&quot;font-weight: bold; text-decoration: none;&quot;&gt;Suppose the problem is to maximize some function &amp;lt;math&amp;gt;S(x)&amp;lt;/math&amp;gt;, for example, &lt;/del&gt;&lt;/div&gt;&lt;/td&gt;&lt;td colspan=&quot;2&quot; class=&quot;diff-side-added&quot;&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td class=&quot;diff-marker&quot; data-marker=&quot;−&quot;&gt;&lt;/td&gt;&lt;td style=&quot;color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #ffe49c; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;&lt;del style=&quot;font-weight: bold; text-decoration: none;&quot;&gt;&amp;lt;math&amp;gt;S(x) = \textrm{e}^{-(x-2)^2} + 0.8\,\textrm{e}^{-(x+2)^2}&amp;lt;/math&amp;gt;. &lt;/del&gt;&lt;/div&gt;&lt;/td&gt;&lt;td colspan=&quot;2&quot; class=&quot;diff-side-added&quot;&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td class=&quot;diff-marker&quot; data-marker=&quot;−&quot;&gt;&lt;/td&gt;&lt;td style=&quot;color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #ffe49c; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;&lt;del style=&quot;font-weight: bold; text-decoration: none;&quot;&gt;To apply CE, one considers first the &#039;&#039;associated stochastic problem&#039;&#039; of estimating&lt;/del&gt;&lt;/div&gt;&lt;/td&gt;&lt;td colspan=&quot;2&quot; class=&quot;diff-side-added&quot;&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td class=&quot;diff-marker&quot; data-marker=&quot;−&quot;&gt;&lt;/td&gt;&lt;td style=&quot;color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #ffe49c; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;&lt;del style=&quot;font-weight: bold; text-decoration: none;&quot;&gt;&amp;lt;math&amp;gt;\mathbb{P}_{\boldsymbol{\theta}}(S(X)\geq\gamma)&amp;lt;/math&amp;gt;&lt;/del&gt;&lt;/div&gt;&lt;/td&gt;&lt;td colspan=&quot;2&quot; class=&quot;diff-side-added&quot;&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td class=&quot;diff-marker&quot; data-marker=&quot;−&quot;&gt;&lt;/td&gt;&lt;td style=&quot;color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #ffe49c; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;&lt;del style=&quot;font-weight: bold; text-decoration: none;&quot;&gt;for a given &#039;&#039;level&#039;&#039; &amp;lt;math&amp;gt;\gamma\,&amp;lt;/math&amp;gt;, and parametric family &amp;lt;math&amp;gt;\left\{f(\cdot;\boldsymbol{\theta})\right\}&amp;lt;/math&amp;gt;, for example the 1-dimensional &lt;/del&gt;&lt;/div&gt;&lt;/td&gt;&lt;td colspan=&quot;2&quot; class=&quot;diff-side-added&quot;&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td class=&quot;diff-marker&quot; data-marker=&quot;−&quot;&gt;&lt;/td&gt;&lt;td style=&quot;color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #ffe49c; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;&lt;del style=&quot;font-weight: bold; text-decoration: none;&quot;&gt;[[Gaussian distribution]],&lt;/del&gt;&lt;/div&gt;&lt;/td&gt;&lt;td colspan=&quot;2&quot; class=&quot;diff-side-added&quot;&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td class=&quot;diff-marker&quot; data-marker=&quot;−&quot;&gt;&lt;/td&gt;&lt;td style=&quot;color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #ffe49c; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;&lt;del style=&quot;font-weight: bold; text-decoration: none;&quot;&gt;parameterized by its mean &amp;lt;math&amp;gt;\mu_t\,&amp;lt;/math&amp;gt; and variance &amp;lt;math&amp;gt;\sigma_t^2&amp;lt;/math&amp;gt; (so &amp;lt;math&amp;gt;\boldsymbol{\theta} = (\mu,\sigma^2)&amp;lt;/math&amp;gt; here).&lt;/del&gt;&lt;/div&gt;&lt;/td&gt;&lt;td colspan=&quot;2&quot; class=&quot;diff-side-added&quot;&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td class=&quot;diff-marker&quot; data-marker=&quot;−&quot;&gt;&lt;/td&gt;&lt;td style=&quot;color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #ffe49c; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;&lt;del style=&quot;font-weight: bold; text-decoration: none;&quot;&gt;Hence, for a given &amp;lt;math&amp;gt;\gamma\,&amp;lt;/math&amp;gt;, the goal is to find &amp;lt;math&amp;gt;\boldsymbol{\theta}&amp;lt;/math&amp;gt; so that&lt;/del&gt;&lt;/div&gt;&lt;/td&gt;&lt;td colspan=&quot;2&quot; class=&quot;diff-side-added&quot;&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td class=&quot;diff-marker&quot; data-marker=&quot;−&quot;&gt;&lt;/td&gt;&lt;td style=&quot;color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #ffe49c; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;&lt;del style=&quot;font-weight: bold; text-decoration: none;&quot;&gt;&amp;lt;math&amp;gt;D_{\mathrm{KL}}(\textrm{I}_{\{S(x)\geq\gamma\}}\|f_{\boldsymbol{\theta}})&amp;lt;/math&amp;gt;&lt;/del&gt;&lt;/div&gt;&lt;/td&gt;&lt;td colspan=&quot;2&quot; class=&quot;diff-side-added&quot;&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td class=&quot;diff-marker&quot; data-marker=&quot;−&quot;&gt;&lt;/td&gt;&lt;td style=&quot;color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #ffe49c; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;&lt;del style=&quot;font-weight: bold; text-decoration: none;&quot;&gt;is minimized. This is done by solving the sample version (stochastic counterpart) of the KL divergence minimization problem, as in step 3 above.&lt;/del&gt;&lt;/div&gt;&lt;/td&gt;&lt;td colspan=&quot;2&quot; class=&quot;diff-side-added&quot;&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td class=&quot;diff-marker&quot; data-marker=&quot;−&quot;&gt;&lt;/td&gt;&lt;td style=&quot;color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #ffe49c; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;&lt;del style=&quot;font-weight: bold; text-decoration: none;&quot;&gt;It turns out that parameters that minimize the stochastic counterpart for this choice of target distribution and&lt;/del&gt;&lt;/div&gt;&lt;/td&gt;&lt;td colspan=&quot;2&quot; class=&quot;diff-side-added&quot;&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td class=&quot;diff-marker&quot; data-marker=&quot;−&quot;&gt;&lt;/td&gt;&lt;td style=&quot;color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #ffe49c; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;&lt;del style=&quot;font-weight: bold; text-decoration: none;&quot;&gt;parametric family are the sample mean and sample variance corresponding to the &#039;&#039;elite samples&#039;&#039;, which are those samples that have objective function value &amp;lt;math&amp;gt;\geq\gamma&amp;lt;/math&amp;gt;.&lt;/del&gt;&lt;/div&gt;&lt;/td&gt;&lt;td colspan=&quot;2&quot; class=&quot;diff-side-added&quot;&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td class=&quot;diff-marker&quot; data-marker=&quot;−&quot;&gt;&lt;/td&gt;&lt;td style=&quot;color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #ffe49c; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;&lt;del style=&quot;font-weight: bold; text-decoration: none;&quot;&gt;The worst of the elite samples is then used as the level parameter for the next iteration.&lt;/del&gt;&lt;/div&gt;&lt;/td&gt;&lt;td colspan=&quot;2&quot; class=&quot;diff-side-added&quot;&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td class=&quot;diff-marker&quot; data-marker=&quot;−&quot;&gt;&lt;/td&gt;&lt;td style=&quot;color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #ffe49c; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;&lt;del style=&quot;font-weight: bold; text-decoration: none;&quot;&gt;This yields the following randomized algorithm that happens to coincide with the so-called Estimation of Multivariate Normal Algorithm (EMNA), an [[estimation of distribution algorithm]].&lt;/del&gt;&lt;/div&gt;&lt;/td&gt;&lt;td colspan=&quot;2&quot; class=&quot;diff-side-added&quot;&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td class=&quot;diff-marker&quot; data-marker=&quot;−&quot;&gt;&lt;/td&gt;&lt;td style=&quot;color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #ffe49c; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt; &lt;/div&gt;&lt;/td&gt;&lt;td colspan=&quot;2&quot; class=&quot;diff-side-added&quot;&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td class=&quot;diff-marker&quot; data-marker=&quot;−&quot;&gt;&lt;/td&gt;&lt;td style=&quot;color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #ffe49c; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;&lt;del style=&quot;font-weight: bold; text-decoration: none;&quot;&gt;===Pseudo-code===&lt;/del&gt;&lt;/div&gt;&lt;/td&gt;&lt;td colspan=&quot;2&quot; class=&quot;diff-side-added&quot;&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td class=&quot;diff-marker&quot; data-marker=&quot;−&quot;&gt;&lt;/td&gt;&lt;td style=&quot;color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #ffe49c; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;&lt;del style=&quot;font-weight: bold; text-decoration: none;&quot;&gt; 1. mu:=-6; sigma2:=100; t&lt;/del&gt;:&lt;del style=&quot;font-weight: bold; text-decoration: none;&quot;&gt;=0; maxits=100;    &lt;/del&gt;// &lt;del style=&quot;font-weight: bold; text-decoration: none;&quot;&gt;Initialize parameters&lt;/del&gt;&lt;/div&gt;&lt;/td&gt;&lt;td colspan=&quot;2&quot; class=&quot;diff-side-added&quot;&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td class=&quot;diff-marker&quot; data-marker=&quot;−&quot;&gt;&lt;/td&gt;&lt;td style=&quot;color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #ffe49c; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;&lt;del style=&quot;font-weight: bold; text-decoration: none;&quot;&gt; 2. N:=100; Ne:=10;                           //&lt;/del&gt;&lt;/div&gt;&lt;/td&gt;&lt;td colspan=&quot;2&quot; class=&quot;diff-side-added&quot;&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td class=&quot;diff-marker&quot; data-marker=&quot;−&quot;&gt;&lt;/td&gt;&lt;td style=&quot;color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #ffe49c; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;&lt;del style=&quot;font-weight: bold; text-decoration: none;&quot;&gt; 3. while t &amp;lt; maxits and sigma2 &amp;gt; epsilon     // While maxits not exceeded and not converged&lt;/del&gt;&lt;/div&gt;&lt;/td&gt;&lt;td colspan=&quot;2&quot; class=&quot;diff-side-added&quot;&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td class=&quot;diff-marker&quot; data-marker=&quot;−&quot;&gt;&lt;/td&gt;&lt;td style=&quot;color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #ffe49c; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;&lt;del style=&quot;font-weight: bold; text-decoration: none;&quot;&gt; 4.  X = SampleGaussian(mu,sigma2,N);         // Obtain N samples from current sampling distribution&lt;/del&gt;&lt;/div&gt;&lt;/td&gt;&lt;td colspan=&quot;2&quot; class=&quot;diff-side-added&quot;&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td class=&quot;diff-marker&quot; data-marker=&quot;−&quot;&gt;&lt;/td&gt;&lt;td style=&quot;color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #ffe49c; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;&lt;del style=&quot;font-weight: bold; text-decoration: none;&quot;&gt; 5.  S = exp(-(X-2)^2) + 0.8 exp(-(X+2)^2);   // Evaluate objective function at sampled points&lt;/del&gt;&lt;/div&gt;&lt;/td&gt;&lt;td colspan=&quot;2&quot; class=&quot;diff-side-added&quot;&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td class=&quot;diff-marker&quot; data-marker=&quot;−&quot;&gt;&lt;/td&gt;&lt;td style=&quot;color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #ffe49c; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;&lt;del style=&quot;font-weight: bold; text-decoration: none;&quot;&gt; 6.  X = sort(X,S);                           // Sort X by objective function values (in descending order)&lt;/del&gt;&lt;/div&gt;&lt;/td&gt;&lt;td colspan=&quot;2&quot; class=&quot;diff-side-added&quot;&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td class=&quot;diff-marker&quot; data-marker=&quot;−&quot;&gt;&lt;/td&gt;&lt;td style=&quot;color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #ffe49c; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;&lt;del style=&quot;font-weight: bold; text-decoration: none;&quot;&gt; 7.  mu = mean(X(1:Ne)); sigma2=var(X(1:Ne)); // Update parameters of sampling distribution&lt;/del&gt;&lt;/div&gt;&lt;/td&gt;&lt;td colspan=&quot;2&quot; class=&quot;diff-side-added&quot;&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td class=&quot;diff-marker&quot; data-marker=&quot;−&quot;&gt;&lt;/td&gt;&lt;td style=&quot;color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #ffe49c; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;&lt;del style=&quot;font-weight: bold; text-decoration: none;&quot;&gt; 8&lt;/del&gt;. &lt;del style=&quot;font-weight: bold; text-decoration: none;&quot;&gt; t = t+1;                                 // Increment iteration counter&lt;/del&gt;&lt;/div&gt;&lt;/td&gt;&lt;td colspan=&quot;2&quot; class=&quot;diff-side-added&quot;&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td class=&quot;diff-marker&quot; data-marker=&quot;−&quot;&gt;&lt;/td&gt;&lt;td style=&quot;color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #ffe49c; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;&lt;del style=&quot;font-weight: bold; text-decoration: none;&quot;&gt; 9&lt;/del&gt;. &lt;del style=&quot;font-weight: bold; text-decoration: none;&quot;&gt;return mu                                 &lt;/del&gt;// &lt;del style=&quot;font-weight: bold; text-decoration: none;&quot;&gt;Return mean of final sampling distribution as solution&lt;/del&gt;&lt;/div&gt;&lt;/td&gt;&lt;td colspan=&quot;2&quot; class=&quot;diff-side-added&quot;&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td class=&quot;diff-marker&quot; data-marker=&quot;−&quot;&gt;&lt;/td&gt;&lt;td style=&quot;color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #ffe49c; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt; &lt;/div&gt;&lt;/td&gt;&lt;td colspan=&quot;2&quot; class=&quot;diff-side-added&quot;&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td class=&quot;diff-marker&quot; data-marker=&quot;−&quot;&gt;&lt;/td&gt;&lt;td style=&quot;color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #ffe49c; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;&lt;del style=&quot;font-weight: bold; text-decoration: none;&quot;&gt;==Related methods==&lt;/del&gt;&lt;/div&gt;&lt;/td&gt;&lt;td colspan=&quot;2&quot; class=&quot;diff-side-added&quot;&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td class=&quot;diff-marker&quot; data-marker=&quot;−&quot;&gt;&lt;/td&gt;&lt;td style=&quot;color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #ffe49c; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;&lt;del style=&quot;font-weight: bold; text-decoration: none;&quot;&gt;*[[Simulated annealing]]&lt;/del&gt;&lt;/div&gt;&lt;/td&gt;&lt;td colspan=&quot;2&quot; class=&quot;diff-side-added&quot;&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td class=&quot;diff-marker&quot; data-marker=&quot;−&quot;&gt;&lt;/td&gt;&lt;td style=&quot;color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #ffe49c; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;&lt;del style=&quot;font-weight: bold; text-decoration: none;&quot;&gt;*[[Genetic algorithms]]&lt;/del&gt;&lt;/div&gt;&lt;/td&gt;&lt;td colspan=&quot;2&quot; class=&quot;diff-side-added&quot;&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td class=&quot;diff-marker&quot; data-marker=&quot;−&quot;&gt;&lt;/td&gt;&lt;td style=&quot;color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #ffe49c; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;&lt;del style=&quot;font-weight: bold; text-decoration: none;&quot;&gt;*[[Harmony search]]&lt;/del&gt;&lt;/div&gt;&lt;/td&gt;&lt;td colspan=&quot;2&quot; class=&quot;diff-side-added&quot;&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td class=&quot;diff-marker&quot; data-marker=&quot;−&quot;&gt;&lt;/td&gt;&lt;td style=&quot;color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #ffe49c; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;&lt;del style=&quot;font-weight: bold; text-decoration: none;&quot;&gt;*[[Estimation of distribution algorithm]]&lt;/del&gt;&lt;/div&gt;&lt;/td&gt;&lt;td colspan=&quot;2&quot; class=&quot;diff-side-added&quot;&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td class=&quot;diff-marker&quot; data-marker=&quot;−&quot;&gt;&lt;/td&gt;&lt;td style=&quot;color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #ffe49c; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;&lt;del style=&quot;font-weight: bold; text-decoration: none;&quot;&gt;*[[Tabu search]]&lt;/del&gt;&lt;/div&gt;&lt;/td&gt;&lt;td colspan=&quot;2&quot; class=&quot;diff-side-added&quot;&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td class=&quot;diff-marker&quot; data-marker=&quot;−&quot;&gt;&lt;/td&gt;&lt;td style=&quot;color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #ffe49c; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt; &lt;/div&gt;&lt;/td&gt;&lt;td colspan=&quot;2&quot; class=&quot;diff-side-added&quot;&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td class=&quot;diff-marker&quot; data-marker=&quot;−&quot;&gt;&lt;/td&gt;&lt;td style=&quot;color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #ffe49c; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;&lt;del style=&quot;font-weight: bold; text-decoration: none;&quot;&gt;==See also==&lt;/del&gt;&lt;/div&gt;&lt;/td&gt;&lt;td colspan=&quot;2&quot; class=&quot;diff-side-added&quot;&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td class=&quot;diff-marker&quot; data-marker=&quot;−&quot;&gt;&lt;/td&gt;&lt;td style=&quot;color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #ffe49c; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;&lt;del style=&quot;font-weight: bold; text-decoration: none;&quot;&gt;*[[Cross entropy]]&lt;/del&gt;&lt;/div&gt;&lt;/td&gt;&lt;td colspan=&quot;2&quot; class=&quot;diff-side-added&quot;&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td class=&quot;diff-marker&quot; data-marker=&quot;−&quot;&gt;&lt;/td&gt;&lt;td style=&quot;color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #ffe49c; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;&lt;del style=&quot;font-weight: bold; text-decoration: none;&quot;&gt;*[[Kullback–Leibler divergence]]&lt;/del&gt;&lt;/div&gt;&lt;/td&gt;&lt;td colspan=&quot;2&quot; class=&quot;diff-side-added&quot;&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td class=&quot;diff-marker&quot; data-marker=&quot;−&quot;&gt;&lt;/td&gt;&lt;td style=&quot;color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #ffe49c; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;&lt;del style=&quot;font-weight: bold; text-decoration: none;&quot;&gt;*[[Randomized algorithm]]&lt;/del&gt;&lt;/div&gt;&lt;/td&gt;&lt;td colspan=&quot;2&quot; class=&quot;diff-side-added&quot;&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td class=&quot;diff-marker&quot; data-marker=&quot;−&quot;&gt;&lt;/td&gt;&lt;td style=&quot;color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #ffe49c; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;&lt;del style=&quot;font-weight: bold; text-decoration: none;&quot;&gt;*[[Importance sampling]]&lt;/del&gt;&lt;/div&gt;&lt;/td&gt;&lt;td colspan=&quot;2&quot; class=&quot;diff-side-added&quot;&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td class=&quot;diff-marker&quot; data-marker=&quot;−&quot;&gt;&lt;/td&gt;&lt;td style=&quot;color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #ffe49c; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt; &lt;/div&gt;&lt;/td&gt;&lt;td colspan=&quot;2&quot; class=&quot;diff-side-added&quot;&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td class=&quot;diff-marker&quot; data-marker=&quot;−&quot;&gt;&lt;/td&gt;&lt;td style=&quot;color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #ffe49c; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;&lt;del style=&quot;font-weight: bold; text-decoration: none;&quot;&gt;==References==&lt;/del&gt;&lt;/div&gt;&lt;/td&gt;&lt;td colspan=&quot;2&quot; class=&quot;diff-side-added&quot;&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td class=&quot;diff-marker&quot; data-marker=&quot;−&quot;&gt;&lt;/td&gt;&lt;td style=&quot;color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #ffe49c; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;&lt;del style=&quot;font-weight: bold; text-decoration: none;&quot;&gt;*De Boer, P-T., Kroese, D.P, Mannor, S. and Rubinstein, R.Y. (2005). A Tutorial on the Cross-Entropy Method. &#039;&#039;Annals of Operations Research&#039;&#039;, &#039;&#039;&#039;134&#039;&#039;&#039; (1), 19–67.[http:&lt;/del&gt;//www.&lt;del style=&quot;font-weight: bold; text-decoration: none;&quot;&gt;maths.uq.edu.au/~kroese/ps/aortut&lt;/del&gt;.&lt;del style=&quot;font-weight: bold; text-decoration: none;&quot;&gt;pdf&lt;/del&gt;]&lt;/div&gt;&lt;/td&gt;&lt;td colspan=&quot;2&quot; class=&quot;diff-side-added&quot;&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td class=&quot;diff-marker&quot; data-marker=&quot;−&quot;&gt;&lt;/td&gt;&lt;td style=&quot;color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #ffe49c; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;&lt;del style=&quot;font-weight: bold; text-decoration: none;&quot;&gt;*Rubinstein, R.Y. (1997). Optimization of Computer simulation Models with Rare Events, &#039;&#039;European Journal of Operations Research&#039;&#039;, &#039;&#039;&#039;99&#039;&#039;&#039;, 89–112.&lt;/del&gt;&lt;/div&gt;&lt;/td&gt;&lt;td colspan=&quot;2&quot; class=&quot;diff-side-added&quot;&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td class=&quot;diff-marker&quot; data-marker=&quot;−&quot;&gt;&lt;/td&gt;&lt;td style=&quot;color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #ffe49c; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;&lt;del style=&quot;font-weight: bold; text-decoration: none;&quot;&gt;*Rubinstein, R.Y., Kroese, D.P. (2004&lt;/del&gt;)&lt;del style=&quot;font-weight: bold; text-decoration: none;&quot;&gt;. &#039;&#039;The Cross-Entropy Method: A Unified Approach to Combinatorial Optimization, Monte-Carlo Simulation, and Machine Learning&#039;&#039;, Springer-Verlag, New York.&lt;/del&gt;&lt;/div&gt;&lt;/td&gt;&lt;td colspan=&quot;2&quot; class=&quot;diff-side-added&quot;&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td class=&quot;diff-marker&quot; data-marker=&quot;−&quot;&gt;&lt;/td&gt;&lt;td style=&quot;color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #ffe49c; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt; &lt;/div&gt;&lt;/td&gt;&lt;td colspan=&quot;2&quot; class=&quot;diff-side-added&quot;&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td class=&quot;diff-marker&quot; data-marker=&quot;−&quot;&gt;&lt;/td&gt;&lt;td style=&quot;color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #ffe49c; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;&lt;del style=&quot;font-weight: bold; text-decoration: none;&quot;&gt;==External links==&lt;/del&gt;&lt;/div&gt;&lt;/td&gt;&lt;td colspan=&quot;2&quot; class=&quot;diff-side-added&quot;&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td class=&quot;diff-marker&quot; data-marker=&quot;−&quot;&gt;&lt;/td&gt;&lt;td style=&quot;color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #ffe49c; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;&lt;del style=&quot;font-weight: bold; text-decoration: none;&quot;&gt;*[http://iew3.technion.ac.il/CE/ Homepage for the CE method]&lt;/del&gt;&lt;/div&gt;&lt;/td&gt;&lt;td colspan=&quot;2&quot; class=&quot;diff-side-added&quot;&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td class=&quot;diff-marker&quot; data-marker=&quot;−&quot;&gt;&lt;/td&gt;&lt;td style=&quot;color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #ffe49c; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt; &lt;/div&gt;&lt;/td&gt;&lt;td colspan=&quot;2&quot; class=&quot;diff-side-added&quot;&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td class=&quot;diff-marker&quot; data-marker=&quot;−&quot;&gt;&lt;/td&gt;&lt;td style=&quot;color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #ffe49c; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;&lt;del style=&quot;font-weight: bold; text-decoration: none;&quot;&gt;[[Category:Heuristics]]&lt;/del&gt;&lt;/div&gt;&lt;/td&gt;&lt;td colspan=&quot;2&quot; class=&quot;diff-side-added&quot;&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td class=&quot;diff-marker&quot; data-marker=&quot;−&quot;&gt;&lt;/td&gt;&lt;td style=&quot;color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #ffe49c; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;&lt;del style=&quot;font-weight: bold; text-decoration: none;&quot;&gt;[[Category:Optimization algorithms and methods]]&lt;/del&gt;&lt;/div&gt;&lt;/td&gt;&lt;td colspan=&quot;2&quot; class=&quot;diff-side-added&quot;&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td class=&quot;diff-marker&quot; data-marker=&quot;−&quot;&gt;&lt;/td&gt;&lt;td style=&quot;color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #ffe49c; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;&lt;del style=&quot;font-weight: bold; text-decoration: none;&quot;&gt;[[Category:Monte Carlo methods]]&lt;/del&gt;&lt;/div&gt;&lt;/td&gt;&lt;td colspan=&quot;2&quot; class=&quot;diff-side-added&quot;&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td class=&quot;diff-marker&quot; data-marker=&quot;−&quot;&gt;&lt;/td&gt;&lt;td style=&quot;color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #ffe49c; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;&lt;del style=&quot;font-weight: bold; text-decoration: none;&quot;&gt;[[Category:Machine learning]]&lt;/del&gt;&lt;/div&gt;&lt;/td&gt;&lt;td colspan=&quot;2&quot; class=&quot;diff-side-added&quot;&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;/table&gt;</summary>
		<author><name>en&gt;Michael Hardy</name></author>
	</entry>
	<entry>
		<id>https://en.formulasearchengine.com/index.php?title=Indecomposable_distribution&amp;diff=14332&amp;oldid=prev</id>
		<title>174.53.163.119: /* Indecomposable */ spacing</title>
		<link rel="alternate" type="text/html" href="https://en.formulasearchengine.com/index.php?title=Indecomposable_distribution&amp;diff=14332&amp;oldid=prev"/>
		<updated>2011-08-19T17:47:16Z</updated>

		<summary type="html">&lt;p&gt;&lt;span class=&quot;autocomment&quot;&gt;Indecomposable: &lt;/span&gt; spacing&lt;/p&gt;
&lt;p&gt;&lt;b&gt;New page&lt;/b&gt;&lt;/p&gt;&lt;div&gt;{{no footnotes|date=September 2013}}&lt;br /&gt;
{{ref improve|date=September 2013}}&lt;br /&gt;
The &amp;#039;&amp;#039;&amp;#039;cross-entropy (CE) method&amp;#039;&amp;#039;&amp;#039; attributed to [[Reuven Rubinstein]] is a general [[Monte Carlo method|Monte Carlo]] approach to&lt;br /&gt;
[[Combinatorial optimization|combinatorial]] and [[Continuous optimization|continuous]] multi-extremal [[Optimization (mathematics)|optimization]] and [[importance sampling]]. &lt;br /&gt;
The method originated from the field of &amp;#039;&amp;#039;rare event simulation&amp;#039;&amp;#039;, where&lt;br /&gt;
very small probabilities need to be accurately estimated, for example in  network reliability analysis, queueing models, or performance analysis of telecommunication systems.&lt;br /&gt;
The CE method can be applied to static and noisy combinatorial optimization problems such as the [[traveling salesman problem]], the [[quadratic assignment problem]], [[Sequence_alignment|DNA sequence alignment]], the [[Maxcut|max-cut]] problem and the buffer allocation problem, as well as continuous [[global optimization]] problems with many local [[extremum|extrema]].&lt;br /&gt;
&lt;br /&gt;
In a nutshell the CE method consists of two phases:&lt;br /&gt;
&lt;br /&gt;
#Generate a random data sample (trajectories, vectors, etc.) according to a specified mechanism.&lt;br /&gt;
#Update the parameters of the random mechanism based on the data to produce a &amp;quot;better&amp;quot; sample in the next iteration. This step involves minimizing the [[cross entropy|&amp;#039;&amp;#039;cross-entropy&amp;#039;&amp;#039;]] or  [[Kullback–Leibler divergence]].&lt;br /&gt;
 &lt;br /&gt;
==Estimation via importance sampling==&lt;br /&gt;
Consider the general problem of estimating the quantity &amp;lt;math&amp;gt;\ell = \mathbb{E}_{\mathbf{u}}[H(\mathbf{X})] = \int H(\mathbf{x})\, f(\mathbf{x}; \mathbf{u})\, \textrm{d}\mathbf{x}&amp;lt;/math&amp;gt;, where &amp;lt;math&amp;gt;H&amp;lt;/math&amp;gt; is some &amp;#039;&amp;#039;performance function&amp;#039;&amp;#039; and &amp;lt;math&amp;gt;f(\mathbf{x};\mathbf{u})&amp;lt;/math&amp;gt; is a member of some [[parametric family]] of distributions. Using [[importance sampling]] this quantity can be estimated as &amp;lt;math&amp;gt;\hat{\ell} = \frac{1}{N} \sum_{i=1}^N H(\mathbf{X}_i) \frac{f(\mathbf{X}_i; \mathbf{u})}{g(\mathbf{X}_i)}&amp;lt;/math&amp;gt;, where &amp;lt;math&amp;gt;\mathbf{X}_1,\dots,\mathbf{X}_N&amp;lt;/math&amp;gt; is a random sample from &amp;lt;math&amp;gt;g\,&amp;lt;/math&amp;gt;. For positive &amp;lt;math&amp;gt;H&amp;lt;/math&amp;gt;, the theoretically &amp;#039;&amp;#039;optimal&amp;#039;&amp;#039; importance sampling [[probability density function|density]] (pdf)is given by &lt;br /&gt;
&amp;lt;math&amp;gt; g^*(\mathbf{x}) = H(\mathbf{x}) f(\mathbf{x};\mathbf{u})/\ell&amp;lt;/math&amp;gt;. This, however, depends on the unknown &amp;lt;math&amp;gt;\ell&amp;lt;/math&amp;gt;. The CE method aims to approximate the optimal PDF by adaptively selecting members of the parametric family that are closest (in the [[Kullback–Leibler divergence|Kullback–Leibler]] sense) to the optimal PDF &amp;lt;math&amp;gt;g^*&amp;lt;/math&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
==Generic CE algorithm==&lt;br /&gt;
# Choose initial parameter vector &amp;lt;math&amp;gt;\mathbf{v}^{(0)}&amp;lt;/math&amp;gt;; set t = 1.&lt;br /&gt;
# Generate a random sample &amp;lt;math&amp;gt;\mathbf{X}_1,\dots,\mathbf{X}_N&amp;lt;/math&amp;gt; from &amp;lt;math&amp;gt;f(\cdot;\mathbf{v}^{(t-1)})&amp;lt;/math&amp;gt;&amp;lt;/p&amp;gt;&lt;br /&gt;
# Solve for &amp;lt;math&amp;gt;\mathbf{v}^{(t)}&amp;lt;/math&amp;gt;, where&amp;lt;br&amp;gt;&amp;lt;math&amp;gt;\mathbf{v}^{(t)} = \mathop{\textrm{argmax}}_{\mathbf{v}} \frac{1}{N} \sum_{i=1}^N H(\mathbf{X}_i)\frac{f(\mathbf{X}_i;\mathbf{u})}{f(\mathbf{X}_i;\mathbf{v}^{(t-1)})} \log f(\mathbf{X}_i;\mathbf{v})&amp;lt;/math&amp;gt;&lt;br /&gt;
# If convergence is reached then &amp;#039;&amp;#039;&amp;#039;stop&amp;#039;&amp;#039;&amp;#039;; otherwise, increase t by 1 and reiterate from step 2.&lt;br /&gt;
&lt;br /&gt;
In several cases, the solution to step 3 can be found &amp;#039;&amp;#039;analytically&amp;#039;&amp;#039;.  Situations in which this occurs are&lt;br /&gt;
* When &amp;lt;math&amp;gt;f\,&amp;lt;/math&amp;gt; belongs to the [[Exponential_family|natural exponential family]]&lt;br /&gt;
* When &amp;lt;math&amp;gt;f\,&amp;lt;/math&amp;gt; is [[discrete space|discrete]] with finite [[Support (mathematics)|support]]&lt;br /&gt;
* When &amp;lt;math&amp;gt;H(\mathbf{X}) = \mathrm{I}_{\{\mathbf{x}\in A\}}&amp;lt;/math&amp;gt; and &amp;lt;math&amp;gt;f(\mathbf{X}_i;\mathbf{u}) = f(\mathbf{X}_i;\mathbf{v}^{(t-1)})&amp;lt;/math&amp;gt;, then &amp;lt;math&amp;gt;\mathbf{v}^{(t)}&amp;lt;/math&amp;gt; corresponds to the [[Maximum likelihood|maximum likelihood estimator]] based on those &amp;lt;math&amp;gt;\mathbf{X}_k \in A&amp;lt;/math&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
== Continuous optimization&amp;amp;mdash;example==&lt;br /&gt;
The same CE algorithm can be used for optimization, rather than estimation. &lt;br /&gt;
Suppose the problem is to maximize some function &amp;lt;math&amp;gt;S(x)&amp;lt;/math&amp;gt;, for example, &lt;br /&gt;
&amp;lt;math&amp;gt;S(x) = \textrm{e}^{-(x-2)^2} + 0.8\,\textrm{e}^{-(x+2)^2}&amp;lt;/math&amp;gt;. &lt;br /&gt;
To apply CE, one considers first the &amp;#039;&amp;#039;associated stochastic problem&amp;#039;&amp;#039; of estimating&lt;br /&gt;
&amp;lt;math&amp;gt;\mathbb{P}_{\boldsymbol{\theta}}(S(X)\geq\gamma)&amp;lt;/math&amp;gt;&lt;br /&gt;
for a given &amp;#039;&amp;#039;level&amp;#039;&amp;#039; &amp;lt;math&amp;gt;\gamma\,&amp;lt;/math&amp;gt;, and parametric family &amp;lt;math&amp;gt;\left\{f(\cdot;\boldsymbol{\theta})\right\}&amp;lt;/math&amp;gt;, for example the 1-dimensional &lt;br /&gt;
[[Gaussian distribution]],&lt;br /&gt;
parameterized by its mean &amp;lt;math&amp;gt;\mu_t\,&amp;lt;/math&amp;gt; and variance &amp;lt;math&amp;gt;\sigma_t^2&amp;lt;/math&amp;gt; (so &amp;lt;math&amp;gt;\boldsymbol{\theta} = (\mu,\sigma^2)&amp;lt;/math&amp;gt; here).&lt;br /&gt;
Hence, for a given &amp;lt;math&amp;gt;\gamma\,&amp;lt;/math&amp;gt;, the goal is to find &amp;lt;math&amp;gt;\boldsymbol{\theta}&amp;lt;/math&amp;gt; so that&lt;br /&gt;
&amp;lt;math&amp;gt;D_{\mathrm{KL}}(\textrm{I}_{\{S(x)\geq\gamma\}}\|f_{\boldsymbol{\theta}})&amp;lt;/math&amp;gt;&lt;br /&gt;
is minimized. This is done by solving the sample version (stochastic counterpart) of the KL divergence minimization problem, as in step 3 above.&lt;br /&gt;
It turns out that parameters that minimize the stochastic counterpart for this choice of target distribution and&lt;br /&gt;
parametric family are the sample mean and sample variance corresponding to the &amp;#039;&amp;#039;elite samples&amp;#039;&amp;#039;, which are those samples that have objective function value &amp;lt;math&amp;gt;\geq\gamma&amp;lt;/math&amp;gt;.&lt;br /&gt;
The worst of the elite samples is then used as the level parameter for the next iteration.&lt;br /&gt;
This yields the following randomized algorithm that happens to coincide with the so-called Estimation of Multivariate Normal Algorithm (EMNA), an [[estimation of distribution algorithm]].&lt;br /&gt;
&lt;br /&gt;
===Pseudo-code===&lt;br /&gt;
 1. mu:=-6; sigma2:=100; t:=0; maxits=100;    // Initialize parameters&lt;br /&gt;
 2. N:=100; Ne:=10;                           //&lt;br /&gt;
 3. while t &amp;lt; maxits and sigma2 &amp;gt; epsilon     // While maxits not exceeded and not converged&lt;br /&gt;
 4.  X = SampleGaussian(mu,sigma2,N);         // Obtain N samples from current sampling distribution&lt;br /&gt;
 5.  S = exp(-(X-2)^2) + 0.8 exp(-(X+2)^2);   // Evaluate objective function at sampled points&lt;br /&gt;
 6.  X = sort(X,S);                           // Sort X by objective function values (in descending order)&lt;br /&gt;
 7.  mu = mean(X(1:Ne)); sigma2=var(X(1:Ne)); // Update parameters of sampling distribution&lt;br /&gt;
 8.  t = t+1;                                 // Increment iteration counter&lt;br /&gt;
 9. return mu                                 // Return mean of final sampling distribution as solution&lt;br /&gt;
&lt;br /&gt;
==Related methods==&lt;br /&gt;
*[[Simulated annealing]]&lt;br /&gt;
*[[Genetic algorithms]]&lt;br /&gt;
*[[Harmony search]]&lt;br /&gt;
*[[Estimation of distribution algorithm]]&lt;br /&gt;
*[[Tabu search]]&lt;br /&gt;
&lt;br /&gt;
==See also==&lt;br /&gt;
*[[Cross entropy]]&lt;br /&gt;
*[[Kullback–Leibler divergence]]&lt;br /&gt;
*[[Randomized algorithm]]&lt;br /&gt;
*[[Importance sampling]]&lt;br /&gt;
&lt;br /&gt;
==References==&lt;br /&gt;
*De Boer, P-T., Kroese, D.P, Mannor, S. and Rubinstein, R.Y. (2005). A Tutorial on the Cross-Entropy Method. &amp;#039;&amp;#039;Annals of Operations Research&amp;#039;&amp;#039;, &amp;#039;&amp;#039;&amp;#039;134&amp;#039;&amp;#039;&amp;#039; (1), 19–67.[http://www.maths.uq.edu.au/~kroese/ps/aortut.pdf]&lt;br /&gt;
*Rubinstein, R.Y. (1997). Optimization of Computer simulation Models with Rare Events, &amp;#039;&amp;#039;European Journal of Operations Research&amp;#039;&amp;#039;, &amp;#039;&amp;#039;&amp;#039;99&amp;#039;&amp;#039;&amp;#039;, 89–112.&lt;br /&gt;
*Rubinstein, R.Y., Kroese, D.P. (2004). &amp;#039;&amp;#039;The Cross-Entropy Method: A Unified Approach to Combinatorial Optimization, Monte-Carlo Simulation, and Machine Learning&amp;#039;&amp;#039;, Springer-Verlag, New York.&lt;br /&gt;
&lt;br /&gt;
==External links==&lt;br /&gt;
*[http://iew3.technion.ac.il/CE/ Homepage for the CE method]&lt;br /&gt;
&lt;br /&gt;
[[Category:Heuristics]]&lt;br /&gt;
[[Category:Optimization algorithms and methods]]&lt;br /&gt;
[[Category:Monte Carlo methods]]&lt;br /&gt;
[[Category:Machine learning]]&lt;/div&gt;</summary>
		<author><name>174.53.163.119</name></author>
	</entry>
</feed>