<?xml version="1.0"?>
<feed xmlns="http://www.w3.org/2005/Atom" xml:lang="en">
	<id>https://en.formulasearchengine.com/index.php?action=history&amp;feed=atom&amp;title=Locally_finite_operator</id>
	<title>Locally finite operator - Revision history</title>
	<link rel="self" type="application/atom+xml" href="https://en.formulasearchengine.com/index.php?action=history&amp;feed=atom&amp;title=Locally_finite_operator"/>
	<link rel="alternate" type="text/html" href="https://en.formulasearchengine.com/index.php?title=Locally_finite_operator&amp;action=history"/>
	<updated>2026-05-22T23:30:31Z</updated>
	<subtitle>Revision history for this page on the wiki</subtitle>
	<generator>MediaWiki 1.43.0-wmf.28</generator>
	<entry>
		<id>https://en.formulasearchengine.com/index.php?title=Locally_finite_operator&amp;diff=28251&amp;oldid=prev</id>
		<title>en&gt;David Eppstein: Added tags to the page using Page Curation (unreferenced)</title>
		<link rel="alternate" type="text/html" href="https://en.formulasearchengine.com/index.php?title=Locally_finite_operator&amp;diff=28251&amp;oldid=prev"/>
		<updated>2012-11-09T06:33:20Z</updated>

		<summary type="html">&lt;p&gt;Added tags to the page using &lt;a href=&quot;https://en.wikipedia.org/wiki/Page_Curation&quot; class=&quot;extiw&quot; title=&quot;wikipedia:Page Curation&quot;&gt;Page Curation&lt;/a&gt; (unreferenced)&lt;/p&gt;
&lt;p&gt;&lt;b&gt;New page&lt;/b&gt;&lt;/p&gt;&lt;div&gt;[[File:Svm 8 polinomial.JPG|thumb|300px|Illustration of the mapping &amp;lt;math&amp;gt;\varphi&amp;lt;/math&amp;gt;. On the left a set of samples in the input space, on the right the same samples in the feature space where the polynomial kernel &amp;lt;math&amp;gt;K(x,y)&amp;lt;/math&amp;gt; (for some values of the parameters &amp;lt;math&amp;gt;c&amp;lt;/math&amp;gt; and &amp;lt;math&amp;gt;d&amp;lt;/math&amp;gt; is the inner product. The hyperplane learned in feature space by an SVM is an ellipse in the input space.]]&lt;br /&gt;
In [[machine learning]], the &amp;#039;&amp;#039;&amp;#039;polynomial kernel&amp;#039;&amp;#039;&amp;#039; is a [[kernel function]] commonly used with [[support vector machine]]s (SVMs) and other [[Kernel trick|kernelized]] models, that represents the similarity of vectors (training samples) in a feature space over polynomials of the original variables, allowing learning of non-linear models.&lt;br /&gt;
&lt;br /&gt;
Intuitively, the polynomial kernel looks not only at the given features of input samples to determine their similarity, but also combinations of these.&lt;br /&gt;
&lt;br /&gt;
==Definition==&lt;br /&gt;
For degree-&amp;#039;&amp;#039;d&amp;#039;&amp;#039; polynomials, the polynomial kernel is defined as&amp;lt;ref&amp;gt;http://www.cs.tufts.edu/~roni/Teaching/CLT/LN/lecture18.pdf&amp;lt;/ref&amp;gt;&lt;br /&gt;
&lt;br /&gt;
:&amp;lt;math&amp;gt;K(x,y) = (x^\top y + c)^d&amp;lt;/math&amp;gt;&lt;br /&gt;
&lt;br /&gt;
where &amp;lt;math&amp;gt;x&amp;lt;/math&amp;gt; and &amp;lt;math&amp;gt;y&amp;lt;/math&amp;gt; are vectors in the &amp;#039;&amp;#039;input space&amp;#039;&amp;#039;, i.e. vectors of features computed from training or test samples, &amp;lt;math&amp;gt;c \geq 0&amp;lt;/math&amp;gt; is a constant trading off the influence of higher-order versus lower-order terms in the polynomial. When &amp;lt;math&amp;gt;c = 0&amp;lt;/math&amp;gt;, the kernel is called homogeneous.&amp;lt;ref&amp;gt;{{cite arXiv&lt;br /&gt;
|last=Shashua&lt;br /&gt;
|first=Amnon&lt;br /&gt;
|eprint=0904.3664&lt;br /&gt;
|title=Introduction to Machine Learning: Class Notes 67577&lt;br /&gt;
|class=cs.LG&lt;br /&gt;
|year=2009&lt;br /&gt;
|version=v1&lt;br /&gt;
|accessdate=26 March 2013&lt;br /&gt;
}}&amp;lt;/ref&amp;gt; (A further generalized polykernel divides &amp;lt;math&amp;gt;x^\top y&amp;lt;/math&amp;gt; by a user-specified scalar parameter &amp;lt;math&amp;gt;a&amp;lt;/math&amp;gt;.&amp;lt;ref name=&amp;quot;lin2012&amp;quot;/&amp;gt;)&lt;br /&gt;
&lt;br /&gt;
As a kernel, &amp;lt;math&amp;gt;K&amp;lt;/math&amp;gt; corresponds an inner product in a feature space based on some mapping &amp;lt;math&amp;gt;\varphi&amp;lt;/math&amp;gt;:&lt;br /&gt;
&lt;br /&gt;
:&amp;lt;math&amp;gt;K(x,y) = \langle \varphi(x), \varphi(y) \rangle&amp;lt;/math&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The nature of &amp;lt;math&amp;gt;\varphi&amp;lt;/math&amp;gt; can be glanced from an example. Let &amp;lt;math&amp;gt;d=2&amp;lt;/math&amp;gt;, so we get the special case of the quadratic kernel. Then&lt;br /&gt;
&lt;br /&gt;
:&amp;lt;math&amp;gt;K(x,y) = \left(\sum_{i=1}^n x_i y_i + c\right)^2 = \sum_{i=1}^n x_i^2 y_i^2 + \sum_{i=2}^n \sum_{j=1}^{i-1} \sqrt{2} x_i y_i \sqrt{2} x_j y_j + \sum_{i=1}^n \sqrt{2c} x_i \sqrt{2c} y_i + c^2&lt;br /&gt;
&amp;lt;/math&amp;gt;&lt;br /&gt;
&lt;br /&gt;
From this it follows that the feature map is given by:&lt;br /&gt;
&lt;br /&gt;
:&amp;lt;math&amp;gt;&lt;br /&gt;
\varphi(x) = \langle x_n^2, \ldots, x_1^2, \sqrt{2} x_n x_{n-1}, \ldots, \sqrt{2} x_n x_1, \sqrt{2} x_{n-1} x_{n-2}, \ldots, \sqrt{2} x_{n-1} x_{1}, \ldots, \sqrt{2} x_{2} x_{1}, \sqrt{2c} x_n, \ldots, \sqrt{2c} x_1, c \rangle&lt;br /&gt;
&amp;lt;/math&amp;gt;&lt;br /&gt;
&lt;br /&gt;
When the input features are binary-valued (booleans), &amp;lt;math&amp;gt;c = 1&amp;lt;/math&amp;gt; and the &amp;lt;math&amp;gt;\sqrt{2}&amp;lt;/math&amp;gt; terms are ignored, the mapped features correspond to [[Logical conjunction|conjunction]]s of input features.&amp;lt;ref name=&amp;quot;Goldberg2008&amp;quot;&amp;gt;Yoav Goldberg and Michael Elhadad (2008). splitSVM: Fast, Space-Efficient, non-Heuristic, Polynomial Kernel Computation for NLP Applications. Proc. ACL-08: HLT.&amp;lt;/ref&amp;gt;&lt;br /&gt;
&lt;br /&gt;
==Practical use==&lt;br /&gt;
Although the [[RBF kernel]] is more popular in SVM classification than the polynomial kernel, the latter is quite popular in [[natural language processing]] (NLP).&amp;lt;ref name=&amp;quot;Goldberg2008&amp;quot;/&amp;gt;&amp;lt;ref name=&amp;quot;Chang2010&amp;quot;&amp;gt;Yin-Wen Chang, Cho-Jui Hsieh, Kai-Wei Chang, Michael Ringgaard and Chih-Jen Lin (2010). [http://jmlr.csail.mit.edu/papers/v11/chang10a.html Training and testing low-degree polynomial data mappings via linear SVM]. J. Machine Learning Research &amp;#039;&amp;#039;&amp;#039;11&amp;#039;&amp;#039;&amp;#039;:1471–1490.&amp;lt;/ref&amp;gt;&lt;br /&gt;
The most common degree is &amp;#039;&amp;#039;d&amp;#039;&amp;#039;=2, since larger degrees tend to [[overfitting|overfit]] on NLP problems.&lt;br /&gt;
&lt;br /&gt;
Various ways of computing the polynomial kernel (both exact and approximate) have been devised as alternatives to the usual non-linear SVM training algorithms, including:&lt;br /&gt;
&lt;br /&gt;
* full expansion of the kernel prior to training/testing with a linear SVM,&amp;lt;ref name=&amp;quot;Chang2010&amp;quot;/&amp;gt; i.e. full computation of the mapping &amp;lt;math&amp;gt;\varphi&amp;lt;/math&amp;gt;&lt;br /&gt;
* [[Association rule learning|basket mining]] (using a variant of the [[apriori algorithm]]) for the most commonly occurring feature conjunctions in a training set to produce an approximate expansion&amp;lt;ref name=&amp;quot;Kudo2003&amp;quot;&amp;gt;T. Kudo and Y. Matsumoto (2003). Fast methods for kernel-based text analysis. Proc. ACL 2003.&amp;lt;/ref&amp;gt;&lt;br /&gt;
* [[inverted index]]ing of support vectors&amp;lt;ref name=&amp;quot;Kudo2003&amp;quot;/&amp;gt;&amp;lt;ref name=&amp;quot;Goldberg2008&amp;quot;/&amp;gt;&lt;br /&gt;
&lt;br /&gt;
One problem with the polynomial kernel is that may it suffer from [[Numerical stability|numerical instability]]: when &amp;lt;math&amp;gt;x^\top y + c &amp;lt; 1&amp;lt;/math&amp;gt;, &amp;lt;math&amp;gt;K(x, y) = (x^\top y + c)^d&amp;lt;/math&amp;gt; tends to zero as &amp;lt;math&amp;gt;d&amp;lt;/math&amp;gt; is increased, whereas when &amp;lt;math&amp;gt;x^\top y + c &amp;gt; 1&amp;lt;/math&amp;gt;, &amp;lt;math&amp;gt;K(x, y)&amp;lt;/math&amp;gt; tends to infinity.&amp;lt;ref name=&amp;quot;lin2012&amp;quot;&amp;gt;Chih-Jen Lin (2012). [http://www.csie.ntu.edu.tw/~cjlin/talks/mlss_kyoto.pdf Machine learning software: design and practical use]. Talk at Machine Learning Summer School, Kyoto.&amp;lt;/ref&amp;gt;&lt;br /&gt;
&lt;br /&gt;
==References==&lt;br /&gt;
&amp;lt;references/&amp;gt;&lt;br /&gt;
&lt;br /&gt;
[[Category:Kernel methods for machine learning]]&lt;/div&gt;</summary>
		<author><name>en&gt;David Eppstein</name></author>
	</entry>
</feed>