<?xml version="1.0"?>
<feed xmlns="http://www.w3.org/2005/Atom" xml:lang="en">
	<id>https://en.formulasearchengine.com/index.php?action=history&amp;feed=atom&amp;title=Triple_modular_redundancy</id>
	<title>Triple modular redundancy - Revision history</title>
	<link rel="self" type="application/atom+xml" href="https://en.formulasearchengine.com/index.php?action=history&amp;feed=atom&amp;title=Triple_modular_redundancy"/>
	<link rel="alternate" type="text/html" href="https://en.formulasearchengine.com/index.php?title=Triple_modular_redundancy&amp;action=history"/>
	<updated>2026-06-02T03:42:32Z</updated>
	<subtitle>Revision history for this page on the wiki</subtitle>
	<generator>MediaWiki 1.43.0-wmf.28</generator>
	<entry>
		<id>https://en.formulasearchengine.com/index.php?title=Triple_modular_redundancy&amp;diff=16061&amp;oldid=prev</id>
		<title>en&gt;DavidCary: mention data scrubbing</title>
		<link rel="alternate" type="text/html" href="https://en.formulasearchengine.com/index.php?title=Triple_modular_redundancy&amp;diff=16061&amp;oldid=prev"/>
		<updated>2013-07-11T05:05:03Z</updated>

		<summary type="html">&lt;p&gt;mention data scrubbing&lt;/p&gt;
&lt;p&gt;&lt;b&gt;New page&lt;/b&gt;&lt;/p&gt;&lt;div&gt;&amp;#039;&amp;#039;&amp;#039;Cosine similarity&amp;#039;&amp;#039;&amp;#039; is a measure of similarity between two vectors of an [[inner product space]] that measures the [[cosine]] of the angle between them. The cosine of 0° is 1, and it is less than 1 for any other angle. It is thus a judgement of orientation and not magnitude: two vectors with the same orientation have a Cosine similarity of 1, two vectors at 90° have a similarity of 0, and two vectors diametrically opposed have a similarity of -1, independent of their magnitude. Cosine similarity is particularly used in positive space, where the outcome is neatly bounded in [0,1].&lt;br /&gt;
&lt;br /&gt;
Note that these bounds apply for any number of dimensions, and Cosine similarity is most commonly used in high-dimensional positive spaces. For example, in [[Information Retrieval]] and [[text mining]], each term is notionally assigned a different dimension and a document is characterised by a vector where the value of each dimension corresponds to the number of times that term appears in the document. Cosine similarity then gives a useful measure of how similar two documents are likely to be in terms of their subject matter.&amp;lt;ref&amp;gt;Singhal, Amit (2001). &amp;quot;Modern Information Retrieval: A Brief Overview&amp;quot;. Bulletin of the IEEE Computer Society Technical Committee on Data Engineering 24 (4): 35–43.&amp;lt;/ref&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The technique is also used to measure cohesion within [[cluster]]s in the field of [[data mining]].&amp;lt;ref&amp;gt;P.-N. Tan, M. Steinbach &amp;amp; V. Kumar, &amp;quot;Introduction to Data Mining&amp;quot;, , Addison-Wesley (2005), ISBN 0-321-32136-7, chapter 8; page 500.&amp;lt;/ref&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;Cosine distance&amp;#039;&amp;#039; is a term often used for the complement in positive space, that is: &amp;lt;math&amp;gt;D_C(A,B) = 1 - S_C(A,B)&amp;lt;/math&amp;gt;. It is important to note, however, that this is not a proper [[distance metric]] as it does not have the triangle inequality property and it violates the coincidence axiom; to repair the triangle inequality property whilst maintaining the same ordering, it is necessary to convert to Angular distance (see below.)&lt;br /&gt;
&lt;br /&gt;
One of the reasons for the popularity of Cosine similarity is that it is very efficient to evaluate, especially for sparse vectors, as only the non-zero dimensions need to be considered.&lt;br /&gt;
&lt;br /&gt;
==Definition==&lt;br /&gt;
&lt;br /&gt;
The cosine of two vectors can be derived by using the [[Euclidean vector#Dot product|Euclidean dot product]] formula:&lt;br /&gt;
&lt;br /&gt;
:&amp;lt;math&amp;gt;\mathbf{a}\cdot\mathbf{b}&lt;br /&gt;
=\left\|\mathbf{a}\right\|\left\|\mathbf{b}\right\|\cos\theta&amp;lt;/math&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Given two [[Vector (geometric)|vectors]] of attributes, &amp;#039;&amp;#039;A&amp;#039;&amp;#039; and &amp;#039;&amp;#039;B&amp;#039;&amp;#039;, the cosine similarity, &amp;#039;&amp;#039;cos(θ)&amp;#039;&amp;#039;, is represented using a [[dot product]] and [[Magnitude (mathematics)#Euclidean vectors|magnitude]] as&lt;br /&gt;
&lt;br /&gt;
:&amp;lt;math&amp;gt;\text{similarity} = \cos(\theta) = {A \cdot B \over \|A\| \|B\|} = \frac{ \sum\limits_{i=1}^{n}{A_i \times B_i} }{ \sqrt{\sum\limits_{i=1}^{n}{(A_i)^2}} \times \sqrt{\sum\limits_{i=1}^{n}{(B_i)^2}} }&amp;lt;/math&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The resulting similarity ranges from &amp;amp;minus;1 meaning exactly opposite, to 1 meaning exactly the same, with 0 usually indicating independence, and in-between values indicating intermediate similarity or dissimilarity.&lt;br /&gt;
&lt;br /&gt;
For text matching, the attribute vectors &amp;#039;&amp;#039;A&amp;#039;&amp;#039; and &amp;#039;&amp;#039;B&amp;#039;&amp;#039; are usually the [[tf-idf|term frequency]] vectors of the documents.  The cosine similarity can be seen as a method of normalizing document length during comparison.&lt;br /&gt;
&lt;br /&gt;
In the case of [[information retrieval]], the cosine similarity of two documents will range from 0 to 1, since the term frequencies ([[tf-idf]] weights) cannot be negative. The angle between two term frequency vectors cannot be greater than&amp;amp;nbsp;90°.&lt;br /&gt;
&lt;br /&gt;
=== Angular similarity ===&lt;br /&gt;
&lt;br /&gt;
The term &amp;quot;cosine similarity&amp;quot; has also been used on occasion to express a different coefficient, although the most common use is as defined above. Using the same calculation of similarity, the normalised angle between the vectors can be used as a bounded similarity function within [0,1], calculated from the above definition of similarity by:&lt;br /&gt;
:&amp;lt;math&amp;gt;1 - \frac{ \cos^{-1}( \text{similarity} )}{ \pi} &amp;lt;/math&amp;gt;&lt;br /&gt;
in a domain where vector coefficients may be positive or negative, or&lt;br /&gt;
:&amp;lt;math&amp;gt;1 - \frac{ 2 \cdot \cos^{-1}( \text{similarity} ) }{ \pi }&amp;lt;/math&amp;gt;&lt;br /&gt;
in a domain where the vector coefficients are always positive.  &lt;br /&gt;
&lt;br /&gt;
Although the term &amp;quot;cosine similarity&amp;quot; has been used for this angular distance, the term is oddly used as the cosine of the angle is used only as a convenient mechanism for calculating the angle itself and is no part of the meaning. Anyway this coefficient can not be used as a proper [[distance metric]] (by subtracting it from 1), consider 2 vectors with angle 0 but different l2-norms. A proper [[distance metric]] would expect these 2 vectors are the same, but this is not the case.&lt;br /&gt;
However for most uses this is not an important property. For any use where only the relative ordering of similarity or distance within a set of vectors is important, then which function is used is immaterial as the resulting order will be unaffected by the choice.&lt;br /&gt;
&lt;br /&gt;
=== Confusion with &amp;quot;Tanimoto&amp;quot; coefficient ===&lt;br /&gt;
&lt;br /&gt;
On occasion, cosine similarity has been confused{{citation needed|date=August 2013}} as a specialised form of a similarity coefficient with a similar algebraic form:&lt;br /&gt;
&lt;br /&gt;
:&amp;lt;math&amp;gt;T(A,B) = {A \cdot B \over \|A\|^2 +\|B\|^2 - A \cdot B}&amp;lt;/math&amp;gt;&lt;br /&gt;
&lt;br /&gt;
In fact, this algebraic form [[Jaccard index#Tanimoto_Similarity_and_Distance|was first defined by Tanimoto]] as a mechanism for calculating the [[Jaccard coefficient]] in the case where the sets being compared are represented as [[bit vector]]s. While the formula extends to vectors in general, it has quite different properties from cosine similarity and bears little relation other than its superficial appearance.&lt;br /&gt;
&lt;br /&gt;
=== Ochiai coefficient ===&lt;br /&gt;
This coefficient is also known in biology as Ochiai coefficient, or Ochiai-Barkman coefficient, or Otsuka-Ochiai coefficient:&amp;lt;ref&amp;gt;&amp;#039;&amp;#039;Ochiai A.&amp;#039;&amp;#039; Zoogeographical studies on the soleoid fishes found Japan and its neighboring regions. II // Bull. Jap. Soc. sci. Fish. 1957. V. 22. № 9. P. 526-530.&amp;lt;/ref&amp;gt;&amp;lt;ref&amp;gt;&amp;#039;&amp;#039;Barkman J.J.&amp;#039;&amp;#039; Phytosociology and ecology of cryptogamic epiphytes, including a taxonomic survey and description of their vegetation units in Europe. – Assen. Van Gorcum. 1958. 628 p.&amp;lt;/ref&amp;gt;&lt;br /&gt;
:&amp;lt;math&amp;gt;K =\frac{n(A \cap B)}{\sqrt{n(A) \times n(B)}}&amp;lt;/math&amp;gt;&lt;br /&gt;
Here, &amp;lt;math&amp;gt;A&amp;lt;/math&amp;gt; and &amp;lt;math&amp;gt;B&amp;lt;/math&amp;gt; are sets, and &amp;lt;math&amp;gt;n(A)&amp;lt;/math&amp;gt; is the number of elements in &amp;lt;math&amp;gt;A&amp;lt;/math&amp;gt;. If sets are represented as [[bit vector]]s, the Ochiai coefficient can be seen to be the same as the cosine similarity.&lt;br /&gt;
&lt;br /&gt;
== Properties ==&lt;br /&gt;
Cosine similarity is related to [[Euclidean distance]] as follows. Denote Euclidean distance by the usual |&amp;#039;&amp;#039;A&amp;#039;&amp;#039; - &amp;#039;&amp;#039;B&amp;#039;&amp;#039;|, and observe that&lt;br /&gt;
&lt;br /&gt;
:&amp;lt;math&amp;gt;|A - B|^2 = (A - B)^\top (A - B) = |A| + |B| - 2 A^\top B&amp;lt;/math&amp;gt;&lt;br /&gt;
&lt;br /&gt;
by [[Polynomial expansion|expansion]]. When &amp;#039;&amp;#039;A&amp;#039;&amp;#039; and &amp;#039;&amp;#039;B&amp;#039;&amp;#039; are normalized to unit length, |&amp;#039;&amp;#039;A&amp;#039;&amp;#039;| = |&amp;#039;&amp;#039;B&amp;#039;&amp;#039;| = 1 so the previous is equal to&lt;br /&gt;
&lt;br /&gt;
:&amp;lt;math&amp;gt;2 (1 - \cos(A, B))&amp;lt;/math&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== See also ==&lt;br /&gt;
* [[Sørensen similarity index|Sørensen&amp;#039;s quotient of similarity]]&lt;br /&gt;
* [[Hamming distance]]&lt;br /&gt;
* [[Correlation]]&lt;br /&gt;
* [[Dice&amp;#039;s coefficient]]&lt;br /&gt;
* [[Jaccard index]]&lt;br /&gt;
* [[SimRank]]&lt;br /&gt;
* [[Information retrieval]]&lt;br /&gt;
&lt;br /&gt;
==References==&lt;br /&gt;
{{reflist}}&lt;br /&gt;
&lt;br /&gt;
== External links ==&lt;br /&gt;
* [http://www.appliedsoftwaredesign.com/archives/cosine-similarity-calculator/ Online Cosine Similarity Calculator]&lt;br /&gt;
* [http://mathforum.org/kb/message.jspa?messageID=5658016&amp;amp;tstart=0 Weighted cosine measure]&lt;br /&gt;
* [http://pyevolve.sourceforge.net/wordpress/?p=2497 A tutorial on cosine similarity using Python]&lt;br /&gt;
&lt;br /&gt;
{{DEFAULTSORT:Cosine Similarity}}&lt;br /&gt;
[[Category:Information retrieval]]&lt;/div&gt;</summary>
		<author><name>en&gt;DavidCary</name></author>
	</entry>
</feed>