|
|
(One intermediate revision by one other user not shown) |
Line 1: |
Line 1: |
| '''Fowlkes–Mallows index'''<ref>{{cite journal|last=Fowlkes|first=E. B.|coauthors=Mallows, C. L.|title=A Method for Comparing Two Hierarchical Clusterings|journal=Journal of the American Statistical Association|date=1 September 1983|volume=78|issue=383|pages=553|doi=10.2307/2288117}}</ref> is an [[Cluster_analysis#External_evaluation|external evaluation]] method that is used to determine the similarity between two clusterings (clusters obtained after a clustering algorithm). This measure of similarity could be either between two hierarchical clusterings or a clustering and a benchmark classification. A higher the value for the Fowlkes–Mallows index indicates a greater similarity between the clusters and the benchmark classifications. | | Got nothing to write about myself at all.<br>Finally a member of wmflabs.org.<br>I really hope I'm useful in one way .<br>xunjie 低ふわふわスカートの5層、 |
| | [その他] iDailyメディアは、 |
| | 中国ペニングのユニクロヘッドは自信を持って言った:日中関係は、 [http://alpha-printing.com/templates/shop/chloe.php SK-2 ����ˮ] このような協力のための強固な基盤を築くディーラーとの間で感情的な交流を深めました。 |
| | 中国の伝統的な結婚式では、 |
| | ブリトニースピアーズ(ブリトニー·スピアーズ)、 [http://www.jaincentreleicester.com/XML/hot/list/tiffany.php �����ϩ`�� ؔ��] Sカーブが帰ってきた!女性のファッション - 服のスタイルの右側のタイプを選択ホイットニー持っている - 子供たちの裸の胸の小さな白いドレス簡潔な範囲に身を包んだポーター(ホイットニー·ポート)を、 |
| | ワルツを生活都市芸術のブレイクアウトのファッションテーマは4テーマ別シリーズ当てつけ青灰色の都市カウボーイマーチアンテナ、 |
| | あなたが知っているシンプルだが重要な問題になってきた?洗浄方法はどのようにブラシをかけ、[http://www.horseshop-online.ch/gallery/list/bottega/ �ܥåƥ���ͥ� ���`���`�� ���] それはレギンスやショートパンツで、 |
| | 特別なハイグレードの生地を使用して織られたこのセクションのスカーフ、 |
| | これは費用ヨーヨーを大幅に節約します!それは絶対に劣るだけでなく、 |
| | 九龍市派に長い祝祭は、 [http://www.jaincentreleicester.com/assets/about/nike.html ��<br><br> �Хå���] |
|
| |
|
| ==Preliminaries==
| | Feel free to visit my page; [http://amorexigente.org.br/dosyalar/shop/tomford.php トムフォード サングラス メンズ] |
| The '''Fowlkes–Mallows index''', when results of two clustering algorithms is used to evaluate the results, is defined as<ref>{{cite journal|last=Halkidi|first=Maria|coauthors=Batistakis, Yannis, Vazirgiannis, Michalis|journal=Journal of Intelligent Information Systems|date=1 January 2001|volume=17|issue=2/3|pages=107–145|doi=10.1023/A:1012801612483}}</ref>
| |
| | |
| :<math>
| |
| FM = \sqrt{ \frac {TP}{TP+FP} \cdot \frac{TP}{TP+FN} }
| |
| </math>
| |
| :where <math>TP</math> is the number of [[true positive]]s, <math>FP</math> is the number of [[false positives]], and <math>FN</math> is the number of [[false negatives]].
| |
| | |
| ==Definition==
| |
| Consider two hierarchical clusterings of <math>n</math> objects labeled <math>A_1</math> and <math>A_2</math>. The trees <math>A_1</math> and <math>A_2</math> can be cut to produce <math>k=2,\ldots,n-1</math> clusters for each tree (by either selecting clusters at a particular height of the tree or setting different strength of the hierarchical clustering). For each value of <math>k</math>, the following table can then be created
| |
| | |
| :<math>M=[m_{i,j}] \qquad (i=1,\ldots,k \text{ and } j=1,\ldots,k) </math>
| |
| | |
| where <math>m_{i,j}</math> is of objects common between the <math>i</math>th cluster of <math>A_1</math> and <math>j</math>th cluster of <math>A_2</math>. The '''Fowlkes–Mallows index''' for the specific value of <math>k</math> is then defined as
| |
| | |
| : <math>B_k=\frac{T_k}{\sqrt{P_kQ_k}}</math>
| |
| where
| |
| :<math>T_k=\sum_{i=1}^{k}\sum_{j=1}^{k}m_{i,j}^2-n</math>
| |
| :<math>P_k=\sum_{i=1}^{k}(\sum_{j=1}^{k}m_{i,j})^2-n</math>
| |
| :<math>Q_k=\sum_{j=1}^{k}(\sum_{i=1}^{k}m_{i,j})^2-n</math>
| |
| | |
| <math>B_k</math> can then be calculated for every value of <math>k</math> and the similarity between the two clusterings can be shown by plotting <math>B_k</math> versus <math>k</math>. For each <math>k</math> we have <math>0 \le B_k \le 1</math>.
| |
| | |
| '''Fowlkes–Mallows index''' can also be defined based on the number of points that are common or uncommon in the two hierarchical clusterings. If we define
| |
| | |
| :<math>TP</math> as the number of points that are present in the same cluster in both <math>A_1</math> and <math>A_2</math>.
| |
| :<math>FP</math> as the number of points that are present in the same cluster in <math>A_1</math> but not in <math>A_2</math>.
| |
| :<math>FN</math> as the number of points that are present in the same cluster in <math>A_2</math> but not in <math>A_1</math>.
| |
| :<math>TN</math> as the number of points that are in different clusters in both <math>A_1</math> and <math>A_2</math>.
| |
| | |
| It can be shown that the four counts have the following property
| |
| :<math> | |
| TP+FP+FN+TN=n(n-1)/2
| |
| </math>
| |
| | |
| and that the '''Fowlkes–Mallows index''' for two clusterings can be defined as<ref>{{cite journal|last=MEILA|first=M|title=Comparing clusterings—an information based distance|journal=Journal of Multivariate Analysis|date=1 May 2007|volume=98|issue=5|pages=873–895|doi=10.1016/j.jmva.2006.11.013}}</ref>
| |
| :<math>
| |
| FM = \sqrt{ \frac {TP}{TP+FP} \cdot \frac{TP}{TP+FN} }
| |
| </math>
| |
| :where <math>TP</math> is the number of [[true positive]]s, <math>FP</math> is the number of [[false positives]], and <math>FN</math> is the number of [[false negatives]].
| |
| | |
| ==Discussion==
| |
| Since the index is directly proportional to the number of true positives, a higher index means greater similarity between the two clusterings used to determine the index. One of the most basic thing to test the validity of this index is to compare two clusterings that are unrelated to each other. Fowlkes and Mallows showed that on using two unrelated clusterings, the value of this index approaches zero as the number of total data points chosen for clustering increase; whereas the value for the [[Rand index]] for the same data quickly approaches <math>1</math><ref>{{cite journal|last=Fowlkes|first=E. B.|coauthors=Mallows, C. L.|title=A Method for Comparing Two Hierarchical Clusterings|journal=Journal of the American Statistical Association|date=1 September 1983|volume=78|issue=383|pages=553|doi=10.2307/2288117}}</ref> making Fowlkes–Mallows index a much accurate representation for unrelated data. This index also performs well if noise is added to an existing dataset and their similarity compared. Fowlkes and Mallows showed that the value of the index decreases as the component of the noise increases. The index also showed similarity even when the noisy dataset had different number of clusters than the clusters of the original dataset. Thus making it a reliable tool for measuring similarity between two clusters.
| |
| | |
| == References ==
| |
| {{reflist}}
| |
| | |
| ==Further reading==
| |
| *{{cite doi|10.1109/WI-IAT.2010.148}}
| |
| | |
| {{DEFAULTSORT:Fowlkes-Mallows index}}
| |
| [[Category:Cluster analysis]]
| |
| [[Category:Clustering criteria]]
| |
Got nothing to write about myself at all.
Finally a member of wmflabs.org.
I really hope I'm useful in one way .
xunjie 低ふわふわスカートの5層、
[その他] iDailyメディアは、
中国ペニングのユニクロヘッドは自信を持って言った:日中関係は、 [http://alpha-printing.com/templates/shop/chloe.php SK-2 ����ˮ] このような協力のための強固な基盤を築くディーラーとの間で感情的な交流を深めました。
中国の伝統的な結婚式では、
ブリトニースピアーズ(ブリトニー·スピアーズ)、 [http://www.jaincentreleicester.com/XML/hot/list/tiffany.php �����ϩ`�� ؔ��] Sカーブが帰ってきた!女性のファッション - 服のスタイルの右側のタイプを選択ホイットニー持っている - 子供たちの裸の胸の小さな白いドレス簡潔な範囲に身を包んだポーター(ホイットニー·ポート)を、
ワルツを生活都市芸術のブレイクアウトのファッションテーマは4テーマ別シリーズ当てつけ青灰色の都市カウボーイマーチアンテナ、
あなたが知っているシンプルだが重要な問題になってきた?洗浄方法はどのようにブラシをかけ、[http://www.horseshop-online.ch/gallery/list/bottega/ �ܥåƥ���ͥ� ���`���`�� ���] それはレギンスやショートパンツで、
特別なハイグレードの生地を使用して織られたこのセクションのスカーフ、
これは費用ヨーヨーを大幅に節約します!それは絶対に劣るだけでなく、
九龍市派に長い祝祭は、 [http://www.jaincentreleicester.com/assets/about/nike.html ��
�Хå���]
Feel free to visit my page; トムフォード サングラス メンズ