Cluster chemistry: Difference between revisions

From formulasearchengine
Jump to navigation Jump to search
en>Rjwilmsi
m fix page format using AWB (9780)
Line 1: Line 1:
'''Biclustering''', '''co-clustering''', or '''two-[[mode (statistics)|mode]] clustering'''<ref>
== I will go all out ==
{{cite journal
| author = Van Mechelen I, Bock HH, De Boeck P
| year = 2004
| title = Two-mode clustering methods:a structured overview
| journal = Statistical Methods in Medical Research
| volume = 13
| issue = 5
| pages = 363–94
| doi = 10.1191/0962280204sm373ra
| pmid = 15516031
}}
</ref> is a [[data mining]] technique which allows simultaneous [[cluster analysis|clustering]] of the rows and columns of a [[matrix (mathematics)|matrix]].
The term was first introduced by Mirkin,<ref name="mirkin">
{{cite book
  | last = Mirkin
  | first = Boris
  | title = Mathematical Classification and Clustering
  | publisher = Kluwer Academic Publishers
  | year = 1996
  | isbn = 0-7923-4159-7 }}
</ref> although the technique was originally introduced much earlier<ref name="mirkin"/> (i.e., by J.A. Hartigan<ref>
{{cite journal
| author = Hartigan JA
| year = 1972
| month =
| title = Direct clustering of a data matrix
| journal = Journal of the American Statistical Association
| volume = 67
| issue = 337
| pages = 123–9
| doi = 10.2307/2284710
| publisher = American Statistical Association
| jstor = 2284710
}}
</ref>).


Given a set of <math>m</math> rows in <math>n</math> columns (i.e., an <math>m \times n</math> matrix), the biclustering algorithm generates biclusters - a subset of rows which exhibit similar behavior across a subset of columns, or vice versa.
Gently nodded,[http://www.alleganycountyfair.org/_vti_cnf/rakuten_oakley_77.htm オークリーサングラス画像].<br><br>understand.<br><br>although the feeling sitting mountain mystery guest teacher can sit mountain had lobbyists and speak for themselves,[http://www.alleganycountyfair.org/_vti_cnf/rakuten_oakley_45.htm オークリー サングラスケース], are subject to 'primitive origin of the universe will' warning ...... It can be seen, sitting off the mountain and the origin of the relationship will not be good. If the customer can save the universe ride mountain sea,[http://www.alleganycountyfair.org/_vti_cnf/rakuten_oakley_64.htm オークリー 野球用サングラス], the original cosmic fear long sit down off the mountain to help it.<br><br>'I will be full,[http://www.alleganycountyfair.org/_vti_cnf/rakuten_oakley_19.htm オークリー サングラス レンズ交換].' Luo Feng Road.<br><br>'Of course, you do not have to panic.' mountain ride off road, 'after all, only one end of the border to the beast, and the Beast from the community who was born last there for a long time now the strength of the beast ...... far from sector to peak This out of the sector should not be the strongest animal nests, otherwise it will not want to leave the nest,[http://www.alleganycountyfair.org/_vti_cnf/rakuten_oakley_69.htm オークリー サングラス レディース], so your strength to deal with community animal stage, and not the strongest ...... should not be a problem! '<br><br>Luo Feng gently nodded: 'No matter what,[http://www.alleganycountyfair.org/_vti_cnf/rakuten_oakley_34.htm オークリーのサングラス], I will go all out,[http://www.alleganycountyfair.org/_vti_cnf/rakuten_oakley_0.htm オークリー サングラス], beheaded sector beast,[http://www.alleganycountyfair.org/_vti_cnf/rakuten_oakley_21.htm 登山 サングラス オークリー], next time will be successful!'<br><br>......
相关的主题文章:
<ul>
 
  <li>[http://www.yngbw.org/plus/view.php?aid=1793 http://www.yngbw.org/plus/view.php?aid=1793]</li>
 
  <li>[http://www.rockclimbing.com/cgi-bin/routes/search.cgi http://www.rockclimbing.com/cgi-bin/routes/search.cgi]</li>
 
  <li>[http://www.7cd8.cn/bbs/home.php?mod=space&uid=1749754 http://www.7cd8.cn/bbs/home.php?mod=space&uid=1749754]</li>
 
</ul>


== Complexity ==
== he has a body surface faint glow ==


The complexity of the biclustering problem depends on the exact problem formulation, and particularly on the merit function used to evaluate the quality of a given bicluster. However most interesting variants of this problem are [[NP-complete]] requiring either large [[computer|computational]] effort or the use of lossy [[heuristics]] to short-circuit the calculation.<ref name=madeira-oliveira />
Wall,[http://www.alleganycountyfair.org/_vti_cnf/rakuten_oakley_70.htm オークリー サングラス 登山], there is a faint burst of flavor taste.<br><br>'What do you want converted,[http://www.alleganycountyfair.org/_vti_cnf/rakuten_oakley_18.htm オークリー レディース サングラス], little guy.' muffled voice echoed in the ancient house.<br><br>Furuya silence.<br><br>Luo Feng looked around, no one.<br><br>'of course, is converted treasures.' Luo Feng said.<br><br>'do you have a Thunder Stone? These days there are a lot of people have been deliberately ran it,[http://www.alleganycountyfair.org/_vti_cnf/rakuten_oakley_34.htm 激安オークリーサングラス], but no thunder stone exchange.' muffled voice continued.<br><br>'there.'<br><br>Luo Feng waved,[http://www.alleganycountyfair.org/_vti_cnf/rakuten_oakley_39.htm オークリー サングラス オーダー], suddenly appeared in front of one foot high Thunder rubble.<br><br>'good.' muffled sound finish.<br><br>faint sound of footsteps coming from the hallway outside, then a pudgy white-bearded old man walk in,[http://www.alleganycountyfair.org/_vti_cnf/rakuten_oakley_50.htm オークリー 人気サングラス], he has a body surface faint glow,[http://www.alleganycountyfair.org/_vti_cnf/rakuten_oakley_56.htm オークリー サングラス ジャパンフィット], Luo Feng glance, the white-bearded old man is intelligent life,[http://www.alleganycountyfair.org/_vti_cnf/rakuten_oakley_17.htm サングラス オークリー], with the original Baba Ta ' Ink meteorite asterisk 'as in a physical form of energy.<br><br>'rare to have a convertible,[http://www.alleganycountyfair.org/_vti_cnf/rakuten_oakley_56.htm オークリー サングラス ジャパンフィット].' the white-bearded old man
 
相关的主题文章:
== Type of Bicluster ==
<ul>
 
 
Different biclustering algorithms have different definitions of bicluster.<ref name="madeira-oliveira">
  <li>[http://www.0759c.cn/bbs/forum.php?mod=viewthread&tid=351416 http://www.0759c.cn/bbs/forum.php?mod=viewthread&tid=351416]</li>
{{cite journal
 
| author = Madeira SC, Oliveira AL
  <li>[http://jj.hnzz.cc/plus/view.php?aid=118545 http://jj.hnzz.cc/plus/view.php?aid=118545]</li>
| year = 2004
 
| title = Biclustering Algorithms for Biological Data Analysis: A Survey
  <li>[http://www.cqjyo.com/plus/feedback.php?aid=107 http://www.cqjyo.com/plus/feedback.php?aid=107]</li>
| journal = IEEE Transactions on Computational Biology and Bioinformatics
 
| volume = 1
</ul>
| issue = 1
| pages = 24–45
| doi = 10.1109/TCBB.2004.2
| pmid = 17048406
}}
</ref>
 
They are:
 
#Bicluster with constant values (a),
#Bicluster with constant values on rows (b) or columns (c),
#Bicluster with coherent values (d, e).
 
{| border="0" cellspacing="20"
|
{| | border="1px solid black" cellpadding="5" cellspacing="0"
|+a) Bicluster with constant values
|-
| 2.0 || 2.0 || 2.0 || 2.0 || 2.0
|-
| 2.0 || 2.0 || 2.0 || 2.0 || 2.0
|-
| 2.0 || 2.0 || 2.0 || 2.0 || 2.0
|-
| 2.0 || 2.0 || 2.0 || 2.0 || 2.0
|-
| 2.0 || 2.0 || 2.0 || 2.0 || 2.0
|}
|
{| | border="1px solid black" cellpadding="5" cellspacing="0"
|+b) Bicluster with constant values on rows
|-
| 1.0 || 1.0 || 1.0 || 1.0 || 1.0
|-
| 2.0 || 2.0 || 2.0 || 2.0 || 2.0
|-
| 3.0 || 3.0 || 3.0 || 3.0 || 3.0
|-
| 4.0 || 4.0 || 4.0 || 4.0 || 4.0
|-
| 4.0 || 4.0 || 4.0 || 4.0 || 4.0
|}
|
{| | border="1px solid black" cellpadding="5" cellspacing="0"
|+c) Bicluster with constant values on columns
|-
| 1.0 || 2.0 || 3.0 || 4.0 || 5.0
|-
| 1.0 || 2.0 || 3.0 || 4.0 || 5.0
|-
| 1.0 || 2.0 || 3.0 || 4.0 || 5.0
|-
| 1.0 || 2.0 || 3.0 || 4.0 || 5.0
|-
| 1.0 || 2.0 || 3.0 || 4.0 || 5.0
|}
|}
 
{| border="0" cellspacing="20"
|
{| | border="1px solid black" cellpadding="5" cellspacing="0"
|+d) Bicluster with coherent values (additive)
|-
| 1.0 || 4.0 || 5.0 || 0.0 || 1.5
|-
| 4.0 || 7.0 || 8.0 || 3.0 || 4.5
|-
| 3.0 || 6.0 || 7.0 || 2.0 || 3.5
|-
| 5.0 || 8.0 || 9.0 || 4.0 || 5.5
|-
| 2.0 || 5.0 || 6.0 || 1.0 || 2.5
|}
|
{| | border="1px solid black" cellpadding="5" cellspacing="0"
|+e) Bicluster with coherent values (multiplicative)
|-
| 1.0 || 0.5 || 2.0 || 0.2 || 0.8
|-
| 2.0 || 1.0 || 4.0 || 0.4 || 1.6
|-
| 3.0 || 1.5 || 6.0 || 0.6 || 2.4
|-
| 4.0 || 2.0 || 8.0 || 0.8 || 3.2
|-
| 5.0 || 2.5 || 10.0 || 1.0 || 4.0
|}
|}
 
<!-- [[File:bicluster.JPG]] -->
 
The relationship between these cluster models and other types of clustering such as [[correlation clustering]] is discussed in.<ref>{{cite journal
  | last = Kriegel
  | first = H.-P.
  | coauthors = Kröger, P., Zimek, A.
  | title = Clustering High Dimensional Data: A Survey on Subspace Clustering, Pattern-based Clustering, and Correlation Clustering
  | journal = ACM Transactions on Knowledge Discovery from Data (TKDD)
  | volume = 3
  | issue = 1
  | pages = 1–58
  | date = March 2009
  | url = http://doi.acm.org/10.1145/1497577.1497578
  | doi = 10.1145/1497577.1497578}}
</ref>
 
== Algorithms ==
 
There are many biclustering [[algorithms]] developed for [[bioinformatics]], including: block clustering, CTWC (Coupled Two-Way Clustering), ITWC (Interrelated Two-Way Clustering), δ-bicluster, δ-pCluster, δ-pattern, FLOC, OPC, Plaid Model, OPSMs (Order-preserving submatrixes), Gibbs, SAMBA (Statistical-Algorithmic Method for Bicluster Analysis),<ref>
{{cite journal
| author = Tanay A, Sharan R, Kupiec M and Shamir R
| year = 2004
| title = Revealing modularity and organization in the yeast molecular network by integrated analysis of highly heterogeneous genomewide data
| journal = Proc Natl Acad Sci USA
| volume = 101
| issue = 9
| pages = 2981–2986
| doi = 10.1073/pnas.0308661100
| pmid = 14973197
| pmc = 365731
}}</ref> Robust Biclustering Algorithm (RoBA), Crossing Minimization,<ref name=ahsan/> cMonkey,<ref>
{{cite journal
| author = Reiss DJ, Baliga NS, Bonneau R
| year = 2006
| title = Integrated biclustering of heterogeneous genome-wide datasets for the inference of global regulatory networks
| journal = BMC Bioinformatics
| volume = 2
| pages = 280–302
| doi = 10.1186/1471-2105-7-280
| pmid = 16749936
| pmc = 1502140
}}</ref> PRMs, DCC, LEB (Localize and Extract Biclusters), QUBIC (QUalitative BIClustering), BCCA (Bi-Correlation Clustering Algorithm) and FABIA (Factor Analysis for Bicluster Acquisition).<ref>
{{cite journal
| author = [[Sepp Hochreiter|Hochreiter S]], Bodenhofer U, Heusel M,  Mayr A, Mitterecker A, Kasim A, Khamiakova T, Van Sanden S, Lin D, Talloen W,  Bijnens L, Gohlmann HWH, Shkedy Z, Clevert DA
| year = 2010
| title = FABIA: factor analysis for bicluster acquisition
| journal = Bioinformatics
| pmid = 20418340
| volume = 26
| issue = 12
| pmc = 2881408
| pages = 1520–1527
| doi = 10.1093/bioinformatics/btq227
}}</ref> Biclustering algorithms have also been proposed and used in other application fields under the names coclustering, bidimensional clustering, and subspace clustering.<ref name=madeira-oliveira />
 
Given the known importance of discovering local patterns in [[time-series data]], recent proposals have addressed the biclustering problem in the specific case of time series [[gene expression]] data. In this case, the interesting biclusters can be restricted to those with [[contiguous]] columns. This restriction leads to a [[tractable problem]] and enables the development of efficient exhaustive [[enumeration]] algorithms such as CCC-Biclustering <ref name="ccc-biclustering">
{{cite journal
| author = Madeira SC, Teixeira MC, Sá-Correia I, Oliveira AL
| year = 2010
| title = Identification of Regulatory Modules in Time Series Gene Expression Data using a Linear Time Biclustering Algorithm
| journal = IEEE Transactions on Computational Biology and Bioinformatics
| volume = 1
| issue = 7
| pages = 153–165
| doi = 10.1109/TCBB.2008.34
}}
</ref> and ''e''-CCC-Biclustering.<ref name="e-ccc-biclustering">
{{cite journal
| author = Madeira SC, Oliveira AL
| year = 2009
| title = A polynomial time biclustering algorithm for finding approximate expression patterns in gene expression time series
| journal = Algorithms for Molecular Biology
| volume = 4
| issue = 8
}}
</ref> These [[algorithm]]s find and report all maximal biclusters with coherent and contiguous columns with perfect/approximate expression patterns, in time linear/[[polynomial]] in the size of the time series gene expression [[matrix (mathematics)|matrix]] using efficient [[string processing]] techniques based on [[suffix tree]]s.
 
Some recent algorithms have attempted to include additional support for biclustering rectangular matrices in the form of other [[datatype]]s, including cMonkey.
 
There is an ongoing debate about how to judge the results of these methods, as biclustering allows overlap between clusters and some [[algorithms]] allow the exclusion of hard-to-reconcile columns/conditions. Not all of the available algorithms are deterministic and the analyst must pay attention to the degree to which results represent stable [[minima]]. Because this is an [[unsupervised classification]] problem, the lack of a [[gold standard (test)|gold standard]] makes it difficult to spot errors in the results. One approach is to utilize multiple biclustering algorithms, with majority or [[super-majority]] voting amongst them deciding the best result. Another way is to analyse the quality of shifting and scaling patterns in biclusters.<ref>
{{cite journal
| author = Aguilar-Ruiz JS
| year = 2005
| title = Shifting and scaling patterns from gene expression data
| journal = Bioinformatics
| volume = 21
| issue = 10
| pages = 3840–3845
| doi = 10.1093/bioinformatics/bti641
| pmid = 16144809
}}
</ref> Biclustering has been used in the domain of text mining (or classification) where it is popularly known as co-clustering
.<ref name="chi-sim">{{cite journal
| author = Bisson G. and Hussain F.
| year = 2008
| title = Chi-Sim: A new similarity measure for the co-clustering task
| journal = ICMLA
| pages = 211–217
| doi = 10.1109/ICMLA.2008.103
 
}}
</ref> Text corpora are represented in a [[vector (mathematics and physics)|vector]]ial form as a [[matrix (mathematics)|matrix]] D whose rows denote the documents and whose columns denote the words in the dictionary. Matrix elements D<sub>ij</sub> denote occurrence of word j in document i. [[Co-clustering]] algorithms are then applied to discover blocks in D that correspond to a group of documents (rows) characterized by a group of words(columns).
 
Several approaches have been proposed based on the information contents of the resulting blocks: matrix-based approaches such as [[singular value decomposition|SVD]] and BVD, and graph-based approaches. [[Information-theoretic]] algorithms [[iterative]]ly assign each row to a cluster of documents and each column to a cluster of words such that the mutual information is maximized. Matrix-based methods focus on the decomposition of matrices into blocks such that the error between the original matrix and the regenerated matrices from the decomposition is minimized. Graph-based methods tend to minimize the cuts between the clusters. Given two groups of documents d<sub>1</sub> and d<sub>2</sub>, the number of cuts can be measured as the number of words that occur in documents of groups d<sub>1</sub> and d<sub>2</sub>.
 
More recently (Bisson and Hussain)<ref name="chi-sim"/> have proposed a new approach of using the similarity between words and the similarity between documents to [[co-clustering|co-cluster]] the matrix. Their method (known as '''χ-Sim''', for cross similarity) is based on finding document-document similarity and word-word similarity, and then using classical clustering methods such as [[hierarchical clustering]]. Instead of explicitly clustering rows and columns alternately, they consider higher-order occurrences of words, inherently taking into account the documents in which they occur. Thus, the similarity between two words is calculated based on the documents in which they occur and also the documents in which "similar" words occur. The idea here is that two documents about the same topic do not necessarily use the same set of words to describe it but a subset of the words and other similar words that are characteristic of that topic. This approach of taking higher-order similarities takes the [[latent semantic analysis|latent semantic]] structure of the whole corpus into consideration with the result of generating a better clustering of the documents and words.
 
In text databases, for a document collection defined by a document by term D matrix (of size m by n, m: number of documents, n: number of terms) the cover-coefficient based clustering methodology<ref>
{{cite journal
| author = Can, F., Ozkarahan, E. A.
| year = 1990
| title = Concepts and effectiveness of the cover coefficient based clustering methodology for text databases
| journal = ACM Transactions on Database Systems
| volume = 15
| issue = 4
| pages = 483–517
| doi = 10.1145/99935.99938
}}
</ref> yields the same number of clusters both for documents and terms (words) using a double-stage probability experiment. According to the cover coefficient concept number of clusters can also be roughly estimated by the following formula <math>(m \times n) / t</math> where t is the number of non-zero entries in D. Note that in D each row and each column must contain at least one non-zero element.
 
In contrast to other approaches, FABIA is a multiplicative model that assumes realistic [[non-Gaussianity|non-Gaussian]] signal distributions with [[heavy tails]]. FABIA utilizes well understood model selection techniques like variational approaches and applies the [[Bayesian probability|Bayesian]] framework. The generative framework allows FABIA to determine the [[information content]] of each bicluster to separate spurious biclusters from true biclusters.
 
== See also ==
* [[Formal concept analysis]]
* [[Biclique]]
* [[Galois connection]]
 
== References ==
{{Reflist|refs=
<ref name=ahsan>
{{cite journal
|last1=Abdullah
|first1=Ahsan
|last2=Hussain
|first2=Amir
|title=A new biclustering technique based on crossing minimization
|journal=Neurocomputing, vol.  69 issue 16-18
|year=2006
|pages=1882–1896
|url= http://linkinghub.elsevier.com/retrieve/pii/S0925231206001615
|doi=10.1016/j.neucom.2006.02.018
|volume=69
|issue=16–18
}}
</ref>
}}
 
=== Others ===
* A. Tanay. R. Sharan, and R. Shamir, "Biclustering Algorithms: A Survey", In ''Handbook of Computational Molecular Biology'', Edited by Srinivas Aluru, Chapman (2004)
* {{cite journal | author = Kluger Y, Basri R, Chang JT, Gerstein MB | year = 2003 | title = Spectral Biclustering of Microarray Data: Coclustering Genes and Conditions | url = | journal = Genome Research | volume = 13 | issue = 4| pages = 703–716 | doi = 10.1101/gr.648603 | pmid = 12671006 | pmc = 430175 }}
 
==External links==
* [http://www.bioinf.jku.at/software/fabia/fabia.html FABIA: Factor Analysis for Bicluster Acquisition, an R package] &mdash;software
 
[[Category:Cluster analysis]]
[[Category:Bioinformatics]]

Revision as of 11:16, 23 February 2014

I will go all out

Gently nodded,オークリーサングラス画像.

understand.

although the feeling sitting mountain mystery guest teacher can sit mountain had lobbyists and speak for themselves,オークリー サングラスケース, are subject to 'primitive origin of the universe will' warning ...... It can be seen, sitting off the mountain and the origin of the relationship will not be good. If the customer can save the universe ride mountain sea,オークリー 野球用サングラス, the original cosmic fear long sit down off the mountain to help it.

'I will be full,オークリー サングラス レンズ交換.' Luo Feng Road.

'Of course, you do not have to panic.' mountain ride off road, 'after all, only one end of the border to the beast, and the Beast from the community who was born last there for a long time now the strength of the beast ...... far from sector to peak This out of the sector should not be the strongest animal nests, otherwise it will not want to leave the nest,オークリー サングラス レディース, so your strength to deal with community animal stage, and not the strongest ...... should not be a problem! '

Luo Feng gently nodded: 'No matter what,オークリーのサングラス, I will go all out,オークリー サングラス, beheaded sector beast,登山 サングラス オークリー, next time will be successful!'

...... 相关的主题文章:

he has a body surface faint glow

Wall,オークリー サングラス 登山, there is a faint burst of flavor taste.

'What do you want converted,オークリー レディース サングラス, little guy.' muffled voice echoed in the ancient house.

Furuya silence.

Luo Feng looked around, no one.

'of course, is converted treasures.' Luo Feng said.

'do you have a Thunder Stone? These days there are a lot of people have been deliberately ran it,激安オークリーサングラス, but no thunder stone exchange.' muffled voice continued.

'there.'

Luo Feng waved,オークリー サングラス オーダー, suddenly appeared in front of one foot high Thunder rubble.

'good.' muffled sound finish.

faint sound of footsteps coming from the hallway outside, then a pudgy white-bearded old man walk in,オークリー 人気サングラス, he has a body surface faint glow,オークリー サングラス ジャパンフィット, Luo Feng glance, the white-bearded old man is intelligent life,サングラス オークリー, with the original Baba Ta ' Ink meteorite asterisk 'as in a physical form of energy.

'rare to have a convertible,オークリー サングラス ジャパンフィット.' the white-bearded old man 相关的主题文章: