|
|
Line 1: |
Line 1: |
| {{Other uses|Color code (disambiguation){{!}}Color code}}
| | Hi there, I am Elke. He is really inclined to lacemaking and he'll be starting another thing along can. Years ago he moved to Maryland. His day job is a payroll sales person. You can always find his website here: http://www.wuestenschiff.de/wiki/index.php?title=Google_Maps_And_No-Cost_Traffic<br><br>Visit my site: [http://www.wuestenschiff.de/wiki/index.php?title=Google_Maps_And_No-Cost_Traffic ft lauderdale seo] |
| In [[computer science]] and [[graph theory]], the method of '''color-coding'''<ref>Alon, N., Yuster, R., and Zwick, U. 1994. Color-coding: a new method for finding simple paths, cycles and other small subgraphs within large graphs. In Proceedings of the Twenty-Sixth Annual ACM Symposium on theory of Computing (Montreal, Quebec, Canada, May 23–25, 1994). STOC '94. ACM, New York, NY, 326–335. DOI= http://doi.acm.org/10.1145/195058.195179</ref><ref name="orig">Alon, N., Yuster, R., and Zwick, U. 1995. Color-coding. J. ACM 42, 4 (Jul. 1995), 844–856. DOI= http://doi.acm.org/10.1145/210332.210337</ref> efficiently finds ''k''-vertex [[Path (graph theory)|simple paths]], ''k''-vertex [[Cycle (graph theory)|cycles]], and other small [[Glossary of graph theory#Subgraphs|subgraphs]] within a given [[graph theory|graph]] using [[probabilistic algorithms]], which can then be [[Derandomization#Derandomization|derandomized]] and turned into [[deterministic algorithm]]s. This method shows that many subcases of the [[subgraph isomorphism|subgraph isomorphism problem]] (an [[NP-complete]] problem) can in fact be solved in [[polynomial time]].
| |
| | |
| The theory and analysis of the color-coding method was proposed in 1994 by [[Noga Alon]], [[Raphael Yuster]], and [[Uri Zwick]].
| |
| | |
| ==Results==
| |
| | |
| The following results can be obtained through the method of color-coding:
| |
| | |
| * For every fixed constant <math>k</math>, if a graph <math>G = (V, E)</math> contains a simple cycle of size <math>k</math>, then such cycle can be found in:
| |
| ** O(<math>V^\omega</math>) expected time, or
| |
| ** O(<math>V^\omega \log V</math>) worst-case time, where <math>\omega</math> is the exponent of [[matrix multiplication]].<ref>[[Coppersmith–Winograd algorithm|Coppersmith–Winograd Algorithm]]</ref>
| |
| | |
| * For every fixed constant <math>k</math>, and every graph <math>G = (V, E)</math> that is in any nontrivial [[Minor_(graph_theory)#Minor-closed graph families|minor-closed graph family]] (e.g., a [[planar graph]]), if <math>G</math> contains a simple cycle of size <math>k</math>, then such cycle can be found in:
| |
| ** O(<math>V</math>) expected time, or
| |
| ** O(<math>V \log V</math>) worst-case time.
| |
| | |
| * If a graph <math>G = (V, E)</math> contains a subgraph isomorphic to a bounded [[treewidth]] graph which has <math>O(\log V)</math> vertices, then such a subgraph can be found in [[polynomial time]].
| |
| | |
| ==The method==
| |
| | |
| To solve the problem of finding a subgraph <math>H = (V_H, E_H)</math> in a given graph <math>G = (V, E)</math>, where <math>H</math> can be a path, a cycle, or any bounded [[treewidth]] graph where <math>|V_H| = O(\log V)</math>, the method of color-coding begins by randomly coloring each vertex of <math>G</math> with <math>k = |V_H|</math> colors, and then tries to find a colorful copy of <math>H</math> in colored <math>G</math>. Here, a graph is colorful if every vertex in it is colored with a distinct color. This method works by repeating (1) random coloring a graph and (2) finding colorful copy of the target subgraph, and eventually the target subgraph can be found if the process is repeated a sufficient number of times.
| |
| | |
| Suppose <math>H</math> becomes colorful with some non-zero probability <math>p</math>. It immediately follows that if the random coloring is repeated <math>\tfrac{1}{p}</math> times, then <math>H</math> is expected to become colorful once. Note that though <math>p</math> is small, it is shown that if <math>|V_H| = O(\log V)</math>, <math>p</math> is only polynomially small. Suppose again there exists an algorithm such that, given a graph <math>G</math> and a coloring which maps each vertex of <math>G</math> to one of the <math>k</math> colors, it finds a copy of colorful <math>H</math>, if one exists, within some runtime <math>O(r)</math>. Then the expected time to find a copy of <math>H</math> in <math>G</math>, if one exists, is <math>O(\tfrac{r}{p})</math>.
| |
| | |
| Sometimes it is also desirable to use a more restricted version of colorfulness. For example, in the context of finding cycles in [[planar graphs]], it is possible to develop an algorithm that finds well-colored cycles. Here, a cycle is well-colored if its vertices are colored by consecutive colors.
| |
| | |
| ===Example===
| |
| | |
| An example would be finding a simple cycle of length <math>k</math> in graph <math>G = (V, E)</math>.
| |
| | |
| By applying random coloring method, each simple cycle has a probability of <math>k!/k^k > \tfrac{1}{e^k}</math> to become colorful, since there are <math>k^k</math> ways of coloring the <math>k</math> vertices on the path, among which there are <math>k!</math> colorful occurrences. Then an algorithm (described below) of runtime <math>O(V^\omega)</math> can be adopted to find colorful cycles in the randomly colored graph <math>G</math>. Therefore, it takes <math>e^k\cdot O(V^\omega)</math> overall time to find a simple cycle of length <math>k</math> in <math>G</math>.
| |
| | |
| The colorful cycle-finding algorithm works by first finding all pairs of vertices in ''V'' that are connected by a simple path of length ''k'' − 1, and then checking whether the two vertices in each pair are connected. Given a coloring function <math>c: V\rightarrow \{1, \dots, k\}</math> to color graph <math>G</math>, enumerate all partitions of the color set <math>\{1, \dots, k\}</math> into two subsets <math>C_1</math>, <math>C_2</math> of size <math>k/2</math> each. Note that <math>V</math> can be divided into <math>V_1</math> and <math>V_2</math> accordingly, and let <math>G_1</math> and <math>G_2</math> denote the subgraphs induced by <math>V_1</math> and <math>V_2</math> respectively. Then, recursively find colorful paths of length <math>k/2 - 1</math> in each of <math>G_1</math> and <math>G_2</math>. Suppose the boolean matrix <math>A_1</math> and <math>A_2</math> represent the connectivity of each pair of vertices in <math>G_1</math> and <math>G_2</math> by a colorful path, respectively, and let <math>B</math> be the matrix describing the adjacency relations between vertices of <math>V_1</math> and those of <math>V_2</math>, the boolean product <math>A_1BA_2</math> gives all pairs of vertices in <math>V</math> that are connected by a colorful path of length <math>k-1</math>. Thus, the recursive relation of matrix multiplications is <math>t(k) \le 2^k\cdot t(k/2)</math>, which yields a runtime of <math>2^{O(k)}\cdot V^\omega \in O(V^\omega)</math>. Although this algorithm finds only the end points of the colorful path, another algorithm by Alon and Naor<ref>Alon, N. and Naor, M. 1994 Derandomization, Witnesses for Boolean Matrix Multiplication and Construction of Perfect Hash Functions. Technical Report. UMI Order Number: CS94-11., Weizmann Science Press of Israel.</ref> that finds colorful paths themselves can be incorporated into it.
| |
| | |
| ==Derandomization==
| |
| | |
| The [[derandomization]] of color-coding involves enumerating possible colorings of a graph <math>G</math>, such that the randomness of coloring <math>G</math> is no longer required. For the target subgraph <math>H</math> in <math>G</math> to be discoverable, the enumeration has to include at least one instance where the <math>H</math> is colorful. To achieve this, enumerating a <math>k</math>-perfect family <math>F</math> of hash functions from <math>\{1, 2, \dots, |V|\}</math> to <math>\{1, 2, \dots, k\}</math> is sufficient. By definition, <math>F</math> is k-perfect if for every subset <math>S</math> of <math>\{1, 2, \dots, |V|\}</math> where <math>|S| = k</math>, there exists a hash function <math>h\in F</math> such that <math>h: S \rightarrow \{1, 2, \dots, k\}</math> is [[perfect hash|perfect]]. In other words, there must exist a hash function in <math>F</math> that colors any given <math>k</math> vertices with <math>k</math> distinct colors.
| |
| | |
| There are several approaches to construct such a <math>k</math>-perfect hash family:
| |
| | |
| # The best explicit construction is by [[Moni Naor]], [[Leonard J. Schulman]], and [[Aravind Srinivasan]],<ref>Naor, M., Schulman, L. J., and Srinivasan, A. 1995. Splitters and near-optimal derandomization. In Proceedings of the 36th Annual Symposium on Foundations of Computer Science (October 23–25, 1995). FOCS. IEEE Computer Society, Washington, DC, 182.</ref> where a family of size <math>e^k k^{O(\log k)} \log |V|</math> can be obtained. This construction does not require the target subgraph to exist in the original subgraph finding problem.
| |
| # Another explicit construction by [[Jeanette P. Schmidt]] and [[Alan Siegel]]<ref name="SS90">Schmidt, J. P. and Siegel, A. 1990. The spatial complexity of oblivious k-probe Hash functions. SIAM J. Comput. 19, 5 (Sep. 1990), 775-786. DOI= http://dx.doi.org/10.1137/0219054</ref> yields a family of size <math>2^{O(k)}\log^2 |V|</math>.
| |
| # Another construction that appears in the original paper of [[Noga Alon]] et al.<ref name="orig" /> can be obtained by first building a <math>k</math>-perfect family that maps <math>\{1, 2, \dots, |V|\}</math> to <math>\{1, 2,\dots, k^2\}</math>, followed by building another <math>k</math>-perfect family that maps <math>\{1, 2, \dots, k^2\}</math> to <math>\{1, 2, \dots, k\}</math>. In the first step, it is possible to construct such a family with <math>2n\log k</math> random bits that are almost <math>2\log k</math>-wise independent,<ref>Naor, J. and Naor, M. 1990. Small-bias probability spaces: efficient constructions and applications. In Proceedings of the Twenty-Second Annual ACM Symposium on theory of Computing (Baltimore, Maryland, United States, May 13–17, 1990). H. Ortiz, Ed. STOC '90. ACM, New York, NY, 213-223. DOI= http://doi.acm.org/10.1145/100216.100244</ref><ref>Alon, N., Goldreich, O., Hastad, J., and Peralta, R. 1990. Simple construction of almost k-wise independent random variables. In Proceedings of the 31st Annual Symposium on Foundations of Computer Science (October 22–24, 1990). SFCS. IEEE Computer Society, Washington, DC, 544-553 vol.2. DOI= http://dx.doi.org/10.1109/FSCS.1990.89575</ref> and the sample space needed for generating those random bits can be as small as <math>k^{O(1)}\log |V|</math>. In the second step, it has been shown by Jeanette P. Schmidt and Alan Siegel<ref name="SS90"/> that the size of such <math>k</math>-perfect family can be <math>2^{O(k)}</math>. Consequently, by composing the <math>k</math>-perfect families from both steps, a <math>k</math>-perfect family of size <math>2^{O(k)}\log |V|</math> that maps from <math>\{1, 2, \dots, |V|\}</math> to <math>\{1, 2, \dots, k\}</math> can be obtained.
| |
| | |
| In the case of derandomizing well-coloring, where each vertex on the subgraph is colored consecutively, a <math>k</math>-perfect family of hash functions from <math>\{1, 2, \dots, |V|\}</math> to <math>\{1, 2, \dots, k!\}</math> is needed. A sufficient <math>k</math>-perfect family which maps from <math>\{1, 2, \dots, |V|\}</math> to <math>\{1, 2, \dots, k^k\}</math> can be constructed in a way similar to the approach 3 above (the first step). In particular, it is done by using <math>nk\log k</math> random bits that are almost <math>k\log k</math> independent, and the size of the resulting <math>k</math>-perfect family will be <math>k^{O(k)}\log |V|</math>.
| |
| | |
| The derandomization of color-coding method can be easily parallelized, yielding efficient [[NC (complexity)|NC]] algorithms.
| |
| | |
| ==Applications==
| |
| | |
| Recently, color-coding has attracted much attention in the field of bioinformatics. One example is the detection of [[Wnt signaling pathway|signaling pathways]] in [[protein-protein interaction]] (PPI) networks. Another example is to discover and to count the number of [[Structural motif|motifs]] in PPI networks. Studying both [[Wnt signaling pathway|signaling pathways]] and [[Structural motif|motifs]] allows a deeper understanding of the similarities and differences of many biological functions, processes, and structures among organisms.
| |
| | |
| Due to the huge amount of gene data that can be collected, searching for pathways or motifs can be highly time consuming. However, by exploiting the color-coding method, the motifs or signaling pathways with <math>k=O(\log n)</math> vertices in a network <math>G</math> with <math>n</math> vertices can be found very efficiently in polynomial time. Thus, this enables us to explore more complex or larger structures in PPI networks. More details can be found in.<ref>Alon, N., Dao, P., Hajirasouliha, I., Hormozdiari, F., and Sahinalp, S. C. 2008. Biomolecular network motif counting and discovery by color coding. Bioinformatics 24, 13 (Jul. 2008), i241-i249. DOI= http://dx.doi.org/10.1093/bioinformatics/btn163</ref><ref>Hüffner, F., Wernicke, S., and Zichner, T. 2008. Algorithm Engineering for Color-Coding with Applications to Signaling Pathway Detection. Algorithmica 52, 2 (Aug. 2008), 114-132. DOI= http://dx.doi.org/10.1007/s00453-007-9008-7</ref>
| |
| | |
| ==References== | |
| {{reflist}}
| |
| | |
| {{DEFAULTSORT:Color-Coding}}
| |
| [[Category:Graph algorithms]]
| |