Truncated tesseract: Difference between revisions

Latest revision as of 06:10, 2 January 2015

Nice to satisfy you, my title is Refugia. Puerto Rico is exactly where he's usually been residing but she needs to transfer simply because of her family members. To collect cash is what his family members and him appreciate. For years he's been operating as a receptionist.

Feel free to surf to my web blog :: std testing at home

@@ Line 1: / Line 1: @@
-Using '''universal hashing''' (in a [[randomized algorithm]] or data structure) refers to selecting a [[hash function]] at random from a family of hash functions with a certain mathematical property (see definition below). This guarantees a low number of collisions in expectation, even if the data is chosen by an adversary. Many universal families are known (for hashing integers, vectors, strings), and their evaluation is often very efficient. Universal hashing has numerous uses in computer science, for example in implementations of [[hash table]]s, randomized algorithms, and [[cryptography]].
+Nice to satisfy you, my title is Refugia. Puerto Rico is exactly where he's usually been residing but she needs to transfer simply because of her family members. To collect cash is what his family members and him appreciate. For years he's been operating as a receptionist.<br><br>Feel free to surf to my web blog :: [http://www.sex-porn-tube.ch/user/JJeannere std testing at home]
-== Introduction ==
-{{see also|Hash function}}
-Assume we want to map keys from some universe <math>U</math> into <math>m</math> bins (labelled <math>[m] = \{0, \dots, m-1\}</math>). The algorithm will have to handle some data set <math>S \subseteq U</math> of <math>|S|=n</math> keys, which is not known in advance. Usually, the goal of hashing is to obtain a low number of collisions (keys from <math>S</math> that land in the same bin). A deterministic hash function cannot offer any guarantee in an adversarial setting if the size of <math>U</math> is greater than <math>m^2</math>, since the adversary may choose <math>S</math> to be precisely the [[Image (mathematics)|preimage]] of a bin. This means that all data keys land in the same bin, making hashing useless. Furthermore, a deterministic hash function does not allow for ''rehashing'': sometimes the input data turns out to be bad for the hash function (e.g. there are too many collisions), so one would like to change the hash function.
-The solution to these problems is to pick a function randomly from a family of hash functions. A family of functions <math>H = \{ h : U \to [m] \}</math> is called a '''universal family''' if, <math>\forall x, y \in U, ~ x\ne y: ~~ \Pr_{h\in H} [h(x) = h(y)] \le \frac{1}{m}</math>.
-In other words, any two keys of the universe collide with probability at most <math>1/m</math> when the hash function <math>h</math> is drawn randomly from <math>H</math>. This is exactly the probability of collision we would expect if the hash function assigned truly random hash codes to every key. Sometimes, the definition is relaxed to allow collision probability <math>O(1/m)</math>. This concept was introduced by Carter and Wegman<ref name=CW77>
-{{cite journal
-   | last1 = Carter | first1 = Larry
-   | last2 = Wegman | first2 = Mark N. | author2-link = Mark N. Wegman
-   | title = Universal Classes of Hash Functions
-   | journal = Journal of Computer and System Sciences
-   | volume = 18
-   | issue = 2
-   | pages = 143–154
-   | year = 1979
-   | doi = 10.1016/0022-0000(79)90044-8
-   | id = Conference version in STOC'77
-}}</ref> in 1977, and has found numerous applications in computer science (see, for example <ref name=Miltersen>
-{{cite web
-   | last = Miltersen
-   | first = Peter Bro
-   | title = Universal Hashing
-   | url = http://www.daimi.au.dk/~bromille/Notes/un.pdf
-   | format = PDF
-   | archiveurl = http://www.webcitation.org/5hmOaVISI
-   | archivedate = 24 June 2009
-}}</ref>). If we have an upper bound of <math>\epsilon<1</math> on the collision probability, we say that we have <math>\epsilon</math>-almost universality.
-Many, but not all, universal families have the following stronger '''uniform difference property''':
-: <math>\forall x,y\in U, ~ x\ne y</math>, when <math>h</math> is drawn randomly from the family <math>H</math>, the difference <math>h(x)-h(y) ~\bmod~ m</math> is uniformly distributed in <math>[m]</math>. Note that the definition of universality is only concerned with whether <math>h(x)-h(y)=0</math>, which counts collisions. The uniform difference property is stronger.
-(Similarly, a universal family can be XOR universal if <math>\forall x,y\in U, ~ x\ne y</math>, the value <math>h(x) \oplus h(y) ~\bmod~ m</math> is uniformly distributed in <math>[m]</math> where <math>\oplus</math> is the bitwise exclusive or operation. This is only possible if <math>m</math> is a power of two.)
-An even stronger condition is [[Pairwise independent|pairwise independence]]: we have this property when  <math>\forall x,y\in U, ~ x\ne y</math> we have the probability that <math>x,y</math> will hash to any pair of hash values <math>z_1, z_2</math> is as if they were perfectly random: <math>P(h(x)=z_1 \land h(y)=z_2)= 1/m^2</math>. Pairwise independence is sometimes called strong universality.
-Another property is uniformity. We say that a family is uniform if all hash values are equally likely: <math>P(h(x)=z)=1/m</math> for any hash value <math>z</math>. Universality does not imply uniformity. However, strong  universality does imply uniformity.
-Given a family with the uniform distance property, one can produce a pairwise independent or strongly universal hash family by adding a uniformly distributed random constant with values in <math>[m]</math> to the hash functions. (Similarly, if <math>m</math> is a power of two, we can achieve pairwise independence from an XOR universal hash family by doing an exclusive or with a uniformly distributed random constant.) Since a shift by a constant is sometimes irrelevant in applications (e.g. hash tables), a careful distinction between the uniform distance property and pairwise independent is sometimes not made.<ref>
-{{cite book
- | last1 = Motwani
- | first1 = Rajeev
- | last2 = Raghavan
- | first2 = Prabhakar
- | title = Randomized Algorithms
- | publisher = Cambridge University Press
- | year = 1995
- | isbn = 0-521-47465-5
- | page = 221
-}}
-</ref>
-For some applications (such as hash tables), it is important for the least significant bits of the hash values to be also universal. When a family is strongly universal, this is guaranteed: if <math>H</math> is a strongly universal family with <math>m=2^L</math>, then the family made of the functions <math>h \bmod{2^{L'}}</math> for all <math>h \in H</math> is also strongly universal for <math>L'\leq L</math>. Unfortunately, the same is not true of (merely) universal families. For example the family made of the identity function <math>h(x)=x</math> is clearly universal, but the family made of the function <math>h(x)=x  \bmod{2^{L'}}</math> fails to be universal.
-== Mathematical guarantees ==
-For any fixed set <math>S</math> of <math>n</math> keys, using a universal family guarantees the following properties.
-# For any fixed <math>x</math> in <math>S</math>, the expected number of keys in the bin <math>h(x)</math> is <math>n/m</math>. When implementing hash tables by [[Hash table#Separate chaining|chaining]], this number is proportional to the expected running time of an operation involving the key <math>x</math> (for example a query, insertion or deletion).
-# The expected number of pairs of keys <math>x,y</math> in <math>S</math> with <math>x\ne y</math> that collide (<math>h(x) = h(y)</math>) is bounded above by <math>n(n-1)/2m</math>, which is of order <math>O(n^2/m)</math>. When the number of bins, <math>m</math>, is <math>O(n)</math>, the expected number of collisions is <math>O(n)</math>. When hashing into <math>n^2</math> bins, there are no collisions at all with probability at least a half.
-# The expected number of keys in bins with at least <math>t</math> keys in them is bounded above by <math>2n/(t-2(n/m)+1)</math>.<ref name=BDP>
-{{cite journal
-   | doi = 10.1007/s00453-007-9036-3
-   | last1 = Baran  | first1 = Ilya
-   | last2 = Demaine  | first2 = Erik D.
-   | last3 = Pătraşcu | first3 = Mihai | author3-link = Mihai Pătraşcu
-   | title = Subquadratic Algorithms for 3SUM
-   | journal = Algorithmica
-   | volume = 50
-   | issue = 4
-   | pages = 584–596
-   | year = 2008
-   | url = http://people.csail.mit.edu/mip/papers/3sum/3sum.pdf
-}}</ref> Thus, if the capacity of each bin is capped to three times the average size (<math>t = 3n/m</math>), the total number of keys in overflowing bins is at most <math>O(m)</math>. This only holds with a hash family whose collision probability is bounded above by <math>1/m</math>. If a weaker definition is used, bounding it by <math>O(1/m)</math>, this result is no longer true.<ref name=BDP />
-As the above guarantees hold for any fixed set <math>S</math>, they hold if the data set is chosen by an adversary. However, the adversary has to make this choice before (or independent of) the algorithm's random choice of a hash function. If the adversary can observe the random choice of the algorithm, randomness serves no purpose, and the situation is the same as deterministic hashing.
-The second and third guarantee are typically used in conjunction with [[Double hashing|rehashing]]. For instance, a randomized algorithm may be prepared to handle some <math>O(n)</math> number of collisions. If it observes too many collisions, it chooses another random <math>h</math> from the family and repeats. Universality guarantees that the number of repetitions is a [[Geometric distribution|geometric random variable]].
-== Constructions ==
-Since any computer data can be represented as one or more machine words, one generally needs hash functions for three types of domains: machine words ("integers"); fixed-length vectors of machine words; and variable-length vectors ("strings").
-=== Hashing integers ===
-This section refers to the case of hashing integers that fit in machines words; thus, operations like multiplication, addition, division, etc. are cheap machine-level instructions. Let the universe to be hashed be <math>U = \{0, \dots, u-1\}</math>.
-The original proposal of Carter and Wegman<ref name=CW77 /> was to pick a prime <math>p \ge u</math> and define
-: <math> h_{a,b}(x) = ((ax + b)~\bmod ~ p)~\bmod ~ m</math>
-where <math>a,b</math> are randomly chosen integers modulo <math>p</math> with <math>a \neq 0</math>. Technically, adding <math>b</math> is not needed for universality (but it does make the hash function 2-independent).
-(This is a single iteration of a [[linear congruential generator]]).
-To see that <math>H = \{ h_{a,b} \}</math> is a universal family, note that <math>h(x) = h(y)</math> only holds when
-: <math>ax+b \equiv ay + b + i\cdot m \pmod{p}</math>
-for some integer <math>i</math> between <math>0</math> and <math>p/m</math>. If <math>x \neq y</math>, their difference, <math>x-y</math> is nonzero and has an inverse modulo <math>p</math>. Solving for <math>a</math>,
-: <math>a \equiv i\cdot m \cdot (x-y)^{-1} \pmod{p}</math>.
-There are <math>p-1</math> possible choices for <math>a</math> (since <math>a=0</math> is excluded) and, varying <math>i</math> in the allowed range, <math>\lfloor p/m \rfloor</math> possible values for the right hand side. Thus the collision probability is
-: <math>\lfloor p/m \rfloor / (p-1)</math>
-which tends to <math>1/m</math> for large <math>p</math> as required. This analysis also shows that <math>b</math> does not have to be randomised in order to have universality.
-Another way to see <math>H</math> is a universal family is via the notion of [[statistical distance]]. Write the difference <math>h(x) - h(y)</math> as
-: <math>h(x)-h(y) \equiv (a(x-y)~ \bmod~ p) \pmod{m}</math>.
-Since <math>x - y</math> is nonzero and <math>a</math> is uniformly distributed in <math>\{1,\dots,p\}</math>, it follows that <math>a(x-y)</math> modulo <math>p</math> is also uniformly distributed in <math>\{1,\dots,p\}</math>. The distribution of <math>(h(x)-h(y)) ~\bmod~ m</math> is thus almost uniform, up to a difference in probability of <math>\pm 1/p</math> between the samples. As a result, the statistical distance to a uniform family is <math>O(m/p)</math>, which becomes negligible when <math>p \gg m</math>.
-==== Avoiding modular arithmetic ====
-The state of the art for hashing integers is the '''multiply-shift''' scheme described by Dietzfelbinger et al. in 1997.<ref name=DHKP97>
-{{cite journal
-  | last1 = Dietzfelbinger | first1 = Martin
-  | last2 = Hagerup | first2 = Torben
-  | last3 = Katajainen | first3 = Jyrki
-  | last4 = Penttonen | first4 = Martti
-  | title = A Reliable Randomized Algorithm for the Closest-Pair Problem
-  | journal = Journal of Algorithms
-  | volume = 25
-  | issue = 1
-  | pages = 19–51
-  | doi = 10.1006/jagm.1997.0873
-  | url = http://www.diku.dk/~jyrki/Paper/CP-11.4.1997.ps
-  | accessdate = 10 February 2011
-  | format = Postscript
-  | year = 1997
-}}</ref> By avoiding modular arithmetic, this method is much easier to implement and also runs significantly faster in practice (usually by at least a factor of four<ref>
-{{cite web
-   | last = Thorup
-   | first = Mikkel | authorlink = Mikkel Thorup
-   | title = Text-book algorithms at SODA
-   | url = http://mybiasedcoin.blogspot.com/2009/12/text-book-algorithms-at-soda-guest-post.html
-}}</ref>). The scheme assumes the number of bins is a power of two, <math>m=2^M</math>. Let <math>w</math> be the number of bits in a machine word. Then the hash functions are parametrised over odd positive integers <math>a < 2^w</math> (that fit in a word of <math>w</math> bits). To evaluate <math>h_{a}(x)</math>, multiply <math>x</math> by <math>a</math> modulo <math>2^w</math> and then keep the high order <math>M</math> bits as the hash code. In mathematical notation, this is
-: <math>h_a(x) = (a\cdot x\,\, \bmod\, 2^w)\,\, \mathrm{div}\,\, 2^{w-M}</math>
-and it can be implemented in [[C (programming language)|C]]-like programming languages by
-: <math>h_a(x) = </math> <code>(unsigned) (a*x) >> (w-M)</code>
-This scheme does ''not'' satisfy the uniform difference property and is only ''<math>2/m</math>-almost-universal''; for any <math>x\neq y</math>, <math>\Pr\{h_a(x) = h_a(y)\} \le 2/m</math>.
-To understand the behavior of the hash function,
-notice that, if <math>ax \bmod 2^w</math> and <math>ay\bmod 2^w</math> have the same highest-order 'M' bits, then <math>a(x-y) \bmod 2^w</math> has either all 1's or all 0's as its highest order M bits (depending on whether <math>ax \bmod 2^w</math> or <math>ay \bmod 2^w</math> is larger.
-Assume that the least significant set bit of <math>x-y</math> appears on position <math>w-c</math>. Since <math>a</math> is a random odd integer and odd integers have inverses in the [[Ring (mathematics)|ring]] <math>Z_{2^w}</math>, it follows that <math>a(x-y)\bmod 2^w</math> will be uniformly distributed among <math>w</math>-bit integers with the least significant set bit on position <math>w-c</math>.  The probability that these bits are all 0's or all 1's is therefore at most <math>2/2^M=2/m</math>.
-On the other hand, if <math>c < M</math>, then higher-order M bits of
-<math>a(x-y) \bmod 2^w</math> contain both 0's and 1's, so
-it is certain that <math>h(x) \ne h(y)</math>.  Finally, if <math>c=M</math> then bit <math>w-M</math> of
-<math>a(x-y) \bmod 2^w</math> is 1 and <math>h_a(x)=h_a(y)</math> if and only if bits <math>w-1,\ldots,w-M+1</math> are also 1, which happens with probability <math>1/2^{M-1}=2/m</math>.
-This analysis is tight, as can be shown with the example <math>x=2^{w-M-2}</math> and <math>y=3x</math>.  To obtain a truly 'universal' hash function, one can use the multiply-add-shift scheme
-: <math>h_{a,b}(x) = ((ax + b) \bmod 2^w)\, \mathrm{div}\, 2^{w-M}</math>
-which can be implemented in [[C (programming language)|C]]-like programming languages by
-: <math>h_{a,b}(x) = </math> <code>(unsigned) (a*x+b) >> (w-M)</code>
-where <math>a</math> is a random odd positive integer with <math>a < 2^w</math> and <math>b</math> is a random non-negative integer with <math>b < 2^{w-M}</math>.  With these choices of <math>a</math> and <math>b</math>, <math>\Pr\{h_{a,b}(x) = h_{a,b}(y)\}\le 1/m</math> for all <math>x\not\equiv y\pmod{2^w}</math>.<ref name="w03">{{cite thesis
-  | type = Ph.D.
-  | last = Woelfel | first = Philipp
-  | title = Über die Komplexität der Multiplikation in eingeschränkten Branchingprogrammmodellen
-  | publisher = Universität Dortmund
-  | url = http://pages.cpsc.ucalgary.ca/~woelfel/paper/diss/index.html
-  | accessdate = 18 September 2012
-  | format = PDF
-  | year = 2003
-}}</ref> This differs slightly but importantly from the mistranslation in the English paper.<ref name="w99">{{cite conference
-  | last1 = Woelfel | first1 = Philipp
-  | title = Efficient Strongly Universal and Optimally Universal Hashing
-  | conference = Mathematical Foundations of Computer Science 1999
-  | series = LNCS
-  | volume = 1672
-  | pages = 262–272
-  | doi = 10.1007/3-540-48340-3_24
-  | url = http://www.springerlink.com/content/a10p748w7pr48682/
-  | accessdate = 17 May 2011
-  | format = PDF
-  | year = 1999
-}}</ref>
-=== Hashing vectors ===
-This section is concerned with hashing a fixed-length vector of machine words. Interpret the input as a vector <math>\bar{x} = (x_0, \dots, x_{k-1})</math> of <math>k</math> machine words (integers of <math>w</math> bits each). If <math>H</math> is a universal family with the uniform difference property, the following family (dating back to Carter and Wegman<ref name=CW77 />) also has the uniform difference property (and hence is universal):
-: <math>h(\bar{x}) = \left( \sum_{i=0}^{k-1} h_i(x_i) \right)\,\bmod~m</math>, where each <math>h_i\in H</math> is chosen independently at random.
-If <math>m</math> is a power of two, one may replace summation by exclusive or.<ref name=thorup09>
-{{cite conference
-   | last = Thorup | first = Mikkel | authorlink = Mikkel Thorup
-   | title = String hashing for linear probing
-   | booktitle = Proc. 20th ACM-SIAM Symposium on Discrete Algorithms (SODA)
-   | pages = 655–664
-   | year = 2009
-   | url = http://www.siam.org/proceedings/soda/2009/SODA09_072_thorupm.pdf
-}}, section 5.3</ref>
-In practice, if double-precision arithmetic is available, this is instantiated with the multiply-shift hash family of.<ref name=DGMP /> Initialize the hash function with a vector <math>\bar{a} = (a_0, \dots, a_{k-1})</math> of random '''odd''' integers on <math>2w</math> bits each. Then if the number of bins is <math>m=2^M</math> for <math>M\le w</math>:
-: <math>h_{\bar{a}}(\bar{x}) =  \left(\big( \sum_{i=0}^{k-1} x_i \cdot a_i \big) ~\bmod ~ 2^{2w} \right) \,\, \mathrm{div}\,\, 2^{2w-M}</math>.
-It is possible to halve the number of multiplications, which roughly translates to a two-fold speed-up in practice.<ref name=thorup09 /> Initialize the hash function with a vector <math>\bar{a} = (a_0, \dots, a_{k-1})</math> of random '''odd''' integers on <math>2w</math> bits each. The following hash family is universal:<ref name=black>
-{{cite conference
-   | last1 = Black | first1 = J.
-  | last2 = Halevi | first2 = S.
-  | last3 = Krawczyk | first3 = H.
-  | last4 = Krovetz | first4 = T.
-   | title = UMAC: Fast and Secure Message Authentication
-   | booktitle = Advances in Cryptology (CRYPTO '99)
-   | year = 1999
-   | url = http://www.cs.ucdavis.edu/~rogaway/papers/umac-full.pdf
-}}, Equation 1</ref>
-: <math>h_{\bar{a}}(\bar{x}) = \left(\Big( \sum_{i=0}^{\lceil k/2 \rceil} (x_{2i} + a_{2i}) \cdot (x_{2i+1} + a_{2i+1}) \Big) \bmod ~ 2^{2w} \right) \,\, \mathrm{div}\,\, 2^{2w-M}</math>.
-If double-precision operations are not available, one can interpret the input as a vector of half-words (<math>w/2</math>-bit integers). The algorithm will then use <math>\lceil k/2 \rceil</math> multiplications, where <math>k</math> was the number of half-words in the vector. Thus, the algorithm runs at a "rate" of one multiplication per word of input.
-The same scheme can also be used for hashing integers, by interpreting their bits as vectors of bytes. In this variant, the vector technique is known as [[tabulation hashing]] and it provides a practical alternative to multiplication-based universal hashing schemes.<ref>{{cite conference
- | last1 = Pătraşcu | first1 = Mihai | author1-link = Mihai Pătraşcu
- | last2 = Thorup | first2 = Mikkel | author2-link = Mikkel Thorup
- | arxiv = 1011.5200
- | title = The power of simple tabulation hashing
- | doi = 10.1145/1993636.1993638
- | pages = 1–10
- | booktitle = Proceedings of the 43rd annual ACM Symposium on Theory of Computing (STOC '11)
- | year = 2011}}</ref>
-Strong universality at high speed is also possible.<ref name="kaser2013">{{cite journal
- | last1 = Kaser | first1 = Owen
- | last2 = Lemire | first2 = Daniel
- | arxiv =  1202.4961
- | title = Strongly universal string hashing is fast
- | year = 2013
- | journal = Computer Journal
- | url = http://dx.doi.org/10.1093/comjnl/bxt070
- | doi = 10.1093/comjnl/bxt070
- | publisher = Oxford University Press}}</ref> Initialize the hash function with a vector <math>\bar{a} = (a_0, \dots, a_{k})</math> of random  integers on <math>2w</math> bits. Compute
-: <math>h_{\bar{a}}(\bar{x})^{\mathrm{strong}} = (a_0 + \sum_{i=0}^{k} a_{i+1} x_{i}   \bmod ~ 2^{2w} ) \div 2^w </math>.
-The result is strongly universal on <math>w</math> bits. Experimentally, it was found to run at  0.2 CPU cycle per byte on recent Intel processors for<math>w = 32</math>.
-=== Hashing strings ===
-This refers to hashing a ''variable-sized'' vector of machine words. If the length of the string can be bounded by a small number, it is best to use the vector solution from above (conceptually padding the vector with zeros up to the upper bound). The space required is the maximal length of the string, but the time to evaluate <math>h(s)</math> is just the length of <math>s</math>. As long as zeroes are forbidden in the string, the zero-padding can be ignored when evaluating the hash function without affecting universality<ref name=thorup09 />). Note that if zeroes are allowed in the string, then it might be best to append a fictitious non-zero (e.g., 1) character to all strings prior to padding: this will ensure that universality is not affected.<ref name=kaser2013 />
-Now assume we want to hash <math>\bar{x} = (x_0,\dots, x_\ell)</math>, where a good bound on <math>\ell</math> is not known a priori. A universal family proposed by <ref name=DGMP>
-{{cite conference
-   | last1 = Dietzfelbinger | first1 = Martin
-   | last2 = Gil | first2 = Joseph
-   | last3 = Matias | first3 = Yossi
-   | last4 = Pippenger | first4 = Nicholas
-   | title = Polynomial Hash Functions Are Reliable (Extended Abstract)
-   | booktitle = Proc. 19th International Colloquium on Automata, Languages and Programming (ICALP)
-   | pages = 235–246
-   | year = 1992
-}}</ref>
-treats the string <math>x</math> as the coefficients of a polynomial modulo a large prime. If <math>x_i \in [u]</math>, let <math>p \ge \max \{ u, m \}</math> be a prime and define:
-:<math>h_a(\bar{x}) = h_\mathrm{int} \left( \big(\sum_{i=0}^\ell x_i\cdot  a^i \big) \bmod ~p \right)</math>, where <math>a \in [p]</math> is uniformly random and <math>h_\mathrm{int}</math> is chosen randomly from a universal family mapping integer domain <math>[p] \mapsto [m]</math>.
-Consider two strings <math>\bar{x}, \bar{y}</math> and let <math>\ell</math> be length of the longer one; for the analysis, the shorter string is conceptually padded with zeros up to length <math>\ell</math>. A collision before applying <math>h_\mathrm{int}</math> implies that <math>a</math> is a root of the polynomial with coefficients <math>\bar{x} - \bar{y}</math>. This polynomial has at most <math>\ell</math> roots modulo <math>p</math>, so the collision probability is at most <math>\ell/p</math>. The probability of collision through the random <math>h_\mathrm{int}</math> brings the total collision probability to <math>\frac{1}{m} + \frac{\ell}{p}</math>. Thus, if the prime <math>p</math> is sufficiently large compared to the length of strings hashed, the family is very close to universal (in [[statistical distance]]).
-To mitigate the computational penalty of modular arithmetic, two tricks are used in practice:<ref name=thorup09 />
-# One chooses the prime <math>p</math> to be close to a power of two, such as a [[Mersenne prime]]. This allows arithmetic modulo <math>p</math> to be implemented without division (using faster operations like addition and shifts). For instance, on modern architectures one can work with <math>p = 2^{61}-1</math>, while <math>x_i</math>'s are 32-bit values.
-# One can apply vector hashing to blocks. For instance, one applies vector hashing to each 16-word block of the string, and applies string hashing to the <math>\lceil k/16 \rceil</math> results. Since the slower string hashing is applied on a substantially smaller vector, this will essentially be as fast as vector hashing.
-==See also==
-* [[K-independent hashing]]
-* [[Rolling hashing]]
-* [[Tabulation hashing]]
-* [[Min-wise independence]]
-* [[Universal one-way hash function]]
-* [[Low-discrepancy sequence]]
-* [[Perfect hashing]]
-== References ==
-<references />
-== Further reading ==
-* {{cite book
-   | last = Knuth
-   | first = Donald Ervin
-   | authorlink = Donald Knuth
-   | title = [[The Art of Computer Programming]], Vol. III: Sorting and Searching
-   | edition = 3rd
-   | year = 1998
-   | publisher = Addison-Wesley
-   | location = Reading, Mass; London
-   | isbn = 0-201-89685-0
-  }}
-==External links==
-* [http://opendatastructures.org/versions/edition-0.1e/ods-java/5_1_ChainedHashTable_Hashin.html#SECTION00811000000000000000 Open Data Structures - Section 5.1.1 - Multiplicative Hashing]
-[[Category:Cryptographic hash functions]]
-[[Category:Hashing]]
-[[Category:Search algorithms]]
-[[Category:Computational complexity theory]]

Truncated tesseract: Difference between revisions

Latest revision as of 06:10, 2 January 2015

Navigation menu

Search