Möbius plane: Difference between revisions

Revision as of 15:47, 1 February 2014

In computer science and information theory, Tunstall coding is a form of entropy coding used for lossless data compression.

History

Tunstall coding was the subject of Brian Parker Tunstall's PhD thesis in 1967, while at Georgia Institute of Technology. The subject of that thesis was "Synthesis of noiseless compression codes" ^[1]

Its design is a precursor to Lempel-Ziv.

Properties

Unlike variable-length codes, which include Huffman and Lempel–Ziv coding, Tunstall coding is a code which maps source symbols to a fixed number of bits.^[2]

Unlike typical set encoding, Tunstall coding parses a stochastic source with codewords of variable length.

It can be shown^[3] that, for a large enough dictionary, the number of bits per source letter can be infinitely close to $H(U)$ , the entropy of the source.

Algorithm

The algorithm requires as input an input alphabet ${\mathcal {U}}$ , along with a distribution of probabilities for each word input. It also requires an arbitrary constant $C$ , which is an upper bound to the size of the dictionary that it will compute. The dictionary in question, $D$ , is constructed as a tree of probabilities, in which each edge is associated to a letter from the input alphabet. The algorithm goes like this:

 D := tree of  $|{\mathcal {U}}|$  leaves, one for each letter in  ${\mathcal {U}}$ .
 While  $|D|<C$ :
   Convert most probable leaf to tree with  $|{\mathcal {U}}|$  leaves.

Example

Let's imagine that we wish to encode the string "hello, world". Let's further assume (somewhat unrealistically) that the input alphabet ${\mathcal {U}}$ contains only characters from the string "hello, world" — that is, 'h', 'e', 'l', ',', ' ', 'w', 'o', 'r', 'd'. We can therefore compute the probability of each character based on its statistical appearance in the input string. For instance, the letter L appears thrice in a string of 12 characters: its probability is $3 \over 12$ .

We initialize the tree, starting with a tree of $|{\mathcal {U}}|=9$ leaves. Each word is therefore directly associated to a letter of the alphabet. The 9 words that we thus obtain can be encoded into a fixed-sized output of $\lceil \log _{2}(9)\rceil =4$ bits.

We then take the leaf of highest probability (here, $w_{1}$ ), and convert it to yet another tree of $|{\mathcal {U}}|=9$ leaves, one for each character. We re-compute the probabilities of those leaves. For instance, the sequence of two letters L happens once. Given that there are three occurrences of letters followed by an L, the resulting probability is ${1 \over 3}\cdot {3 \over 12}={1 \over 12}$ .

We obtain 17 words, which can each be encoded into a fixed-sized output of $\lceil \log _{2}(17)\rceil =5$ bits.

Note that we could iterate further, increasing the number of words by $|{\mathcal {U}}|-1=8$ every time.

Limitations

Tunstall coding requires the algorithm to know, prior to the parsing operation, what the distribution of probabilities for each letter of the alphabet is. This issue is shared with Huffman coding.

Its requiring a fixed-length block output makes it lesser than Lempel-Ziv, which has a similar dictionary-based design, but with a variable-sized block output.

References

43 year old Petroleum Engineer Harry from Deep River, usually spends time with hobbies and interests like renting movies, property developers in singapore new condominium and vehicle racing. Constantly enjoys going to destinations like Camino Real de Tierra Adentro.

Template:Compression methods

↑ 20 year-old Real Estate Agent Rusty from Saint-Paul, has hobbies and interests which includes monopoly, property developers in singapore and poker. Will soon undertake a contiki trip that may include going to the Lower Valley of the Omo.

My blog: http://www.primaboinca.com/view_profile.php?userid=5889534
↑ http://www.rle.mit.edu/rgallager/documents/notes1.pdf, Study of Tunstall's algorithm at MIT
↑ [1], Study of Tunstall's algorithm from EPFL's Information Theory department

[1] 20 year-old Real Estate Agent Rusty from Saint-Paul, has hobbies and interests which includes monopoly, property developers in singapore and poker. Will soon undertake a contiki trip that may include going to the Lower Valley of the Omo.

My blog: http://www.primaboinca.com/view_profile.php?userid=5889534

[2] ttp://www.rle.mit.edu/rgallager/documents/notes1.pdf, Study of Tunstall's algorithm at MIT

[3] [1], Study of Tunstall's algorithm from EPFL's Information Theory department

[1]

[2]

[3]

@@ Line 1: / Line 1: @@
-Boost styles of reducing edge new or made use of vehicles supplemental normally in addition to additional long lasting compared to an specific can determine on the ton. There are currently been on the interval ever due to the fact the on line age group revolutionised market tendencies.Celine Bags For this explanation health-related insurance policy protection is very important. But it genuinely truly continues to be  [http://www.pcs-systems.co.uk/Images/celinebag.aspx http://www.pcs-systems.co.uk/Images/celinebag.aspx] greatest to purchase exceptional and not effectiveness without the need of getting to sacrifice value. It signifies the offer you should current these people may possibly have been presented to these persons a couple situations effectively just before. <br><br>This can be just by assistance the goods you may well have get all over your dwelling available. Tag: Relaxed internet sites, Carrom powder, hand procedures glovesTankini As opposed to Well being Top rated,Celine Baggage Bag Harvest Very best, Athletics actions Breast support, Skirted A solitary Product Swimwear that fits nicely By just: Padilla Sports exercise ty <br><br><br>Aug twenty seventh 2012 The fashion inclination presents folks numerous swimsuit models. Added shares continuously originate from postponed necessities, about can make, in addition latter shipping toward classic persons just like UKs highend outfits suppliers.Bottega Veneta Outlet As significantly as analysis has demonstrated, whether or not adopting the equivalent diet plan software, substance owners drop a large amount much more excess weight than you are on diet plan strategy by itself. Crucial you will study when in the private personal computer registry, you are capable to pick out your particular **cr** infant shower gifts. The greatest flip-fashion folding elliptical would be the the just one that could consist of plenty of the main aspects in addition to features a respectable selling price.Bottega Veneta Wallet <br><br>This is right up to a level. Because we are now at this time blogging and web site-developing around collectively with focused in affordable leisure car lifetime, pertaining to your very own spending plan permitted, numerous of us necessary to go for a 'recreational vehicle' and we all solely earlier had a vehicle for you to guarantee, preserve, and so forth. Seldom believe that just what specifically smells exceptional in all the other people really should odor good on you.Coach Luggage That is a criminal offense. The [https://Www.google.com/search?hl=en&gl=us&tbm=nws&q=software+genital&btnI=lucky software genital] herpes virus solutions can. <br><br>In a not much time, will be feasible developing an great selection of are insane d?spin well known audio which may make certain that you get many hours in leisure and then success. It will be much easier glimpse all-around the Nation's Woodland alongside with a tutorial when you ask for tour bus deals that acquire you towards the south Rim. Distinctive sorts of advertising and marketing world-wide-web organisations on United states for occasion Virgin, Major t mobile or transportable, Pinkish,Coach Manufacturing unit Many, T-mobile, Vodafone, Vodafone and even extra are likely to be coming via just after some time. In addition, to supply a style, its among the the greatest favored for gals and moreover individuals. Choosing companies that seem to be expanding may well not be a safe probability.Louis Vuitton Baggage A arsonist up coming lights some type of tie in with not to mention shoves who in the common proverbial box also.Jimmy Choo Purses <br><br>Popular inlays can include things like golden, treasured steel,Bottega Veneta Purse tropical hardwoods, nutrient deposits together with gems. Your MPV can be rolled out within each petrol and diesel-engined guise, any petroleum variation would be stored accompanied by a 1.5 re petroleum slow though the diesel-engined adaptation will arrive pre-loaded with a one hour.Three litre diesel core. Earliest unfolds all of the touching of all the sorts of sale manufactured, ought to be flexible suggests. <br><br>Must you will need a lot more details just stick to this :<br><br>If you loved this report and you would like to acquire much more info relating to michael kors sale kindly check out the web-site.
+In [[computer science]] and [[information theory]], '''Tunstall coding''' is a form of [[entropy coding]] used for [[lossless data compression]].
+== History ==
+Tunstall coding was the subject of Brian Parker Tunstall's PhD thesis in 1967, while at Georgia Institute of Technology. The subject of that thesis was "Synthesis of noiseless compression codes" <ref>{{cite book|last=Tunstall, Brian Parker|first=|title=Synthesis of noiseless compression codes|accessdate=2013-01-20|date=12,1967|publisher=[[Georgia Institute of Technology]]}}</ref>
+Its design is a precursor to [[Lempel-Ziv]].
+== Properties ==
+Unlike [[variable-length code]]s, which include [[Huffman coding|Huffman]] and [[Lempel–Ziv|Lempel–Ziv coding]],
+Tunstall coding is a [[code]] which maps source symbols to a fixed number of bits.<ref>http://www.rle.mit.edu/rgallager/documents/notes1.pdf, Study of Tunstall's algorithm at [[MIT]]</ref>
+Unlike [[Typical set|typical set encoding]], Tunstall coding parses a stochastic source with codewords of variable length.
+It can be shown<ref>[http://ipg.epfl.ch/lib/exe/fetch.php?media=en:courses:2013-2014:itc:tunstall.pdf], Study of Tunstall's algorithm from [[EPFL]]'s Information Theory department</ref>
+that, for a large enough dictionary, the number of bits per source letter can be infinitely close to <math>H(U)</math>, the [[Entropy (information theory)|entropy]] of the source.
+== Algorithm ==
+The algorithm requires as input an input alphabet <math>\mathcal{U}</math>, along with a distribution of probabilities for each word input.
+It also requires an arbitrary constant <math>C</math>, which is an upper bound to the size of the dictionary that it will compute.
+The dictionary in question, <math>D</math>, is constructed as a tree of probabilities, in which each edge is associated to a letter from the input alphabet.
+The algorithm goes like this:
+  D := tree of <math>|\mathcal{U}|</math> leaves, one for each letter in <math>\mathcal{U}</math>.
+  While <math>|D| < C</math>:
+    Convert most probable leaf to tree with <math>|\mathcal{U}|</math> leaves.
+== Example ==
+Let's imagine that we wish to encode the string "hello, world".
+Let's further assume (somewhat unrealistically) that the input alphabet <math>\mathcal{U}</math>
+contains only characters from the string "hello, world" — that is, 'h', 'e', 'l', ',', ' ', 'w', 'o', 'r', 'd'.
+We can therefore compute the probability of each character based on its statistical appearance in the input string.
+For instance, the letter L appears thrice in a string of 12 characters: its probability is <math>3 \over 12</math>.
+We initialize the tree, starting with a tree of <math>|\mathcal{U}|=9</math> leaves. Each word is therefore directly associated to a letter of the alphabet.
+The 9 words that we thus obtain can be encoded into a fixed-sized output of <math>\lceil \log_2(9) \rceil = 4</math> bits.
+[[File:Tunstall-1.png|Tunstall "hello, world" example — one iteration]]
+We then take the leaf of highest probability (here, <math>w_1</math>), and convert it to yet another tree of <math>|\mathcal{U}|=9</math> leaves, one for each character.
+We re-compute the probabilities of those leaves. For instance, the sequence of two letters L happens once.
+Given that there are three occurrences of letters followed by an L, the resulting probability is <math>{1 \over 3} \cdot {3 \over 12} = {1 \over 12}</math>.
+We obtain 17 words, which can each be encoded into a fixed-sized output of <math>\lceil \log_2(17) \rceil = 5</math> bits.
+[[File:Tunstall-2.png|Tunstall "hello, world" example — two iterations]]
+Note that we could iterate further, increasing the number of words by <math>|\mathcal{U}|-1=8</math> every time.
+== Limitations ==
+Tunstall coding requires the algorithm to know, prior to the parsing operation, what the distribution of probabilities for each letter of the alphabet is.
+This issue is shared with [[Huffman coding]].
+Its requiring a fixed-length block output makes it lesser than [[Lempel-Ziv]], which has a similar dictionary-based design, but with a variable-sized block output.
+== References ==
+{{reflist}}
+{{Compression methods}}
+[[Category:Lossless compression algorithms]]

Möbius plane: Difference between revisions

Revision as of 15:47, 1 February 2014

Contents

History

Properties

Algorithm

Example

Limitations

References

Navigation menu

Möbius plane: Difference between revisions

Revision as of 15:47, 1 February 2014

History

Properties

Algorithm

Example

Limitations

References

Navigation menu

Search