Pseudospectral optimal control: Difference between revisions

Revision as of 01:48, 17 November 2013

Sliding window based part-of-speech tagging is used to part-of-speech tag a text.

A high percentage of words in a natural language are words which out of context can be assigned more than one part of speech. The percentage of these ambiguous words is typically around 30%, although it depends greatly on the language. Solving this problem is very important in many areas of natural language processing. For example in machine translation changing the part-of-speech of a word can dramatically change its translation.

Sliding window based part-of-speech taggers are programs which assign a single part-of-speech to a given lexical form of a word, by looking at a fixed sized "window" of words around the word to be disambiguated.

The two main advantages of this approach are:

It is possible to automatically train the tagger, getting rid of the need of manually tagging a corpus.
The tagger can be implemented as a finite state automaton (Mealy machine)

Formal definition

Let

Γ = {γ_{1}, γ_{2}, \dots, γ_{| Γ |}}

be the set of grammatical tags of the application, that is, the set of all possible tags which may be assigned to a word, and let

W = {w 1, w 2, \dots}

be the vocabulary of the application. Let

T : W \to P (Γ)

be a function for morphological analysis which assigns each $w$ its set of possible tags, $T (w) \subseteq Γ$ , that can be implemented by a full-form lexicon, or a morphological analyser. Let

Σ = {σ_{1}, σ_{2}, \dots, σ_{| Σ |}}

be the set of word classes, that in general will be a partition of $W$ with the restriction that for each $σ \in Σ$ all of the words $w Σ σ$ will receive the same set of tags, that is, all of the words in each word class ( $σ$ ) belong to the same ambiguity class.

Normally, $Σ$ is constructed in a way that for high frequency words, each word class contains a single word, while for low frequency words, each word class corresponds to a single ambiguity class. This allows good performance for high frequency ambiguous words, and doesn't require too many parameters for the tagger.

With these definitions it is possible to state problem in the following way: Given a text $w [1] w [2] \dots w [L] \in W^{*}$ each word $w [t]$ is assigned a word class $T (w [t]) \in Σ$ (either by using the lexicon or morphological analyser) in order to get an ambiguously tagged text $σ [1] σ [2] \dots σ [L] \in W^{*}$ . The job of the tagger is to get a tagged text $γ [1] γ [2] \dots γ [L]$ (with $γ [t] \in T (σ [t])$ ) as correct as possible.

A statistical tagger looks for the most probable tag for an ambiguously tagged text $σ [1] σ [2] \dots σ [L]$ :

γ^{*} [1] \dots γ^{*} [L] = * a r g m a x \limits_{γ [t] \in T (σ [t])} p (γ [1] \dots γ [L] σ [1] \dots σ [L])

Using Bayes formula, this is converted into:

γ^{*} [1] \dots γ^{*} [L] = * a r g m a x \limits_{γ [t] \in T (σ [t])} p (γ [1] \dots γ [L]) p (σ [1] \dots σ [L] γ [1] \dots γ [L])

where $p (γ [1] γ [2] \dots γ [L])$ is the probability that a particular tag (syntactic probability) and $p (σ [1] \dots σ [L] γ [1] \dots γ [L])$ is the probability that this tag corresponds to the text $σ [1] \dots σ [L]$ (lexical probability).

In a Markov model, these probabilities are approximated as products. The syntactic probabilities are modelled by a first order Markov process:

p (γ [1] γ [2] \dots γ [L]) = \prod_{t = 1}^{t = L} p (γ [t + 1] γ [t])

where $γ [0]$ and $γ [L + 1]$ are delimiter symbols.

Lexical probabilities are independent of context:

p (σ [1] σ [2] \dots σ [L] γ [1] γ [2] \dots γ [L]) = \prod_{t = 1}^{t = L} p (σ [t] γ [t])

One form of tagging is to approximate the first probability formula:

p (σ [1] σ [2] \dots σ [L] γ [1] γ [2] \dots γ [L]) = \prod_{t = 1}^{t = L} p (γ [t] C_{(-)} [t] σ [t] C_{(+)} [t])

where $C_{(-)} [t] = σ [t - N_{(-)}] σ [t - N_{(-)}] \dots σ [t - 1]$ is the right context of the size $N_{(+)}$ .

In this way the sliding window algorithm only has to take into account a context of size $N_{(-)} + N_{(+)} + 1$ . For most applications $N_{(-)} = N_{(+)} = 1$ . For example to tag the ambiguous word "run" in the sentence "He runs from danger", only the tags of the words "He" and "from" are needed to be taken into account.

@@ Line 1: / Line 1: @@
+'''Sliding window based part-of-speech tagging''' is used to [[part-of-speech tagging|part-of-speech tag]] a text.
+A high percentage of words in a [[natural language]] are words which out of context can be assigned more than one part of speech. The percentage of these ambiguous words is typically around 30%, although it depends greatly on the language. Solving this problem is very important in many areas of [[natural language processing]]. For example in [[machine translation]] changing the part-of-speech of a word can dramatically change its translation.
-Next - GEN Gallery is a full incorporated Image Gallery plugin for Word - Press which has a Flash slideshow option. This one is one of the most beneficial features of Word - Press as this feature allows users to define the user roles. Step-4 Testing: It is the foremost important of your Plugin development process. Out of the various designs of photography identified these days, sports photography is preferred most, probably for the enjoyment and enjoyment associated with it. The top 4 reasons to use Business Word - Press Themes for a business website are:. <br><br>Choosing what kind of links you'll be using is a ctitical aspect of any linkwheel strategy, especially since there are several different types of links that are assessed by search engines. Best of all, you can still have all the functionality that you desire when you use the Word - Press platform. This may possibly also permit it currently being seriously straightforward to modify the hues within your Ad - Sense code so the ads blend nicely with the many term broad internet word wide web web page in case you can come to your conclusion to run the adverts. By purchasing Word - Press weblogs you can acquire your very own domain title and have total command of your web site. You can also get a free keyword tool that is to determine how strong other competing sites are and number of the searches on the most popular search sites. <br><br>It is also popular because willing surrogates,as well as egg and sperm donors,are plentiful. To sum up, ensure that the tactics are aiming to increase the ranking and attracting the maximum intended traffic in the major search engines. all the necessary planning and steps of conversion is carried out in this phase, such as splitting, slicing, CSS code, adding images, header footer etc. Every single Theme might be unique, providing several alternatives for webpage owners to reap the benefits of in an effort to instantaneously adjust their web page appear. Customization of web layout is easy due to the availability of huge selection of templates. <br><br>You can add keywords but it is best to leave this alone. I have compiled a few tips on how you can start a food blog and hopefully the following information and tips can help you to get started on your food blogging creative journey. One of the great features of Wordpress is its ability to integrate SEO into your site. Contact Infertility Clinic Providing One stop Fertility Solutions at:. OSDI, a  Wordpress Development Company  based on ahmedabad, India. <br><br>Millions of individuals and organizations are now successfully using this tool throughout the world. s ability to use different themes and skins known as Word - Press Templates or Themes. You can select color of your choice, graphics of your favorite, skins, photos, pages, etc. Word - Press is an open source content management system which is easy to use and offers many user friendly features.  If you enjoyed this post and you would certainly like to get additional facts relating to [http://x.3pods.de/27iv wordpress dropbox backup] kindly browse through our web-site. 95, and they also supply studio press discount code for their clients, coming from 10% off to 25% off upon all theme deals.
+Sliding window based part-of-speech taggers are programs which assign a single part-of-speech to a given lexical form of a word, by looking at a fixed sized "window" of words around the word to be disambiguated.
+The two main advantages of this approach are:
+* It is possible to automatically train the tagger, getting rid of the need of manually tagging a corpus.
+* The tagger can be implemented as a [[finite-state machine|finite state automaton]] ([[Mealy machine]])
+==Formal definition==
+Let
+:<math>\Gamma = \{ \gamma_{1}, \gamma_{2}, \ldots, \gamma_{|\Gamma|} \}</math>
+be the set of grammatical tags of the application, that is, the set of all possible tags which may be assigned to a word, and let
+:<math>W = \{ w1, w2, \ldots \}</math>
+be the vocabulary of the application. Let
+:<math> T : W \rightarrow P ( \Gamma )</math>
+be a function for morphological analysis which assigns each <math>w</math> its set of possible tags, <math>T ( w ) \subseteq \Gamma</math>, that can be implemented by a full-form lexicon, or a morphological analyser. Let
+:<math>\Sigma = \{ \sigma_{1}, \sigma_{2}, \ldots, \sigma_{|\Sigma|} \}</math>
+be the set of word classes, that in general will be a [[Partition of a set|partition]] of <math>W</math> with the restriction that for each <math>\sigma \in \Sigma</math> all of the words <math>w \, \Sigma \, \sigma</math> will receive the same set of tags, that is, all of the words in each word class (<math>\sigma</math>) belong to the same ambiguity class.
+Normally, <math>\Sigma</math> is constructed in a way that for high frequency words, each word class contains a single word, while for low frequency words, each word class corresponds to a single ambiguity class. This allows good performance for high frequency ambiguous words, and doesn't require too many parameters for the tagger.
+With these definitions it is possible to state problem in the following way: Given a text <math>w[1] w[2] \ldots w[L] \in W^*</math> each word <math>w[t]</math> is assigned a word class <math>T ( w[t] ) \in \Sigma</math> (either by using the lexicon or morphological analyser) in order to get an ambiguously tagged text <math>\sigma[1] \sigma[2] \ldots \sigma[L] \in W^*</math>. The job of the tagger is to get a tagged text <math>\gamma[1] \gamma[2] \ldots \gamma[L]</math> (with <math>\gamma[t] \in T(\sigma[t])</math>) as correct as possible.
+A statistical tagger looks for the most probable tag for an ambiguously tagged text <math>\sigma[1] \sigma[2] \ldots \sigma[L]</math>:
+:<math>\gamma^*[1] \ldots \gamma^*[L] = \operatorname*{arg\,max}\limits_{\gamma[t] \in T(\sigma[t])} p(\gamma[1] \ldots \gamma[L] \sigma[1] \ldots \sigma[L]) </math>
+Using [[Bayes formula]], this is converted into:
+:<math>\gamma^*[1] \ldots \gamma^*[L] = \operatorname*{arg\,max}\limits_{\gamma[t] \in T(\sigma[t])} p(\gamma[1] \ldots \gamma[L]) p(\sigma[1] \ldots \sigma[L] \gamma[1] \ldots \gamma[L])</math>
+where <math>p(\gamma[1] \gamma[2] \ldots \gamma[L])</math> is the probability that a particular tag (syntactic probability) and <math>p(\sigma[1] \dots \sigma[L] \gamma[1] \ldots \gamma[L])</math> is the probability that this tag corresponds to the text <math>\sigma[1] \ldots \sigma[L]</math> (lexical probability).
+In a [[Markov model]], these probabilities are approximated as products. The syntactic probabilities are modelled by a first order Markov process:
+:<math>p(\gamma[1] \gamma[2] \ldots \gamma[L]) = \prod_{t=1}^{t=L} p(\gamma[t+1] \gamma[t])</math>
+where <math>\gamma[0]</math> and <math>\gamma[L+1]</math> are delimiter symbols.
+Lexical probabilities are independent of context:
+:<math>p(\sigma[1] \sigma[2] \ldots \sigma[L] \gamma[1] \gamma[2] \ldots \gamma[L]) = \prod_{t=1}^{t=L} p(\sigma[t] \gamma[t])</math>
+One form of tagging is to approximate the first probability formula:
+:<math>p(\sigma[1] \sigma[2] \ldots \sigma[L] \gamma[1] \gamma[2] \ldots \gamma[L]) = \prod_{t=1}^{t=L} p(\gamma[t] C_{(-)}[t] \sigma[t] C_{(+)}[t])</math>
+where <math>C_{(-)}[t] = \sigma[t - N_{(-)}] \sigma [t - N_{(-)}] \ldots \sigma[t - 1]</math> is the right context of the size <math>N_{(+)}</math>.
+In this way the sliding window algorithm only has to take into account a context of size <math>N_{(-)} + N_{(+)} + 1</math>. For most applications <math>N_{(-)} = N_{(+)} = 1</math>. For example to tag the ambiguous word "run" in the sentence "He runs from danger", only the tags of the words "He" and "from" are needed to be taken into account.
+==Further reading==
+* Sanchez-Villamil, E., Forcada, M. L., and Carrasco, R. C. (2005). "[http://www.dlsi.ua.es/~mlf/docum/sanchezvillamil04p.pdf Unsupervised training of a finite-state sliding-window part-of-speech tagger]". ''Lecture Notes in Computer Science / Lecture Notes in Artificial Intelligence'', vol. 3230, p. 454-463
+[[Category:Computational linguistics]]

Pseudospectral optimal control: Difference between revisions

Revision as of 01:48, 17 November 2013

Formal definition

Further reading

Navigation menu

Search