E2F: Difference between revisions

Revision as of 11:30, 7 March 2013

The factored language model (FLM) is an extension of a conventional language model. In an FLM, each word is viewed as a vector of k factors: $w_{i} = {f_{i}^{1}, . . ., f_{i}^{k}} .$ An FLM provides the probabilistic model $P (f | f_{1}, . . ., f_{N})$ where the prediction of a factor $f$ is based on $N$ parents ${f_{1}, . . ., f_{N}}$ . For example, if $w$ represents a word token and $t$ represents a Part of speech tag for English, the expression $P (w_{i} | w_{i - 2}, w_{i - 1}, t_{i - 1})$ gives a model for predicting current word token based on a traditional Ngram model as well as the Part of speech tag of the previous word.

A major advantage of factored language models is that they allow users to specify linguistic knowledge such as the relationship between word tokens and Part of speech in English, or morphological information (stems, root, etc.) in Arabic.

Like N-gram models, smoothing techniques are necessary in parameter estimation. In particular, generalized back-off is used in training an FLM.

References

55 years old Systems Administrator Antony from Clarence Creek, really loves learning, PC Software and aerobics. Likes to travel and was inspired after making a journey to Historic Ensemble of the Potala Palace.

You can view that web-site... ccleaner free download

Template:Compu-AI-stub

@@ Line 1: / Line 1: @@
-I'm Bridgette and I live in a seaside city in northern United States, Westlake. I'm 27 and I'm will soon finish my study at American Politics.<br><br>My website - [http://Localparanormal.com/members/bradkerrigan/activity/2076/ fifa Coin generator]
+The '''factored language model''' ('''FLM''') is an extension of a conventional [[language model]].  In an FLM, each word is viewed as a vector of ''k'' factors: <math>w_i = \{f_i^1, ..., f_i^k\}.</math>  An FLM provides the probabilistic model <math>P(f|f_1, ..., f_N)</math> where the prediction of a factor <math>f</math> is based on <math>N</math> parents <math>\{f_1, ..., f_N\}</math>.  For example, if <math>w</math> represents a word token and <math>t</math> represents a [[Part of speech]] tag for English, the expression <math>P(w_i|w_{i-2}, w_{i-1}, t_{i-1})</math> gives a model for predicting current word token based on a traditional [[Ngram]] model as well as the [[Part of speech]] tag of the previous word.
+A major advantage of factored language models is that they allow users to specify linguistic knowledge such as the relationship between word tokens and [[Part of speech]] in English, or morphological information (stems, root, etc.) in Arabic.
+Like [[N-gram]] models, smoothing techniques are necessary in parameter estimation.  In particular, generalized back-off is used in training an FLM.
+==References==
+*{{cite conference | author=J Bilmes and K Kirchhoff | url=http://ssli.ee.washington.edu/people/bilmes/mypapers/hlt03.pdf | title=Factored Language Models and Generalized Parallel Backoff | booktitle=Human Language Technology Conference | pages= | year=2003}}
+[[Category:Statistical natural language processing]]
+[[Category:Probabilistic models]]
+{{compu-AI-stub}}

E2F: Difference between revisions

Revision as of 11:30, 7 March 2013

References

Navigation menu

Search