Main Page: Difference between revisions

From formulasearchengine
Jump to navigation Jump to search
No edit summary
No edit summary
 
(394 intermediate revisions by more than 100 users not shown)
Line 1: Line 1:
A '''finite state transducer''' ('''FST''') is a [[finite state machine]] with two tapes: an input tape and an output tape. This contrasts with an ordinary [[finite state automaton]] (or [[finite state acceptor]]), which has a single tape.
This is a preview for the new '''MathML rendering mode''' (with SVG fallback), which is availble in production for registered users.


==Overview==
If you would like use the '''MathML''' rendering mode, you need a wikipedia user account that can be registered here [[https://en.wikipedia.org/wiki/Special:UserLogin/signup]]
* Only registered users will be able to execute this rendering mode.
* Note: you need not enter a email address (nor any other private information). Please do not use a password that you use elsewhere.


An automaton can be said to ''recognize'' a string if we view the content of its tape as input.  In other words, the automaton computes a function that maps strings into the set {0,1}.  Alternatively, we can say that an automaton ''generates'' strings, which means viewing its tape as an output tape.  On this view, the automaton generates a [[formal language]], which is a set of strings.  The two views of automata are equivalent: the function that the automaton computes is precisely the [[indicator function]] of the set of strings it generates.  The class of languages generated by finite automata is known as the class of [[regular language]]s.
Registered users will be able to choose between the following three rendering modes:  


The two tapes of a transducer are typically viewed as an input tape and an output tape.  On this view, a transducer is said to ''transduce'' (i.e., translate) the contents of its input tape to its output tape, by accepting a string on its input tape and generating another string on its output tape.  It may do so [[nondeterministic]]ally and it may produce more than one output for each input string.  A transducer may also produce no output for a given input string, in which case it is said to ''reject'' the input.  In general, a transducer computes a [[relation (mathematics)|relation]] between two formal languages.  The class of relations computed by finite state transducers is known as the class of [[rational relation]]s.
'''MathML'''
:<math forcemathmode="mathml">E=mc^2</math>


Finite-state transducers are often used for [[phonology|phonological]] and [[morphology (linguistics)|morphological analysis]] in [[natural language processing]] research and applications. Pioneers in this field include [[Ronald Kaplan]], [[Lauri Karttunen]], [[Martin Kay]] and [[Kimmo Koskenniemi]].<ref>{{Harvnb|Koskenniemi|1983}}</ref>
<!--'''PNG'''  (currently default in production)
A common way of using transducers is in a so-called "cascade", where transducers for various operations are combined into a single transducer by repeated application of the composition operator (defined below).
:<math forcemathmode="png">E=mc^2</math>


==Formal construction==
'''source'''
:<math forcemathmode="source">E=mc^2</math> -->


Formally, a finite transducer ''T'' is a 6-tuple (''Q'', Σ, Γ, ''I'', ''F'', δ) such that:
<span style="color: red">Follow this [https://en.wikipedia.org/wiki/Special:Preferences#mw-prefsection-rendering link] to change your Math rendering settings.</span> You can also add a [https://en.wikipedia.org/wiki/Special:Preferences#mw-prefsection-rendering-skin Custom CSS] to force the MathML/SVG rendering or select different font families. See [https://www.mediawiki.org/wiki/Extension:Math#CSS_for_the_MathML_with_SVG_fallback_mode these examples].


* ''Q'' is a [[finite set]], the set of ''states'';
==Demos==
* Σ is a finite set, called the ''input alphabet'';
* Γ is a finite set, called the ''output alphabet'';
* ''I'' is a [[subset]] of ''Q'', the set of ''initial states'';
* ''F'' is a subset of ''Q'', the set of ''final states''; and
* <math>\delta \subseteq Q \times (\Sigma\cup\{\epsilon\}) \times (\Gamma\cup\{\epsilon\}) \times Q</math> (where ε is the [[empty string]]) is the ''transition relation''.


We can view (''Q'', δ) as a labeled [[directed graph]], known as the ''transition graph'' of ''T'': the set of vertices is ''Q'', and  <math>(q,a,b,r)\in\delta</math> means that there is a labeled edge going from vertex ''q'' to vertex ''r''. We also say that ''a'' is the ''input label'' and ''b'' the ''output label'' of that edge.
Here are some [https://commons.wikimedia.org/w/index.php?title=Special:ListFiles/Frederic.wang demos]:


NOTE: This definition of finite transducer is also called ''letter transducer'' (Roche and Schabes 1997); alternative definitions are possible, but can all be converted into transducers following this one.


Define the ''extended transition relation'' <math>\delta^*</math> as the smallest set such that:
* accessibility:
** Safari + VoiceOver: [https://commons.wikimedia.org/wiki/File:VoiceOver-Mac-Safari.ogv video only], [[File:Voiceover-mathml-example-1.wav|thumb|Voiceover-mathml-example-1]], [[File:Voiceover-mathml-example-2.wav|thumb|Voiceover-mathml-example-2]], [[File:Voiceover-mathml-example-3.wav|thumb|Voiceover-mathml-example-3]], [[File:Voiceover-mathml-example-4.wav|thumb|Voiceover-mathml-example-4]], [[File:Voiceover-mathml-example-5.wav|thumb|Voiceover-mathml-example-5]], [[File:Voiceover-mathml-example-6.wav|thumb|Voiceover-mathml-example-6]], [[File:Voiceover-mathml-example-7.wav|thumb|Voiceover-mathml-example-7]]
** [https://commons.wikimedia.org/wiki/File:MathPlayer-Audio-Windows7-InternetExplorer.ogg Internet Explorer + MathPlayer (audio)]
** [https://commons.wikimedia.org/wiki/File:MathPlayer-SynchronizedHighlighting-WIndows7-InternetExplorer.png Internet Explorer + MathPlayer (synchronized highlighting)]
** [https://commons.wikimedia.org/wiki/File:MathPlayer-Braille-Windows7-InternetExplorer.png Internet Explorer + MathPlayer (braille)]
** NVDA+MathPlayer: [[File:Nvda-mathml-example-1.wav|thumb|Nvda-mathml-example-1]], [[File:Nvda-mathml-example-2.wav|thumb|Nvda-mathml-example-2]], [[File:Nvda-mathml-example-3.wav|thumb|Nvda-mathml-example-3]], [[File:Nvda-mathml-example-4.wav|thumb|Nvda-mathml-example-4]], [[File:Nvda-mathml-example-5.wav|thumb|Nvda-mathml-example-5]], [[File:Nvda-mathml-example-6.wav|thumb|Nvda-mathml-example-6]], [[File:Nvda-mathml-example-7.wav|thumb|Nvda-mathml-example-7]].
** Orca: There is ongoing work, but no support at all at the moment [[File:Orca-mathml-example-1.wav|thumb|Orca-mathml-example-1]], [[File:Orca-mathml-example-2.wav|thumb|Orca-mathml-example-2]], [[File:Orca-mathml-example-3.wav|thumb|Orca-mathml-example-3]], [[File:Orca-mathml-example-4.wav|thumb|Orca-mathml-example-4]], [[File:Orca-mathml-example-5.wav|thumb|Orca-mathml-example-5]], [[File:Orca-mathml-example-6.wav|thumb|Orca-mathml-example-6]], [[File:Orca-mathml-example-7.wav|thumb|Orca-mathml-example-7]].
** From our testing, ChromeVox and JAWS are not able to read the formulas generated by the MathML mode.


* <math>\delta\subseteq\delta^*</math>;
==Test pages ==
* <math>(q,\epsilon,\epsilon,q)\in\delta^*</math> for all <math>q\in Q</math>; and
* whenever <math>(q,x,y,r) \in \delta^*</math> and <math>(r,a,b,s) \in \delta</math> then <math>(q,xa,yb,s) \in \delta^*</math>.


The extended transition relation is essentially the reflexive [[transitive closure]] of the transition graph that has been augmented to take edge labels into account.  The elements of <math>\delta^*</math> are known as ''paths''.  The edge labels of a path are obtained by concatenating the edge labels of its constituent transitions in order.
To test the '''MathML''', '''PNG''', and '''source''' rendering modes, please go to one of the following test pages:
*[[Displaystyle]]
*[[MathAxisAlignment]]
*[[Styling]]
*[[Linebreaking]]
*[[Unique Ids]]
*[[Help:Formula]]


The ''behavior'' of the transducer ''T'' is the rational relation [''T''] defined as follows: <math>x[T]y</math> [[if and only if]] there exists <math>i \in I</math> and <math>f \in F</math> such that <math>(i,x,y,f) \in \delta^*</math>.  This is to say that ''T'' transduces a string <math>x\in\Sigma^*</math> into a string <math>y\in\Gamma^*</math> if there exists a path from an initial state to a final state whose input label is ''x'' and whose output label is ''y''.
*[[Inputtypes|Inputtypes (private Wikis only)]]
 
*[[Url2Image|Url2Image (private Wikis only)]]
Finite State Transducers can be weighted, where each transition is labeled with a weight in addition to the input and output labels. This property makes FSTs very useful for machine learning applications. A Weighted Finite State Transducer (WFST) over a [[semiring]] ''K'' can be defined similarly to an unweighted one as an 8-tuple ''T''=(''Q'', Σ, Γ, ''I'', ''F'', E, λ, ρ), where:
==Bug reporting==
* ''Q'', Σ, Γ, ''I'', ''F'' are defined as above;
If you find any bugs, please report them at [https://bugzilla.wikimedia.org/enter_bug.cgi?product=MediaWiki%20extensions&component=Math&version=master&short_desc=Math-preview%20rendering%20problem Bugzilla], or write an email to math_bugs (at) ckurs (dot) de .
* <math> E \subseteq Q \times (\Sigma\cup\{\epsilon\}) \times (\Gamma\cup\{\epsilon\}) \times Q \times K</math> (where ε is the [[empty string]]) is the finite set of transitions;
* <math>\lambda:  I \rightarrow K </math> maps initial states to weights;
* <math>\rho: F \rightarrow K </math> maps final states to weights.
In order to make certain operations on WFSTs well-defined, the weights have to form a [[Semiring]]. Two typical semirings used in practice are Log and Tropical Semirings.
 
==Operations on finite state transducers==
 
The following operations defined on finite automata also apply to finite transducers:
 
* [[Union (set theory)|Union]].  Given transducers ''T'' and ''S'', there exists a transducer <math>T\cup S</math> such that <math>x[T\cup S]y</math> if and only if <math>x[T]y</math> or <math>x[S]y</math>.
 
* Concatenation.  Given transducers ''T'' and ''S'', there exists a transducer <math>T\cdot S</math> such that <math>wx[T\cdot S]yz</math> if and only if <math>w[T]y</math> and <math>x[S]z</math>.
 
* [[Kleene closure]].  Given a transducer ''T'', there exists a transducer <math>T^*</math> with the following properties: (1) <math>\epsilon[T^*]\epsilon</math>; (2) if <math>w[T^*]y</math> and <math>x[T]z</math> then <math>wx[T^*]yz</math>; and <math>x[T^*]y</math> does not hold unless mandated by (1) or (2).
 
* [[Intersection (set theory)|Intersection]]. Given transducers ''T'', ''S'', the intersection of these transducers ''H'' has the property that (1) x[''H'']y if and only if  x[''T'']y and x[''S'']y.
 
* [[Composition of relations|Composition]]. Given a transducer ''T'' on alphabets Σ and Γ and a transducer ''S'' on alphabets Γ and Δ, there exists a transducer <math>T \circ S</math> on Σ and Δ such that <math>x[T \circ S]z</math> if and only if there exists a string <math>y\in\Gamma^*</math> such that <math>x[T]y</math> and <math>y[S]z</math>. This operation extends to the weighted case.<ref>{{harvnb|Mohri|2004|pp=3–5}}</ref>
 
:This definition uses the same notation which is used in mathematics for [[Composition of relations|relation composition]]. However, the conventional reading for relation composition is the other way around: given two relations <math>T</math> and <math>S</math>, <math>(x,z)\in T\circ S</math> when there exist some <math>y</math> such that <math>(x,y)\in S</math> and <math>(y,z)\in T</math>.
 
* [[Projection (mathematics)|Projection]] to an automaton.  There are two projection functions: <math>\pi_1</math> preserves the input tape, and <math>\pi_2</math> preserves the output tape.  The first projection, <math>\pi_1</math> is defined as follows:
 
:* Given a transducer ''T'', there exists a finite automaton <math>\pi_1 T</math> such that <math>\pi_1 T</math> accepts ''x'' if and only if there exists a string ''y'' for which <math>x[T]y</math>.
 
:The second projection, <math>\pi_2</math> is defined similarly.
 
* [[Determinization]]. Given a transducer ''T'', we want to build an equivalent transducer which has a unique initial state and such that no two transitions leaving any state share the same input label. The [[powerset construction]] can be extended to transducers, or even weighted transducers, but sometimes fails to halt; indeed, some non-deterministic transducers do not admit equivalent deterministic transducers.<ref>[http://www.let.rug.nl/~vannoord/papers/preds/node22.html]</ref> [[Characterization (mathematics)|Characterizations]] of determinizable transducers have been proposed<ref>{{harvnb|Mohri|2004|pp=5–6}}</ref> along with efficient algorithms to test them<ref>{{harvnb|Allauzen|2003}}</ref>: they rely on the [[semiring]] used in the weighted case as well as a general property on the structure of the transducer (the [[twins property]]).
 
* Weight pushing for the weighted case.<ref>{{harvnb|Mohri|2004|pp=7–9}}</ref>
 
* Minimization for the weighted case.<ref>{{harvnb|Mohri|2004|pp=9–11}}</ref>
 
* Removal of [[epsilon-transitions]].
 
==Additional properties of finite state transducers==
 
* It is [[decidable]] whether the relation [''T''] of a transducer ''T'' is empty.
 
* It is decidable whether there exists a string ''y'' such that ''x''[''T'']''y'' for a given string ''x''.
 
* It is [[undecidable]] whether two transducers are equivalent.
 
* If one defines the alphabet of labels <math>L=(\Sigma\cup\{\epsilon\}) \times (\Gamma\cup\{\epsilon\})</math>, finite state transducers are isomorphic to [[nondeterministic finite automata|NDFA]] over the alphabet <math>L</math>, and may therefore be determinized (turned into [[deterministic finite automaton|deterministic finite automata]] over the alphabet <math>L=[(\Sigma\cup\{\epsilon\}) \times \Gamma] \cup [\Sigma \times (\Gamma\cup\{\epsilon\})]</math> ) and subsequently minimized so that they have the minimum number of states.{{Citation needed|date=September 2008}}
 
==Applications==
 
* Context-sensitive rewriting rules of the form ''a → b / c _ d'', used in [[linguistics]] to model [[phonological rule]]s and [[sound change]], are computationally equivalent to finite-state transducers, provided that application is nonrecursive, i.e. the rule is not allowed to rewrite the same substring twice. <ref>{{cite web | url=http://acl.ldc.upenn.edu/J/J94/J94-3001.pdf | title=Regular Models of Phonological Rule Systems | accessdate=August 25, 2012}}</ref>
 
==See also==
 
=== Internal links ===
* [[Mealy machine]]
* [[Moore machine]]
* [[Morphological dictionary]]
* [[foma (software)]]
 
=== External links ===
* [http://openfst.org/ OpenFst], an open-source library for FST operations.
* [http://www.ims.uni-stuttgart.de/tcl/SOFTWARE/SFST.html Stuttgart Finite State Transducer Tools], another open-source FST toolkit
* [http://jsalatas.ictpro.gr/java-fst-framework-api-review/ java FST Framework], an open-source java FST Framework capable of handling OpenFst text format.
 
==Notes==
{{reflist|3}}
 
== References ==
 
<references/>
<div class="references-small">
*{{cite journal
  | last1 = Allauzen
  | first1 = Cyril
  | last2 = Mohri
  | first2 = Mehryar
  | title = Efficient Algorithms for Testing the Twins Property
  | journal = Journal of Automata, Languages and Combinatorics
  | year = 2003
  | volume = 8
  | issue = 2
  | pages = 117–144
  | url = http://www.cs.nyu.edu/~mohri/pub/twins.pdf
}}
*{{citation
  | last = Koskenniemi
  | first = Kimmo
  | authorlink = Kimmo Koskenniemi
  | title = Two-level morphology: A general computational model of word-form recognition and production
  | publisher = Department of General Linguistics, [[University of Helsinki]]
  | year = 1983
  | url = http://www.ling.helsinki.fi/~koskenni/doc/Two-LevelMorphology.pdf
}}
*{{cite journal
  | last = Mohri
  | first = Mehryar
  | title = Weighted Finite-State Transducer Algorithms: An Overview
  | journal = Formal Languages and Applications
  | year = 2004
  | volume = 148
  | issue = 620
  | pages = 551–564
  | url = http://www.cs.nyu.edu/~mohri/pub/fla.pdf
}}
</div>
 
==Further reading==
 
* {{cite book |last= Jurafsky|first= Daniel |coauthors= James H. Martin|authorlink=Daniel Jurafsky|title= [[Speech and Language Processing]] |publisher= [[Prentice Hall]] |year= 2000 |isbn= 0-13-095069-6 |pages=71–83}}
* {{cite book |last= Kornai|first= Andras|authorlink=Andras Kornai|title=[[Extended Finite State Models of Language]] |publisher= [[Cambridge University Press]] |year=1999 |isbn= 0-521-63198-X}}
* {{cite book |last= Roche|first= Emmanuel |coauthors= Yves Schabes|authorlink=Emmanuel Roche|title= [[Finite-state language processing]] |publisher= [[MIT Press]] |year= 1997 |isbn= 0-262-18182-7|pages=1–65}}
* {{cite book |last= Beesley|first= Kenneth R. |coauthors= Lauri Karttunen|authorlink=Kenneth R. Beesley|title= [[Finite State Morphology]] |publisher= [[Center for the Study of Language and Information]] |year= 2003 |isbn= 1-57586-434-7}}
* {{cite book |last= Roark|first= Brian |coauthors= Richard Sproat|authorlink=Brian Roark|title= [[Computational Approaches to Morphology and Syntax]] |publisher= [[Oxford University Press]] |year= 2007 |isbn= 0-19-927478-9}}
* {{cite book |last= Berstel|first= Jean |title= [[Transductions and Context-free Languages]] |publisher= [[Teubner Verlag]] |year= 1979 }}. [http://www-igm.univ-mlv.fr/~berstel/LivreTransductions/LivreTransductions.html Free PDF version]
 
{{DEFAULTSORT:Finite State Transducer}}
[[Category:Models of computation]]
[[Category:Formal languages]]
[[Category:Automata theory]]
 
[[bs:Konačni transduktor]]
[[ca:Transductor d'estats finits]]
[[de:Transduktor (Informatik)]]
[[es:Transductor de estados finitos]]
[[fr:Transducteur fini]]
[[hr:Konačni pretvornik]]
[[sr:Коначни трансдуктор]]

Latest revision as of 22:52, 15 September 2019

This is a preview for the new MathML rendering mode (with SVG fallback), which is availble in production for registered users.

If you would like use the MathML rendering mode, you need a wikipedia user account that can be registered here [[1]]

  • Only registered users will be able to execute this rendering mode.
  • Note: you need not enter a email address (nor any other private information). Please do not use a password that you use elsewhere.

Registered users will be able to choose between the following three rendering modes:

MathML

E=mc2


Follow this link to change your Math rendering settings. You can also add a Custom CSS to force the MathML/SVG rendering or select different font families. See these examples.

Demos

Here are some demos:


Test pages

To test the MathML, PNG, and source rendering modes, please go to one of the following test pages:

Bug reporting

If you find any bugs, please report them at Bugzilla, or write an email to math_bugs (at) ckurs (dot) de .