Template:Automatic taxobox/doc/gory technical details

From formulasearchengine
Jump to navigation Jump to search

Template:Automatic taxobox/doc/nav/boxed

Technical details

An accompanying page for each taxon, at Template:Taxonomy/Taxon_name, uses a template to list the parent of each taxon. Template:Tlx then consults these templates to produce a full taxonomy. It only displays "major" taxonomic ranks (i.e. not sub, super, nano etc), with the exception of the immediate parents to the taxon. Instructions for the manual creation of this page, if it does not yet exist, appear in the automatic-taxobox; it should be easy to create a bot to automatically create pages for higher taxa based on their current taxoboxes. Thus the hierarchy can be automatically generated, minimizing the work for editors of new pages whilst creating a consistent taxonomy, thus increasing the utility of Wikipedia (see Template:Cite doi).

A list of all templates involved in generating an automatic taxobox, and their relationships, can be found at Template:Automatic_taxobox/map.

Maintenance

Sandboxing this template is difficult, but for testing purposes, the templates Template:Taxonomy/Test-40 (etc) may be useful.


Extracting taxonomy information via API

This system makes it straightforward to recover full taxonomic information for any taxon using an API. The raw Wikitext of each Template:Taxonomy/Taxon_name is in a consistent format with a line specifying "|rank=taxonomic rank (latinized)", "|parent=Parent taxon", and "|link=Wikilink".

Each parent taxon can be successively queried until the desired taxonomic level is obtained.

An example of an application using the API to browse taxonomies is available at http://taxobrowser.erikhaugen.com.

Algorithm

The automatic taxobox is placed directly above the first paragraph. Since the user is not required to enter any information about the scientific classification within the article, the automatic taxobox searches a database of taxonomy templates for one that matches the supplied Template:Para parameter (or, if none is supplied, the article's title, ignoring any parenthetical expressions).

If the taxon cannot be found in the database, the editor is prompted to enter the taxon's data onto a specified page where that data will be accessed for that taxon's taxobox and for all descendant taxa's taxoboxes. Contained on this page are the taxonomic rank, a link to the article about that taxon, the format which should be used when displaying that taxon in a taxobox, the name of the parent taxon, an indicator of extinction, and a list of references for all of this information.

Once the taxon is identified in the database, the system pulls all the information except the reference parameter. The same is done for this taxon's parent, and the grandparent, great-grandparent, etc., until a top-level taxon is reached (e.g. Life, Veterovata, Ichnos).

Due to technical limitations, this path is traversed several times during the processing of the code, which causes a severe server-end lag when loading an article. If you have ideas for improving this, we'd love to hear them.

Because the tree is only accessible from the taxon the classification's being calculated for, the total time required for these lookups (not taking into consideration any other overhead) is therefore triangular, and the Big O time can be represented as , where corresponds to the number of levels deep the taxon in question is (including the top-level taxon). The number of times a taxon is looked up can be then calculated by , where is the number of levels deep the taxon in question is from the lowest-level color-defining taxon.

Once traversals are complete, the information is sorted out from each taxon. Ranks are converted to English displayable words, links are formed to each taxon, extinction daggers are added, appropriate ranks are italicized, and taxa deemed unimportant to that taxonomy are discarded. The final result looks exactly like a manual taxobox, except for the edit link in the bar that says "Scientific classification".