Supertree methods combine source-trees, or trees obtained from the literature, with overlapping species sets into one tree. Nyakatura and Bininda-Emonds [1] selected matrix representation with parsimony (MRP) [5] as the method of choice for generating a supertree of carnivore species. The workflow is illustrated in Figure 1: MRP constructs a new data matrix (MRP-matrix), where each species in the source-trees is represented in a row. The columns of the MRP-matrix are built by encoding the source-trees. Species sharing a common node in the rooted source-tree are assigned the character '1'; the remaining species in the tree receive character '0'. Species not in the source-tree are assigned the character '?'. Thus, each branch of each source-tree contributes one column to the matrix representation of the data. The MRP-matrix resembles a multiple sequence alignment with binary characters {0,1} and missing characters {?}. This superficial 'homology' is employed to reconstruct the most parsimonious tree(s) of the encoded branches from the source trees [3]. The supertree (or supertrees) displays the phylogenetic relationship of all species in the source trees. However, contrary to multiple sequence alignments obtained from DNA or proteins, the variability we observe in the MRP-matrix cannot be modeled by probabilistic models of evolutionary change.
Like almost all phylogenetic methods that deal with large collections of heterogeneous data, many pitfalls during compilation and analysis of the data exist (reviewed in [6]). Nyakatura and Bininda-Emonds [1] did a great deal of work to avoid such systematic errors. Critical issues are the quality of the source trees, the fact that different source trees may have used overlapping raw data and are thus no longer independent, and that it is not at all obvious how to weight the source trees.
With the increasing availability of molecular sequence data, this classical method of supertree reconstruction will soon be replaced by supertree analysis on molecular data, which avoids potential dependency problems. All that will be required is simply to derive a tree for each multiple sequence alignment and apply MRP to the inferred source-trees. In such situations, the tedious compilation of source trees from the literature is not necessary. The carnivore supertree in [1] includes already 74 gene-trees. On the other hand, including source trees from the literature provides phylogenetic information for species for which no molecular data are yet available, as is the case for 57 out of 286 carnivore species.
Finally, contrary to modern phylogenetic inference, the supertree approach lacks any statistical model of evolutionary change, although supertree methods still infer the 'true' relationships very well. Thus, the phylogenetic information presented in source trees together with a careful analysis is able to reconstruct the phylogeny of large systematic groups. Some progress has been made to include statistical analysis into a supertree approach. For example, bootstrap methods were applied to evaluate the support for the supertree by randomly sampling with replacement from the source-trees. Recently, a new approach for supertree reconstruction was proposed: matrix representation with likelihood (MRL) [7]. MRL is one approach towards more statistical thinking in supertree reconstruction.