Animal Evolution: Reassembling animal evolution: a four-dimensional puzzle

18.1 Introduction

Drawing from the latest literature and the con- tributions in this volume, we consider some of the recent progress made in the study of animal evolution and the hurdles that remain. Each of the disciplines considered—palaeontology, evo- devo, phylogenetics, and the incorporation of gen- omic data—have made major contributions to our understanding of how animals have diversified. Together, these pursuits are resulting in a return to whole-organism biology where the link between genotype and phenotype is considered in the con- text of changing physical and biological environ- ments. The modern approach integrates across all these sometimes disparate disciplines, with the aim of reconciling available evidence to describe the patterns and processes that have led to the existing diversity of animal life.
Arguably, there is one underlying common quest that unites the goals of individual researchers: the search for homology—recognizing it, defining it, and using it. Whether it is establishing shared common ancestry of form or function, similar challenges face those contemplating strings of nucleotides, protein structure, gene expression, biochemical pathways, organs systems, or fos- silized microstructures. As we move towards a greater understanding of evolution and the bio- logical entities undergoing selection, it is the study of homology that allows us to detect patterns and interpret processes.
Gaps in our knowledge can be daunting. At best they define the limits of our ignorance, and
at worst they prevent any meaningful or confi- dent interpretation of available information. We consider how some of the major gaps are being addressed with the renaissance of whole-organism biology, the development of improved models, and the advent of new technologies.

18.2 Phylogenies and phylogenetics

Since the first credible molecular estimate of ani- mal relationships was published by Field et al. (1988) there have been a number of significant changes in our understanding of the evolution of the animal kingdom. The largest shift has been from the widely held assumption of gradualism, whereby morphologically simpler animals such as flatworms were placed towards the base of the tree, and complex features such as coeloms and segments were thought to be homologous and to define major groups of animals higher up the tree. The tree widely accepted today has its roots firmly in Field et al.’s study, and subsequent studies adding to the sampling of small subunit (SSU) ribosomal RNA gene (rDNA) sequences; the major revolutions have, until recently, almost all come from efforts using SSU rDNA. Terms such as Ecdysozoa and Lophotrochozoa draw upon shared morphological features, but their roots stem from SSU rDNA. The new animal phylogeny, hand in hand with com- parative developmental studies of homologous gene expression, has forced a reassessment of the evolution and homology of many characteristics of animals; a recognition of the pervasive effects of the loss of characters and secondary simplification

191

192 AN I M AL EV O L UTI O N

of body plans (Copley et al., 2004; Jenner, 2004c) as
is apparent in the flatworms.
While there has been enormous progress in our
understanding of metazoan phylogeny leading to
broad agreement over the outline of the animal
tree (Halanych, 2004; Telford, 2006), there remain
a number of hotly contested questions in meta-
zoan phylogeny; with inevitability, the outstanding
questions are the hardest to answer and the dif-
ficulties encountered are likely to stem from mul-
tiple sources. The first major source of difficulty
occurs when the living phyla emerged in an explo-
sive radiation leaving little chance for the fixation
of informative substitutions; such a situation is
exemplified by the difficulty of resolving relation-
ships between the lophotrochozoan clades (Dunn
et al., 2008). The second important source of diffi-
culty arises when living exemplars are the result of
unusual patterns of genomic evolution that violate
assumptions of models used to reconstruct trees,
resulting in inaccuracies in their placement on the
tree (Philippe and Telford, 2006). This is undoubt-
edly seen in the case of the acoel flatworms, chae-
tognaths, myzostomids, gnathostomulids, and
various other ‘Problematica’.
The tendency for phylogeneticists to contra-
dict each other over the placement of problematic
groups may be rather frustrating to outsiders but
is inevitable. First, all animals that have ever been
described have also been positioned somewhere on
a phylogenetic tree. Any progress to be made inev-
itably involves changing this position and hence
introduces contradiction. Secondly, and alluded
to above, all the easily solved aspects of the tree
were answered 10 or 20 years ago, meaning any-
thing currently worth studying is by definition
problematic. A reliable phylogeny is fundamental
to comparative biology and to our understanding
of evolution, and progress continues.
The progress currently being made stems from
the combination of four approaches; much larger
data sets (phylogenomics) which avoid stochastic
error from limited samples; data from additional
representatives of problematic taxa to avoid or
reduce systematic error; alternative sources of data
(e.g. microRNAs) and, potentially, other rare gen-
omic changes which it is hoped are resistant to
homoplastic evolution (Rokas and Holland, 2000;
Boore, 2006); and finally, improved methods of tree reconstruction that more accurately model the underlying process of molecular evolution so reducing further the possibility of stochastic error (Philippe and Telford, 2006). The biggest contribu- tors to progress in terms of data are the new, cheap technologies for DNA sequencing. We are not far from the day when any given species (with a ‘nor- mal’ sized genome) will have its genome completely sequenced for less than the sum that a single gene may have cost 25 years ago. This will provide the greatest possible source of data for phylogenetic analysis and the resolution of any remaining errors will be the province of the model makers.

18.3 Palaeontology

The frustrations inherent in reconstructing the phylogeny of living animals are echoed by the problems of palaeontology. Many fossils are hard to decipher, especially for outsiders, and confusion is exacerbated by the vehement disagreements over their interpretation by the experts. As an example, the Lower Cambrian Emmonaspis cambrensis has been linked with graptolites (hemichordates), chordates and arthropods, and even with Ediacaran frond-like organisms since its description in 1886 (Conway Morris, 1993b). Beyond the well-known problems of preservation and interpretation (Budd and Jensen, 2000), the most interesting fossils— those in the stem lineages of living taxa with the potential to show the order of acquisition of clade synapomorphies—are the hardest to interpret and to relate to modern groups by their very lack of synapomorphies.
Despite the undoubted problems of palaeon- tology, fossils are unique in their ability to inform us about certain aspects of evolution (Smith, 1994). While comparisons of living taxa within an accur- ate phylogenetic framework give tremendous insight into the pattern of evolution, this approach remains limited by the fact that most of the steps of evolution leading to living clades are absent. As an example, it seems clear that the closest relatives of the arthropods are to be found amongst the cycloneuralian worms. It is not clear, however, how much a comparison of priapulids and arthropods will tell us about the stages by which segments and

S UMMING UP T HE P A R T S 193

jointed appendages were acquired in the arthro- pod stem; in such a case, fossils can be of enormous importance.
The importance of studying fossil lineages for our understanding of the evolution of crown groups has been discussed. Stem-lineage fos- sils make an important contribution in several ways; they break long branches leading to crown groups and show intermediate character states; they may reveal unsuspected character hom- ologies or indeed convergent evolution between extant groups; they can highlight character loss in certain groups; and, finally, they provide the sole means to calibrate evolutionary trees by giv- ing minimum divergence times of living clades. Fossils are also able to provide the ecological back- ground to specific evolutionary events, perhaps most spectacularly the great extinctions and the invasion of new habitats such as the land. All of this information is provided uniquely by fossils; it is vital that evolutionary biologists do not damn fossil evidence too readily based on the difficulties inherent in the field. Palaeontologists themselves recognize the problems they face, and efforts are being made to strengthen the objectivity of fos- sil interpretation and to understand the limits of inference; e.g. in calibrating trees (Drummond et al., 2006; Marshall, 2008), and the interpret- ation of biological evidence for historical events (Budd and Jensen, 2000; Domazet-Los et al., 2007; Peterson et al., 2007; Donoghue and Purnell, 2009). Newly discovered deposits, new tools to visualize internal and microscopic features, new methods of detecting and characterizing biomolecules, and simply returning repeatedly to problematic taxa in the light of new evidence will keep the study of fossils alive.

18.4 Developmental evolution

A phylogenetic tree can describe the relationships of species of living and fossil taxa; mapping the characteristics of those taxa onto the framework of the tree permits us to track the evolution of those characters, showing in which groups—and even at what time—key morphological novelties have evolved. While this combination of a dated phylogenetic framework and the distribution of
characters provides a historical description or pat- tern of character evolution, to understand mor- phological novelty and how such morphological change has occurred at the level of the genome and the embryo (the process of morphological evolu- tion) we need to study the genetics behind changes in ontogeny (see, for example, Moczek, 2008).
The birth of modern developmental evolutionary biology came 25 years ago with the molecular clon- ing of the homeobox motif from Drosophila home- otic genes (Carrasco et al., 1984; McGinnis et al.,
1984) alongside the amazing discovery that the same motif (and indeed the same genes) existed in vertebrates with conserved functions. Comparative molecular genetic analyses of development have since changed our view of the evolution of devel- opmental mechanisms and the origins of novel morphology, revealing surprising conservation and providing an alternative to phylogenetic prox- imity for determining homology. The promise of current evo-devo research is to expand the focus of research to new groups of organisms. While a great deal of progress continues to be made using comparisons of expression patterns (using in situ hybridization) for detecting similarity of function of homologous genes and identifying homology of characters, the export of genomics and true func- tional studies (e.g. RNA interference and transgen- esis) to animals not previously considered model organisms is extremely exciting (see, for example, Abzhanov et al., 2008, and Vera et al., 2008).
By expanding beyond the traditional model organisms, practitioners of developmental evo- lutionary biology are able to build on the discov- eries of the phylogeneticists and palaeontologists to address some of the more intriguing ques- tions in morphological evolution. Current ques- tions revealed by the new animal phylogeny and palaeontological discoveries include the origins of arthropods from the cycloneuralian worms such as priapulids and kinorhynchs, the unexpected rela- tionship of the deuterostome-like brachiopods to lophotrochozoans such as annelids and molluscs, and the possible origins of bilaterians from ani- mals resembling the acoel flatworms.
In addition to investigating specifics such as those questions mentioned above, another focus of devel- opmental evolutionary studies is the generalities

194 AN I M AL EV O L UTI O N

of the genetics behind morphological evolution. A current debate concerns the relative importance of changes in regulatory DNA versus coding DNA of genes (Carroll, 2008; Stern and Orgogozo, 2008; Wagner and Lynch, 2008). One thing on which both sides seem to agree, however, and perhaps this realization is more fundamental than scoring points, is that changes of small effect predomin- ate. Cis-regulatory changes are common due to the possibility of making subtle changes in independ- ent enhancers, and coding changes occur where their pleiotropic effects are minimized. There is nothing new under the sun, however (Ecclesiastes
1:9–14), and this debate harks back, of course, to R. A. Fisher’s analogy of the focusing of a micro- scope using small adjustments (Fisher, 1930).

18.5 Mind the gaps

Addressing what is missing in the study of ani- mal evolution is unavoidable and necessary, not least because it demonstrates openness, attempts to define the limits of our knowledge, and indi- cates possible directions for future research. The influence of missing empirical information can be substantial, and assessing the impact of missing fossils, missing taxa, and missing data is almost a discipline itself in systematics. What is not known can influence estimates of tree topology and stability and the biological inferences we are prepared to make (see Wiens, 2006; Geuten et al.,
2007; Fitzhugh, 2008). In phylogeny, should miss- ing features be scored as losses or simply miss- ing data, and when are multiple related missing features indicative of single losses (e.g. the dele- tion of strings of nucleotides or the loss of entire organs systems)? In palaeontology and evo-devo, when can absence of evidence be used as evidence of absence?
Incomplete information necessarily pushes us either towards caution, in the fear that any infer- ences from gappy data may be deemed premature, or towards bravery (perhaps even foolhardiness) as the constant need to take stock of available evi- dence forces phylogenetic estimates, character map- ping, taxonomic revisions, recalibrated histories, and the desire to provide a narrative that explains biodiversity through space and time. Diligent
researchers are keen to indicate the strength of their arguments by circumscribing the limits and possible influence of what is not known, at the risk of undermining any conclusions drawn from what is known. In contrast, selective sampling can pro- vide more robust arguments and may obviate the need to consider uncertainty or less compelling scenarios. Though we do not set out to sample selectively, the nature of certain data sets puts us firmly at the mercy of exemplars. Just as the early days of SSU rDNA estimates of animal phylogeny relied on single taxa as representatives of entire phyla, we have seen phylogenomic analyses suf- fering from over-representation of taxonomically biased model organisms or unbalanced data sets as more or fewer expressed sequence tags (ESTs) are recruited for analysis from unrelated research. Using all available evidence from GenBank to esti- mate animal interrelationships would be cumber- some and unwise, but that is not to say we should not consider all the available data for statements on homology, and sample them for balanced represen- tative data sets.
Balancing taxon and character sampling is diffi- cult, and has been the focus of empirical and the- oretical studies (e.g. Graybeal, 1998; Pollock et al.,
2002), but there is little doubt that with each new data set we are liable to repeat the mistakes of insuf- ficient or biased sampling. In many cases we sim- ply do not know that our sampling is insufficient or biased, or may not be able address any shortfalls until new data sets become available. Many gaps in phylogenetic data sets await attention on key taxa for known characters that need to be scored. Meanwhile, expert morphologists and taxonomists are declining in number, character coding is fre- quently controversial, archival specimens may not be available or suitable for sampling the missing data, and the animals may be difficult to sample, being rare, cryptic, geographically isolated, elu- sive, or extinct. We need to live with gaps but also to recognize the need to address them when the opportunity arises.
The age of genomics arrived with the expect- ation that knowledge of complete genetic blue- prints would provide a surfeit of phylogenetic information for robust tree reconstruction. This has yet to occur, since our efforts to uncover form,

S UMMING UP T HE P A R T S 195

function, and homology have been achieved for
very few components of genomes (Kuzniar et al.,
2008). For animal evolutionary biologists the
era of post-genomics is a long way off, not just
because of the lack of understanding of available
genomes, but also because of the lack of character-
ized genomes themselves. Sampling systematically
across the animal tree of life is an important strat-
egy in developing comparative genomic data sets,
but until now evolutionary biologists have rarely
dictated sampling priorities. Furthermore, even a
cursory look at the revolutions in molecular sys-
tematics show how sampling just a few key taxa
can upset the entire understanding of animal evo-
lution. For example, it was preliminary molecular
systematic surveys of flatworms that highlighted
the phylogenetic uniqueness of acoelomorph flat-
worms (Carranza et al., 1997; Littlewood et al., 1999)
and that led ultimately to their current status, their
distinctness from the Platyhelminthes and their
importance as links to our deep bilaterian past
(Baguñà et al., 2008; Hejnol and Martindale, 2008b).
Undoubtedly, denser sampling of animal genomes
will provide more surprises.
Whilst evolutionary biologists are constantly
concerned with homology either implicitly or
explicitly (see recent review by Szucsich and
Wirkner, 2007), large-scale data sets are moving
us away from an intimate understanding of all the
statements of homology that we make or rely upon.
To some, this may appear to be neglecting our
responsibility as those whose task it is to detect,
highlight, and interpret the evidence for shared
ancestry. Recently there has been a shift from por-
ing over nucleotide and amino acid alignments
with reference to secondary structures, open read-
ing frames, and function, where indels (insertion/
deletion markers) might be placed judiciously and
exclusion sets chosen carefully, to a need for auto-
mation in order to harness considerable volumes of
data (Wong et al., 2008). A plethora of data requires
the building and implementation of bioinformatic
pipelines to make many of these decisions for us,
swiftly, consistently (with given criteria), and rou-
tinely in the hope that we are minimizing noise
and maximizing signal. Whilst these routines and
algorithms might be borne of an understanding
of the underlying data, such automated efforts do
not negate the need to make evolutionary sense of the biological data, and we must be wary of open- ing new gaps in our understanding.

18.6 Learning from the past and taking advantage of the present

In an era dominated by unprecedented access to information, we have an opportunity for embracing considerable bodies of primary data, meta-data, and the thoughts and arguments of generations of researchers. Global efforts to digit- ize literature and specimens, internet tools that mine, parse, and link databases, and concerted global efforts by a generation of researchers will- ing to synthesize existing information are gener- ating new understanding, whilst complementary efforts by others to generate primary data con- tinue unabated. Indeed, the increase in rate at which gene sequence data can now be gener- ated with second-generation sequencing is phe- nomenal, and third-generation sequencing, now on the horizon, promises orders of magnitude more data (Shendure and Ji, 2008). The informa- tion revolution is vast in scale and breadth and brings with it new powers and challenges, not least for bioinformaticians (Helaers et al., 2008; Pop and Salzberg, 2008). New ways of studying genomes and inferring historical events challenge underlying philosophies and resurrect arguments against phenetics, but there is little doubt that presence/absence of genes, gene networks and biochemical pathways, relative arrangement of genes, and so on, provide an entirely new vocabu- lary with which to consider the past (Boore, 2006; Ding et al., 2008; Dulith et al., 2008).
Although we strive for pragmatic approaches to the onslaught of information, and welcome the opportunities to bring disparate fields back into the fold, caution is always at the back of our minds. For example, although we might expect to be able to access information at the click of a mouse, at what point should we select the following without a second thought: a gene sequence with no associ- ated voucher specimen, a distribution map based on inaccurate identifications or DNA barcodes, a tree topology based on data we have not seen, a cluster of genes we have not verified as being

196 AN I M AL EV O L UTI O N

orthologous, a supertree? Clearly, no individual can make all these decisions independently and it is as a community that we police ourselves, and the data we choose to accept as fit for purpose. Systematics continues to be about maximizing the signal and minimizing the noise, but there is a constant battle against a modern trend towards
‘one-gene-fits-all’ approaches, the undermining
of systems that ‘ain’t broke and don’t need fixing’
(e.g. the Linnean system for classification), a lack of rigour in the understanding or implementing the tools (and underlying philosophies) of the trade, and false claims as to how we will have cat- alogued or barcoded every species on the planet and resolved the position of every twig of the tree of life within the next 25 years. Rhetoric aside, there has been no better time to study animal evolution.

Animal Evolution

Sunday, March 21, 2010

Reassembling animal evolution: a four-dimensional puzzle

No comments:

Post a Comment

Followers

Blog Archive

About Me