In a paper published in The American Naturalist in 1985, Joseph Felsenstein proposed a new method – phylogenetically independent contrasts – that allowed the incorporation of phylogenetic information into comparative analyses. Using Felsenstein’s method, biologists could overcome the statistical problem of non-independence of species due to shared ancestry. Thirty-one years after the paper was published, I spoke to Joseph Felsenstein about his motivation to develop this method, the impact of the paper on the comparative method in ecology and evolution and subsequent developments in methods to deal with phylogenetic non-independence of data.
Citation: Felsenstein, J. (1985). Phylogenies and the comparative method. The American Naturalist, 125(1), 1-15.
Date of interview: 23 November 2016 (via Skype)
Hari Sridhar: I would like to ask you about your motivation to do the work presented in this paper. This came many years after you had started working on phylogenies. Is this one of the first papers where you discussed phylogenies and the comparative method?
Joseph Felsenstein: The first one. The first one where I did, and one of the first ones where anybody did. Really, it is the first one where anybody did. Well, there’s one 1942 paper by somebody else that talked about pairs of species that were implicitly on a phylogeny; they weren’t explicit. Anyway, it’s a very obscure method.
HS: How did you get interested in this topic?
JF: Well, I had been working on phylogenies since my PhD work, although I’m really trained as a theoretical population geneticist. But in the late 1970s, I switched my work into working mostly on phylogenies. And in my thesis work I had done some particular technical work, which turned out to be useful for this comparative method problem. But the real impetus for it was that British biologists, in particular, Paul Harvey and Tim Clutton-Brock, had been worrying about the problem in the late 1970s – from 1977 on – and if you graph, for example, brain weight against body weight across species, they were trying to figure out, should you do it with species or should you use a higher level in the classification system and do genera, for example? And they couldn’t quite figure out how to think very clearly about this. Paul Harvey passed through Seattle in the early 80s, and I heard of the problem through him. In 1982 he gave a seminar on the problem in our Zoology Department, and I realized that I could make use of some of the algebra that I had done in my thesis. So I basically sat down and figured out how to do that, that you could make these contrasts on a phylogeny and they would be independent of each other, whereas the species on a phylogeny are not independent. In fact, that’s the whole point of a phylogeny, is it’s a diagram of how they are not independent. We and a chimpanzee are not independent of each other because we share a lot of common ancestry. So, I was able to recycle this algebra from my thesis work and address the problem that Paul Harvey and Tim Clutton- Brock had been talking about and come up with this.
I went on sabbatical leave in 1982, to Britain, to the University of Edinburgh, which is where I had done my post-doc 15 years before, and in going around and in travelling to various universities in Britain, I gave some talks on this, including visiting Paul Harvey at the University of Sussex. Anyway, I submitted the paper after I came back to Seattle in late 1983. The story is told in more detail in the recent paper by Ray Huey, Ted Garland, and Michael Turelli in the The American Naturalist in 2019. I contributed an Appendix to their paper telling the story.
HS: And around the same time, were you aware of other people who were sort of working on the same kind of topic, or was this completely new?
JF: I became aware in 1983 of the work of Mark Ridley, who had done a thesis addressing the problem with categorical characters. He had a method which I found a little harder to understand. I think I cited that in my paper.
HS: Yes, you did cite him.
JF: In that paper? Okay. So I was obviously aware of that, but really his methods…let me just see here…Yeah, Ridley…Yeah, it was actually a monograph that he wrote in 1983. So he was addressing the same issue. It’s just that his methods are a bit obscure. I don’t entirely know. Even if you asked me to explain them today, I’d have to go back and look at them. I don’t really know; because other people have not picked them up.
HS: Stepping back a bit, could you also tell us, you know how you got interested in this area of research? You said your PhD thesis was on population genetics, and you’ve done a lot of work on the statistics around genetics and phylogenies. Can you tell us how you got interested in this combination of topics?
JF: Well, actually, it didn’t quite work that way. My thesis was on phylogenies. I was trained as a theoretical population geneticist and one of my earliest mentors was Jim Crow – James F. Crow – at the University of Wisconsin, when I was an undergraduate there. And then I went to graduate school with Richard Lewontin who was then at the University of Chicago, subsequently he was at Harvard. And I was working on my PhD, a very ambitious, and in the end, unsuccessful PhD problem that I had posed myself in his lab. It wasn’t working. Meanwhile I got some people in the same department, the biochemical geneticist Jack Hubby, and the Drosophila systematist, – rather famous Drosophila systematist – Lynn Throckmorton were both on the same floor of the same building, and they had some protein band data, and they wanted a clustering program used on it. They gave me a copy of Sokal and Sneath’s book Numerical Taxonomy, I read this and then I implemented the method and I ultimately got a diagram for them which they never used. But I got hooked on the methods. I got quite fascinated with these methods of making tree-like diagrams.
And you said that that’s genetics, but it isn’t because a lot of people were using morphological characters that weren’t explicitly genetic. And people were using molecular sequences, and in ways that weren’t really genetic. They were just looking at the number of substitutions going from one sequence to another. Each species was represented by one molecular sequence, with no consideration of within-species variation, so there was no population genetics. So actually, it represented me moving outside of theoretical population genetics and starting to work on this sideline. And then in the end Dick Lewontin said, “Well, if your main thesis topic is not succeeding, why don’t you write up this tree stuff?”, which I did – for my thesis. Then I went to the University of Washington and for the first 10 years did mostly theoretical population genetics, but on the side I kept working on phylogenies. And then by the 1980s, I decided that that area was heating up and I should really spend more time on it.
HS: Do you remember you know how long it took you to do the work presented in this paper and write it up? Do you remember when and where you did most of it?
JF: I don’t have very clear recollections of that. Paul Harvey tells me that after he gave his seminar on the problem, he went to my population genetics lunchtime seminar the next day. And in that informal seminar I explained how one could use contrasts to solve his problem. It was pretty quick because I had already got the contrasts in my thesis work 10 years before, as a quick way of evaluating likelihoods on trees. And then I realized that if we have correlated characters, and we did a contrast on each character, we preserve the character correlations, but we remove the correlations between species, and you get these independent contrasts. And then it was pretty obvious after that. So I don’t think it took very long. It took a little longer to write it up and get it through review and so on.
HS: Was American Naturalist the first place you submitted this to?
JF: Yeah. It was, yes.
HS: Do you remember if it had a relatively smooth ride through peer review? Do you remember what the review process was like?
JF: I do. I mean, I have the reviews right here. It had four reviews and aside, for instance, from complaining about some jokey wording that I had, which I removed, the reviewers were not hostile. This was unusual then because there was a lot of warfare going on between people working on numerical methods, like me, and morphological systematists who had gotten excited about Willi Hennig‘s methods for doing phylogenies in a very non-statistical, philosophical approach. So I was having all these wars with people over phylogenetic methodology, but the reviewers at TheAmerican Naturalist I got were not hostile, but all of them expressed the same doubt. They all said, “This method requires phylogenies; what we’ve got is classifications; we mostly don’t have phylogenies. So, we think this method is probably useless.” And so, you know, some of them said, in effect, “Well, you know, you could publish it, but mostly we won’t have what’s needed to do this method.” And I think they weren’t looking very far ahead. It did, however, just barely get accepted because they all said, “Well, I guess there’s enough here to publish it, even if it isn’t very useful because nobody has phylogenies”. And I was thinking that if I had submitted it one year earlier, it probably would not have been published because people would have been even more negative about the existence of phylogenies. So it came along at the first moment when it would have been accepted in that journal. Now we look back and it looks silly to, you know, raise that objection. You know, it came out in 1985. 1985 is the year that Polymerase Chain Reaction methods came along, and suddenly you could get lots of loci out of many organisms; so the number of phylogenies was rapidly going up.
HS: I want to talk a little more about the reviews. At one point in the paper you say “some reviewers of this paper felt that the message was rather ‘nihilistic’”. And then in the Acknowledgments you thank the reviewers for saving you from yourself at one point. Tell us more about this.
JF: It was the jokey remark that I made, which was a bit arrogant and insulting. I imagined the student who said, “I want to do comparative methods but I don’t want to use phylogenies” and made a snide suggestion as to what other lines of work this student could go into. But I was trying to make the point that you’ve got to use phylogenies. I did it a little a little bit too fiercely and the reviewers objected to the fierceness. It was not a technical point.
HS: Earlier you said that one reason the reviewers thought this method wasn’t useful because phylogenies weren’t available. What did you feel about this criticism then? Did you feel that that situation with regard to availability of phylogenies would change fairly soon?
JF: Oh, I could see it happening. I could see the number of phylogenies increasing. I could see that they were more and more practical. Remember that I produced, in 1980, the PHYLIP package for inferring phylogenies, which was the first widely distributed package of programs. So I could watch, by its distribution, the interest going up. And I got up to 100 distributions, and I went “Oh, that is very exciting.” And, you know, I was starting to get hundreds of distributions of it. Of course, now, it’s past its plateau and is declining now – other programs have taken over – the number of registered users of PHYLIP is somewhere around 31,000. In those days, I had like 300 users and I was very excited. That seemed like a huge number. So I could see that that was coming. And it seemed then and still seems to me that classifications were not very useful, but phylogenies were. Most systematists even to the present day still think that classification is central; that taxonomy, in terms of classifications, is the core of systematics. I will argue that in order to make inferences about multiple species, you need phylogenies, but it is not necessary, when going from the data to the phylogeny to the inference, to have a classification. So I think classifications above the species-level classifications are a side issue and not very important. And systematists get furious at me when I say that, and refuse to acknowledge that. To this day, I’m the only person, you know – there may be one or two or three people, but I don’t know who the others are really – who are willing to stand up and say classification isn’t that important; phylogenies are what are important. But I think the whole situation is silly because the people in the field are voting with their feet to go do phylogenies, you know, and not so much to spend their time on classification. So, sooner or later that will impress itself on systematists.
HS: You mention one of the difficulties in the method is the reconstruction of the phylogeny. And you mention three sources of information we can use to construct phylogenies – gene frequencies, molecular sequences, and quantitative characters. Today, it’s clearly one of these methods that rules. Would you say that the situation was very different then?
JF: Well, initially in the 1960s, gene frequencies were a major source of information, but they’re near the species-level or within species really, where you don’t always have a tree. The pioneers of work in this area were Anthony Edwards and Luca Cavalli-Sforza, and they worked on gene frequencies within humans. So, you know, I wanted to mention that. For quantitative characters, the problem is, in order to interpret them, you have to understand how they’re correlated with each other. The comparative method machinery, which goes from molecular sequences, gets a tree and then looks at what we can infer about the correlations in evolution of different characters turns out to be something very useful to do. Going the other way, where we start with the quantitative characters and make a tree is really not very viable right now.
HS: In the same year that this paper was published, you published another paper, which has received a huge number of citations, probably among the most cited papers in biology
JF: You mean the bootstrap paper?
HS: Yes
JF: That is my most cited paper. It is one of the most cited papers on phylogenies, but Naruya Saitou and Masatoshi Nei’s paper on the neighbor-joining method is more cited; it has a couple times more citations than mine. And there are some other bioinformatics papers, like the ones that introduced the BLAST search method, and Dez Higgin’s papers on Clustal, the alignment method Clustal, those are much more heavily cited than my paper. There’s a list of that. In 2014, Nature published a list of the 100 most cited papers in science, in all of science. And they are almost all methods papers. One of the papers that is not on that list is ‘Watson & Crick 1953’. Now, of course, it’s much more fundamental, but these papers get cited every time anybody uses is a method. So they get huge numbers of citations. And when you look through there, you’ll find that bioinformatics methods are very highly cited. My paper is No. 41 in that list. But I think the Clustal and BLAST ones are down around 15 or 20. They are in the top 15 or 20 most cited scientific papers of all time. My paper is the most cited paper ever produced at my university. That, at least comforts me a little bit. So, it’s very heavily cited and much more cited than the comparative methods paper.
HS: Which one came earlier?
JF: Published earlier… well, the comparative methods paper is the first paper of the year in The American Naturalist. So you know, it was published first, yes, but I didn’t work on them sequentially. I was of course, already working on the bootstrap when I published the other one.
HS: Yes, I just remembered you talk about a way to calculate the confidence intervals in the American Naturalist paper.
JF: I do. I talked about confidence intervals. I don’t think I said anything about the bootstrap in there. I talk about the prospects for using both the quantitative characters and the molecular sequences. And I talk a bit about likelihood ratios and I have some discussion of likelihood ratios in there. It’s what would now be called the ‘total evidence approach’, where, I guess, you have to infer the co-variances between the quantitative characters, but then make use of them in making the biology. It’s a method that nobody uses, basically, because there’s so much information coming from the molecular sequences.
HS: How was the paper received when it was published? Did it attract a lot of attention?
JF: No. It was about a year before anybody used it. Of course, there’s a publication delay. So there were a few people who got interested. And the citations gradually grow. But I think if you look at the citations, if you look at Web of Science or something and you look at its citations, you’ll find that, you know, there’s a few people using it every year and the numbers trickle upward. It’s been cited 5100 times in Web of Science. And if I analyze the results – in 1985, two papers cited it, but I think they’d be papers of mine. In 1986, 14 papers cited it; 1987 – 20; then 32, then 38, then 37. So by 1990, the number of papers that had cited the comparative methods paper was about 130. So, five years later, which is, you know, modest interest. And its citations grew… they’ve gone up slowly in the 2000s, and then they continued up. In 2015, there were 313 citations to the paper. That’s the latest completely completed year. Yeah, the figures I gave you earlier were for the bootstrap, which is silly; we needed to know about this one.
HS: Apart from the citations, was this something that people were talking about? For example, when you met people who were doing comparative analyses, did they talk to you about this?
JF: I remember some people saying, “Oh, that’s interesting.” And there were some people using it. I don’t think I remember much controversy. The people I was fighting with would have completely dismissed it. I mean, like the people in Willi Hennig Society who were very much focused on philosophical approaches, they would regard this as completely irrelevant and completely wrong. So they just wouldn’t pay attention to it. I think people using molecular data and working on measurable characters, you know, tended to find it interesting, but they weren’t all using it right away. The interest kind of gradually grew.
HS: Today, would it be fair to say that what you proposed is more-or-less universally accepted? If yes, do you have a sense of around when that happened?
JF: No, not really. I mean, I just see this gradual growth, and then by the 2000s, you know, everybody is saying … everyone is basically thinking this way. But it happened gradually during the 90s. I think a big influence was Paul Harvey. Paul Harvey and Mark Pagel wrote a book in 1991 on comparative methods, and they publicized this. They did a lot of publicity for it. Mark Pagel was actually a PhD student at my university. Many people think because he’s in Reading, England, that he’s English; but he’s American. And he got a PhD from my university. I was not in contact with him when he did that. He was doing a PhD in our Psychology Department, on animal behaviour. But then one day he came to see me, just as he was leaving for England, and he said, “I’m going to work with Paul Harvey. What would you suggest I do?” And I said, “Persuade him to use this method.” And he must have done so. Because then they put out a book together. I think that helped a lot.
HS: Today, do you think that there is also the danger of going too far in the opposite direction, by which I mean, are there instances where using phylogenies might not be required and might actually lead to wrong conclusions? Or do you think that, as a general rule, comparative analysis are better done with phylogenies? In fact, in the paper, you mentioned a couple of scenarios where it might not be required: if natural selection has happened instantaneously or if you are looking at the correlation between two variables that just reflect the response to a common environmental factor. What are your thoughts on this today?
JF: I think that’s still valid. And I think the reviewers in 1985 – well, 1984 really – raised cases like that. They said, “suppose that there is a response to an immediate environment” basically, “then phylogenies might not be important.” So, you know, I had that in the paper. What you really have to do is use a method that can handle both of those and try to see whether statistically you can rule out the effect of phylogeny. And there have been methods – there’s one by Michael Lynch in 1991 – which tried to have a phylogenetic signal plus a local response to the immediate environment. And that was intended as a test framework where you could look at those. But in order to do that you have to start from the phylogeny and add all this immediate response stuff.
HS: Do you remember how you drew the figures for this paper?
JF: How did I draw them? I plotted them. There was a Calcomp plotter in our computer centre, and you could feed in files of numbers and, you know, there were some function calls you could make – I don’t remember how it works – and I figured out how to draw these axes and plot these figures. And they’re very crude symbols. I mean, the symbols are little squares, and then up here, it’s a rather weird looking little asterisk, in which the diagonal lines are longer than the vertical and horizontal lines. We don’t draw asterisks that way, usually. That was what the plotter produced if you gave it the figure of an asterisk, and there’s some other way you could tell it to put a square. So I had to do all that in the computer and have it computer-plotted.
HS: Were the phylogenies originally hand-drawn?
JF: They originally were hand drawn, but I think I had, I believe, a graphic artist in our medical school redraw them.
HS: I also wanted to ask you about the people you acknowledge, to learn a little more about who these people were and how they helped. First, you mention a student whose name you don’t know, who, after a seminar you gave at University College, London, told you about a method that uses pairs of closely-related species instead of a full phylogeny. After the paper was published, were you able to trace this person?
JF: No, never did. It was when I gave the lecture at University College London, and after the lecture, a young woman who was, I think a post-doc or something, came up and said, “You know, there’s this pair-wise method”. And I said, “Oh, that’s interesting.” But I never got her name. And she went away. Later I contacted Paul Harvey and others. I contacted Paul and said, “Who was that?” And he said, “I don’t know, I’m not sure who that was.” He didn’t remember her. And then when I submitted the paper to The American Naturalist, instead of her name, I just put in brackets “(person’s name to be supplied) has told me”, you know. Because I was still writing letters to Paul Harvey asking who was that, and they hadn’t figured it out. And we never did figure it out. But the reviewers got very upset. They figured there was some very obscure conspiracy, that I had some devious reason for concealing this person’s name. I just didn’t know who it was! I later found out that the method, the pair-wise method, was done by a guy named Salisbury in 1942. That’s what she was referring to – Salisbury. I don’t think she gave me the Salisbury reference, but somebody did. I knew that the method itself was not invented by her, but was developed by Salisbury.
HS: The next person you acknowledge is Ray Huey
JF: Ray Huey is a guy who is in our biology department – very well known physiological ecologist. I was in contact with him and he was interested and I don’t remember what suggestions he made. One of the things I had to do in that paper was, at the beginning of the paper, I give the names of several people. I cite several papers where people don’t do a phylogenetic correction. Most of those people were fairly tolerant about that. I could understand it if they’d gotten mad at me; most of them didn’t. I may have, you know, I suspect that I talked to Ray partly about, you know, who is likely to get upset, if they were used as a negative example. But I don’t really remember the details of that. You know, Ray’s a friend, and I’ve gotten wise advice from him a lot.
HS: John Gittleman
JF: John Gittleman, who more recently was at the University of Tennessee, was a post-doctoral fellow. And John was in Seattle; he did some post-doctoral work here. He might also have worked with Paul Harvey; I’m not sure which.
HS: Did you discuss the method you were proposing with him?
JF: Yeah, these are mostly zoologists and they were interested in the method and, you know, they found it interesting and I had some discussion with them. It was not about the technical machinery.
HS: Robert Martin
JF: Robert Martin again was somebody who was interested in using the method. He was a zoologist in England who was interested in using it on, I believe, primate data. So again, I’d had some discussion with him.
HS: Then, “Mart Ridley”; I’m guessing that is Mark
JF: Yes, I talked to him to try to figure out what his method was, okay.
HS: This paper came at a time when you were fairly well-established in your career already. But did the paper and the attention it attracted have any kind of direct impact on your career?
JF: Um, I think indirectly over the next few years. I think from about 1985, particularly, actually, the bootstrap did a lot more for my career. But this also did, and connected me more to zoologists who were working on comparative biology. But basically, there was a rising tide of awareness that people had, of what I was doing. And it really wasn’t till the middle 90s that this began to really become noticeable. I started getting attention more heavily about 10 years after this, than right away.
HS: Did it also impact the course that your research took after this?
JF: Not really. Most of my work at this time was on things like the bootstrap and molecular phylogenies. I was writing review articles, I wrote a little bit more about comparative biology, I wrote a 1988 review in Annual Review of Ecology and Systematics, which talked about how you use quantitative characters with phylogenies and brought up this and other, you know, problems. I was trying to spread that view of things. But mostly my work was on molecular phylogenies. And then in the 90s, I worked more on coalescence – likelihood methods and Markov chain Monte Carlo methods for coalescence. And it really wasn’t till after – well, after I published my book; I finished the book in 2003 – that I shifted my work more into working on quantitative characters and phylogenies. And that’s been the centre of my work since then. So, I have a couple of papers after that on extensions of this method. One is on how do you handle finite sample size. If you don’t know the exact species mean, but you have, you know, six samples from that species, how do you take that into account? That was in 2008. And then in 2012, I have a paper on how do you use discrete characters with an old model – Sewall Wright’s threshold model – as a model of determination of a discrete state by quantitative characters. That was fairly recent; four years ago. And I think that one will get more widely used. It’s very gradually catching on. So, I’ve kind of come back to that, in the last five or six years. Steve Arnold – I don’t know if you know who Steve Arnold is – Steve and I would talk at Evolution meetings and we would say, “you know, there is this course that goes on at Woods Hole at the Marine Biological Labs”. And I had been lecturing in it for many, many years. I lectured in it like 25 times. But he and I would get together and say, “you know, there ought to be a parallel course on quantitative character methods, within species and in phylogenies. And it should be parallel to the Woods Hole course.” And finally, in 2011, we had the opportunity to run a one week course on that in North Carolina, at NESCent, which was an NSF funded outfit at Duke University. We ran it three times there and then their grant finished. We then ran it three times at NIMBioS, which is National Institute for Mathematical and Biological Synthesis at Knoxville, Tennessee, at the University of Tennessee and then their grant ran out. And now we’re actually running it next summer, next June, around here at Friday Harbor Laboratories at the University of Washington. So, we’ve got this summer course going, basically, and we’ve done six years of it, and we’re going to do more. And we’ve had, you know, about 25 or 30 students each time. So there’s some hundreds of people who have come through this course now, and I think it’s starting to have some impact.
HS: It’s now 31 years since the paper was published. Does your thinking on the mathematical solution you provide to the problem in this paper, remain more-or-less the same, in its essence?
JF: It remains the same except that I see some of the limitations more clearly. I see the problem of sample size, which the ecologist Robert Ricklefs pointed out in 1996. And then I had to make some machinery to solve that. There’s an increasing use also, instead of Brownian Motion models, which I used in this paper, there’s an increasing use of Ornstein-Uhlenbeck models, which are Brownian Motion but attracted to an optimum point. And I can see that sometimes Brownian Motion is not the best model, but the Orstein-Uhlenbeck models are actually very difficult. There are many possible patterns you can have with Orstein-Uhlenbeck as to where the optima are, and you don’t really know which of them is going to be operative. It’s simplest to use Brownian Motion but I can see that more complex models are needed. On the other hand, just people waving hands and saying this model isn’t good enough doesn’t really help.
HS; Another thing you say in the paper is that “There is no reason to believe that the normal distribution is particularly plausible as the distribution from which changes in individual branches of the phylogeny are drawn.” What’s your thinking on that today?
JF: Well, Orstein-Uhlenbeck models do have a normally distributed change, but it’s different. It doesn’t have an expectation of zero change: from the starting point there is an expectation that moves closer to the optimum. People have come along and said there are these other processes. There are what are called Lévy processes, which are a whole class of processes that have very large jumps. So people say, “what if it’s a Lévy process?” And I have to say two things: one, there are many Lévy processes; which one do you want it to be? The other one is, what mechanistic process can produce a Lévy process? What mechanistic forces in biology actually will produce a Lévy process? And they don’t really have easy answers to that. I think if you have forces involving selection and genetic drift on gene frequencies, they will not be able to make these very big jumps that a Lévy process demands. People come along with this mathematical bestiary. And they say, “Here’s a weird process; it could be that one.” And I have to say, “I don’t think so. You know, it’s all very well, but what would actually produce that?” And I think it’s much harder than that. So yeah, there are other possibilities. What I like to say to people is that “Yes, you’re allowed to be sceptical of that, but you will only be admitted to the event if you carry with you an alternative model. Just coming in and saying that it’s more complicated than that will cause me to ignore you.”
HS: You say, “It should be obvious that there is much statistical work remaining to be done on robust methods of using partial knowledge of phylogenies to make inferences about regressions and correlations of characters.” Today, would you say that that’s no longer an issue because full phylogenies are relatively easily available? Or do you think that, you know, there’s still need for research in that area?
JF: I think that it certainly is easier to build phylogenies, but also people are using methods, not just the bootstrap, but also Bayesian methods to get posteriors, to sample say from the posterior of the phylogeny. And I did talk about – I think it was in the 1988 paper – the fact that you could use a cloud of bootstrap samples or cloud of Bayesian samples and basically do this analysis on every tree. The computing power now makes it rather easy. The contrast method is actually very fast, computationally. And so if somebody gives you a 1000 bootstrap samples or 1000 Bayesian samples and says: “infer the co-variances for every one of these”, I will say: “sure, easy to do.” And the software in my package in PHYLIP, will enable you to do that. There are options in the main program Contrast that carry out these kinds of analyses: can read in a dataset, and then produce multiple trees. It can do that. So I think that’s still very relevant, even though the point estimate of the tree is better than before. You still want to know what the uncertainty does to this, and it’s doable. And so, by the late 1980s, I was already talking about how you could do that.
HS: You say “One rather serious problem that confronts comparative studies is that the relationship under study may change through time.”. Subsequent to this paper, did you research this further, i.e. how to tackle changing relationships in comparative analysis?
JF: I have not really done that. I think that usually lies a little bit beyond what people are able to investigate, partly because if you are using present-day species, what you’re going to see is some sort of average of the co-variances over time. But unless you have fossil data as well, you actually won’t have much power to figure out how the co-variances changed through time. I haven’t heard of very many people doing that. They will, of course, probably when they integrate fossils into this. There’s a lot of interest right now in parametric statistical models that involve both present-day species and fossils. So, I think more of that is coming, but it hasn’t really got here yet.
HS: At the time when you were thinking about this and writing this paper, did you anticipate at all the impact it would have on the field?
JF: I don’t think I knew how large these fields were all going to get. I’d also, a few years before, tried to make available programs and talk about how you could make likelihood inference of phylogenies. I wasn’t, you know, the first person to work on that, but I helped make it practical. So I thought some people would use likelihoods and bootstraps and this, but I don’t think I imagined that it would just keep growing and growing and growing. Basically, what I’ve come to realize is that the phylogeny is the basic structure you need, to say anything sensible about multiple species. And we should have realized that back when I was a graduate student. The people working in population genetics then had all these beautiful equations within species, but they had no idea how you thought about multiple species. It was this area that was mysterious. And now I realize that we should have looked in and said, “Phylogenies will be the key to that. Let’s study phylogenies and figure out how to use them.” But I think a lot of the people working within species back then in the 1960s did not have that, you know, they failed to get interested in phylogenies because they didn’t understand how central they were going to be to comparisons between multiple species, basically.
HS: You earlier spoke about the way in which citations have grown from the time this was published. Do you also have a sense of, you know, what the paper gets cited for?
JF: It’s mostly when someone’s used the method. Someone uses contrasts and then they cite it. They need to have a citation for the method, so they cite.
HS: Another thing I noticed from your recent publications is that you’ve also started working within species variation. Is that a recent interest or was this something you have been thinking about from the beginning?
JF: I was trained in theoretical population genetics and also in quantitative genetics. I was a post-doctoral fellow at the Institute of Animal Genetics in Edinburgh with Alan Robertson, who was the most brilliant theoretician in quantitative genetics. And Douglas Falconer was there, who wrote the standard textbook on the field. So I was always very aware of quantitative genetics, animal plant breeding, and those issues. So I think what I’m trying to do is to bring together these between-species methods, including phylogenies and comparative methods, and put in a little bit more realism of the variation that’s occurring within species, and merging these two areas, basically. So it’s not just a recent interest; it’s been a long term interest.
HS: Have you ever read this paper after it was published?
JF: Oh, yeah, I always do that. I always reread my papers so that I can find out how they sound. I want to see how I’m coming across, and so I read them again. So yeah, I do that all the time. I probably went all the way through a few times. But I’ve read particular bits of it, or talked to people about it.
HS: Do you notice any striking differences in the way you wrote then from the way you write today?
JF: I think I write more simply now, and I don’t go into as much detail of implications. Maybe it’s just fewer brain cells, but I think my style has shifted a little bit away from seeing all the complications right away, to telling a simpler story. But on the whole, I’m happy when I reread it. I said what needed to be said. I do see some points where I could have mentioned certain simple points that I actually understood then. Other people got papers, published papers, making these points, you know. People talked about, when you have contrasts and you do a regression of one character on another, you actually have to make the regression pass through the origin. Because the expectation of a contrast is zero, so, you know, it goes through zero under the model. I don’t think I said that in the paper, and I think Ted Garland got a paper out of that, you know, talking about how the regression should go through the origin. So I keep looking and say, you know, “If I’d only mentioned these things, I could, you know, I could have gotten some credit for those points too.” I think it’s also just ego. I like to go back and read my papers because I, both, enjoy looking at a paper and knowing that it’s gotten across and people are paying attention, but I’m also defensive and I go back and read and say, “Is it good enough?” And usually I conclude, “It is good enough.”
HS: This is a very easy paper to read. This is not my primary area of research, but because of the way it is written, I was able to understand it almost entirely.
JF: Well, I try to write that way, And I think I was influenced a lot by things that I read by other people, including my two mentors, Jim Crow and Dick Lewontin. Jim Crow had a very clear, simple style. Dick Lewontin wrote a little more elegantly, but also very, very clearly. And I think, I’m really just trying to see if I can do as well as they did. Over the years people have said that my writing is generally very clear in many of my papers, but I would maybe rather that they said it was literary or elegant or inspiring. They don’t say that; they say clear. And I guess I conclude that if you have to have only one of those adjectives, “clear” is probably the one you want. You probably want your paper to be clear before it’s literary or inspiring or elegant.
HS: Would you count this paper as one of your favourites among the papers you’ve written?
JF: Yes. You know, there are a few papers that I liked, just because they’re very clever, even if they’re not very important. But I think this got the job done. And it made the point clearly and established a whole area of work. And I think the bootstrap did the same thing. I also like the paper on phylogenies for DNA by likelihoods. Those are three of my most cited papers, and I’m pretty happy with all of them. I think there’s at least one other paper that I really love because it’s the most elegant method I ever did. That’s a paper on a method for coalescents called the Bootstrap Monte Carlo. A year or two after I published it, somebody showed that it was wrong; that the method doesn’t work. But it’s such a beautiful insight that I love the method. It’s one of my favourite papers, but nobody should pay attention to it because the method is wrong!
HS: Somewhere else you mention another paper in The American Naturalist. I’m forgetting what about now…
JF: Well, there’s a paper in 1978 on a macro-evolutionary model. It’s got a little thermodynamics and other stuff in it. Now, I do intend to go back and write more on that. I think there’s more to be done on that, and it’s a paper that over the years got about three or four citations up until very recently. It’s a perfect example of how you can do good work but if it doesn’t get noticed it can sink like a stone. That one did. But I still think I will get back to that topic, and there will be more on that topic. I hope to write more soon on that.
HS: What would you say to a student who is about to read this paper today? What should he or she take away from it? Also, would you add any caveats they should keep in mind as they are reading it?
JF: Umm, I think what I might do is tell them to go read the chapter in my book on this. Go read Chapter 25. in my phylogeny book. It’s probably simpler than this paper. And it will also give some more recent citations of complications and things you need to think about. So I might tell the person: “Go read that chapter, then go read the paper, and you could put it into more context.” I might also point people to the review article I did in 1988 on quantitative characters. Now that too can be found in my book in Chapter 24. So I might say, “Maybe read chapters 24 and 25 in the book, and then you’ll be able to put the whole area into a bigger context without doing as much work as reading these papers.”
0 Comments