Friday, May 29, 2015

About as bad as behavior genetics reporting gets

A story in the New York Times a couple of days ago, linked here.  The story was prompted by a paper in Evolutiona and Human Behavior by Brendan Zietsch et al, linked here. The title says it all:  "Infidelity Lurks in Your Genes."  I should say at the outset that the article itself is fine.  I'm not sure I buy the argument that the heritabilities they estimate are higher than others in meaningful ways, and there are obvious reasons to be skeptical about the small sample candidtate gene results, but in the paper the authors are perfectly up front and thoughtful about the limitations of their conclusions.
and in fact, the OXTR and sexuality work has most of what you could want out of this kind of study, especially meaningful animal models.  This post is not about the science.

But the Times article gets just about everything wrong.  Of course, there is the ridiculous overstatement of the psychological meaning of heritability.  OK, infidelity is heritable, but so is everything else, so if infidelity lurks in our genes so does everything, which I suppose is true.  What they mean is, likelihood of infidelity not independent of genetic endowment.  The important lesson of complex behavior genetics is about the human condition-- we all create our selves and regulate our behavior in the constant presence of genetic endowment-- and not about anything particular about individual behaviors.  There is no other aspect of sexual behavior to contrast with infidelity that does not lurk in our genes and is therefore under our perfect pscychological control.  The world doesn't work that way.

Then there is the confounding of the twin evidence and the candidate gene work.  See the paragraph that begins "He found that 9.8%..." It starts out talking about candidate genes, and then switches to saying that 40% of the variance can be attributed to genes.  The average NYT reader would have absolutely no idea that it isn't OXTR and vasospressin that acount for 40% (in fact they account for a couple of percent, and that is almost certainly an over-estimate)

Then there is the  evolutionnary  part, which is like a parody..  Men cheat because there is an evolutionary advantage to reproducing with many women.  But women cheat too.  (I like the old jingle:  Hoggamus higgamus, men are polygamous.  Higgamus hoggamus, so are women.)  Why do women cheat?  Well, because they enjoy it!  But don’t worry, that has a biological explanation too, they enjoy it because dopamine.  Once again, I know there are many interesting evolutionary things to say about fidelity and infidelity, and reward systems or whatever.  But it should be a science reporters job not to reduce them to nonsense.  It does the field no good.

The article closes with a story of an acquaintance of the reporter who has cheated on her partner repeatedly and compulsively over the years.  (By the way, this seems like a lot of information to reveal.  A bisexual woman, apparently married to a man, who is an acquaintance of the writer.  The guy is a psychiatrist.  He should be careful.) Recently the relationship has been bad, so the writer can write it off to psychological causes.  But she also cheated early on in the relationship, which the writer takes as evidence that her cheating is “innate”.

People, like I say, are partially self-determining organisms who are born into the world with evolved impulses, some of them universal and some of them differing among individuals.  Managing the relation between the evolved impulses that we share with voles and our complex self-regulating psychology (which also evolved, of course, but only exists in primitive forms in voles) is the essential human activity; understanding it is the ultimate goal of psychology.  Nature didn’t do us the favor of giving us some desires that are innate and others that are strictly psychological, although it is always tempting to think that way because the alternative is so daunting.  From the point of view of humans-as-biological-entities these questions are the basis of evolutionary psychology and behavior genetics; from the subjective point of view of living people they are (forgive me) psychoanalytic.

I have gathered recently that there is a movement in meteorology to change the way weather forecasting is discussed in the popular press.  The excellent Capital Weather Gang in the Washington Post no longer scream headlines like, “Blizzard to Bury DC!”.  Instead they talk in terms of probabilities and confidence intervals, discuss how their forecasts might go wrong, weight the different possible outcomes, consider the limitations of existing weather models.  (Those of you in the mid-Atlantic might also want to check out the WxRisk feed on Facebook.) It’s less than thrilling than Blizzard! But ultimately more interesting, and leaves readers with a sense of what meteorologists actually do.  We need a similar kind of popular press reform in BG.


Thursday, May 28, 2015

The Heritability of Everything

I am getting asked what I think of the recent paper by Polderman et al in Nature Genetics, linked here (firewall).  The answer is that I am ambivalent about it, and rather than try to squeeze my thoughts into a Facebook post, I thought I might expand a little here.

First, the upside.  I would not have believed it was possible to conduct this meta-analysis.  In fact I literally did not believe it when I read the abstract.  The authors of this paper conducted a meta-analysis of every twin study that has been conducted over the last fifty years, including almost 18,000 traits from 2,700 studies.  Not twin studies of ability, or personality, or behavior, but twin studies of everything.  It represents an inconceivable amount of work.  And the meta-analysis itself is beautifully executed.  The graphs are striking, the numerical analysis is sophisticated.  And to top it all off, the data are available in the form of web-based analysis tool. (Question-- is there a library of pdf's to go with the analysis tool?  It will be much more useful to future investigators if it is possible to scan the original reports for additional data that were not included in the main analysis.  I imagine there would be copyright problems.)

Nevertheless, the question has to be asked-- was it worth the effort?  Twenty-five years ago, I named the proposition that everything is heritable the "First Law of Behavior Genetics."  When I said that I didn't feel as though my conclusion was awaiting affirmation via meta-analysis, because it was obvious.  No serious person, then or now, questions whether in general rMZ > rDZ, not even the critics.  (I'll get to what they do question in a second.)  So the rock bottom finding of the meta-analysis, that on average 49% (The data did the authors a favor by not coming out to exactly 50%) of the variability in the traits is attributable to A, just isn't news.  It is a massive, overwhelming confirmation of what we already knew.  (For the record, the other two laws of behavior genetics were confirmed as well.)

Moreover, taken as a number, a unit of analysis, heritability coefficients are funny things to aggregate on such a massive level.  What exactly are we supposed to make of the fact that twins studies in the ophthalmology domain produced the highest heritabilities?  Should eye doctors, as opposed to say dermatologists, be rushing to the genetics lab because their trait turns out to be more heritable?  No.  Whatever else a heritability may be, it is not an index of how "genetic" something is.  It is not, for example, a useful indicator of how successful gene-finding efforts are likely to be.  If nothing else, differences in reliability of measurement are confounded every heritability tallied here.  My point is this-- although it's nice to know that on average everything is 50% heritable, it's hard to attach much meaning to the number itself, or especially to deviations from that number, to the fact that eye conditions have heritabilities around .7 and attitudes around .3.  Having two arms has a heritability of 0.

And as I say, no one really disputes the fact that everything is heritable.  Critics of BG don't say, "It seems to me that if someone tallied the data carefully it would turn out the fraternal twins are just as similar as identical twins."  They say, for example, that the increased similarity of MZs is in fact environmental, the result of violations of the EEA.  Or they say that genetic and environmental contributions to differences can only be separated statistically, not biologically.  Or they say a million other things, none of which I necessarily endorse, but none of which are really refuted by this analysis.  

The hard question about twin studies is why MZ twins are more similar than DZ twins.  I take the softest view possible:  that in a very general way genetic similarity is associated with phenotypic similarity, for everything, and that this can occur without there being specific genes that are linked in specific ways to specific outcomes.  Whatever the unimaginably complex pathways there may be to becoming a fan of beach volleyball, more genetically similar people are going to be more similar in their fandom.  This is true both across established levels of genetic relatedness (twin and family studies) and in the low-level relatedness among everyone else (GCTA).  It says nothing about the reality of volleyball-fandom as a phenotype, nothing about the likelihood of finding volleyball genes.  It is the general causal background noise of genetic influence.  I have always said that the three laws aren't really laws, they are null hypotheses.  One way to characterize this study is that it is a massive confirmation that the null hypothesis is true.

The bulk of the analyses in the paper is concerned with an issue related to this argument, the fit of the additive model.  This is too complicated a question to get into very deeply here.  To me, for anything meriting the word "complex", on a biological level the additive model is obviously wrong.  Does anyone really think that when the day comes when we understand why some people are more extroverted than others the explanation is going to be that there are thousands of individual genes with biologically specifiable independent additive effects?  But anyway, the authors argue that most twin studies are not inconsistent with the hypothesis of additivity, in the quantitative genetic sense, basically that rMZ = 2rDZ.  But as the authors explain, the classical twin model actually has very little to say about additivity.  Basically, you can always draw a straight line through two data points.  

I'm not perfectly clear about what they do here.  I think that for each comparison they report, they test the null hypothesis that rMZ = 2rDZ.  For 69% of the effects the null hypothesis cannot be rejected.  All this means is that in general the second law of BG holds up (C=0), and that the additive twin model is not violated because rMZ > 2rDZ.  All problems with statistical power (which matters on the level of the individual comparison) are counting in their favor.  So a study with 20 twin pairs in which rMZ = .7 and rDZ = .2 would count as "consistent" with additivity, because the null hypothesis would not be rejected.  But it seems to me it is a big inference to reach any conclusions about the additivity of developmental biology on this basis.  Just by the way, authors, I would be interested to know the percentage of comparisons for which rMZ > 2rDZ, broken down by domain.  If that is in there somewhere I missed it.

So that is what I think.  The study represents an impressive, massive effort; I don't know that it produced anything we didn't know before.  It represents a new style of behavior genetics that I have come to think of as "maximilist".  The authors of this study are not hereditarian, in fact the article hardly takes a theoretical position at all on basic nature-nurture issues.  Instead they amass enormous amounts of evidence in support of a hypothesis that isn't in itself very surprising.  Everything is heritable.  GCTA showing that intelligence is heritable and polygenic is maximilist.  GCTA is a formidable effort in quantitative genetics, but we already knew that intelligence was heritable and polygenic.  The GWAS of educational attainment showing that with half a million people you can find a SNP significant at 10-8 that accounts for a quarter of a percent of the variance is maximilist.  The PGC is maximilist.  Maximilist BG takes the hard issues in BG-- that everything is heritable, but it is hard to get from heritability to meaningful understanding of process, that there doesn't seem to be genes of substantial effect for anything behavioral-- and instead of grappling with them, tries to bury them in an avalanche of data.  

Many of my colleagues, I'm sure, do not agree with what I have written here.  Good.  One of my goals for the next few years is to try to get the field talking about the important questions we face at the interface of data collection and theory.  Unfortunately, that involves disagreeing in public, although I hope we can do so collegially.  So I challenge those of you who disagree-- say so in a comment here, or on the BGAnet Facebook page, or better yet start a blog.  Too many theoretical discussions of BG are old arguments between us and our old opponents-- the EEA people-- and those arguments are in my opinion mostly played out.  It will be much more interesting for us to talk to each other.

Tuesday, February 19, 2013

Development and Psychopathology

Chris Beam and I have a new paper out at Development and Psychopathology.  Link here.

Monday, February 18, 2013

Realism and Gloom

Steve Hsu replied to my blog post.  (Almost a week ago!  He has had about ten blog posts since then.  I have never been able to keep up with the pace of blogging.)

Steve wonders why I am so gloomy about the prospects for an explanatory genetic science.  His optimism is based on a "model" linked here.

With all due respect, that isn't much of a model.  All it says is that SOMEHOW, something like height or IQ has to instantiated by all the genes that make up the identical twin correlation.  All the main effects, plus all the (unspecified) interactions and nonlinear terms.  Well sure, but that isn't saying anything except that identical twins are some kind of existence proof that it is possible for all the information to be added up in an organism to develop into a phenotype.  I joked in my last post that we cold predict height or IQ if we could grow an identical twin for each of us.  That is what organisms are:  developmental computations over the near-infinite dimensionality of the gene-and-environment space.  The problem is that we can't figure out how to reproduce that process using any finite combination rule on the actual DNA.  It's like saying that in theory we ought to be able to predict the weather on the first Tuesday in March 2017, if we just get enough data, and use a model that combines all the linear and non-linear combinations.  Except we can't, because a) It's a completely hypothetical argument, and b) There is chaotic non-linearity in between here and there.

Tim Bates also replies, here, mostly in the context of IQ.  Some of the post isn't a response to me, but to hard-core environmentalists who believe "twin studies are fatally flawed"and that kind of thing, which has never been me.

The main point of his post is to wonder what is going to happen as sample sizes get bigger and bigger, allowing us to detect statistically significant effects of alleles with smaller and smaller effects.  Tim expects that the genes that are identified will cluster in understandable biogenetic pathways, leading to cumulative brain science about intelligence.  Maybe, but how does he know this?  He cites height, but a quick glance at the BGAnet Facebook group will show that even the claim that height genes make sense is pretty controversial.  One thing that isn't going to happen as sample sizes increase:  the effect sizes of the SNPs aren't going to go up.  We already have an unbiased estimate of that, and it is gloomy.

But here is the real point.  Suppose you took the entire research program for IQ: twins and adoptees, on out to GWAS and biochemical pathways, and did it instead for marital status.  We already know that marital status is heritable, and given that it is heritable, I don't see any reason that given big enough sample sizes etc etc, we wouldn't find SNPs that exceed 10 minus whatever. (Or is there an alternative, a way for something to be heritable without having significant SNP associations?)  Would divorce SNPs cluster in biochemical pathways and lead to a neuroscience of marriage?  Genetic reductionists have a choice.  Either you have to explain why SOME things (height, IQ) are headed to genetic explanation via twin studies, GWAS, etc, while OTHER things (divorce, how much TV you watch) get the heritability but not the ultimate genetic explanation.  OR you have to anticipate a world in which everything is explained by combinations of SNPs. Everything is heritable, so either everything is ultimately explainable in genetic terms, or some heritable things can't be decomposed into genetic molecules.

A serious math problem underlies all this.  As sample sizes go up, we increase the power to detect significant effects of smaller and smaller SNPs, with diminishing returns on the total percentage of variance explained.  It seems like it ought to be possible to estimate the distribution of SNP effect sizes from existing data, and then calculate how far out in the distribution we would have to go in order to explain, say, half the variance, which is what we can do easily by just predicting from the parents IQs.  My guess is that we would have to get way way the hell out in the distribution of effect sizes, by which time the marginal effects would be so ridiculously tiny that the sample sizes required would not be in the tens of thousands but the billions.  As I write this I have the feeling that someone must have already done it.

Wednesday, February 13, 2013

I am prompted to dust off my little (well, never) used blog by a paper that was just published in Molecular Psychiatry.  I have gotten a bunch of emails about it, mostly from people who seem to think it contradicts my outlook on behavior genetics. Link here, though it is behind a paywall unless your University gets you through it.  I don't have any criticisms of the study itself, really.  It is timely, well-done and interesting.  I just don't think it is revolutionary, or even a harbinger of something revolutionary; it is a new way of demonstrating something we have known for a long time.

The research group put together a large consortium of studies with genome-wide SNP data on samples of children with IQ scores.  They then searched for genome-wide significance for the individual SNPs (and didn't find any, although they are getting closer), conducted a gene- (as opposed to SNP) based analysis that identified one gene with a significant association with IQ, used genome-wide complex trait analysis to show that common SNPs jointly account for a substantial proportion of the variation in IQ, and built a multi-SNP predictor based on the SNPs most strongly related to IQ, which predicted 1.2, 3.5 and .5 percent of the variation in IQ in three replication samples.

What does all this mean?  To understand it, you have to place it in context:  the first of the three assertions in the title, that IQ is heritable, has been perfectly well established by twin and adoption studies for seventy-five years.  It's good to show once again without the twins, but it is hardly news.  The second assertion, that it is highly polygenic, has been pretty obvious for a long time also, and has become moreso recently.

But what of the GCTA and the predictive composite?  GCTA is more like a twin study than it is like gene-finding.  SNP arrays are used to define pairwise genomic similarity among "unrelated" individuals, and then genomic similarity is compared to phenotypic similarity.  So yes, the heritability that was detected via quantitative genetics exists down in the SNPs somewhere, but where else would it have been?  When the researchers create composites of actual SNPs, instead of just identifying SNP-based variance, they can account for a weighted mean of 1.7% of the variance, which is a correlation of r=.13.  That, to me, is the bottom line:  if we were start a program tomorrow to take SNPs from newborns and predict their intelligence, we would do so at a level much worse than predicting from the parent's income, for example, never mind from their IQ.  And this part of the story is not one that we expect to improve as samples get bigger.  The 1.7% was based on all the SNPs, not just those reaching some magical level of significance.

What we do expect as samples get bigger is that maybe some individual SNPs will reach that magical level.  Steve Hsu predicts so, here.  I say so what.  Sure, if samples reach into the hundreds of thousands, a few SNPS with truly tiny effect sizes will be significant.  Once again:  no one sensible thought that maybe SNPs weren't associated with intelligence; the twin studies demonstrate that SNPs have to be associated with intelligence.  The real question is whether, short of growing everyone an identical twin, we can figure out the combinatorial rules by which bits of DNA combine, so we can build useful scientific explanations or prediction models.  I still see no signs that we are headed in that direction.

Tuesday, January 3, 2012

I just changed the name of this blog to "The Gloomy Prospect Blog," in the hope of actually using it.  The key to blogging, especially intellectual blogging, is reading.  If you read in a structured and regular way, there is plenty to blog about, otherwise not so much.  I do read a lot about behavior genetics, so it ought to be a matter of just keeping track of what I read.  We'll see.

Why The Gloomy Prospect?  I lifted the phrase from Robert Plomin in 2000, in a paper linked here.  Plomin and Daniels' original comment has been reproduced here.

Plomin and Daniels said:

One gloomy prospect is that the salient environment might be unsystematic, idiosyncratic, or serendipitous events such as accidents, illnesses, or other traumas . . . . Such capricious events, however, are likely to prove a dead end for research. More interesting heuristically are possible systematic sources of differences between families. (p. 8).

To which I replied:

The gloomy  prospect  looms larger for the genome project than is generally acknowledged. The question is not whether there are correlations to be found between individual genes and complex behavior—of course there are—but instead whether there are domains of genetic causation in which the gloomy prospect does not prevail, allowing the little bits of correlational evidence to cohere into replicable and  cumulative genetic models of development. My own prediction is that such domains will prove rare indeed, and that the likelihood of discovering them will be inversely related to the complexity of the  behavior  under study. 

I think events since 2000 have borne me out:  scientific study of  the nonshared environment and molecular aspects of the genome have proven much harder than anyone anticipated.  But I still feel bad about harping on it, as though I am spoiling the good vibes of hardworking scientists, who are naturally optimistic about the work they are conducting.  But ever since I was in graduate school, I have felt that biogenetic science has always oversold their contribution, tried to convince everyone that the next new method is going to be the one that finally turns psychology into a real natural science, drags our understanding of ourselves out of the humanistic muck.  But it never actually happens.  More on that next time I write.

Sunday, June 26, 2011

Visscher et al on psychiatric genetics

Evidence-based psychiatric genetics, AKA the false dichotomy between common and rare variant hypotheses

Someone on the BGAnet Facebook group posted about this paper (I can't find a full online copy, the link is just to the abstract.)

Peter Visscher and colleagues make the point that old-fashioned genetics language borrowed from Mendelizing rare gene disorders doesn't do a good job describing the genetics of more complex disorders like schizophrenia and depression, and of course they are right. Penetration, hetergoeneity, phenocopy make sense for disorders caused by a single gene or a countable handful of genes, but not for the kinds of complex, mega-polygenic disorders and normal traits that psychiatric geneticists study today.

But I don't think they take the argument far enough. They wind up optimistic about the genetics of schizophrenia and depression. Not optimistic that they are heritable-- there is no controversy about that-- but optimistic that studying the genetics of depression via GWAS will sooner or later produce something that looks like the biology of depression. I doubt it. The obstacle, as Visscher et al show, is that right now the SNPs for schizophrenia with the biggest effect sizes account for much less than a percent in the variance of liability, depression even less. Even if you add up all the SNPs that have been reliably associated with schizophrenia you get only about a percent. So Visscher et al draw a comparison with height, which five years ago was at about the same place SNP-explanation-wise, and now, thanks to some truly enormous samples, is all the way up to 10%.

But so what? What is to be learned by examining associations between phenotypes and individual SNPs that account for miniscule percentages of the variance. OK, if you have half a million participants they are statistically significant, but where is the evidence that over the long run finding more and more tinier and tinier statistically significant SNPs is going to add up to a theory of anything? Maybe at some point something points to some kind of biological pathway which can then be studied using other methods. If so, fine, but I can't get over the impression that the primary motivation for all of the GWAS is to re-document the by now obvious fact that one way or another all these things are heritable, that SOMEHOW all that genetic variation produces a correlation in schizophrenia liability of .7 in identical twins. But we already knew that.

One piece of old-fashioned genetic language that Visscher et al don't give up is "causal variant." SNPs work because they are in linkage disequilibrium with the "causal variants" that actually cause height and schizophrenia. In some very literal sense that has to be true, if all the word causal means is, "is associated with." What if there is a variant that predisposes young children to interact less successfully with their mothers, which causes them to get less food, which causes them to be a little shorter. Is that a "causal variant" for height? The real conclusion about massively polygenic developmental systems is that the idea of causal variants doesn't really apply. Genes are inputs into complex developmental systems, out of which phenotypes emerge on the other end. In an absolute sense, there are no "genes for" anything, and no genes that qualify as "causes" for anything.

One quick test I apply to all bio-genetic accounts of complex behavioral phenotypes. Every time Visscher et al refer to schizophrenia or depression, substitute "marital status." It is heritable, about as heritable as depression. Do you doubt that a GWAS with half a million participants would find a few significant associations? How could it be otherwise? But is there any sense to be found in talk about "causal variants" for divorce? Are there human traits to which this whole line of genetic explanation does not apply? If so, how do we know the difference? Once upon a time, everyone thought quantitative genetics would do it.... some things would turn out to be really genetic, others, well, whatever the opposite of genetic was supposed to be. It didn't work, and it isn't working now.