Monday, February 18, 2013

Realism and Gloom

Steve Hsu replied to my blog post.  (Almost a week ago!  He has had about ten blog posts since then.  I have never been able to keep up with the pace of blogging.)

Steve wonders why I am so gloomy about the prospects for an explanatory genetic science.  His optimism is based on a "model" linked here.

With all due respect, that isn't much of a model.  All it says is that SOMEHOW, something like height or IQ has to instantiated by all the genes that make up the identical twin correlation.  All the main effects, plus all the (unspecified) interactions and nonlinear terms.  Well sure, but that isn't saying anything except that identical twins are some kind of existence proof that it is possible for all the information to be added up in an organism to develop into a phenotype.  I joked in my last post that we cold predict height or IQ if we could grow an identical twin for each of us.  That is what organisms are:  developmental computations over the near-infinite dimensionality of the gene-and-environment space.  The problem is that we can't figure out how to reproduce that process using any finite combination rule on the actual DNA.  It's like saying that in theory we ought to be able to predict the weather on the first Tuesday in March 2017, if we just get enough data, and use a model that combines all the linear and non-linear combinations.  Except we can't, because a) It's a completely hypothetical argument, and b) There is chaotic non-linearity in between here and there.

Tim Bates also replies, here, mostly in the context of IQ.  Some of the post isn't a response to me, but to hard-core environmentalists who believe "twin studies are fatally flawed"and that kind of thing, which has never been me.

The main point of his post is to wonder what is going to happen as sample sizes get bigger and bigger, allowing us to detect statistically significant effects of alleles with smaller and smaller effects.  Tim expects that the genes that are identified will cluster in understandable biogenetic pathways, leading to cumulative brain science about intelligence.  Maybe, but how does he know this?  He cites height, but a quick glance at the BGAnet Facebook group will show that even the claim that height genes make sense is pretty controversial.  One thing that isn't going to happen as sample sizes increase:  the effect sizes of the SNPs aren't going to go up.  We already have an unbiased estimate of that, and it is gloomy.

But here is the real point.  Suppose you took the entire research program for IQ: twins and adoptees, on out to GWAS and biochemical pathways, and did it instead for marital status.  We already know that marital status is heritable, and given that it is heritable, I don't see any reason that given big enough sample sizes etc etc, we wouldn't find SNPs that exceed 10 minus whatever. (Or is there an alternative, a way for something to be heritable without having significant SNP associations?)  Would divorce SNPs cluster in biochemical pathways and lead to a neuroscience of marriage?  Genetic reductionists have a choice.  Either you have to explain why SOME things (height, IQ) are headed to genetic explanation via twin studies, GWAS, etc, while OTHER things (divorce, how much TV you watch) get the heritability but not the ultimate genetic explanation.  OR you have to anticipate a world in which everything is explained by combinations of SNPs. Everything is heritable, so either everything is ultimately explainable in genetic terms, or some heritable things can't be decomposed into genetic molecules.

A serious math problem underlies all this.  As sample sizes go up, we increase the power to detect significant effects of smaller and smaller SNPs, with diminishing returns on the total percentage of variance explained.  It seems like it ought to be possible to estimate the distribution of SNP effect sizes from existing data, and then calculate how far out in the distribution we would have to go in order to explain, say, half the variance, which is what we can do easily by just predicting from the parents IQs.  My guess is that we would have to get way way the hell out in the distribution of effect sizes, by which time the marginal effects would be so ridiculously tiny that the sample sizes required would not be in the tens of thousands but the billions.  As I write this I have the feeling that someone must have already done it.

No comments: