The reaction to the analysis has been a little strange. I read the meta-analysis as broadly supportive of the existence of the effect, more details on that below. To the extent I have some personal attachment to the finding, which I admit, I felt good about it.
But to my surprise, the reaction on Twitter sounded as though they had disconfirmed the finding. I think the first tweet I saw was this one....
Boy did he get it wrong, I thought, and (as I end up regretting every time) shot off a reply:@StuartJRitchie Turkheimer's GxSES study is always cited even though it's an extreme outlier in literature, as shown in recent meta-analysis— Timofey Pnin (@pnin1957) December 7, 2015
To my surprise, Tim Bates said in fact, they were right....@pnin1957 @StuartJRitchie @tuckerdrob @timothycbates Hey authors of said meta-analysis-- would you say this statement is true?— Eric Turkheimer (@ent3c) December 14, 2015
@StuartJRitchie Yes of course, answer all the comments. Nail Turkheimer (2003).
— James Thompson (@JamesPsychol) December 7, 2015
What was going on? Had I completely misread the paper?
It turns out that of this had started up in response to a more general blog post by Stuart Ritchie (I hope to find the time to comment on that whole argument), in which he described the effect size as an "outlier".
Upon further twittering, Stuart agreed that "extreme outlier" (by the other poster) was an overstatement, but still... what exactly did Tucker-Drob and Bates show? The effect size we reported in 2003 was indeed the largest in the meta-analysis, however:
1) It certainly wasn't an "extreme outlier" and other than being the largest wasn't an outlier at all. It was reasonably close to the high-side of the other estimates. Plus, TD&B performed a quantitative test of the null hypothesis that all US effect sizes were drawn from the same population, and failed to reject it.
2) Not only was the average effect size non-zero in the US, it remained non-zero if Turkheimer (2003), or even every study I have had anything to do with, was removed from the analysis.
3) Although the mean effect size was smaller than what we reported, but it was by no means trivial, in a p<.05 even though it's ridiculously tiny kind of way. The TD&B graph of the effect looks almost identical to what we reported, with a somewhat less dramatic y-axis. Still lots of variation in heritability across SES.
4) The whole process here, in which (quasi-, we weren't the first) discovery samples turn out to have unrealistically large sample sizes, is a perfectly normal part of science. One discovers things because they are big; after that, when everyone is looking for the effect, big or small, they shrink. The winner's curse. All one can hope as an early adopter of a finding is that it remains (a) significantly non-zero, and (b) substantively non-trivial, and the Scarr-Rowe interaction accomplished this, at least in the US.
5) The US vs. Europe part is interesting, but there is a perfectly reasonable environmental basis for speculation about why it would happen, which is that there is greater socioeconomic inequality in the US. Speculation, yes, but perfectly good fodder for future studies. The conclusion that it doesn't happen in Europe, by the way, is exactly what I have concluded in writing at least three times.
It doesn't seem to me that TD&B leaves much for everyone to be arguing about. The interaction happens, with moderate ES, in the US but not Europe. Period. Why exactly it happens is unknown. I have already confessed to the somewhat unscientific feeling of commitment I have to the effect. I think those on the other side, with more of a hereditarian bent than me, should be big enough to admit that they are actively rooting against it, because it threatens their global assumption that genes always predominate. In fact I think the best thing about the TD & B paper is how even-handed it is.