Apology RE Clark et al.

[Originally posted to my older website on 22 June 2020 https://web.uvic.ca/~dslind/sites/default/files/Lindsay%20Statement%20RE%20Cark%20et%20alia%2022%20June.pdf]

I write regarding a recent Psychological Science article entitled “Declines in religiosity predict increases in violent crime—but not among countries with relatively high average IQ,” by Cory Clark, Bo Winegard, Jordan Beardslee, Roy Baumeister, and Azim Shariff. I was Editor in Chief when the manuscript was submitted and when it was accepted, so I am responsible. I’m no longer Editor in Chief –- my term ended with 2019. I am writing this statement as a private individual and the views are my own.

Over the last few weeks, many have criticized Clark et al. on multiple grounds. The most important charge, as I understand it, is that the thesis of the article is fundamentally racist or at least feeds into racist narratives and undermines social justice causes. I do not know if any of the authors meant to promote racism, but in retrospect I can certainly see that their article does feed into racist narratives. Strong criticisms of the validity of the measures and analyses have also been raised. On 17 June, Clark et al. asked Editor in Chief Patricia Bauer to retract the article. Patricia has granted that request and the article will soon be retracted.

My aim here is to apologize for my roles, as then-Editor in Chief, in the handling of the Clark et al. submission, and to share my perspective on how the submission was handled. I would have liked to share these thoughts earlier, but the situation was complex and demanded care and time. Moreover, in my judgment it was important that I stay out of the way of Patricia Bauer’s handling of the situation.

I apologize for three shortcomings on my part, briefly enumerated here and elaborated upon below. First, I deeply regret that I failed to think about the racial implications of the manuscript. Second, I am sorry that I did not require revisions to correct problems with the writing such as blurred distinctions between psychological constructs versus measures and speculations/extrapolations far removed from the data. Third, I wish I had done more to investigate the validity of the measures. The second and third failings follow from the first – If I had apprehended the racial implications of the manuscript, I believe I would have handled it with greater care.

Let me provide some context about how Psych Science manuscript submissions are handled.  We received 1,693 new submissions in 2019 (an average of 4.8 new submissions per day). As Editor in Chief, I skimmed each submission and ( assuming it passed a very low bar, which more than 99% did) assigned it to a Senior Editor or to myself, depending primarily on whose turn it was to get a new batch of assignments. The Senior Editor read each assigned manuscript and then assigned it to an Editor with relevant subject-area expertise (usually an Associate Editor but sometimes another Senior Editor or me). Each of those two editors judged independently if the manuscript warranted extended review. If either editor judged the submissions to have the potential to be accepted for publication in Psych Science, then it was sent for review (with the Editor with relevant-area expertise in the role of Action Editor).

The Clark et al. submission arrived on 25 January 2019. I assigned it (along with five other new submissions) to a Senior Editor. My email to the Senior Editor included a note: “Probably needless to
say, if # 102 [Clark et al.] goes out [for extended review] at least one reviewer should be a Stats Adviser (or at least someone who has serious stats chops).”

The Senior Editor selected another Senior Editor, Jamin Halberstadt, as Action Editor. Both editors judged that the manuscript warranted extended review. Most Psych Science submissions that are sent for review get two or three external reviewers, but Jamin wisely recruited four reviewers for the Clark et al. manuscript (including one recommended by the corresponding author and one Jamin selected based on particularly strong statistical expertise). Judging from their CVs, all four reviewers are eminently qualified to assess this research. All four reviewers provided substantive assessments of the initial submission. The reviews and his own reading of the manuscript led Jamin to invite a revision in a multi-page action letter (more than 3,000 words including the reviews). The authors made extensive revisions, detailed in a 17-page cover letter accompanying the R1 version.

Jamin sent the R1 version to the same four reviewers. Each of them again came through with a review. Three of the four reviewers recommended acceptance. The remaining reviewer recommended rejection based on reservations as to the use of homicide as the index of violence. Jamin invited a second revision, and the authors again made substantive revisions, including some addressing the use of homicide to index violence. Jamin and I conferred about the R2 version, and agreed that it would be good to ask the authors to temper some of their conclusions. Jamin was satisfied with their response and accepted the R3 version (I don’t recall if I checked on it).

Jamin’s handling of this manuscript was extremely careful and thorough. Three of four expert reviewers unambiguously recommended acceptance. The one dissenting reviewer’s concerns focused exclusively on the homicide measure, and the authors provided a counter argument to those concerns. It would be extraordinary for an editor to reject a manuscript with such positive reviews.

The Clark et al. article was published online on 21 January. It did not seem to attract a lot of attention until Andrew Gelman posted a scorching critque of it (crediting an email he received from Keith Donohue). Near the end of that blog, Gelman wrote:

I’m surprised Psych Science would publish this paper, given its political content and given that academic psychology is pretty left-wing and consciously anti-racist. I’m guessing that it’s some combination of: (a) for the APS editors, support of the in-group is more important than political ideology, and Baumeister’s in the in-group, (b) nobody from the journal ever went to the trouble of reading the article from beginning to end (I know I didn’t enjoy the task!), (c) if they did read the paper, they’re too clueless to have understood its political implications.

It was (c). I saw the Clark et al. submission as likely to be controversial, but race did not enter into my concern. It seems stupid now that others have pointed it out, but race did not cross my mind. I thought that the author’s thesis would be controversial due to nationalistic/cultural concerns (as in northern versus southern Europeans) and failed to consider its implications for racial issues. I take little comfort in the fact that I am not the only one with such blinders, but will nonetheless note that Action Editor Jamin Halberstadt, too, reports that the racial implications of this manuscript did not occur to him. None of the reviewers mentioned race. Again, this is not an excuse. It is an admission. [see Peggy McIntosh about the invisibility of racism to Whites.]

I am sorry to have been so clueless. All articles in Psych Science should clearly distinguish measures from constructs and be modest in extrapolating from findings, but Editors should take special pains with submissions that intersect with sensitive cultural issues. As Editor in Chief that was my duty and I failed in it.

In terms of science, Clark et al. may not be worse than some other articles published in Psych Science during my editorship. Psychology is a young science and there are trade-offs between different considerations that sometimes make it appropriate to publish reports despite methodological problems. But the combination of an extremely fraught social issue, methodological weaknesses, and problems in the writing makes for a toxic mess. Extraordinary claims demand extraordinary evidence, and so too do claims that are likely to be socially divisive and hurtful.

I strongly believe that racism is wrong and unjust and that it is a sad, pervasive, long-standing reality. I have unfairly benefited from racism. Millions of people of colour have been harmed by it. I am ideologically anti-racist but I am a product of my cultural background and my insensitivity to the racial implications of the Clark et al. manuscript betrays that fact. I am committed to working against racism in myself and in my culture, including within academia and its journals.

When I learned of the criticisms of Clark et al. on 12 June, I contacted APS and the new Editor in Chief of Psychological Science Patricia Bauer. In my view, Patricia’s response to this crisis has been exemplary. She wrote a thoughtful and insightful editorial that was to be distributed by APS on 19 June. Patricia invested a lot of thought, care, and consultation into crafting that piece, but that editorial was mooted by Cory Clark’s Tweet of 17 June requesting retraction. Patricia subsequently wrote a new editorial expressing her views on this matter and the steps she plans to take to address the issues. I thank her for her leadership and wish her the best.

Journals Can and Often Do Enhance Psychological Science

Some research psychologists who champion transparency and replicability have expressed low opinions of journal and journal editors.  I’m enthusiastically on board with efforts to promote transparency and replicability, but I’m also a journal editor.  I have read that journals are vestigial organs that no longer serve valuable functions and that journals are maintained only due to outmoded traditions that make peer-reviewed journal articles the coin of our realm.  Some psychologists have expressed outrage that journals charge people to read their pages.  The argument seems to be that journals add no real value.  In my view, journals can and often do add value.

For example:

->Journals develop and communicate information about standards to scientists (e.g., the APA and the Psychonomic Society have detailed guidelines regarding statistical analyses).  Authors wishing to publish in journals that set such standards are motivated to meet them.  Probably THE quickest way to increase transparency and replicability is to get journal editors to value preregistration, planning for power/precision, and data and materials sharing.

->The review process helps select stronger versus weaker contributions, providing a filter on what is more versus less likely to be worth reading.  It does this imperfectly, but imagine the oceans of content we’d be swimming in if the coin of the realm was the number of preprints posted.

->The review process strengthens good submissions by helping authors improve the exposition, the analyses, the arguments, etc., and sometimes by helping them to design follow-up research that clarifies the meaning of the initially submitted work.

->Copy editing and production enhance readability and make the final product look better.  If folks don’t think that the review and production processes enhance their work, why do they want to post a PDF of the final, post-production version rather than posting the submitted version (which they are always free to do)?

->Journals’ communications/marketing systems help deliver selected and polished works to readers.  For example This Week in Psychological Science goes out to 30,000+ psychologists.

->For societies such as APS much of the income generated via journals gets reinvested into psychological science in myriad ways (e.g., defraying the cost of conferences, paying for lobbying efforts, funding grants and awards, etc.).

Some psychologists seem to look forward to a future in which researchers bypass pain-in-the-ass editors, reviewers, copy editors, and all that hassle in favour of posting preprints in central archives. The hope seems to be that the most worthy preprints will naturally attract the most attention and that readers will provide constructive input that leads to ongoing evolution of each paper across multiple versions.  Maybe that will prove to be an effective way of disseminating science.  But I’m not optimistic.

There are problems with the current journal system.  It is my understanding that many large publishers make extraordinary high profits that are difficult to justify and that strain university library budgets and create barriers to access.  That’s deplorable even if professional societies get a big share of the loot and do good things with it.  Journals and publishers vary in the extent to which they add value; the worst of them add very little and even the best can and should be improved a lot.  I hope that psychologists will direct their efforts toward improving rather than replacing peer-reviewed journals as the primary venue for primary research reports.

SOME REMARKS CONCERNING THE PRESENTATION OF EFFECTIVE SCIENTIFIC TALKS

Professor Nicholas Wheeler, A. A. Knowlton Professor Emeritus of Physics, Reed College, created this document for senior-level undergraduate physics majors, but it generalizes well to all science domains and scientists with many years of experience stand to learn from it.
Almost at the beginning of my research experience [at Cambridge, in the mid-1930’s] the great articles of Bethe and his collaborators Bacher and Livingston appeared. We had a weekly seminar at which it was decided by the senior members—Dirac, Fowler and Peierls, to mention only four—that we would go through these articles bit by bit. The idea was to allocate part of them to the students, and each week one of us would expound on the blackboard. Our expounding was, I suppose, pretty bad; at any rate the senior people, Darwin and Fowler in particular, thought so. It came ‘round to my turn and sure enough I made a very poor fist of it, and received a terrible pounding for my pains. Fortunately, my exposition was such a shambles that it had to be continued the following week. For the first time in my life I began to think not merely of taking in knowledge but of giving it out. I spent a whole week thinking about expressing myself, about ways of writing on the blackboard, and about illustrative diagrams. The result was that I was able to…
From Fred Hoyle, “Reflections & Reminiscences” in Encounter with the Future (Trident Press, 1965)

INTRODUCTION

In your classwork you have studied the conceptual fundamentals of contemporary physics, and learned something of the analytical and experimental techniques central to the field. Problems of a different order—how to make effective use of the published literature, how to keep a research effort focused and in motion, how to write graceful scientific prose—have been brought vividly to your attention by your thesis activity. The thesis seminar is intended to provide experience in yet another critical area. The oral presentation of scientific material poses problems that, if left unresolved, can very much impede the progress of one’s scientific career. The following remarks are intended to alert you to the fundamentals of effective oral presentation.

Axiom I:           SCIENTIFIC TALKS ARE NOT SIMPLY VERBALIZED TECHNICAL PAPERS.

This distinction is as fundamental as that between painting and music; readers can linger over passages that give them difficulty, but listeners have no such option.

Axiom II:          THE ALLOTTED TIME—WHATEVER IT IS—IS ALWAYS SHORTER THAN YOU THINK.

Speakers and authors labor under distinct constraints, toward distinct goals…and have distinct sets of resources upon which to draw. A speaker’s task is synoptic—to plant the outlines (and nothing more) of one or a few ideas unforgettably in the minds of the individual listeners who comprise the speaker’s momentary audience. A speaker can realistically aspire to no more, and should use every available means to achieve no less.

1st Law:          PROCEED AS QUICKLY AS YOU CAN TO THE SHARPEST POSSIBLE STATEMENT OF THE PROBLEM—OF YOUR MOTIVATION AND GOALS.

Your intent, after all, is to clarify a mystery. A sharp statement of that mystery will serve to engage your listeners’ attention and to provide them with hooks upon which to hang your subsequent remarks.

2nd Law:         DON’T TRY TO PRESENT TOO MANY IDEAS.

“Two ideas per talk” is an excellent average, and one really good idea per talk is actually much better than average. If your subject is complex, phrase your remarks in terms of the simplest illustrative special case: thus (in point of historical fact) was Schrodinger led from the “relativistic Schrodinger equation”—which doesn’t work—to the “non-relativistic Schrodinger equation”—which does!

3rd Law:          OMIT THE COMPUTATIONAL DETAILS THAT LINK YOUR MAJOR POINTS.

Listeners desire to know only about the general strategy of computation, and are—tentatively—willing to take on trust the correctness of your claim that things work out as you state.

4th Law:          STATE—AND RESTATE—YOUR CENTRAL MESSAGE AS SHARPLY, AS VIVIDLY, AS MEMORABLY AS IT IS IN YOUR POWER TO DO.

Redundancy, while often a defect in written work, is indispensable to the effective oral communication of difficult material. If you can reduce your conclusion to a metaphor, a visual image, a point which sharply engages your listeners’ physical intuition, you should not hesitate to do so, for such is the stuff which sticks in the memory. But you should, in the service of honesty, go on to state the sense in which your conclusions are conditional, your metaphor deceptive…your sense of what lies further down the road that you and your listeners have traveled.

What now follows is a list of lesser points, all of which follow as corollaries from the major principles stated above.

MISCELLANEOUS PITFALLS & TRICKS OF THE TRADE

While an author may (at some risk) write for an “abstract reader,” speakers speak to specific individuals. Speak not to the blackboard nor to the back wall but to the faces of your listeners. Imagine yourself to be seated among them. Read their faces: they are a resource that distinguishes your predicament from that of an author, and will tell you instantly when you have lapsed into obscurity.

If knowledge and a deep desire to be understood are your strongest offensive weapons, humility is your strongest defense. Resist the urge to show off. Remember that wit that does not serve your informative purpose is wit misapplied. And cite your sources. The latter point of courtesy will, after all, release you from any obligation to develop details that interested listeners can find in the published literature.

You should except to expose specific methodological points germane to your central idea, but resist the urge to review standard derivations of standard material. You haven’t time. Historical remarks are—if accurate—useful, but should not be allowed (which is their tendency) to displace your main points.

Since blackboard work takes time—not in itself a bad thing, for it gives your listeners time to reflect—you may find that the tempo and ultimate impact of your talk are improved if references and secondary detail are written out in advance for distribution as handouts. Graphic aids are helpful—they help you to engage not only the ear but also the eye of your listeners, and to introduce an element of variety—but for maximal impact must be thoughtfully designed and tested in advance. Remember that transparencies may tempt you to go faster than any listener can follow with comprehension, and that poor graphics are worse than no graphics; by their use you have, after all, denied listeners the option of saying (for example) “Write larger!”

Scientific lectures take place in theatres, and are themselves a kind of theatre.  It follows that it is appropriate/important to give attention to the overall design of your presentation—to matters of rhythm, tempo and timing, to the sequence in which you pull the rabbits from your hat, to the willful (but invisible) fabrication of an element of surprise, of climax. It is classical sonata form, not the phonebook, that should serve as your model. It is, moreover, entirely appropriate to rehearse your presentation … and when it is over to ask yourself (and your friends!) “In what respects did my talk succeed (and why), in what respects did it fail (and why), how could it have been improved?”

A speaker who does not speak (up) is—by definition—not a speaker. Actors and singers—even those with small voices–acceot it as a professional obligation to be “heard ringingly in the back of the house”… and so also must scientific speakers. The “inaudibility problem” (analog of the “writing too small problem”) derives mainly from three circumstances: (1) inattention (no excuse); (2) modesty (misplaced); (3) lack of scientific confidence. In cases of the latter type the speaker is, in point of fact, not yet ready to speak, and should redesign his/her talk until the circumstances that undermine confidence have been identified and either (1) eliminated or (2) frankly confessed. Cases of the former types are more easily dealt with, by practice with the assistance of the above-mentioned friends; it takes conscious effort to fill with sound a room full of acoustically-absorbent protoplasm.

Effective scientific speaking is an acquired skill. But of all the skills that a physicist must possess it is one of the most easily acquired. All that is required is conscious attention to the fundamentals… and practice, much practice.

19 Things Editors of Experimental Psychology Journals can do to Increase the Replicability of the Research they Publish

The challenges confronting psychological science can be divided into two categories: Hard and easy.  The hard problems have to do with developing measures, methods, theories, and models that will enable the development of a genuinely useful science of psychology.  Solving those problems will take many smart and creative people a long time.  The easy problem is to stop publishing so many crappy, underpowered, p-hacked experiments that don’t replicate.  Replication is not the only criterion for a science, but it is a fundamental one.

I assume that engaged editors read and think about each submission before looking at the reviews, and don’t just tally reviewers’ “votes” but rather make their best judgment as to whether or not the submission meets (or has decent prospects of meeting via revisions) criteria for publication in their journals. That’s just the basics of doing the job.  Here are some other things that editors can do to with the specific aim of increasing the replicability of the findings they publish.

1. Sign on to and endorse the Transparency and Openness Promotion guidelines.

2. Encourage and reward detailed preregistrations (Lindsay, Simons, & Lilienfeld, 2017).

3. Be wary of papers that report a single underpowered study with surprising findings, especially if critical p values are greater than .025 (Lindsay, 2015).

4. If you think that the work reported in a manuscript has potential but you doubt its replicability, consider inviting a revision with a preregistered replication, perhaps under terms of a Registered Report.

5. Encourage (perhaps even require) authors to share de-identified data and materials with reviewers (and more widely after publication) (Lindsay, 2017).

6. If you have Associate Editors, ensure that they have appropriate stats/methods chops and are committed to promoting transparency and replicability.

7. Consider recruiting Statistical Advisers (i.e., psychologists with high-level statistical expertise who are willing to commit to providing consulting service to you and Associate Editors when need arises).

8. Ensure that each submission that is sent for review goes to at least one reviewer who has serious stats/methods chops.

9. Require authors to provide a compelling rationale as to why the sample size of each reported study is appropriate (see, e.g., Anderson, Kelley, & Maxwell, 2017).  Consider not only the number of subjects but also the number of observations per subject per construct.  Precedent is a weak basis for sample-size decisions (because psychologists have a long history of conducting many underpowered experiments and publishing the few that yield statistically significant effects; underpowered studies will be non-significant unless they exaggerate effect size (see cool shiny widget by Felix Schönbrodt implementing Daniel Lakens’s spreadsheet on this issue at http://shinyapps.org/ ).  See also Bruner and Schimack (2016).  There are no hard and fast rules and statistical power must be considered in the context of other considerations that may limit sample size (e.g., difficulty, rarity, riskiness, urgency).

10. Require report of an index of precision for averages of dependent variables (e.g., 95% confidence intervals [Cummings, 2014]] or credible intervals [Morey, 2015]); when appropriate require report of effect sizes and a measure of their precision.  Make sure that authors make clear how these things were calculated.

11. Require fine-grained graphical presentations of results that show distributions (e.g., scatterplots, frequency histograms, violin plots) and look at and think about those distributions.

12. Don’t let authors describe a non-significant NHST result as strong evidence for the null hypothesis (e.g.,. Lakens, 2016) , nor describe a pattern in which an effect is significant in one condition or experiment and not significant in another as if it evidenced an interaction (Gelman & Stern, 2006).

13. Attend to measurement sensitivity, reliability, validity, manipulation checks, demand characteristics, experimenter bias, confounds, etc.

14. Require authors to address in the manuscript the issue of known/anticipated constraints on the generality of the findings (Simons, Shoda, & Lindsay, 2017).

15. Use tools such as StatCheck to detect errors in statistical reporting.

16. Consider inviting submissions that propose Registered Reports (pre data collection).

17. Considering inviting submissions that report preregistered direct replications of findings previously published in your journal (ideally as RRs) (Lindsay, 2017)

18. Publish the action editor’s name with each article.

19. If you become aware of errors in a work published in your journal, work to correct them in an open, straightforward way, whether that involves an erratum, corrigendum, or retraction.  Reach out to Retraction Watch.