Terry Speed 试验设计实践派

在小说阅读器中沉浸阅读

"This book is about the planning of experiments in which the effects under investigation tend to be masked by fluctuations outside the experimenter's control. " So begins the book you are reading.

“实验者无法控制的诸多变动往往遮蔽了所观测到的效应,而本书正是关于如何在这种情况下设计试验的。”您正在阅读的这本书就从这句话开始了。

Why planning of experiments? Because in the thirty years leading up to this book, statisticians had created a body of experience, methods and theory showing that with good planning, experimenters could deal with those fluctuations. In other words, with good planning, and appropriate analyses, experimenters can unmask the effects they are investigating. This was, and still is an outstanding success story for Statistics, one that needs to be told to every generation of experimenters and statisticians.

为什么要设计试验?因为在这本书出版之前的30年里,统计学家创造了一系列经验、方法和理论,表明通过良好的设计,实验者可以应对这些波动。换句话说,通过良好的规划和适当的分析,实验者可以揭示他们正在研究的效应。这曾经是,现在仍然是,统计学的一个杰出的成功故事,需要告诉每一代实验者和统计学家。

Why D. R. Cox's book? Firstly, Cox was a superb and. The evidence lies in over 20 books which he authored, co-authored or edited, and his many papers. This book was his first. Secondly, although there have been many valuable developments in the field since 1958, the key ideas for the planning of experiments have not changed since he wrote his book. Thirdly, Cox really does focus on the planning of experiments, while most other books before and since his place a much greater emphasis on the construction of designs and the analysis of planned experiments. And fourthly, it was and still is rare among such books to see that statistical and mathematical technicalities largely avoided in order to keep the book accessible to its primary audience, which is experimenters. Although statisticians and data scientists (see below) may well have the mathematical and statistical knowledge used in the design and analysis of comparative experiments, the more they understand and appreciate the concrete issues facing experimenters, the better they will be at carrying out their role. That is why this book is for them too.

为什么是 D. R. Cox 的书?首先, Cox 是一位出色的“解经家”。证据存在于他撰写、合著抑或编辑的20多本书及他的许多论文中。此书是他的第一本著作。其次,尽管自1958年以来试验设计取得了许多有价值的发展,但自他写书以来,该领域的关键思想并没有改变。第三, Cox 确实专注于试验的设计规划,而在他之前及之后的大多数其他书籍都更加强调设计的构建以及对设计过的试验的分析。第四,在此类书中,为了让主要读者(即实验者)能够理解书籍内容而很大程度上避免统计和数学技术细节的做法,在过去和现在都很少见。尽管统计学家和数据科学家(见下文)很可能拥有用于设计和分析对比试验的数学和统计知识,但他们越了解并重视实验者所面临的具体问题,就越能更好地履行自己的职责。这是为什么这本书也适合他们的原因。

Why read a 1958 book now? It is always good to know the history of the ideas you study and the methods you use. Cox's book was written close enough in time to the foundational work in experimental design, and it cites seminal early work, that you can experience the history. However, the field of Statistics has changed enormously in the almost seventy years since Cox's book was published. Indeed, in many places around the world, Statistics has essentially been replaced by Data Science, with much of the material in this book left unmentioned in undergraduate and graduate courses on this topic. Nevertheless, the issues this book deals with remain fundamental to agricultural, industrial and psychological experiments, and to those in many other fields. Designing for the reduction of error at the planning stage (Cox, Chapter 3), using supplementary observations to reduce error at the analysis stage (Cox, Chapter 4), randomization (Cox, Chapter 5) and factorial experiments (Cox, Chapter 6) are perennial themes, and they are not obvious ones. This material will not be readily rediscovered by modern data scientists, no matter how bright. Modern students of Statistics or Data Science need to know them. This is a book of a modest length, but it packs a punch, and demands close reading. The benefits of doing so are enormous.

为什么现在要读一本1958年的书?了解您所研究的思想及所使用的方法的历史总是有好处的。Cox 的书的写作时间与试验设计的基础性工作的形成时期非常接近,并且它引用了开创性的早期工作,因此您可以体验历史。自 Cox 的书出版以来的近70年里,统计学领域发生了巨大的变化。事实上,在世界上许多地方,统计学基本上已经被数据科学所取代,本书中的大部分内容在本科生和研究生课程中都没有提及。尽管如此,本书讨论的问题对于农业、工业和心理实验以及许多其他领域的实验仍然至关重要。在规划设计阶段进行设计以减少误差(Cox,第3章),在分析阶段使用补充观测来减少误差(Cox,第4章),随机化(Cox,第5章)和析因试验(Cox,第6章)是长期存在的、却并非显而易见的主题。无论这些素材本身多么光芒万丈,现代数据科学家们都不会轻易重新发现它们。学习现代统计学或数据科学的学生需要了解它们。此书篇幅适中,但内容丰富,需要仔细阅读。这样做将会获益良多。

Why a Chinese translation now? First, we should all feel indebted to Dr. Zhou for carrying out the translation. There is no doubt that China is at the forefront of data science, including big data and AI. I am less confident that courses on Applied Statistics in China routinely cover material in the first half of Cox's book. It should be clear from my earlier remarks that I think this is highly desirable. Sixty years ago, Pao-Lu Hsu (Xu Bao-Lu), the first scholar to offer courses on probability and statistics in China, gave informal seminars on the design of experiments from his home in the years before he died in 1970. He published in the area using the pseudonym BanCheng, which was meant to cover himself and his collaborating students. The topic of this publication, though not the particular work, is discussed in section 11. 4 of Cox's book. I have little doubt that Hsu would join me in urging all Chinese statisticians and experimenters to read this Chinese translation. 

为什么现在要翻译成中文呢?首先,我们都应该感谢周教授完成了这本译作。毫无疑问,中国在数据科学,包括大数据和人工智能方面走在最前沿。我不太相信中国的统计学教学通常会涵盖 Cox 的书前半部分的内容。如上所述,可以清楚地看出,我认为教学内容包含它们是非常可取的。60年前,许宝騄(XuBao-Lu),中国第一位开设概率和统计课程的学者,于1970年去世之前的几年中,在家里举办了关于试验设计的非正式研讨会。他在该领域发表论文时使用笔名“班成”,以包括他自己和他的合作学生。那篇论文(虽然并非特殊的工作)的主题在 Cox 的书的11.4节中进行了讨论。我毫不怀疑许先生会和我一起敦促所有中国统计学家和实验者阅读这份中文译本。

I close with a significant recommendation published soon after the book was published, and we are greatly indebted to Dr. Zhou for providing this translation. In summary, wrote the eminent English-American statistician, Colin L. Mallows, in his 1959 Biometrika review of Cox's book, this is a book about real statistics... it should be made required reading for all students of statistics.

以这本书出版后不久所发表的一条重要的推荐作为收尾。总而言之,英裔美国统计学家 Colin L. Mallows 1959年在 Biometrika 上对 Cox这本书的评论中写到,这是一本关于真实统计学的书……它应该成为所有统计学学生的必读读物。

Terence P. Speed

July 2024

图片

Terry Speed (Terence Paul Speed): 澳大利亚科学院院士、英国皇家学会院士,澳大利亚统计学家。他是沃尔特和伊丽莎·霍尔医学研究所 (Walter and Eliza Hall Institute of Medical Research)  的高级首席研究员,因在方差分析和生物信息学方面做出的贡献而闻名,尤其是对微阵列数据的分析。