[翻译]足球统计分析:曾经与未来之王
[翻译]足球统计分析:曾经与未来之王由 TBPOAT 发表Statistical analysis of football:The once and future king 足球统计分析:曾经与未来之王
(博弈论最新研究成果)
RIVALRIES in football can come and go, but there is no surer bet for a high-stakes club match than the biannual Clásico between Real Madrid and FC Barcelona, the two titans of Spain’s La Liga. On March 22nd Bar?a pulled out a spirited 2-1 victory, increasing their margin in the standings over second-place Real Madrid from one to four points. Meanwhile, in the league’s parallel and almost as closely watched scoring race, Cristiano Ronaldo (pictured, at left), Los Blancos’ biggest star, slipped home his 31st goal of the season in the 31st minute, bringing him within one of the 32 scored so far by Barcelona"s Lionel Messi (right).
足球场上的对抗可以你来我往,但是没有比皇马和巴塞罗那,西甲两巨头之间一年两次的国家德比更高风险更引人注目的比赛了。3月22日巴萨取得了一个充满活力的2-1胜利,使得他们积分榜上比第二名的皇家马德里分差从一分增加到4分。同时,在联赛中并行并且几乎一样被密切关注进球比赛中,克里斯蒂亚诺·罗纳尔多(上图,左),los blancos最大的明星,在第31分钟取得了本赛季的第31进球,拉近了他和迄今为止一人独得32进球的巴萨的梅西(右)的距离。
Ever since Mr Ronaldo was transferred to Real Madrid in 2009 for a then-record fee of £80m ($130m at the time), setting the stage for him and Mr Messi to square off at least twice a year, they have been regarded by common consent as the two best players in the sport. Their ranking relative to each other, however, has flip-flopped. Mr Messi won the annual Ballon d’Or award issued by FIFA, football’s international governing body, to the top performer of the year for four straight years from 2009-12. During the past two campaigns, however, Mr Ronaldo has captured the prize. There is little mystery as to the cause of this changing of the guard: in a sport whose currency is goals, Mr Ronaldo was money more often than Mr Messi in both 2013 and 2014.
自从罗纳尔多先生在2009年以8000万英镑(约合当时的1.3亿美元)创记录的转会费用转会皇马以来,他和梅西先生每年至少两次交锋作对,并且被普遍认同为这项运动(当今)最好的两名选手。然而,他们的相对排名却点颠了个。梅西先生在2009到2012年连续四年荣获金球奖。这一奖项由国际足联(足球的国际管理机构)颁发给年度最佳球员。然而在过去的两个赛季里,罗纳尔多先生赢得了奖项。荣誉换岗的缘由有些微妙,在以进球为货币单位的运动中,罗纳尔多先生在2013年和2014年都比梅西先生挣得多一点。
However, not all goals are created equal: their value depends entirely on the context in which they occur. Some land in the 90th minute of a 1-1 tie and secure a win all by themselves; others are tacked on at the same point in a match once a team is already up 3-1. And while the assumption that goals should be distributed more or less normally between high- and low-leverage situations certainly holds over large sample sizes, it often breaks down at the level of specific players. A proper measure of individual scoring output would consider when shots find the net as well as how often. And upon further scrutiny, it turns out that the Ballon d’Or voters—or at least those who based their decision on scoring—appear to have erred. Even during Mr Messi’s two “down” years, Barcelona benefited more from his comparatively modest 86 goals than Real Madrid did from Mr Ronaldo’s prolific 105.
然而,并非所有的进球都有同样的价值:它们的价值完全取决于具体情状。有些进球在1-1平局的第90分钟打进,以一球而奠定胜局;有些进球则在团队已经3:1领先的时候打进。此外,尽管在样本数量足够的情况下,可以假定进球应该呈正态(均态)分布,实际进球分布情况往往与特定球员个体水平相关。个人得分输出的更适当的计量方法应该不仅仅考虑打进了多少个球,还应该考虑什么时间段打进了这些球。通过这样进一步的细微的计量分析,事实证明,金球奖的选民们做决定的时候似乎出现了偏差--至少,在他们用进球作为投票依据的时候有所偏差。即使在梅西先生的两个“下滑”的赛季,巴萨从他相对谦逊的86个进球中得到的利益仍然超越皇马从罗纳尔多先生多产的105个进球中所得到的。
The statistic that weights goals according to their context is called Expected Points Added (EPA), an application of the Win Probability Added framework originally developed for baseball. By extrapolating from over 4,000 English Premier League matches played from 2001-13, the analytical website SoccerStatistically.com offers an applet that lists the odds of a team’s win, draw or loss at any point in a match given the venue, time remaining and goal margin. Comparing these probabilities immediately before and after a goal shows how much each score changes the expected outcome.
该统计模型被称为预期加分EPA,根据进球的背景而对每一个进球加以权重。该模型发展出的应用“获胜概率框架”最初是为棒球比赛而设计开发。分析网站SoccerStatistically.com研究了2001-13年4000多场英超联赛比赛的数据,提供了一个小程序。输入比赛场地、所余时间、让球数目(?),小程序能输出球队输赢的赔率。通过比较某个进球之后输赢概率的变化,可以看出每一个进球对最后结果的影响有多大。
For example, take the two situations mentioned above. With the score tied in the 90th minute, a team playing at home has an 11% chance to win, 82% to draw and 8% to lose. Multiplying by three points in the standings for a win, one for a draw and zero for a loss, its expected points (EP) are 1.13. With a one-goal lead, in contrast, its win probability shoots up to 95% and its draw odds fall to 5%, which equates to 2.89 EP. The gap between them of 1.76 points is the EPA associated with such a critical goal. In the alternate situation, with a two-goal lead in the 90th minute, the home team already has a 99.7% chance to win and 0.3% to draw. Adding on a final goal as an exclamation point is worth just 0.007 of EPA—250 times less than the tie-breaker.
例如,采取上面提到的两种情况。比分在第90分钟打平,球队在主场作战有11%的机会赢,82%机会和局,8%的机会输球。算赢率的时候,用标准公式,赢的机会乘3,平的机会乘1,输的机会乘0,此时预期点(EP)是1.13(1.15?)。随着一球领先,它赢的概率激增到95%,平局赔率下降到5%,此时的预期点相当于2.89 (2.90?看来公式不是简单四则运算)。 而2.89与1.13的差,1.76点,就是这一至关重要的进球拥有的预期点。另一情形,比如,主队在90分钟已经两球领先的时候,主队已经有99.7%的机会赢得比赛,0.3%的机会打平,此时添加一个进球最终只值0.007个预期值。相对于打破平局的那个进球,该锦上添花球的预期值只值前者的250分之一。
http://cdn.static-economist.com/sites/default/files/imagecache/original-size/images/2015/03/blogs/game-theory/20150328_woc641.png
The only necessary adjustment to this simple calculation comes on penalty kicks. As soon as one team commits a foul, their opponents’ expected points increase by the EPA of a goal multiplied by the league’s average spot-kick conversion rate, which in La Liga has been 78% since 2009-10. If the designated kicker nails it, he is then credited with the remainder, while if he flubs it, his account is debited 78% of the goal’s value for the missed opportunity. As an illustration, take Real Madrid’s home match on March 1st. It remained scoreless in the 52nd minute, giving Los Blancos 1.60 EP. When Villarreal was called for a foul, the home team gained a 78% chance to score a goal that would improve their EP to 2.46, a difference of 0.86 points. Mr Ronaldo duly netted his spot kick, adding 0.19 points of EPA (the remaining 22% of 0.86) to his ledger. Conversely, in a match on February 24th Barcelona led Manchester City 2-1 four minutes into added time, for an EP of 2.90. City fouled in the penalty area, raising Barcelona’s EP to 2.98 by giving them 78% odds of scoring a goal worth 0.10 of EPA. But Mr Messi missed, dropping his team’s EP back down to its previous level, and thus reducing his EPA total by 0.08.
这一简单的计算唯一需要调整的是点球。当一支球队犯规的时候,对手的预期点增加,增加的数值为进球的EPA乘以联赛的平均点球转化率,西甲2009-10年以来平均点球转化率一直是78%。如果点球进了,进球者收获所进点球(扣除点球转化率的)剩余值,如果点球没进,罚点者扣除所失点球的78%的预期值。举例说明:3月1日,皇马的主场比赛。在第52分钟时,它仍然一分未得,我们给los blancos1.60 EP。当比利亚雷亚尔犯规,主队有78%的机会打进一球,这一球将把球队的EP提高至2.46,0.86点差异。罗纳尔多先生正常踢进点球,个人账户增加0.19分的EPA(0.86余下的22%)。相反,2月24日巴塞罗那对阵曼城,补时4分钟时比分2-1,相对应球队有2.90的EP。曼城在禁区内犯规,巴萨的EP提升至2.98(78%的点球转化率乘以2:1之后一个进球所值的0.10EPA)。梅西先生错失点球,使得他的团队的EP回落到原有的水平,他的个人账户EPA总值相应减少了 0.08。
So what happens when this analysis is applied to every single goal scored by the game’s two leading lights in La Liga, the pan-European Champions League and the World Cup during 2013 and 2014? Mr Ronaldo’s edge all but vanishes. The pride of Portugal’s 105 goals contributed 41.6 EP to Real Madrid and his national squad, an average of 0.40 EPA per goal. Although he assured himself a second straight Ballon d’Or with three goals in the semi-final and final of last year’s Champions League, all of them were mere pile-ons. In the return leg against Bayern Munich, he scored when Los Blancos already led 3-0 and 4-0 in the aggregate, for a combined EPA of just 0.29. And in the final, his score was a penalty kick in extra time when Real Madrid already enjoyed a 3-1 lead, yielding a trivial 0.004 of EPA. In contrast, the supposedly slumping Mr Messi squeezed 40.3 EPA from his 86 goals, an average of 0.47 each. He showed a remarkable knack for scoring when it counted: on five different occasions in 2013 and 2014, he netted a tie-breaking goal in the final 20 minutes of a contest. In other words, the Argentine’s 20% deficit in raw goals relative to Mr Ronaldo was almost entirely offset by a 20% advantage in the importance of the goals he did score.
所以,当我们将这一分析方法应用于西甲、欧冠、以及2013、2014世界杯两位主要人物的所有进球的时候,会发生什么呢?罗纳尔多先生的优势完全湮灭。葡萄牙人令人骄傲的105球只给皇马和他的国家队贡献了41.6 EP,平均每个进球只有0.40 EPA。虽然他以去年欧冠半决赛以及决赛的三个进球为自己赢得了连续第二个金球奖,他这三个进球,个个都只是锦上添花。在对阵拜仁慕尼黑的次回合,他在皇马总比分已经3-0和4-0的情况下进球,综合EPA仅仅只有0.29。而在决赛中,他的得分靠一个加时赛的点球,此时皇马已经享受着3-1的领先,该点球产生的EPA为一个微不足道的0.004。与此相反,理应下滑的梅西先生从他的86球中挤压出40.3 EPA,平均每球0.47EPA。在需要的时候,他表现出了非凡的得分诀窍:在2013年和2014年五个不同场合,他在最后20分钟的比赛中攻入制胜球。换句话说,相对于罗纳尔多先生,阿根廷人的20%原始进球数的赤字完全被他的进球的重要程度消除,事实上,他赢得了20%的优势。
Moreover, even this metric fails to do full justice to the superior timing of Mr Messi’s scoring. Just as important as when a player’s goals occur within a match is which matches they occur in. And Mr Messi’s goals—particularly his match-winners—have been heavily concentrated in his teams’ most crucial contests.
事实上,即使是这个测量模式也不足以完全说明梅西先生的进球时机的重要程度。进球在什么时候发生很重要,进球在什么比赛发生同样重要。梅西先生的进球,尤其是他的制胜球,很浓厚地集中在他的球队最重要的比赛中。
There is no straightforward way to assign weights to different matches. Only the most die-hard club supporter would consider a La Liga championship to matter as much as a World Cup title—but who’s to say precisely how much more the Cup is worth? One potential measure is television viewership. Using readily available statistics from Britain, and adjusting for whether England was playing and whether the match was broadcast over the free airwaves or only via satellite, Champions League matches ranged from having an equivalent audience to a domestic-league contest (in the group stages) up to three times as much for the final. And interest in the World Cup never fell below a multiplier of 4.7, soaring up to 15.2 for the championship game. Finally, using a different methodology, Clásicos are often said to be worth twice as much as other La Liga matches, since the victors of these Barcelona-Real Madrid contests both secure three points for themselves and deny three points to their chief rival.
我们没有一个简单的方法来分配不同比赛的权重。只有最铁杆的俱乐部支持者会将一个西甲冠军等同于世界杯冠军--但谁又能精确计算出世界杯价值几何?一个可能的衡量标准也许是电视收视率。我们使用英国现成的统计模式,用英格兰队是否参加以及相应的比赛通过免费的电波还是只能通过卫星展播来微调,欧冠比赛的(收视率)大约能达到具有相当观众基础的国内联赛比赛(小组赛阶段)的三倍。而人们对世界杯的兴趣从来没有跌破4.7倍,决赛阶段更是飙升至15.2倍。最后,采用不同的方法,国家德比权重为是两倍于其他西甲比赛的比赛--因为德比的胜利者在保证自己三分到手的同时让他们的对手失去3分对手。
Only when this element of context is taken into account does Mr Messi’s superior performance fully reveal itself. After factoring in the importance of a match, Mr Messi pulls away with a 59.5 weighted EPA in 2013 and 2014 (an average of 0.69 per goal), compared with 50.4 for the Ballon d’Or winner (an average of 0.48). By EPA alone, Mr Messi’s most valuable goal during the past two years was a 91st-minute winner on neutral turf. But he delivered this coup de grace not to a lowly opponent in La Liga, but rather to Iran in the group stage of the World Cup, a match that attracts 4.7 times more interest. He also scored twice, breaking a tie both times, in Argentina’s match against Nigeria four days later. In contrast, Mr Ronaldo only found the net once during the Cup, against Ghana. And even that goal had a marginal impact: although it was the deciding score in the match, Portugal needed a four-goal swing relative to the United States to advance out of its group. As a result, Mr Ronaldo’s yeoman effort did nothing to prevent his countrymen from heading home early. If that match were treated as reducing a four-goal deficit to three, Mr Ronaldo’s EPA for the two years would collapse to 43.6, 27% lower than Mr Messi’s.
当这一因素考虑在内时,梅西先生的卓越性能完全显露出来。在比赛的重要性计入权重后,梅西先生2013和2014年可以带着59.5加权EPA离开,平均每个进球0.69EPA,与他相比,金球奖得主仅有50.4加权EPA,平均为0.48。以EPA为唯一准则,在过去的两年中,梅西先生的最有价值的进球是在中立草皮第91分钟的一个进球。但他并不是在西甲一个卑微对手的头上交出这个致命一击,而是在世界杯小组赛上,对阵伊朗,这场比赛吸引了4.7倍的观众。四天之后,在阿根廷对阵尼日利亚的比赛中,他还梅开二度,两次打破比赛平衡。相反,罗纳尔多先生世界杯期仅仅一次找到球网,对阵加纳。甚至这一进球也仅仅产生了微弱的影响:虽然它确实决定了比赛的成绩,但是葡萄牙一共需要四个进球才能将美国挤出小组出线。结果很不幸,罗纳尔多先生的努力没有能阻止他的同胞提前回家。如果那场比赛被视为将进球赤字从四个减少到三个,罗纳尔多先生的过去两年的EPA坍塌到了43.6,比梅西先生低足足27%。
Fans of Mr Messi should be careful not to overinterpret this analysis. First, goal scoring is just one part of football. Half the game is defence, which is excluded entirely from this calculation. And even on offence, playmaking ability—consisting both of passing skill, and of the ability to draw defenders’ attention and open space for teammates—can matter as much or more as the final act of slipping the ball past the keeper.
当然,梅西先生的粉丝们应该注意不要过度解读这一分析。首先,进球只是足球运动的一部分。这个游戏的另一半是防御,该计算完全排除这一项。更可能冒昧的还有:组织进攻能力--包括传球技巧,吸引防守球员的注意力,为队友开拓空间--所有这些,与将足球滑过守门员的最后一击能力一样重要甚至更为重要。(好辛辣。。。。。。)
Moreover, even when it comes to goals alone, none of this means that Mr Messi has more ability to score when it counts than Mr Ronaldo does. In general, evidence across sports is scant that “clutch” ability, or its opposite “choking”—meaning a tendency to improve or decline in the most important situations—are repeatable skills or deficiencies. In a sufficiently large sample, players’ performance in critical moments usually resembles their contributions overall. Some will always happen to do better than their averages with the game or season on the line and others worse, but these deviations rarely exceed the amount of variance one would expect from chance alone. The best predictor of how Mr Messi and Mr Ronaldo are likely to do in a high-stakes match like a Clásico, or in the final minutes of a must-win contest, is their average output over all matches during the past few years, adjusted for their health and the quality of their opponents—not merely their performance in the subset of past Clásicos or other important matches.
此外,即使只谈论进球,这一切都不意味着梅西先生有比罗纳尔多先生更强的得分能力。在一般情况下,体育运动很难提供类似离合器的加速或者与之相反的窒息趋势的证据。这种离合器功能,通常意味着能够增强或者减弱某些重要情境下的趋势,是可重复的技能或者缺陷。在一个足够大的样本下,球员在关键时刻的表现通常近似于他们的整体贡献。有些人在比赛甚至整个赛季都比他们平均水平好,有些人则比他们的平均水准差,但这些偏差通常都在可期待范围内。要预计梅西先生和罗纳尔多先生在一个高风险比赛,诸如国家德比中的表现,或者在一场必须要赢的比赛的最后几分钟的表现,要看他们过去几年所有比赛的平均表现,他们如何调整自己的健康水准以及他们的对手的状态,而不能仅仅只看他们在过去国家德比或其他重要比赛的子集的情形。
Nonetheless, as a purely backwards-looking, descriptive measure, observers who point to Mr Messi’s return to the top of the goal leaderboard as evidence that he has at last rediscovered his form after a pair of disappointing-by-his-standards campaigns are mistaken. Mr Messi has indeed been the world’s most valuable goal scorer this year. He was also the world’s most valuable goal scorer in 2014, and in 2013, and (as is well-known) during the four years before that. The Ballon d’Or voters may have suffered from “Messi fatigue” for their past two selections. Supporters of Barcelona and Argentina, however, most surely have not.
尽管如此,作为一个纯粹的回顾以及描述性测量,本文证明:观察家们认为梅西在经历了两个相对他个人水平来说相对失望赛季之后重新回到了进球榜前茅的观点是错误的。今年的梅西先生确实是世界上最宝贵的进球者。事实上,他也是2014年,2013年,乃至著名的之前四年里这个世界最有价值的进球者。金球奖的选民们在做出过去两个选择时,可能遭受了“梅西审美疲劳”。而巴塞罗那和阿根廷的支持者,大多数肯定没有这一审美疲劳。
原文地址 http://www.economist.com/blogs/gametheory/2015/03/statistical-analysis-football?fsrc=scn/rd_ec/the_once_and_future_king
@徐丽质2001 友情翻译,
原发帖人@PauloDybala
[ 此帖被YICO_LIU在2015-03-25 17:59修改 ] 经济学人从数学统计方面分析足球比赛和球员,很是新颖,看来还是下了功夫。
原文还带数据下载,很严谨。
PS 让球数目是足彩公司根据主客球队各方面差距,对胜负的新定义。主让球数目1球,主队净胜球必须大于1才算胜,2:0,3:1算胜,1:0,2:1算平,0:0,1:1算输 重复编辑。 谢谢TBPOAT 搬贴!
没有发帖权的人很苦恼啊^^ 引用1楼 @werw323 发表的:
经济学人从数学统计方面分析足球比赛和球员,很是新颖,看来还是下了功夫。
原文还带数据下载,很严谨。
PS 让球数目是足彩公司根据主客球队各方面差距,对胜负的新定义。主让球数目1球,主队净胜球必须大于1才算胜,2:0,3:1算胜,1:0,2:1算平,0:0,1:1算输
多谢werw323的解释。一直搞不清楚足彩是怎么回事。 引用3楼 @徐丽质2001 发表的:
谢谢TBPOAT 搬贴!
没有发帖权的人很苦恼啊^^
为神马没有发帖权?借个手机号绑定下就行啊? 引用6楼 @TBPOAT 发表的:
为神马没有发帖权?借个手机号绑定下就行啊?
我给出手机号也收不到验证码。可能我的手机设置了特殊屏蔽吧。 引用7楼 @徐丽质2001 发表的:
我不在中国啊。给出手机号也收不到验证码。
汗。找个国内朋友帮个忙啊,用他手机号,把验证码告诉你。
需要的话我可以帮忙 引用8楼 @TBPOAT 发表的:
汗。找个国内朋友帮个忙啊,用他手机号,把验证码告诉你。
需要的话我可以帮忙
谢谢了。一个号码只能绑定一个账号的。^^我就随便玩玩而已。 教你花样吹梅西系列(数学建模篇) 引用4楼 @徐丽质2001 发表的:
多谢werw323的解释。一直搞不清楚足彩是怎么回事。
不客气,我对足彩也只了解点皮毛。
细读此文,觉得很专业。里面几个观点值得参考
1.并非所有的进球都有同样的价值:它们的价值完全取决于具体情状。不但足球是如此。篮球,橄榄球等也是相同道理。美国体育的数据分析相当专业,也很全面。EPA的分析同样应用在篮球,橄榄球等。
2.仅仅从进球EPA的计算,2013年开始到现在,梅西个人对所参加比赛的附加值,是高于其他个人的。
3.观察家们认为梅西在经历了两个相对他个人水平来说相对失望赛季之后重新回到了进球榜前茅的观点是错误的。
只能说《经济学人》高人多,希望更多人看到
http://b1.hoopchina.com.cn/post/smile/smile.gif
PS 原地址的附件数据有梅西13年到现在绝大多数比赛的进球时间,和进球个数
页:
[1]