Quantcast
Channel: Programming – Gea-Suan Lin's BLOG
Viewing all articles
Browse latest Browse all 103

Python 3.14 的 tail-call interpreter 的效能提升來自於繞過 LLVM 的 regression bug

$
0
0

前幾天 Python 圈子蠻熱鬧的一個主題:「Performance of the Python 3.14 tail-call interpreter」。

Python 3.14 實作了 tail-call interpreter (是個 opt-in 參數),官方測試發現效能在 benchmark 時有巨大的提升 (9%~15%),但文章作者覺得不合理,交叉測試了許多 case 後發現這是因為 LLVM 的 regression bug 導致 computed gotos 比較慢,而 tail-call interpreter 的實作避開了這個 bug:

Unfortunately, as I will document in this post, these impressive performance gains turned out to be primarily due to inadvertently working around a regression in LLVM 19. When benchmarked against a better baseline (such GCC, clang-18, or LLVM 19 with certain tuning flags), the performance gain drops to 1-5% or so depending on the exact setup.

官方在得知 compiler 的 regression bug 後用其他方式再測試 tail-call interpreter 的改善結果,實際的提升大約是 3%~5%,而非當初說的 9%~15%:

This section previously reported a 9-15% geometric mean speedup. This number has since been cautiously revised down to 3-5%. While we expect performance results to be better than what we report, our estimates are more conservative due to a compiler bug found in Clang/LLVM 19, which causes the normal interpreter to be slower. We were unaware of this bug, resulting in inaccurate results. We sincerely apologize for communicating results that were only accurate for certain versions of LLVM 19 and 20. At the time of writing, this bug has not yet been fixed in LLVM 19-21. Thus any benchmarks with those versions of LLVM may produce inaccurate numbers. (Thanks to Nelson Elhage for bringing this to light.)

是個完全沒想到的情況,還蠻推薦去看看原文測試的數據以及解讀...


Viewing all articles
Browse latest Browse all 103