JIT buffer重定位 问题背景 PHP8 JIT Just-In-Time compilation is a hybrid model of interpreter and Ahead-of-Time compilation, that some or all of the code is compiled, often at run-time, without requiring the developer to manually compile it.
JIT也就是边解释执行代码,找到热点代码将其编译,以后就直接去执行编译成机器码的热点代码,提高性能。JIT
JavaScript Performance Penalty. Function calls between embedded builtins and JIT compiled code can come at a considerable performance penalty. With the x86-64 instruction set, we can’t use direct calls. Instead, we need to rely on indirect calls through a register or memory operand. Such calls rely more heavily on prediction since it’s not immediately apparent from the call instruction itself what the target of the call is.
For 64-bit applications, branch prediction performance can be negatively impacted when the target of a branch is more than 4 GB away from the branch.
JIT生成机器码之后,从编译好的code跳转到解释执行的code,这之间是存在性能损失的,因为Intel对这种远距离的分支跳转,分支预测得不准(似乎是有对分支跳转的距离假设,比如函数A跳转到函数B的虚拟地址一般是在4GB以内)。
思考:那PHP里的JIT buffer的位置是在哪呢? 跳转到PHP的解释器执行的代码的情况如何呢? 是否也可以对这个远距离跳转去优化呢?
JIT buffer内存布局 我们跑了wordpress的workload,php.ini的配置中如下两项配置了opcache的最大值和jit buffer的最大值
1 2 3 4 5 6 7 8 opcache.memory_consumption=1024 ; The size of the shared memory storage used by OPcache, in megabytes. opcache.jit_buffer_size=128M ; The amount of shared memory to reserve for compiled JIT code. A zero value disables the JIT. opcache.huge_code_pages=1 ; Enables or disables copying of PHP code (text segment) into HUGE PAGES. This should improve performance
跑workload 用户发出请求由php处理的过程中,查看任意php-fpm 的进程的内存使用maps文件。
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 $ cat /proc/1888419/maps 555555400000-5555554f9000 r--p 00000000 08:02 6166492 /opt/pkb/git/hhvm-perf/php-fpm-wp5.8-php8.1.4-jit.bolt 555555600000-555555a00000 r-xp 00000000 00:0f 285761873 /anon_hugepage (deleted) 555555a00000-555556231000 r--p 00600000 08:02 6166492 /opt/pkb/git/hhvm-perf/php-fpm-wp5.8-php8.1.4-jit.bolt 55555655f000-555556600000 r--p 00f5f000 08:02 6166492 /opt/pkb/git/hhvm-perf/php-fpm-wp5.8-php8.1.4-jit.bolt 555556600000-555556604000 rw-p 01000000 08:02 6166492 /opt/pkb/git/hhvm-perf/php-fpm-wp5.8-php8.1.4-jit.bolt 555556604000-555556624000 rw-p 00000000 00:00 0 555556800000-555556ca4000 r-xp 01400000 08:02 6166492 /opt/pkb/git/hhvm-perf/php-fpm-wp5.8-php8.1.4-jit.bolt 555556ca4000-555556ebe000 rw-p 00000000 00:00 0 [heap] 555556ebe000-555556f19000 rw-p 00000000 00:00 0 [heap] 7fffad3f0000-7fffad400000 rwxp 00000000 00:00 0 7fffad400000-7fffad600000 rw-p 00000000 00:0f 285734811 /anon_hugepage (deleted) 7fffad600000-7fffed600000 rw-s 00000000 00:0f 285761874 /anon_hugepage (deleted) 7fffed600000-7ffff5600000 r-xs 40000000 00:0f 285761874 /anon_hugepage (deleted) 7ffff560b000-7ffff565b000 rwxp 00000000 00:00 0 7ffff565b000-7ffff56dc000 rw-p 00000000 00:00 0 7ffff56e0000-7ffff56f0000 rwxp 00000000 00:00 0 7ffff56f0000-7ffff5705000 rw-s 00000000 00:01 135173 /dev/zero (deleted) 7ffff5705000-7ffff571a000 r--p 00000000 08:02 6166503 /opt/pkb/git/hhvm-perf/opcache-wp5.8-php8.1.4-jit.so 7ffff571a000-7ffff57cb000 r-xp 00015000 08:02 6166503 /opt/pkb/git/hhvm-perf/opcache-wp5.8-php8.1.4-jit.so 7ffff57cb000-7ffff57e4000 r--p 000c6000 08:02 6166503 /opt/pkb/git/hhvm-perf/opcache-wp5.8-php8.1.4-jit.so 7ffff57e4000-7ffff57e7000 r--p 000de000 08:02 6166503 /opt/pkb/git/hhvm-perf/opcache-wp5.8-php8.1.4-jit.so 7ffff57e7000-7ffff57f7000 rw-p 000e1000 08:02 6166503 /opt/pkb/git/hhvm-perf/opcache-wp5.8-php8.1.4-jit.so
可以看到 7fffed600000 这里JIT 跳转进入到555555600000 php的.text 段,距离是很远的,far jump会有性能损失。
注意,当 opcache.jit_buffer_size=16M , opcache.memory_consumption=128M 的时候,mmap会分配到<2GB的位置上去。
1 2 3 4 5 6 7 8 9 10 11 12 13 40000000-48000000 rw-s 00000000 00:0f 286759728 /anon_hugepage (deleted) 48000000-49000000 r-xs 08000000 00:0f 286759728 /anon_hugepage (deleted) 555555400000-5555554f9000 r--p 00000000 08:02 6166492 /opt/pkb/git/hhvm-perf/php-fpm-wp5.8-php8.1.4-jit.bolt 555555600000-555555a00000 r-xp 00000000 00:0f 286759727 /anon_hugepage (deleted) 555555a00000-555556231000 r--p 00600000 08:02 6166492 /opt/pkb/git/hhvm-perf/php-fpm-wp5.8-php8.1.4-jit.bolt 55555655f000-555556600000 r--p 00f5f000 08:02 6166492 /opt/pkb/git/hhvm-perf/php-fpm-wp5.8-php8.1.4-jit.bolt 555556600000-555556604000 rw-p 01000000 08:02 6166492 /opt/pkb/git/hhvm-perf/php-fpm-wp5.8-php8.1.4-jit.bolt 555556604000-555556624000 rw-p 00000000 00:00 0 555556800000-555556ca4000 r-xp 01400000 08:02 6166492 /opt/pkb/git/hhvm-perf/php-fpm-wp5.8-php8.1.4-jit.bolt 555556ca4000-555556ebe000 rw-p 00000000 00:00 0 [heap] 555556ebe000-555556f19000 rw-p 00000000 00:00 0 [heap]
代码实现 代码思路在代码注释里写的很清晰了,就是在分配JIT buffer之前(函数create_segments),找一段空闲的内存空间,距离 PHP .text 段在4GB以内的都可以。
https://github.com/stkeke/php-src/commit/7ccc2a9209af156a9da0619b22e0be12f0adc2ab
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 #if defined(MAP_HUGETLB) if (requested_size >= huge_page_size && requested_size % huge_page_size == 0 ) { for (int i = 0 ; i < candidate_count; i++) { res = mmap((void *)candidates[i], requested_size, flags, MAP_SHARED|MAP_ANONYMOUS|MAP_HUGETLB|MAP_FIXED, fd, 0 ); if (MAP_FAILED != res) { return res; } } } #endif
后续maintainer修改了这个patch,更简洁一些。
https://github.com/php/php-src/commit/17aa81a5e22d4b8d1ffd7c89cb641939b4f6b7db
最后JIT Buffer的位置被移动到了heap之后,或者一开始的位置。
1 2 3 4 5 6 7 8 9 10 11 12 13 555513e00000-555553e00000 rw-s 00000000 00:0f 287680313 /anon_hugepage (deleted) 555553e00000-555555200000 r-xs 40000000 00:0f 287680313 /anon_hugepage (deleted) 555555400000-5555554f9000 r--p 00000000 08:02 11403879 /opt/pkb/git/hhvm-perf/php-fpm-wp5.8-php8.1.4-jit.bolt 555555600000-555555a00000 r-xp 00000000 00:0f 287680312 /anon_hugepage (deleted) 555555a00000-555556231000 r--p 00600000 08:02 11403879 /opt/pkb/git/hhvm-perf/php-fpm-wp5.8-php8.1.4-jit.bolt 55555655f000-555556600000 r--p 00f5f000 08:02 11403879 /opt/pkb/git/hhvm-perf/php-fpm-wp5.8-php8.1.4-jit.bolt 555556600000-555556604000 rw-p 01000000 08:02 11403879 /opt/pkb/git/hhvm-perf/php-fpm-wp5.8-php8.1.4-jit.bolt 555556604000-555556624000 rw-p 00000000 00:00 0 555556800000-555556ca4000 r-xp 01400000 08:02 11403879 /opt/pkb/git/hhvm-perf/php-fpm-wp5.8-php8.1.4-jit.bolt 555556ca4000-555556ebe000 rw-p 00000000 00:00 0 [heap] 555556ebe000-555556f19000 rw-p 00000000 00:00 0 [heap] 7ffff53f0000-7ffff5400000 rwxp 00000000 00:00 0 7ffff5400000-7ffff5600000 rw-p 00000000 00:0f 287683296 /anon_hugepage (deleted)
代码已经合并到php-src社区,好久没写C了,这里总结一些写代码时遇到的思考和问题。
用函数指针可以获取代码段所在的位置。void* addr = php_text_lighthouse;
注意ifdef的使用中,不存在的代码最后不会被编译进binary,要注意到hugepage 变量的定义,避免出现未定义的问题。
注意区分数据类型,uintptr_t 表示地址,任何指向void的合法指针都可以转化为这个类型。 ptrdiff_t 有符号整数类型,两个指针相减结果的类型。表示地址之间的距离。 size_t 表示大小,是无符号整数类型,这是sizeof操作符结果的类型。
函数参数是数组指针的时候,最好还传一个数组的max_size,避免越界
减少if的嵌套
用中间变量保存一些计算步骤,让代码更易读
提PR的时候写好注释,所有关键地方都写上,方便maintainer阅读
优化思路 性能优化四个方向:CPU/内存&网卡&磁盘&应用
优化代码段的布局,将常常调用的代码都放到4GB以内(for X64)。
Reference
Short builtin calls
Intel Optimization Manual
PHP: Runtime Configuration - Manual
基于硬件特性的性能调优