将指令的目标地址保持在寄存器中,直到指令退出为止

我想在XeonE5 Sandy Bridge上使用精确的基于事件的采样(PEBS)来记录特定事件的所有地址(例如缓存未命中)。

但是, Core TM i7处理器和Intel®XeonTM5500处理器性能分析指南 (第24页)包含以下警告:

由于PEBS机制在指令完成时捕获寄存器的值,因此无法重建以下类型的加载指令(Intel asm约定)的解除引用的地址。
MOV RAX, [RAX+const]
这种指令主要与指针追逐有关
mystruc = mystruc->next;
这是捕获存储器指令地址的这种方法的重大缺点。

根据objdump,我在程序中有许多该表单的加载指令。 有什么办法可以避免吗?

由于这是一个特定于英特尔的问题,解决方案不必以任何方式移植,它只需要工作。 我的代码是用C语言编写的,我理想地寻找编译器级解决方案(gcc或icc),但欢迎任何建议。


一些例子:

 mov 0x18(%rdi),%rdi mov (%rcx,%rax,8),%rax 

在这两种情况下,在指令退出之后(因此当我查看寄存器值以确定我加载到/来自的位置时)地址的值(在这些示例中分别为%rdi + 18%rcx + 8 * %rax )被mov的结果覆盖。

你想做的是转换表格的所有说明:

 mov (%rcx,%rax,8),%rax 

成:

 mov (%rcx,%rax,8),%r11 mov %r11,%rax 

通过修改编译器生成的汇编程序源,可以更轻松地完成此操作。 下面是一个perl脚本,它将通过读取和修改.s文件来完成所有必要的转换。

只需更改构建以生成.s文件而不是.o文件,应用脚本,然后使用asgcc生成.o文件


这是实际的脚本。 我已根据以下评论中的构建过程在我自己的一些资源上测试了它。

该脚本具有以下function:

  1. 扫描并查找所有函数定义
  2. 标识给定函数中使用的所有寄存器
  3. 找到该函数的所有返回点
  4. 根据函数的寄存器用法选择要使用的临时寄存器(即它将使用函数尚未使用的临时寄存器)
  5. 用两个指令序列替换所有“麻烦”的指令
  6. 在尝试使用被调用者保存的寄存器之前,尝试使用未使用的临时寄存器(例如%r11或未使用的参数寄存器)
  7. 如果所选寄存器被callee保存,将添加push to function prolog并pop to function [multiple] ret语句
  8. 维护所有分析和转换的日志,并将其作为注释附加到输出.s文件

 #!/usr/bin/perl # pebsfix/pebsfixup -- fix assembler source for PEBS usage # # command line options: # "-a" -- use only full 64 bit targets # "-l" -- do _not_ use lea # "-D[diff-file]" -- show differences (default output: "./DIFF") # "-n10" -- do _not_ use register %r10 for temporary (default is use it) # "-o" -- overwrite input files (can be multiple) # "-O" -- output file (only one .s input allowed) # "-q" -- suppress warnings # "-T[lvl]" -- debug trace # # "-o" and "-O" are mutually exclusive # # command line script test options: # "-N[TPA]" -- disable temp register types [for testing] # "-P" -- force push/pop on all functions # # command line arguments: # 1-- list of .s files to process [or directory to search] # for a given file "foo.s", output is to "foo.TMP" # if (-o is given, "foo.TMP" is renamed to "foo.s") # # suggested usage: # change build to produce .s files # FROM: # cc [options] -c foo.c # TO: # cc [options] -c -S foo.c # pebsfixup -o foo.s # cc -c foo.s # # suggested compiler options: # [probably only really needed if push/pop required. use -NP to verify] # (1) use either of # -O2 -fno-optimize-sibling-calls # -O1 # (2) use -mno-omit-leaf-frame-pointer # (3) use -mno-red-zone [probably not required in any case] # # NOTES: # (1) red zones are only really useful for leaf functions (ie if fncA calls # fncB, fncA's red zone would be clobbered) # (2) pushing onto the stack isn't a problem if there is a formal stack frame # (3) the push is okay if the function has no more than six arguments (ie # does _not_ use positive offsets from %rsp to access them) #pragma pgmlns use strict qw(vars subs); our $pgmtail; our $opt_a; our $opt_T; our $opt_D; our $opt_l; our $opt_n10; our $opt_N; our $opt_P; our $opt_q; our $opt_o; our $opt_O; our $opt_s; our @reguse; our %reguse_tobase; our %reguse_isbase; our $regusergx; our @regtmplist; our %regtmp_type; our $diff; our $sepflg; our $fatal; our @cmtprt; master(@ARGV); exit(0); # master -- master control sub master { my(@argv) = @_; my($xfsrc); my($file,@files); my($bf); $pgmtail = "pebsfixup"; optget(\@argv); # define all known/usable registers regusejoin(); # define all registers that we may use as a temporary regtmpall(); if (defined($opt_D)) { unlink($opt_D); } # show usage if (@argv <= 0) { $file = $0; open($xfsrc,"<$file") || sysfault("$pgmtail: unable to open '%s' -- $!\n",$file); while ($bf = <$xfsrc>) { chomp($bf); next if ($bf =~ /^#!/); last unless ($bf =~ s/^#//); $bf =~ s/^# ?//; print($bf,"\n"); } close($xfsrc); exit(1); } foreach $file (@argv) { if (-d $file) { dodir(\@files,$file); } else { push(@files,$file); } } if (defined($opt_O)) { sysfault("$pgmtail: -O may have only one input file\n") if (@files != 1); sysfault("$pgmtail: -O and -o are mutually exclusive\n") if ($opt_o); } foreach $file (@files) { dofile($file); } if (defined($opt_D)) { exec("less",$opt_D); } } # dodir -- process directory sub dodir { my($files,$dir) = @_; my($file,@files); @files = (`find $dir -type f -name '*.s'`); foreach $file (@files) { chomp($file); push(@$files,$file); } } # dofile -- process file sub dofile { my($file) = @_; my($ofile); my($xfsrc); my($xfdst); my($bf,$lno,$outoff); my($fixoff); my($lhs,$rhs); my($xop,$arg); my($ix); my($sym,$val,$typ); my(%sym_type); my($fnc,$fnx,%fnx_lookup,@fnxlist); my($retlist); my($uselook,@uselist,%avail); my($fixreg,$fixrtyp); my($sixlist); my($fix,$fixlist); my($fixtot); my(@fix); my(@outlist); my($relaxflg); my($cmtchr); undef($fatal); undef(@cmtprt); msgprt("\n") if ($sepflg); $sepflg = 1; msgprt("$pgmtail: processing %s ...\n",$file); $cmtchr = "#"; cmtprt("%s\n","-" x 78); cmtprt("FILE: %s\n",$file); # get the output file $ofile = $file; sysfault("$pgmtail: bad suffix -- file='%s'\n",$file) unless ($ofile =~ s/[.]s$//); $ofile .= ".TMP"; # use explicit output file if (defined($opt_O)) { $ofile = $opt_O; sysfault("$pgmtail: output file may not be input file -- use -o instead\n") if ($ofile eq $file); } open($xfsrc,"<$file") || sysfault("$pgmtail: unable to open '%s' -- $!\n",$file); $lno = 0; while ($bf = <$xfsrc>) { chomp($bf); $bf =~ s/\s+$//; $outoff = $lno; ++$lno; push(@outlist,$bf); # clang adds comments $ix = index($bf,"#"); if ($ix >= 0) { $bf = substr($bf,0,$ix); $bf =~ s/\s+$//; } # look for ".type blah, @function" # NOTE: this always comes before the actual label line [we hope ;-)] if ($bf =~ /^\s+[.]type\s+([^,]+),\s*(\S+)/) { ($sym,$val) = ($1,$2); $val =~ s/^\@//; $sym_type{$sym} = $val; cmtprt("\n"); cmtprt("TYPE: %s --> %s\n",$sym,$val); next; } # look for "label:" if ($bf =~ /^([a-z_A-Z][a-z_A-Z0-9]*):$/) { $sym = $1; next if ($sym_type{$sym} ne "function"); $fnc = $sym; cmtprt("FUNCTION: %s\n",$fnc); $fnx = {}; $fnx_lookup{$sym} = $fnx; push(@fnxlist,$fnx); $fnx->{fnx_fnc} = $fnc; $fnx->{fnx_outoff} = $outoff; $uselook = {}; $fnx->{fnx_used} = $uselook; $retlist = []; $fnx->{fnx_retlist} = $retlist; $fixlist = []; $fnx->{fnx_fixlist} = $fixlist; $sixlist = []; $fnx->{fnx_sixlist} = $sixlist; next; } # remember all registers used by function: while ($bf =~ /($regusergx)/gpo) { $sym = ${^MATCH}; $val = $reguse_tobase{$sym}; dbgprt(3,"dofile: REGUSE sym='%s' val='%s'\n",$sym,$val); $uselook->{$sym} += 1; $uselook->{$val} += 1 if ($val ne $sym); } # handle returns if ($bf =~ /^\s+ret/) { push(@$retlist,$outoff); next; } if ($bf =~ /^\s+rep[az]*\s+ret/) { push(@$retlist,$outoff); next; } # split up "movq 16(%rax),%rax" ... $ix = rindex($bf,","); next if ($ix < 0); # ... into "movq 16(%rax)" $lhs = substr($bf,0,$ix); $lhs =~ s/\s+$//; # check for "movq 16(%rsp)" -- this means that the function has/uses # more than six arguments (ie we may _not_ push/pop because it # wreaks havoc with positive offsets) # FIXME/CAE -- we'd have to adjust them by 8 which we don't do (undef,$rhs) = split(" ",$lhs); if ($rhs =~ /^(\d+)[(]%rsp[)]$/) { push(@$sixlist,$outoff); cmtprt("SIXARG: %s (line %d)\n",$rhs,$lno); } # ... and "%rax" $rhs = substr($bf,$ix + 1); $rhs =~ s/^\s+//; # target must be a [simple] register [or source scan will blow up] # (eg we actually had "cmp %ebp,(%rax,%r14)") next if ($rhs =~ /[)]/); # ensure we have the "%" prefix next unless ($rhs =~ /^%/); # we only want the full 64 bit reg as target # (eg "mov (%rbx),%al" doesn't count) $val = $reguse_tobase{$rhs}; if ($opt_a) { next if ($val ne $rhs); } else { next unless (defined($val)); } # source operand must contain target [base] register next unless ($lhs =~ /$val/); ###cmtprt("1: %s,%s\n",$lhs,$rhs); # source operand must be of the "right" type # FIXME/CAE -- we may need to revise this next unless ($lhs =~ /[(]/); cmtprt("NEEDFIX: %s,%s (line %d)\n",$lhs,$rhs,$lno); # remember the place we need to fix for later $fix = {}; push(@$fixlist,$fix); $fix->{fix_outoff} = $outoff; $fix->{fix_lhs} = $lhs; $fix->{fix_rhs} = $rhs; } close($xfsrc); # get total number of fixups foreach $fnx (@fnxlist) { $fixlist = $fnx->{fnx_fixlist}; $fixtot += @$fixlist; } msgprt("$pgmtail: needs %d fixups\n",$fixtot) if ($fixtot > 0); # fix each function foreach $fnx (@fnxlist) { cmtprt("\n"); cmtprt("FNC: %s\n",$fnx->{fnx_fnc}); $fixlist = $fnx->{fnx_fixlist}; # get the fixup register ($fixreg,$fixrtyp) = regtmploc($fnx,$fixlist); # show number of return points { $retlist = $fnx->{fnx_retlist}; cmtprt(" RET: %d\n",scalar(@$retlist)); last if (@$retlist >= 1); # NOTE: we display this warning because we may not be able to # handle all situations $relaxflg = (@$fixlist <= 0) || ($fixrtyp ne "P"); last if ($relaxflg && $opt_q); errprt("$pgmtail: in file '%s'\n",$file); errprt("$pgmtail: function '%s' has no return points\n", $fnx->{fnx_fnc}); errprt("$pgmtail: suggest recompile with correct options\n"); if (@$fixlist <= 0) { errprt("$pgmtail: working around because function needs no fixups\n"); last; } if ($fixrtyp ne "P") { errprt("$pgmtail: working around because fixup reg does not need to be saved\n"); last; } } # show stats on register usage in function $uselook = $fnx->{fnx_used}; @uselist = sort(keys(%$uselook)); cmtprt(" USED:\n"); %avail = %reguse_isbase; foreach $sym (@uselist) { $val = $uselook->{$sym}; $typ = $regtmp_type{$sym}; $typ = sprintf(" (TYPE: %s)",$typ) if (defined($typ)); cmtprt(" %s used %d%s\n",$sym,$val,$typ); $val = $reguse_tobase{$sym}; delete($avail{$val}); } # show function's available [unused] registers @uselist = keys(%avail); @uselist = sort(regusesort @uselist); if (@uselist > 0) { cmtprt(" AVAIL:\n"); foreach $sym (@uselist) { $typ = $regtmp_type{$sym}; $typ = sprintf(" (TYPE: %s)",$typ) if (defined($typ)); cmtprt(" %s%s\n",$sym,$typ); } } # skip over any functions that don't need fixing _and_ have a temp # register if (@$fixlist <= 0 && (! $opt_P)) { next if (defined($fixreg)); } msgprt("$pgmtail: function %s\n",$fnx->{fnx_fnc}); # skip function because we don't have a fixup register but report it # here unless (defined($fixreg)) { $bf = (@$fixlist > 0) ? "FATAL" : "can be ignored -- no fixups needed"; msgprt("$pgmtail: FIXNOREG (%s)\n",$bf); cmtprt(" FIXNOREG (%s)\n",$bf); next; } msgprt("$pgmtail: FIXREG --> %s (TYPE: %s)\n",$fixreg,$fixrtyp); cmtprt(" FIXREG --> %s (TYPE: %s)\n",$fixreg,$fixrtyp); foreach $fix (@$fixlist) { $outoff = $fix->{fix_outoff}; undef(@fix); cmtprt(" FIXOLD %s\n",$outlist[$outoff]); # original if ($opt_l) { $bf = sprintf("%s,%s",$fix->{fix_lhs},$fixreg); push(@fix,$bf); $bf = sprintf("\tmov\t%s,%s",$fixreg,$fix->{fix_rhs}); push(@fix,$bf); } # use lea else { ($xop,$arg) = split(" ",$fix->{fix_lhs}); $bf = sprintf("\tlea\t\t%s,%s",$arg,$fixreg); push(@fix,$bf); $bf = sprintf("\t%s\t(%s),%s",$xop,$fixreg,$fix->{fix_rhs}); push(@fix,$bf); } foreach $bf (@fix) { cmtprt(" FIXNEW %s\n",$bf); } $outlist[$outoff] = [@fix]; } unless ($opt_P) { next if ($fixrtyp ne "P"); } # fix the function prolog $outoff = $fnx->{fnx_outoff}; $lhs = $outlist[$outoff]; $rhs = sprintf("\tpush\t%s",$fixreg); $bf = [$lhs,$rhs,""]; $outlist[$outoff] = $bf; # fix the function return points $retlist = $fnx->{fnx_retlist}; foreach $outoff (@$retlist) { $rhs = $outlist[$outoff]; $lhs = sprintf("\tpop\t%s",$fixreg); $bf = ["",$lhs,$rhs]; $outlist[$outoff] = $bf; } } open($xfdst,">$ofile") || sysfault("$pgmtail: unable to open '%s' -- $!\n",$ofile); # output all the assembler text foreach $bf (@outlist) { # ordinary line unless (ref($bf)) { print($xfdst $bf,"\n"); next; } # apply a fixup foreach $rhs (@$bf) { print($xfdst $rhs,"\n"); } } # output all our reasoning as comments at the bottom foreach $bf (@cmtprt) { if ($bf eq "") { print($xfdst $cmtchr,$bf,"\n"); } else { print($xfdst $cmtchr," ",$bf,"\n"); } } close($xfdst); # get difference if (defined($opt_D)) { system("diff -u $file $ofile >> $opt_D"); } # install fixed/modified file { last unless ($opt_o || defined($opt_O)); last if ($fatal); msgprt("$pgmtail: installing ...\n"); rename($ofile,$file); } } # regtmpall -- define all temporary register candidates sub regtmpall { dbgprt(1,"regtmpall: ENTER\n"); regtmpdef("%r11","T"); # NOTES: # (1) see notes on %r10 in ABI at bottom -- should we use it? # (2) a web search on "shared chain" and "x86" only produces 28 results # (3) some gcc code uses it as an ordinary register # (4) so, use it unless told not to regtmpdef("%r10","T") unless ($opt_n10); # argument registers (a6-a1) regtmpdef("%r9","A6"); regtmpdef("%r8","A5"); regtmpdef("%rcx","A4"); regtmpdef("%rdx","A3"); regtmpdef("%rsi","A2"); regtmpdef("%rdi","A1"); # callee preserved registers regtmpdef("%r15","P"); regtmpdef("%r14","P"); regtmpdef("%r13","P"); regtmpdef("%r12","P"); dbgprt(1,"regtmpall: EXIT\n"); } # regtmpdef -- define usable temp registers sub regtmpdef { my($sym,$typ) = @_; dbgprt(1,"regtmpdef: SYM sym='%s' typ='%s'\n",$sym,$typ); push(@regtmplist,$sym); $regtmp_type{$sym} = $typ; } # regtmploc -- locate temp register to fix problem sub regtmploc { my($fnx,$fixlist) = @_; my($sixlist); my($uselook); my($regrhs); my($fixcnt); my($coretyp); my($reglhs,$regtyp); dbgprt(2,"regtmploc: ENTER fnx_fnc='%s'\n",$fnx->{fnx_fnc}); $sixlist = $fnx->{fnx_sixlist}; $fixcnt = @$fixlist; $fixcnt = 1 if ($opt_P); $uselook = $fnx->{fnx_used}; foreach $regrhs (@regtmplist) { dbgprt(2,"regtmploc: TRYREG regrhs='%s' uselook=%d\n", $regrhs,$uselook->{$regrhs}); unless ($uselook->{$regrhs}) { $regtyp = $regtmp_type{$regrhs}; $coretyp = $regtyp; $coretyp =~ s/\d+$//; # function uses stack arguments -- we can't push/pop if (($coretyp eq "P") && (@$sixlist > 0)) { dbgprt(2,"regtmploc: SIXREJ\n"); next; } if (defined($opt_N)) { dbgprt(2,"regtmploc: TRYREJ opt_N='%s' regtyp='%s'\n", $opt_N,$regtyp); next if ($opt_N =~ /$coretyp/); } $reglhs = $regrhs; last; } } { last if (defined($reglhs)); errprt("regtmploc: unable to locate usable fixup register for function '%s'\n", $fnx->{fnx_fnc}); last if ($fixcnt <= 0); $fatal = 1; } dbgprt(2,"regtmploc: EXIT reglhs='%s' regtyp='%s'\n",$reglhs,$regtyp); ($reglhs,$regtyp); } # regusejoin -- get regex for all registers sub regusejoin { my($reg); dbgprt(1,"regusejoin: ENTER\n"); # rax foreach $reg (qw(abcd)) { regusedef($reg,"r_x","e_x","_l","_h"); } # rdi/rsi foreach $reg (qw(ds)) { regusedef($reg,"r_i","e_i","_i","_il"); } # rsp/rbp foreach $reg (qw(bs)) { regusedef($reg,"r_p","e_p"); } foreach $reg (8,9,10,11,12,13,14,15) { regusedef($reg,"r_","r_d","r_w","r_b"); } $regusergx = join("|",reverse(sort(@reguse))); dbgprt(1,"regusejoin: EXIT regusergx='%s'\n",$regusergx); } # regusedef -- define all registers sub regusedef { my(@argv) = @_; my($mid); my($pat); my($base); $mid = shift(@argv); dbgprt(1,"regusedef: ENTER mid='%s'\n",$mid); foreach $pat (@argv) { $pat = "%" . $pat; $pat =~ s/_/$mid/; $base //= $pat; dbgprt(1,"regusedef: PAT pat='%s' base='%s'\n",$pat,$base); push(@reguse,$pat); $reguse_tobase{$pat} = $base; } $reguse_isbase{$base} = 1; dbgprt(1,"regusedef: EXIT\n"); } # regusesort -- sort base register names sub regusesort { my($symlhs,$numlhs); my($symrhs,$numrhs); my($cmpflg); { ($symlhs,$numlhs) = _regusesort($a); ($symrhs,$numrhs) = _regusesort($b); $cmpflg = $symlhs cmp $symrhs; last if ($cmpflg); $cmpflg = $numlhs <=> $numrhs; } $cmpflg; } # _regusesort -- split up base register name sub _regusesort { my($sym) = @_; my($num); if ($sym =~ s/(\d+)$//) { $num = $1; $num += 0; $sym =~ s/[^%]/z/g; } ($sym,$num); } # optget -- get options sub optget { my($argv) = @_; my($bf); my($sym,$val); my($dft,%dft); foreach $sym (qw(al n10 P qos T)) { $dft{$sym} = 1; } $dft{"N"} = "T"; $dft{"D"} = "DIFF"; while (1) { $bf = $argv->[0]; $sym = $bf; last unless ($sym =~ s/^-//); last if ($sym eq "-"); shift(@$argv); { if ($sym =~ /([^=]+)=(.+)$/) { ($sym,$val) = ($1,$2); last; } if ($sym =~ /^(.)(.+)$/) { ($sym,$val) = ($1,$2); last; } undef($val); } $dft = $dft{$sym}; sysfault("$pgmtail: unknown option -- '%s'\n",$bf) unless (defined($dft)); $val //= $dft; ${"opt_" . $sym} = $val; } } # cmtprt -- transformation comments sub cmtprt { $_ = shift(@_); $_ = sprintf($_,@_); chomp($_); push(@cmtprt,$_); } # msgprt -- progress output sub msgprt { printf(STDERR @_); } # errprt -- show errors sub errprt { cmtprt(@_); printf(STDERR @_); } # sysfault -- abort on error sub sysfault { printf(STDERR @_); exit(1); } # dbgprt -- debug print sub dbgprt { $_ = shift(@_); goto &_dbgprt if ($opt_T >= $_); } # _dbgprt -- debug print sub _dbgprt { printf(STDERR @_); } 

更新:

我已更新脚本以修复错误,添加更多检查以及更多选项。 注意:我必须在底部删除ABI才能达到30,000的限制。

否则奇怪的结果出现在带括号的其他命令上,例如cmpl %ebp, (%rax,%r14)分为lhs='cmpl %ebp, (%rax'rhs='%r14)' ,这反过来导致/$rhs/失败。

是的,这是一个错误。 固定。

你的$rhs =~ /%[er](.x|\d+)/与字节或字加载不匹配diax 。 但这不太可能。 哦,还有,我认为它无法匹配rdi / rsi 。 所以你不需要r10d中的尾随d

固定。 查找所有变体。

哇,我认为这样的事情必须在编译时发生,并且在事实之后这样做会太乱。

无耻的插头:谢谢你的“哇!”。 perl非常适合像这样凌乱的工作。 我以前写过这样的汇编程序“注入”脚本。 (例如)返回[编译器支持之前]的日期以添加性能分析调用。

您可以将%r10标记为另一个调用保留寄存器。

在做了一些网页搜索之后,我只能在"static chain" x86上找到大约84个匹配项。 唯一相关的是x86 ABI。 并且,除了将其作为脚注提及之外,它没有提供任何解释。 此外,一些gcc代码使用r10 而没有任何保存作为被调用者寄存器。 所以,我现在默认程序使用r10 (如果需要,可以使用命令行选项禁用它)。

如果函数已经使用了所有寄存器会发生什么?

如果它真的全部,那么我们运气不好。 如果脚本无法找到备用寄存器,脚本将检测并报告此情况并禁止修复。

并且,它将使用“callee必须保留”寄存器,通过将push作为函数的第一个inst并且在ret inst [可以具有多个]之前注入相应的pop 。 可以使用选项禁用此function。

你不能只是推/弹,因为红色区域上的步骤

不,它没有 。 原因如下:

(1)几乎作为旁注:红色区域仅在叶子function中有用。 否则,如果fncA调用fncB ,仅仅通过fncA执行此fncA将会踩到它自己的红色区域。 请参阅脚本顶部注释块中的编译选项。

(2)更重要的是,由于push/pop的注入方式。 push发生任何其他inst 之前pop发生任何其他insts之后[就在ret之前]。

红色区域仍然存在 – 完好无损。 它只是从原来的位置偏移了-8。 保留所有红色区域活动,因为这些insts使用来自%rsp rsp的偏移

内联asm块内的push/pop 不同 。 通常的情况是红区代码正在执行(例如) mov $23,-4(%rsp) 。 之后推出的内联asm块会执行push/pop

一些function显示:

 # function_original -- original function before pebsfixup # RETURNS: 23 function_original: mov $23,-4(%rsp) # red zone code generated by compiler ... mov -4(%rsp),%rax # will still have $23 ret # function_pebsfixup -- pebsfixup modified # RETURNS: 23 function_pebsfixup: push %r12 # pebsfixup injected mov $23,-4(%rsp) # red zone code generated by compiler ... mov -4(%rsp),%rax # will still have $23 pop %r12 # pebsfixup injected ret # function_inline -- function with inline asm block and red zone # RETURNS: unknown value function_inline: mov $23,-4(%rsp) # red zone code generated by compiler # inline asm block -- steps on red zone push %rdx push %rcx ... pop %rcx pop %rdx ... mov -4(%rsp),%rax # now -4(%rsp) no longer has $23 ret 

push/pop 确实让我们遇到麻烦的地方是函数是否使用六个以上的参数(即args 7+在堆栈中)。 访问这些参数使用%rsp偏移量:

 mov 32(%rsp),%rax 

通过我们的“技巧” push ,偏移将是不正确的。 正确的偏移现在会高出8:

 mov 40(%rsp),%rax 

该脚本将检测到这一点并抱怨。 但是,它还没有[正]调整正偏移量,因为这种情况的概率很低。 它可能需要大约五行代码来解决这个问题。 现在撑船……

我现在能想到的唯一方法是使用&(和号)汇编程序约束。 这意味着无论出现这样的指令,我都必须通过我的代码,并替换每个指针mystruc = mystruc->next; 有类似的东西:
asm volatile("mov (%1),%0" : "=&r" (mystruc) : "r" (&(mystruc->next)))

然而,这是一种非常行人的方法,并且可能存在比结构内部的指针更复杂的情况。 我知道这基本上会增加注册压力,所以编译器正在积极尝试避免,我仍然在寻找其他任何方法来做到这一点。