遇到一个valgrind自身的bug

背景

公司C++项目代码使用了cppcheck做静态代码检查,也使用valgrind检查是否有内存泄漏问题。我多次强调要做到0警告,虽然有了CICD的Jenkins自动检查,也将结果通过邮件发给项目人员,但有的人还是没去修正警告,由于不是自己管辖范围,不好多说什么。
最近使用valgrind测试,遇到了未识别指令的问题(运行的程序被认为是非法指令)。经查发现是valgrind版本太低造成的。

问题出现

运行命令如下:

valgrind  --leak-check=full --show-leak-kinds=all  ./a.out

错误提示如下:

vex amd64->IR: unhandled instruction bytes: 0xF 0xC7 0xF0 0x89 0x6 0xF 0x42 0xC1
vex amd64->IR:   REX=0 REX.W=0 REX.R=0 REX.X=0 REX.B=0
vex amd64->IR:   VEX=0 VEX.L=0 VEX.nVVVV=0x0 ESC=0F
vex amd64->IR:   PFX.66=0 PFX.F2=0 PFX.F3=0
==3562== valgrind: Unrecognised instruction at address 0x4ef1b15.
==3562==    at 0x4EF1B15: ??? (in /usr/lib/x86_64-linux-gnu/libstdc++.so.6.0.21)
==3562==    by 0x4EF1CB1: std::random_device::_M_getval() (in /usr/lib/x86_64-linux-gnu/libstdc++.so.6.0.21)
==3562==    by 0x400D8B: std::random_device::operator()() (in /home/latelee/test/mytest/warningtest/a.out)
==3562==    by 0x400FA1: Init() (in /home/latelee/test/mytest/warningtest/a.out)
==3562==    by 0x400DD7: GetRandomC11() (in /home/latelee/test/mytest/warningtest/a.out)
==3562==    by 0x400B55: GetRandomNum() (in /home/latelee/test/mytest/warningtest/a.out)
==3562==    by 0x400D0A: main (in /home/latelee/test/mytest/warningtest/a.out)

==26759== Process terminating with default action of signal 4 (SIGILL)
==26759==  Illegal opcode at address 0x4EF1B15
==26759==    at 0x4EF1B15: ??? (in /usr/lib/x86_64-linux-gnu/libstdc++.so.6.0.21)
==26759==    by 0x4EF1CB1: std::random_device::_M_getval() (in /usr/lib/x86_64-linux-gnu/libstdc++.so.6.0.21)
==26759==    by 0x400D8B: std::random_device::operator()() (in /home/latelee/test/mytest/warningtest/a.out)
==26759==    by 0x400FA1: Init() (in /home/latelee/test/mytest/warningtest/a.out)
==26759==    by 0x400DD7: GetRandomC11() (in /home/latelee/test/mytest/warningtest/a.out)
==26759==    by 0x400B55: GetRandomNum() (in /home/latelee/test/mytest/warningtest/a.out)
==26759==    by 0x400D0A: main (in /home/latelee/test/mytest/warningtest/a.out)

同时,valgrind也进行了提醒:

==2636== Your program just tried to execute an instruction that Valgrind
==2636== did not recognise.  There are two possible reasons for this.
==2636== 1. Your program has a bug and erroneously jumped to a non-code
==2636==    location.  If you are running Memcheck and you just saw a
==2636==    warning about a bad jump, it's probably your program's fault.
==2636== 2. The instruction is legitimate but Valgrind doesn't handle it,
==2636==    i.e. it's Valgrind's fault.  If you think this is the case or
==2636==    you are not sure, please let us know and we'll try to fix it.
==2636== Either way, Valgrind will now raise a SIGILL signal which will
==2636== probably kill your program.

大概意思是说,要么是程序代码真的有bug,要么是valgrind本身有bug(顺便反馈给作者)。反复阅读代码,统计new和delete出现次数,都没问题。
后来在https://stackoverflow.com/questions/37032339/valgrind-unrecognised-instruction上看到有介绍,大概意思是不支持_M_getval(),帖子中还附带了补丁,也建议使用新版本。本文使用新版本valgrind进行测试。

新版本编译、测试

先查看当前版本,如下:

$ valgrind --version
valgrind-3.11.0

在官网上查看到最新版本是3.13,下载地址

tar jxf valgrind-3.13.0.tar.bz2 

./configure --prefix=/home/latelee/bin/valgrind
make -j
make install

使用新版本进行测试,命令如下:

$/home/latelee/bin/valgrind/bin/valgrind --leak-check=full --show-leak-kinds=all  ./a.out

这次的结果如下:

==30979== Memcheck, a memory error detector
==30979== Copyright (C) 2002-2017, and GNU GPL'd, by Julian Seward et al.
==30979== Using Valgrind-3.13.0 and LibVEX; rerun with -h for copyright info
==30979== Command: ./a.out
==30979== 
==30979== Conditional jump or move depends on uninitialised value(s)
==30979==    at 0x400E36: Uninit() (in /home/latelee/test/mytest/warningtest/a.out)
==30979==    by 0x400B65: GetRandomNum() (in /home/latelee/test/mytest/warningtest/a.out)
==30979==    by 0x400D0A: main (in /home/latelee/test/mytest/warningtest/a.out)
==30979== 
-316985884
==30979== 
==30979== HEAP SUMMARY:
==30979==     in use at exit: 72,704 bytes in 1 blocks
==30979==   total heap usage: 7 allocs, 6 frees, 84,856 bytes allocated
==30979== 
==30979== LEAK SUMMARY:
==30979==    definitely lost: 0 bytes in 0 blocks
==30979==    indirectly lost: 0 bytes in 0 blocks
==30979==      possibly lost: 0 bytes in 0 blocks
==30979==    still reachable: 72,704 bytes in 1 blocks
==30979==         suppressed: 0 bytes in 0 blocks
==30979== Rerun with --leak-check=full to see details of leaked memory
==30979== 
==30979== For counts of detected and suppressed errors, rerun with: -v
==30979== Use --track-origins=yes to see where uninitialised values come from
==30979== ERROR SUMMARY: 1 errors from 1 contexts (suppressed: 0 from 0)

可以看到,虽然有error,但已经没有valgrind: Unrecognised instruction at address的错误信息了。

小结

建议使用源码编译安装valgrind,减少其自身bug带来的误判。

李迟 2018.8.30 周四 夜

发布了483 篇原创文章 · 获赞 254 · 访问量 111万+
展开阅读全文

没有更多推荐了,返回首页

©️2019 CSDN 皮肤主题: 技术工厂 设计师: CSDN官方博客

分享到微信朋友圈

×

扫一扫,手机浏览