19:02:37 #startmeeting 19:02:37 Meeting started Thu Sep 24 19:02:37 2020 UTC. The chair is wumpus. Information about MeetBot at http://wiki.debian.org/MeetBot. 19:02:37 Useful Commands: #action #agreed #help #info #idea #link #topic. 19:02:41 hi 19:02:43 hi 19:02:43 hi 19:02:45 hi 19:02:46 hi 19:02:50 hi 19:02:51 hi 19:02:53 hi 19:02:53 hi 19:03:09 #bitcoin-core-dev Meeting: wumpus sipa gmaxwell jonasschnelli morcos luke-jr sdaftuar jtimon cfields petertodd kanzure bluematt instagibbs phantomcircuit codeshark michagogo marcofalke paveljanik NicolasDorier jl2012 achow101 meshcollider jnewbery maaku fanquake promag provoostenator aj Chris_Stewart_5 dongcarl gwillen jamesob ken281221 ryanofsky gleb moneyball kvaciral ariard digi_james 19:03:11 amiti fjahr jeremyrubin lightlike emilengler jonatack hebasto jb55 elichai2 19:03:28 hi 19:03:49 hi 19:03:58 one proposed topic: what sipa just said (will have to look up in the log what :-) 19:04:18 wumpus: #20005 19:04:19 https://github.com/bitcoin/bitcoin/issues/20005 | memcmp with constants that contain zero bytes are broken in GCC · Issue #20005 · bitcoin/bitcoin · GitHub 19:04:38 https://dpaste.com/GU4GPSPB8 <-- affected symbols in bitcoind 19:04:42 hi 19:04:42 I see 19:05:11 yeah i just wanted to draw attention to this potentially very scary bug 19:05:26 how did they manage to break such a basic library function 19:05:30 hi 19:05:38 wumpus: they turned it into strcmp sometimes it looks like 19:05:43 wumpus: it's not a libc bug, it's bug in gcc's implementation of __builtin_memcmp 19:05:50 at least the consensus code seems to have dodged it 19:06:08 but otherwise yes, this is really concerning 19:06:20 i believe none of our gitian builds are affected, as we use older GCCs 19:06:27 can someone verify that? 19:06:32 that's at least good to know, let me see 19:06:50 macOS build should use LLVM IIRC 19:06:57 luke-jr: indeed 19:07:06 sipa: is this a generaly class of bugs do you reckon or pretty specific to memcmp? should we be auditing all builtins? 19:07:10 sipa: Gitian should use 7 and 8 19:07:11 I *assume* LLVM didn't reimplement this bug 19:07:13 gcc 8 it seems 19:07:46 jeremyrubin: i don't think we have the manpower to do actual auditing of all of gcc's builtins (and other potential bugs), nor is that our job 19:07:50 bionic uses gcc 7, right? 19:08:06 jonasschnelli: yes 19:08:12 jonasschnelli: but we explicitly install 8 for some platforms 19:08:17 ^ 19:08:28 jeremyrubin: i think it's important to note that while the bug was known, we stumbled upon it in the libsecp256k1 repo, in a test 19:08:48 Ah. Yes. Arm / risc uses gcc-8 19:08:54 hmm 19:08:56 so that's kind of a sign that our processes are at least capable of catching some issues like this 19:09:06 sipa: should I be concerned I *didn't* get any libsecp symbols in my comparison? 19:09:25 I honestly don't now how to prevent us being affected by problems like this 19:09:26 or is the test only in a newer version? 19:09:26 luke-jr: no, i think this is expected (it's in code added in a new test) 19:09:28 computers are broken 19:09:35 indeed 19:10:09 reality is often disappointing 19:10:11 I can understand libc bugs in some cases but this is something so low-level 19:10:40 anyway, i think we should see what performance impact building with -fno-builtin-memcmp has 19:10:51 sipa: yes but only for affected compilers 19:10:58 if that is sufficiently low, i think that's a sufficient solution - perhaps restricted to GCC 9 and 10 19:10:59 sipa: that won't fix C++ afaik 19:11:06 luke-jr: it will 19:11:29 luke-jr: std::lexicographical_compare (in the .h file) just calls __builting_memcmp 19:11:38 Should we also add a test that fails with gcc9/10? 19:11:39 I don't even care about peformance there, just add the option, it needs to be correct, if people want performance don't use a broken compiler 19:11:46 MarcoFalke: definitely 19:11:52 sipa: -fno-builtin-* doesn't affect __builtin_* 19:12:00 luke-jr: hmm! 19:12:09 that's a good point, let's test that 19:12:18 that's the important thing thoug hthat it fixes the issue, a test like MarcoFalke says makes a lot of sense 19:12:35 at least GCC's docs suggest to whitelist builtins, one should use -fno-builtin and call __builtin_* directly ;) 19:13:00 or just fail compiling on affected compilers, they're going to backport a fix right? 19:13:09 wumpus: yes 19:13:12 wumpus: supposedly, but it looks non-trivial and they haven't yet 19:13:28 I'd prefer things to build at all on compilers with a broken memcmp 19:13:33 +not 19:13:44 it's fixed in master, but not in any release 19:14:06 gcc 9.3 and 10.1 seem afficted 19:14:15 (whatever we end up doing, we need a test for this in configure) 19:14:18 personally, I neutered my builtin memcmp to always fail (fallback to libc) and am rebuilding my system 19:14:40 wumpus: my first thought was adding a configure test that exploits the bug... but that won't work when crosscompiling 19:14:43 i think? 19:14:57 well, crosscompiling, we can't run it 19:14:58 sipa: ... ugh 19:15:03 why not add a sanity check? we have others 19:15:09 unless we assume that building for the host platform is always similarly affected as the target platform 19:15:12 yes, you'd need to run it to figure it out right 19:15:14 Is memcmp constexpr? 19:15:21 Maybe it can static_assert... 19:15:26 jeremyrubin: it's a C function constexpr is a C++ concept 19:15:29 holy shit this is *really* bad 19:15:41 sipa: bad assumption 19:15:51 luke-jr: yes, i think so 19:15:52 maybe it could expect the symbols generated or something, if it generates a call to strcpy instead? 19:16:08 wumpus: there are no calls involved 19:16:09 wumpus: it doesn't 19:16:14 it's the autogenerated builtin code that is wrong 19:16:19 wumpus: it's unrolling the memcmp 19:16:22 ok... 19:16:31 which makes it implementation something more or less equal to strcmp rather than memcmp 19:16:52 basically turns it into if (a[0] == "x" && a[1] == "x" && …) 19:17:11 aiui 19:17:24 that's impossible to check in a platform independent way 19:17:44 it's unclear to me if the patch is in the gcc 10.2 release? 19:17:50 jonatack: no 19:17:52 jonatack: it is not in any release 19:18:00 thank you 19:18:03 fwiw, this is (part of) the GCC patch that fixes this: https://gcc.gnu.org/git/?p=gcc.git;a=blobdiff;f=gcc/builtins.c;h=228db78f32bfdcce31e23850e878843d7a45adcc;hp=eb662112b32e5ad377d2a865d9977c64dc12cc93;hb=d5803b9876b3d11c93d1a10fabb3fbb1c4a14bd6;hpb=3e99ed65cbedf7a6c0abb9cd63c191326995fd34 19:18:21 Could we wrap memcmp with a compile time wrapper that static asserts we aren't using a const array with a 0 byte? Not sure if you can enable-if-fu around constexpr args. 19:18:38 it's hard to read but I think it's possible to deduce a few condition under which this bug can't happen 19:18:42 jeremyrubin: that's not a solution 19:18:43 The bug only happens when it's a const array right? 19:19:05 jeremyrubin: if we can identify all potentially affected code paths, we could just replace them with my_memcmp that doesn't have the bug too 19:19:31 ah right you're worried about other stdlib calls and stuff. 19:19:37 but C++ also generates memcmps it's not just our own code 19:19:49 it's a very low-level function 19:19:52 sipa: I just did that? :P 19:20:02 right - "all code paths" also contains std::lexicographical_compare and a few other things 19:20:10 luke-jr: right, but not in a guaranteed automated way 19:20:23 luke-jr: it's a really useful way of finding potential issues now, though 19:20:26 thanks for that 19:21:29 i've tried making configure detect the version of GCC, and so far been unsuccesful 19:22:03 my google-fu led me to 9-year old stackoverflow posts with comments "ah yes, that doesn't work anymore, let's just delete that functionality" from the m4 gcc scripts 19:22:09 (though i'm sure people with more autoconf skills can do better) 19:22:11 >_< 19:22:28 the idea is that in general you should test for compiler features, rather than exact versions 19:22:36 which is absolutely true, except for bug workarounds :p 19:22:37 and as soon as the patch is backported to gcc 9 and 10 it will become more difficult too 19:22:38 sipa: ideally, yes 19:22:52 sipa: even for bug workarounds, you can't use the version number to see that my GCC is patched ;) 19:23:02 that's true, but it's a safe superset 19:23:07 worse, some distributions might backport the single patch without changing the version number 19:23:25 sure, it dpeends on what it does in that case 19:23:26 anyway, what's the plan? 19:23:33 if it only passes an extra command line flag it's ok 19:23:35 i think the first thing to do is add a unit test that catches the bug 19:23:54 sipa: why not sanity test? 19:23:55 though we also need to make sure it doesn't become an "expected failure" thing that people ignore 19:23:56 but it shouldn't fail the build for versions that have been fixed 19:24:08 luke-jr: that's a good point 19:24:24 a sanity check makes sense 19:25:01 the GCC bug tracker has a reduced test (note the first test *didn't* fail on my unpatched GCC) 19:25:29 luke-jr: yeah, it seems the cases you need to catch it differ slightly between gcc 9 and 10 19:25:49 hmm 19:26:01 I wonder if that means different code could be affected by GCC 9? 19:26:09 or is GCC 9's affected a strict subset of GCC 10's? 19:26:15 we also should figure out if -fno-builtin-memcmp fixes std::lexicographical_compare 19:26:33 I very doubt it does 19:26:37 luke-jr: i don't know 19:27:09 i don't want to monopolize the meeting with this; we can also continue discussion on #20005 19:27:11 https://github.com/bitcoin/bitcoin/issues/20005 | memcmp with constants that contain zero bytes are broken in GCC · Issue #20005 · bitcoin/bitcoin · GitHub 19:27:31 but i think this is deserves some priority 19:27:32 we don't have any proposed topics at least 19:27:58 but we can go to high priority for review if this topic is finished 19:28:54 #topic High priority for review 19:29:07 https://github.com/bitcoin/bitcoin/projects/8 13 blockers, 1 bugfix, 2 chasing concept ACK 19:29:14 i'm somewhat curious to add a patch to GCC that makes it emit a message if it applies this optimization wrongly. that may be possible with some effort given that we have the GCC patch that fixes the issue. but not sure if I'll find time 19:29:42 real_or_random: luke modified his compiler with the patch, and then tried to see which symbols differ before and after 19:29:59 (we can continue after the high priority for review topic) 19:30:03 sipa: oh I missed that, cool 19:30:06 yes 19:30:59 patching gcc is not a solution that we can rely on in general as most people don't build their own gcc 19:31:49 right, I was just doing it to quickly audit what was affected 19:31:58 sure, for that it's a great way 19:32:16 sipa: btw, I think it's using __builtin_memcmp for std::equal or == in PSBT stuff? 19:33:13 nothing to discuss about high priority for review PRs? 19:33:20 i guess not... 19:33:23 pls review rwconf :P 19:33:32 high-priority: #19845 might be close 19:33:35 https://github.com/bitcoin/bitcoin/issues/19845 | net: CNetAddr: add support to (un)serialize as ADDRv2 by vasild · Pull Request #19845 · bitcoin/bitcoin · GitHub 19:33:53 was good to see signet get in this week 19:34:08 indeed 19:34:47 wumpus: oh, we could build with -O1 ? 19:35:05 yes 19:35:14 * luke-jr wonders how -O1 works but -fno-builtin-memcmp doesn't 19:35:34 Doesn't that make IBD take twice as long? 19:35:40 probably 19:35:47 still better than broken builtins though 19:35:50 Might be better to refuse to start with the sanity check 19:35:50 maybe there's some more granular thing we can disable 19:35:59 MarcoFalke: both ☺ 19:36:06 luke-jr: could you build with -fno-builtin-memcmp without the gcc patch, and then (for the affected symbols from your previous list) check if which ones become equal to the fixed gcc output? 19:37:12 sipa: I don't have the unpatched compiler anymore :x 19:37:29 on an x86 server, however, -fno-builtin-memcmp did not fix the bug 19:37:36 for int x= __builtin_memcmp (a, "\0\0\0\0", 4); 19:38:00 sure, but we never call __builtin_memcmp directly from our code 19:38:49 speaking of that, what about other dependency libraries that can be assumed to have been built with the broken compiler 19:39:03 boost, libupnp, libevent ... 19:39:24 we can't fix everything 19:39:35 wumpus: is it even remotely possible to check what compiler compiled an already compiled library? 19:39:35 no, we can't 19:39:47 doesn't mean we can't work around it 19:39:49 I suppose that's an argument for just the sanity check, and not trying to workaround it 19:39:50 elichai2: not in gneral 19:40:35 like just not permitting building with known-to-be-affected compilers, which at least in some cases will avoid the situation where dependencies are built with the same thing 19:40:51 sipa: right 19:40:59 but... if your system itself has bugs introduced by this, i think there is very little we can do about it 19:41:15 for many distributions at least, the dependencies will have been built with the same gcc/g++ that is available 19:42:20 i believe my python3 binary is built with GCC 9.3 19:42:30 $ strings python3.8 | fgrep GCC 19:42:30 [GCC 9.3.0] 19:45:18 luke-jr: did the tests in the final patch fail on your unpatched gcc? 19:45:42 didn't try 19:46:49 I'm going to close the meeting, can continue this discussion afterwards and we're out of topics 19:46:51 #endmeeting