19:02:37 <wumpus> #startmeeting
19:02:37 <lightningbot> Meeting started Thu Sep 24 19:02:37 2020 UTC.  The chair is wumpus. Information about MeetBot at http://wiki.debian.org/MeetBot.
19:02:37 <lightningbot> Useful Commands: #action #agreed #help #info #idea #link #topic.
19:02:41 <jonasschnelli> hi
19:02:43 <jonatack> hi
19:02:43 <MarcoFalke> hi
19:02:45 <hebasto> hi
19:02:46 <amiti> hi
19:02:50 <fjahr> hi
19:02:51 <luke-jr> hi
19:02:53 <sipa> hi
19:02:53 <achow101> hi
19:03:09 <wumpus> #bitcoin-core-dev Meeting: wumpus sipa gmaxwell jonasschnelli morcos luke-jr sdaftuar jtimon cfields petertodd kanzure bluematt instagibbs phantomcircuit codeshark michagogo marcofalke paveljanik NicolasDorier jl2012 achow101 meshcollider jnewbery maaku fanquake promag provoostenator aj Chris_Stewart_5 dongcarl gwillen jamesob ken281221 ryanofsky gleb moneyball kvaciral ariard digi_james
19:03:11 <wumpus> amiti fjahr jeremyrubin lightlike emilengler jonatack hebasto jb55 elichai2
19:03:28 <meshcollider> hi
19:03:49 <jnewbery> hi
19:03:58 <wumpus> one proposed topic: what sipa just said (will have to look up in the log what :-)
19:04:18 <sipa> wumpus: #20005
19:04:19 <gribble> https://github.com/bitcoin/bitcoin/issues/20005 | memcmp with constants that contain zero bytes are broken in GCC · Issue #20005 · bitcoin/bitcoin · GitHub
19:04:38 <luke-jr> https://dpaste.com/GU4GPSPB8 <-- affected symbols in bitcoind
19:04:42 <jeremyrubin> hi
19:04:42 <wumpus> I see
19:05:11 <sipa> yeah i just wanted to draw attention to this potentially very scary bug
19:05:26 <wumpus> how did they manage to break such a basic library function
19:05:30 <kanzure> hi
19:05:38 <luke-jr> wumpus: they turned it into strcmp sometimes it looks like
19:05:43 <sipa> wumpus: it's not a libc bug, it's bug in gcc's implementation of __builtin_memcmp
19:05:50 <luke-jr> at least the consensus code seems to have dodged it
19:06:08 <sipa> but otherwise yes, this is really concerning
19:06:20 <sipa> i believe none of our gitian builds are affected, as we use older GCCs
19:06:27 <sipa> can someone verify that?
19:06:32 <wumpus> that's at least good to know, let me see
19:06:50 <luke-jr> macOS build should use LLVM IIRC
19:06:57 <sipa> luke-jr: indeed
19:07:06 <jeremyrubin> sipa: is this a generaly class of bugs do you reckon or pretty specific to memcmp? should we be auditing all builtins?
19:07:10 <MarcoFalke> sipa: Gitian should use 7 and 8
19:07:11 <luke-jr> I *assume* LLVM didn't reimplement this bug
19:07:13 <wumpus> gcc 8 it seems
19:07:46 <sipa> jeremyrubin: i don't think we have the manpower to do actual auditing of all of gcc's builtins (and other potential bugs), nor is that our job
19:07:50 <jonasschnelli> bionic uses gcc 7, right?
19:08:06 <luke-jr> jonasschnelli: yes
19:08:12 <wumpus> jonasschnelli: but we explicitly install 8 for some platforms
19:08:17 <luke-jr> ^
19:08:28 <sipa> jeremyrubin: i think it's important to note that while the bug was known, we stumbled upon it in the libsecp256k1 repo, in a test
19:08:48 <jonasschnelli> Ah. Yes. Arm / risc uses gcc-8
19:08:54 <luke-jr> hmm
19:08:56 <sipa> so that's kind of a sign that our processes are at least capable of catching some issues like this
19:09:06 <luke-jr> sipa: should I be concerned I *didn't* get any libsecp symbols in my comparison?
19:09:25 <wumpus> I honestly don't now how to prevent us being affected by problems like this
19:09:26 <luke-jr> or is the test only in a newer version?
19:09:26 <sipa> luke-jr: no, i think this is expected (it's in code added in a new test)
19:09:28 <wumpus> computers are broken
19:09:35 <jonasschnelli> indeed
19:10:09 <sipa> reality is often disappointing
19:10:11 <wumpus> I can understand libc bugs in some cases but this is something so low-level
19:10:40 <sipa> anyway, i think we should see what performance impact building with -fno-builtin-memcmp has
19:10:51 <wumpus> sipa: yes but only for affected compilers
19:10:58 <sipa> if that is sufficiently low, i think that's a sufficient solution - perhaps restricted to GCC 9 and 10
19:10:59 <luke-jr> sipa: that won't fix C++ afaik
19:11:06 <sipa> luke-jr: it will
19:11:29 <sipa> luke-jr: std::lexicographical_compare (in the .h file) just calls __builting_memcmp
19:11:38 <MarcoFalke> Should we also add a test that fails with gcc9/10?
19:11:39 <wumpus> I don't even care about peformance there, just add the option, it needs to be correct, if people want performance don't use a broken compiler
19:11:46 <sipa> MarcoFalke: definitely
19:11:52 <luke-jr> sipa: -fno-builtin-* doesn't affect __builtin_*
19:12:00 <sipa> luke-jr: hmm!
19:12:09 <sipa> that's a good point, let's test that
19:12:18 <wumpus> that's the important thing thoug hthat it fixes the issue, a test like MarcoFalke says makes a lot of sense
19:12:35 <luke-jr> at least GCC's docs suggest to whitelist builtins, one should use -fno-builtin and call __builtin_* directly ;)
19:13:00 <wumpus> or just fail compiling on affected compilers, they're going to backport a fix right?
19:13:09 <sipa> wumpus: yes
19:13:12 <luke-jr> wumpus: supposedly, but it looks non-trivial and they haven't yet
19:13:28 <wumpus> I'd prefer things to build at all on compilers with a broken memcmp
19:13:33 <wumpus> +not
19:13:44 <sipa> it's fixed in master, but not in any release
19:14:06 <jonatack> gcc 9.3 and 10.1 seem afficted
19:14:15 <wumpus> (whatever we end up doing, we need a test for this in configure)
19:14:18 <luke-jr> personally, I neutered my builtin memcmp to always fail (fallback to libc) and am rebuilding my system
19:14:40 <sipa> wumpus: my first thought was adding a configure test that exploits the bug... but that won't work when crosscompiling
19:14:43 <sipa> i think?
19:14:57 <luke-jr> well, crosscompiling, we can't run it
19:14:58 <wumpus> sipa: ... ugh
19:15:03 <luke-jr> why not add a sanity check? we have others
19:15:09 <sipa> unless we assume that building for the host platform is always similarly affected as the target platform
19:15:12 <wumpus> yes, you'd need to run it to figure it out right
19:15:14 <jeremyrubin> Is memcmp constexpr?
19:15:21 <jeremyrubin> Maybe it can static_assert...
19:15:26 <sipa> jeremyrubin: it's a C function constexpr is a C++ concept
19:15:29 <wumpus> holy shit this is *really* bad
19:15:41 <luke-jr> sipa: bad assumption
19:15:51 <sipa> luke-jr: yes, i think so
19:15:52 <wumpus> maybe it could expect the symbols generated or something, if it generates a call to strcpy instead?
19:16:08 <sipa> wumpus: there are no calls involved
19:16:09 <luke-jr> wumpus: it doesn't
19:16:14 <sipa> it's the autogenerated builtin code that is wrong
19:16:19 <luke-jr> wumpus: it's unrolling the memcmp
19:16:22 <wumpus> ok...
19:16:31 <sipa> which makes it implementation something more or less equal to strcmp rather than memcmp
19:16:52 <luke-jr> basically turns it into if (a[0] == "x" && a[1] == "x" && …)
19:17:11 <luke-jr> aiui
19:17:24 <wumpus> that's impossible to check in a platform independent way
19:17:44 <jonatack> it's unclear to me if the patch is in the gcc 10.2 release?
19:17:50 <luke-jr> jonatack: no
19:17:52 <sipa> jonatack: it is not in any release
19:18:00 <jonatack> thank you
19:18:03 <real_or_random> fwiw, this is (part of) the GCC patch that fixes this: https://gcc.gnu.org/git/?p=gcc.git;a=blobdiff;f=gcc/builtins.c;h=228db78f32bfdcce31e23850e878843d7a45adcc;hp=eb662112b32e5ad377d2a865d9977c64dc12cc93;hb=d5803b9876b3d11c93d1a10fabb3fbb1c4a14bd6;hpb=3e99ed65cbedf7a6c0abb9cd63c191326995fd34
19:18:21 <jeremyrubin> Could we wrap memcmp with a compile time wrapper that static asserts we aren't using a const array with a 0 byte? Not sure if you can enable-if-fu around constexpr args.
19:18:38 <real_or_random> it's hard to read but I think it's possible to deduce a few condition under which this bug can't happen
19:18:42 <sipa> jeremyrubin: that's not a solution
19:18:43 <jeremyrubin> The bug only happens when it's a const array right?
19:19:05 <sipa> jeremyrubin: if we can identify all potentially affected code paths, we could just replace them with my_memcmp that doesn't have the bug too
19:19:31 <jeremyrubin> ah right you're worried about other stdlib calls and stuff.
19:19:37 <wumpus> but C++ also generates memcmps it's not just our own code
19:19:49 <wumpus> it's a very low-level function
19:19:52 <luke-jr> sipa: I just did that? :P
19:20:02 <sipa> right - "all code paths" also contains std::lexicographical_compare and a few other things
19:20:10 <sipa> luke-jr: right, but not in a guaranteed automated way
19:20:23 <sipa> luke-jr: it's a really useful way of finding potential issues now, though
19:20:26 <sipa> thanks for that
19:21:29 <sipa> i've tried making configure detect the version of GCC, and so far been unsuccesful
19:22:03 <sipa> my google-fu led me to 9-year old stackoverflow posts with comments "ah yes, that doesn't work anymore, let's just delete that functionality" from the m4 gcc scripts
19:22:09 <sipa> (though i'm sure people with more autoconf skills can do better)
19:22:11 <luke-jr> >_<
19:22:28 <sipa> the idea is that in general you should test for compiler features, rather than exact versions
19:22:36 <sipa> which is absolutely true, except for bug workarounds :p
19:22:37 <wumpus> and as soon as the patch is backported to gcc 9 and 10 it will become more difficult too
19:22:38 <luke-jr> sipa: ideally, yes
19:22:52 <luke-jr> sipa: even for bug workarounds, you can't use the version number to see that my GCC is patched ;)
19:23:02 <sipa> that's true, but it's a safe superset
19:23:07 <wumpus> worse, some distributions might backport the single patch without changing the version number
19:23:25 <wumpus> sure, it dpeends on what it does in that case
19:23:26 <sipa> anyway, what's the plan?
19:23:33 <wumpus> if it only passes an extra command line flag it's ok
19:23:35 <sipa> i think the first thing to do is add a unit test that catches the bug
19:23:54 <luke-jr> sipa: why not sanity test?
19:23:55 <sipa> though we also need to make sure it doesn't become an "expected failure" thing that people ignore
19:23:56 <wumpus> but it shouldn't fail the build for versions that have been fixed
19:24:08 <sipa> luke-jr: that's a good point
19:24:24 <wumpus> a sanity check makes sense
19:25:01 <luke-jr> the GCC bug tracker has a reduced test (note the first test *didn't* fail on my unpatched GCC)
19:25:29 <sipa> luke-jr: yeah, it seems the cases you need to catch it differ slightly between gcc 9 and 10
19:25:49 <luke-jr> hmm
19:26:01 <luke-jr> I wonder if that means different code could be affected by GCC 9?
19:26:09 <luke-jr> or is GCC 9's affected a strict subset of GCC 10's?
19:26:15 <sipa> we also should figure out if -fno-builtin-memcmp fixes std::lexicographical_compare
19:26:33 <luke-jr> I very doubt it does
19:26:37 <sipa> luke-jr: i don't know
19:27:09 <sipa> i don't want to monopolize the meeting with this; we can also continue discussion on #20005
19:27:11 <gribble> https://github.com/bitcoin/bitcoin/issues/20005 | memcmp with constants that contain zero bytes are broken in GCC · Issue #20005 · bitcoin/bitcoin · GitHub
19:27:31 <sipa> but i think this is deserves some priority
19:27:32 <wumpus> we don't have any proposed topics at least
19:27:58 <wumpus> but we can go to high priority for review if this topic is finished
19:28:54 <wumpus> #topic High priority for review
19:29:07 <wumpus> https://github.com/bitcoin/bitcoin/projects/8  13 blockers, 1 bugfix, 2 chasing concept ACK
19:29:14 <real_or_random> i'm somewhat curious to add a patch to GCC that makes it emit a message if it applies this optimization wrongly. that may be possible with some effort given that we have the GCC patch that fixes the issue. but not sure if I'll find time
19:29:42 <sipa> real_or_random: luke modified his compiler with the patch, and then tried to see which symbols differ before and after
19:29:59 <sipa> (we can continue after the high priority for review topic)
19:30:03 <real_or_random> sipa: oh I missed that, cool
19:30:06 <real_or_random> yes
19:30:59 <wumpus> patching gcc is not a solution that we can rely on in general as most people don't build their own gcc
19:31:49 <luke-jr> right, I was just doing it to quickly audit what was affected
19:31:58 <wumpus> sure, for that it's a great way
19:32:16 <luke-jr> sipa: btw, I think it's using __builtin_memcmp for std::equal or == in PSBT stuff?
19:33:13 <wumpus> nothing to discuss about high priority for review PRs?
19:33:20 <sipa> i guess not...
19:33:23 <luke-jr> pls review rwconf :P
19:33:32 <jonatack> high-priority: #19845 might be close
19:33:35 <gribble> https://github.com/bitcoin/bitcoin/issues/19845 | net: CNetAddr: add support to (un)serialize as ADDRv2 by vasild · Pull Request #19845 · bitcoin/bitcoin · GitHub
19:33:53 <jonatack> was good to see signet get in this week
19:34:08 <sipa> indeed
19:34:47 <luke-jr> wumpus: oh, we could build with -O1 ?
19:35:05 <wumpus> yes
19:35:14 * luke-jr wonders how -O1 works but -fno-builtin-memcmp doesn't
19:35:34 <MarcoFalke> Doesn't that make IBD take twice as long?
19:35:40 <wumpus> probably
19:35:47 <wumpus> still better than broken builtins though
19:35:50 <MarcoFalke> Might be better to refuse to start with the sanity check
19:35:50 <luke-jr> maybe there's some more granular thing we can disable
19:35:59 <luke-jr> MarcoFalke: both ☺
19:36:06 <sipa> luke-jr: could you build with -fno-builtin-memcmp without the gcc patch, and then (for the affected symbols from your previous list) check if which ones become equal to the fixed gcc output?
19:37:12 <luke-jr> sipa: I don't have the unpatched compiler anymore :x
19:37:29 <luke-jr> on an x86 server, however, -fno-builtin-memcmp did not fix the bug
19:37:36 <luke-jr> for    int x= __builtin_memcmp (a, "\0\0\0\0", 4);
19:38:00 <sipa> sure, but we never call __builtin_memcmp directly from our code
19:38:49 <wumpus> speaking of that, what about other dependency libraries that can be assumed to have been built with the broken compiler
19:39:03 <wumpus> boost, libupnp, libevent ...
19:39:24 <luke-jr> we can't fix everything
19:39:35 <elichai2> wumpus: is it even remotely possible to check what compiler compiled an already compiled library?
19:39:35 <wumpus> no, we can't
19:39:47 <sipa> doesn't mean we can't work around it
19:39:49 <luke-jr> I suppose that's an argument for just the sanity check, and not trying to workaround it
19:39:50 <wumpus> elichai2: not in gneral
19:40:35 <sipa> like just not permitting building with known-to-be-affected compilers, which at least in some cases will avoid the situation where dependencies are built with the same thing
19:40:51 <wumpus> sipa: right
19:40:59 <sipa> but... if your system itself has bugs introduced by this, i think there is very little we can do about it
19:41:15 <wumpus> for many distributions at least, the dependencies will have been built with the same gcc/g++ that is available
19:42:20 <sipa> i believe my python3 binary is built with GCC 9.3
19:42:30 <sipa> $ strings python3.8 | fgrep GCC
19:42:30 <sipa> [GCC 9.3.0]
19:45:18 <jonatack> luke-jr: did the tests in the final patch fail on your unpatched gcc?
19:45:42 <luke-jr> didn't try
19:46:49 <wumpus> I'm going to close the meeting, can continue this discussion afterwards and we're out of topics
19:46:51 <wumpus> #endmeeting