19:01:07 <wumpus> #startmeeting
19:01:07 <lightningbot> Meeting started Thu Aug  2 19:01:07 2018 UTC.  The chair is wumpus. Information about MeetBot at http://wiki.debian.org/MeetBot.
19:01:07 <lightningbot> Useful Commands: #action #agreed #help #info #idea #link #topic.
19:01:12 <jnewbery> hi!
19:01:31 <promag> hi
19:01:33 <cfields> hi
19:01:35 <provoostenator> hi
19:01:42 <jonasschnelli> hi
19:02:03 <wumpus> #bitcoin-core-dev Meeting: wumpus sipa gmaxwell jonasschnelli morcos luke-jr btcdrak sdaftuar jtimon cfields petertodd kanzure bluematt instagibbs phantomcircuit codeshark michagogo marcofalke paveljanik NicolasDorier jl2012 achow101 meshcollider jnewbery maaku fanquake promag provoostenator
19:02:15 <kanzure> hi.
19:02:19 <achow101> hi
19:02:23 <meshcollider> Hi
19:02:34 <instagibbs> ello
19:03:10 <gmaxwell> Hi.
19:03:24 <wumpus> topics?
19:04:08 <luke-jr> crickets
19:04:31 <wumpus> crickets are... good I guess
19:04:52 <wumpus> 0.17 PRs: https://github.com/bitcoin/bitcoin/pulls?q=is%3Aopen+is%3Apr+milestone%3A0.17.0
19:05:22 <wumpus> 0.17 issues: https://github.com/bitcoin/bitcoin/issues?q=is%3Aopen+is%3Aissue+milestone%3A0.17.0
19:05:52 <luke-jr> I guess we could discuss the CXXFLAGS stuff
19:05:56 <wumpus> 0.17 release schedule: #12624
19:05:57 <gribble> https://github.com/bitcoin/bitcoin/issues/12624 | Release schedule for 0.17.0 · Issue #12624 · bitcoin/bitcoin · GitHub
19:06:00 <luke-jr> I don't really have a good solution for it
19:06:28 <wumpus> #topic CXXFLAGS stuff
19:06:35 <gmaxwell> Whats the issue?
19:06:58 <luke-jr> gmaxwell: autotools forces user CXXFLAGS after our own; so when the user builds with -mno-avx2, the build simply fails
19:07:02 <provoostenator> #13789
19:07:04 <gribble> https://github.com/bitcoin/bitcoin/issues/13789 | crypto/sha256: Use pragmas to enforce necessary intrinsics for GCC and Clang by luke-jr · Pull Request #13789 · bitcoin/bitcoin · GitHub
19:07:19 <cfields> eh?
19:07:29 <cfields> it doesn't force it, we do that ourselves
19:07:35 <gmaxwell> can someone please drop the registed users +q for now? sdaftuar is muted.
19:07:35 <luke-jr> ie, autotools calls the compiler with our -mavx2 *followed by* -mno-avx2
19:07:40 <wumpus> well apparaently there's a problem where the build system passes the wrong flags, and luke-jr's PR works around it with pragmas
19:07:43 <gmaxwell> or at least voice sdaftuar
19:07:43 <wumpus> that seems wrong to me
19:07:50 <cfields> the intention is for any user-passed flags to be able to override the ones we add. If that's not the case, it's a bug.
19:07:59 <luke-jr> cfields: that's the problem in this case
19:08:06 <luke-jr> we can't let the user override -mavx2 here
19:08:09 <sdaftuar> hi
19:08:23 <cfields> luke-jr: you mean we shouldn't, or we currently don't?
19:08:43 <luke-jr> cfields: autotools makes it impossible to override user CXXFLAGS, but for these files we must or we fail to compile
19:08:45 <gmaxwell> luke-jr: why would the user ever pass -mno-avx2 in the first place?  We shouldn't be using -mavx except on special files that need it to compile.
19:08:47 <cfields> oh, I see what you mean
19:08:57 <luke-jr> gmaxwell: to avoid AVX2 instructions
19:09:02 <cfields> luke-jr: eh?
19:09:07 <luke-jr> gmaxwell: it's those special files that fail to compile
19:09:16 <sipa> hi!
19:09:26 <cfields> luke-jr: fail how? do we just need to check for more intrinsics?
19:09:31 <provoostenator> (I guess it was crickets and the muffled voice of sdaftuar  in the dinstance)
19:09:39 <gmaxwell> luke-jr: If the user wants to avoid executing avx2 instructions they need do nothing. They don't have to pass any special options.
19:09:59 <gmaxwell> cfields: he is saying that if he builds with CXXFLAGS=-mno-avx2 the compile fails.
19:10:10 <luke-jr> gmaxwell: AVX2 is enabled by default with some -march options
19:10:12 <cfields> gmaxwell: I assume the issue is some failures to compile because of a busted compiler, so there's a desire to be able to avoid them entirely.
19:10:21 <luke-jr> https://github.com/bitcoin/bitcoin/issues/13758
19:10:38 <gmaxwell> luke-jr: why would anyone -march=<thing with avx2> then -mno-avx2?  that just seems busted.
19:10:38 <cfields> ooooooh
19:11:04 <wumpus> it looks like a really contrives scenario to me
19:11:04 <luke-jr> gmaxwell: I'm not sure why laurentb is doing it, but there's no reason they shouldn't be able to
19:11:22 <gmaxwell> in any case, why not detect the -mno-avx2 and then don't even compile the file?
19:11:24 <wumpus> not worth it polluting the code with all kinds of compiler specific pragmas at least
19:11:25 <cfields> ok, there are plenty of ways to solve that. I think we can just discuss on the github issue?
19:11:35 <luke-jr> gmaxwell: autotools says we're supposed to allow changing CXXFLAGS after configure
19:11:43 <wumpus> agree with cfields , there's no hurry with this
19:11:44 <luke-jr> make CXXFLAGS=…
19:11:47 <luke-jr> okay
19:12:03 <gmaxwell> in any case -march=<hardware that has avx2> -mno-avx2 sounds like a misguided thing that we shouldn't take a lot of complexity to support, unless someone knows otherwise.
19:12:09 <cfields> gmaxwell: we turn off all flags except the ones we're testing when doing the autoconf checks so that they don't cause unrelated errors.
19:12:10 <gmaxwell> Esp because there is -mtune
19:12:27 <luke-jr> cfields: no we don't
19:12:38 <MarcoFalke> Unassigned the issues from the 0.17 milestone for now
19:12:42 <cfields> luke-jr: ones that we add, we do
19:13:02 <luke-jr> MarcoFalke: it's a must-fix for 0.17
19:13:22 <wumpus> no, it's not a must-fix for 0.17
19:13:26 <gmaxwell> I don't see how this is a must fix.
19:13:33 <wumpus> agree fully with MarcoFalke
19:13:36 <luke-jr> broken build system..
19:13:48 <wumpus> any other topics?
19:13:51 <gmaxwell> "User sets weird options which seem to make no sense as we know, and then something that arguably should work but fails" is not really blocker material.
19:13:59 <meshcollider> Lol
19:14:00 <provoostenator> Windows
19:14:03 <provoostenator> (topic)
19:14:04 <MarcoFalke> I left the pulls for review in the 0.17 milestone, so if reviewers like them, they can still be merged
19:14:21 <luke-jr> provoostenator: what about Windows? :p
19:14:21 <provoostenator> As in: do we want to fix the Windows unicode stuff, given that there's still two weeks?
19:14:25 <wumpus> right, it's annoying for the specific user, but if you have really specific needs like that you can patch around it
19:14:35 <MarcoFalke> Let's do the unicode stuff for 0.18
19:14:36 <wumpus> #topic Windows (provoostenator)
19:14:37 <provoostenator> I think the opinion in the ticket was no.
19:14:51 <MarcoFalke> It would require a leveldb bump and major changes
19:14:55 <gmaxwell> wumpus: we should find out what the user is doing, might just be some greater confusion... like they want to benchmark each way or something.
19:15:05 <MarcoFalke> Not sure if we can review and test that in such a short time frame
19:15:25 <provoostenator> Ok, and we're not getting tons of reports about this either?
19:15:31 <jonasschnelli> Is the unicode stuff just about filenames (wallet)?
19:15:32 <wumpus> gmaxwell: right!
19:15:43 <MarcoFalke> I am looking into restoring the proper warning for -wallet=non-ascii, but that should be all for 0.17
19:15:51 <MarcoFalke> jonasschnelli: Also datadir
19:15:52 <sipa> only half here, but feel free to ping me
19:15:58 <provoostenator> I think it's mostly filename yes, but also labels.
19:16:30 <jonasschnelli> Yeah. We should fix that. But this seems to be open since forever and I don't see a pressing need for 0.17
19:16:33 <provoostenator> But afaik I know it works if your system locale is set "correctly".
19:16:53 <MarcoFalke> jonasschnelli: Indeed, anything that is not a regression should go into 0.18
19:17:04 <cfields> provoostenator: due to the nature of the bug, I think many of the people who would be reporting it may not speak english. So the significance may be a little under-represented.
19:17:12 <cfields> s/bug/issue/
19:17:16 <gmaxwell> I was trying to say what cfields just said.
19:17:22 <meshcollider> Good point
19:17:29 <jonasschnelli> Yes. Good point.
19:17:35 <luke-jr> provoostenator: wait, labels are broken?
19:17:37 <gmaxwell> We shouldn't take few reports to mean few issues... but it the change is invasive and not ready, and the issue isn't new...
19:17:58 <cfields> gmaxwell: agreed. -1 from me as well. Just wanted to throw that out there.
19:17:59 <provoostenator> luke-jr: not sure, try with an english system locale and then adding Chinese labels, I can't remember if that works.
19:18:14 <luke-jr> provoostenator: last I checked, we had functional tests for non-English labels :/
19:18:17 <provoostenator> But with a Chinese system locale it does work afaik, hence it doesn't seem super urgent.
19:18:24 <wumpus> I think it's a serious issue
19:18:27 <MarcoFalke> We can backport to 0.17.1, if it qualifies as bug fix
19:18:28 <provoostenator> Functional tests that run on linux.
19:18:36 <wumpus> but it's risky to do this last minute
19:18:42 <wumpus> and it's not a regression
19:18:44 <gmaxwell> MarcoFalke: +1
19:19:05 <wumpus> MarcoFalke: agree
19:19:08 <provoostenator> Maybe just merge it into master after the 0.17 split so it gets testing quickly. Then we could always backport it if there's strong demand?
19:19:14 <gmaxwell> it's a bug, maybe one not worth backporting depending on how tidy the fix is.
19:19:37 <wumpus> it's unfortunate that this requires such invasive changes for windows
19:19:52 <wumpus> specific changes not required for other OSes
19:19:59 <ken2812221> This is really hard to fix.
19:20:03 <wumpus> which means most of use cannot test it usefully
19:20:24 <luke-jr> can it be reproduced in WINE?
19:20:43 <provoostenator> luke-jr: I haven't tried, I just use a virtual box with Windows 10
19:20:50 <MarcoFalke> I don't think we should fall back to WINE for testing that issue
19:21:16 <provoostenator> Someone recently offered to run and maintain Windows integration builds
19:21:17 <meshcollider> It doesn't sound like WINE would have the same issue tbh
19:21:27 <gmaxwell> Health professions recommend against trying to use wine to solve your problems.
19:21:49 <cfields> careful we don't spiral to homebrew...
19:21:50 <MarcoFalke> This really needs to be tested on native Windows (Not against testing it in wine additionally, though)
19:22:06 <wumpus> yes
19:22:18 <provoostenator> #12613
19:22:19 <gribble> https://github.com/bitcoin/bitcoin/issues/12613 | [CI] Adding MSVC build to CI check with Appveyor · Issue #12613 · bitcoin/bitcoin · GitHub
19:22:36 <ken2812221> If there is a way to run functionfal test on Window
19:22:43 <wumpus> MSVC is a orthogonal issue, I think, to solving this in the gitian builds
19:22:48 <MarcoFalke> I occasionally run them  on appveyro
19:23:10 <luke-jr> MarcoFalke: it'd be nice to have Travis test it in the meantime
19:23:39 <wumpus> well not orthogonal but I wouldn't be surprised if there are differences mingw versus MSVC in unicoode handling
19:23:40 <MarcoFalke> Sure, if you can get this to work on travis, why not
19:24:07 <MarcoFalke> ken2812221: I think I also got them to run on my windows vm once
19:24:21 <wumpus> gmaxwell: this is one of the cases where you'd recommend stronger liquior instead.
19:24:38 <ken2812221> MSVC adds a unicode version of fstream, but mingw does not have that.
19:24:52 <meshcollider> MarcoFalke: "once" sounds promising ;)
19:25:16 <MarcoFalke> I could try once more and then write down the steps I did, heh
19:25:59 <cfields> ken2812221: huh, we could potentially add it and upstream to mingw, then.
19:26:52 <wumpus> why does utf-8 need a special fstream
19:26:54 <wumpus> I'm confused
19:27:06 <wumpus> why does this have to be so messed up
19:27:16 <ken2812221> wumpus: just filename
19:27:33 <ken2812221> It's fine to read the stream
19:27:34 <wumpus> can't we just drop windows support
19:27:38 * wumpus ducks
19:27:41 <meshcollider> Lol
19:28:02 <provoostenator> Maybe run it inside at little Linux VM? Can't be worse than Electron :-)
19:28:07 <wumpus> yesss
19:28:21 <wumpus> windows 10 already includes ubuntu right
19:28:29 <midnightmagic> it's messed up because windows can still run old windows crap and they have to support old api forever, including all the crappy api they built when the company was on the rocks before they discovered scm
19:28:37 <wumpus> let's dump the win32 garbage and use that...
19:28:41 <gmaxwell> lol. really with the proposed builder stuff, would building a whole linux system really be worse? :P
19:28:50 <cfields> wumpus: Doesn't a brand new win10 update add a bunch of compat stuff that would avoid these issues? I don't think it'd be crazy to consider dropping support for anything less than that reasonably soon.
19:29:04 <wumpus> cfields: it does! that's true
19:29:05 <provoostenator> Not by default, but you can install it. I once did the inception thing: Gitian builder VM inside Ubuntu inside Windows inside Virtual Box on Mac. It crashed.
19:29:27 <gmaxwell> cfields: I dont know the details, but my understanding is that a lot of people don't want to run windows 10 due to integrated adware or something?
19:29:56 <provoostenator> gmaxwell: the latest update, at least for me, even forced me to allow analytics data
19:30:07 <cfields> whoa
19:30:17 <meshcollider> Yeah it's much more invasive
19:30:27 <provoostenator> They're definately going for the get hacked by random people on the internet or get spied on by us sales pitch.
19:30:38 <cfields> gmaxwell: I suppose those people would be used to running outdated versions of things, then :\
19:30:58 <provoostenator> (or both)
19:31:22 <wumpus> ugh
19:31:35 <wumpus> then again, this is not a regression
19:31:40 <wumpus> what works on windows, works.
19:31:56 <gmaxwell> thats a possibility too, and better than dropping support for windows...
19:32:07 <wumpus> the question is whether we should change 10000 lines just to accomodate unicode in it
19:32:38 <gmaxwell> It seems really surprising that we'd need to change more than some wrapper functions.
19:32:45 <luke-jr> ^
19:32:55 <wumpus> yes it seems a design error in the first place if this cannot be wrapped somehow
19:33:27 <cfields> agreed. And that's a reasonable thing to fix, imo. IIRC it also means we have trouble adding filesystem sandboxing.
19:33:28 <wumpus> I do think ken2812221's PR is fairly ok in that regard
19:34:30 <wumpus> yes the reason I did the fs:: stuff is to allow for filesystem sandboxing
19:35:43 <cfields> wumpus: more is required though, right? I assume something along the lines of ken2812221's PR would help?
19:36:55 <wumpus> cfields: I had it down to only changes in fs.h and adding a fs.cpp, afaik
19:37:08 <cfields> ah, ok. nm then.
19:37:38 <wumpus> but maybe more is needed now that the code evolved further...
19:38:07 <gmaxwell> I hope we can find a set of changes that a reasonable independant of windows, so that the windows fix is just a couple files changed.
19:38:25 <wumpus> I think my main drawback in ken2812221 's PR is that it changes .string to stringu8 or something, which seems unnecessary, all strings should be utf-8
19:38:48 <wumpus> if you make sure the wrapper does that you don't need to change it everywhere in the code
19:39:42 <ken2812221> wunpus: That's how boost's path work. It need to pass a utf8 converter.
19:40:04 <wumpus> ken2812221: yes...
19:40:21 <wumpus> but our own wrapper could do that automatically
19:40:23 <ken2812221> Actually I have thought that before, but it need to patch boost.
19:40:36 <luke-jr> didn't we want to get rid of boost anyway?
19:40:39 <wumpus> or wrap boost
19:40:40 <luke-jr> maybe this is a good opportunity
19:40:47 <wumpus> we'll do so, in time
19:41:12 <wumpus> yes
19:41:21 <wumpus> any other topics?
19:41:58 <gmaxwell> Topic suggestion: Leveldb FD usage on x86_64
19:42:59 <wumpus> #topic leveldb FD usage on x86_64 (gmaxwell)
19:43:13 <gmaxwell> There was a recent report from a user hitting the select limit on his x86_64 linux host. Inspection with lsof shows that leveldb is using a lot of FDs on nodes where we expected it to be mostly using mmap.  Apparently leveldb has a number of mmaps limit, as far as I know there isn't any reason we shouldn't increase it.
19:43:47 <gmaxwell> (seperately we should move to using poll ... but increasing the mmap limit should be a ~1 line change, unless someone knows a reason to not do so)
19:44:23 <wumpus> I think this limit was increased recently
19:44:38 <wumpus> specifically for 64-bit OSes, as it was deemed to be no problem
19:44:51 <wumpus> of course, I'm not surprised it turns out to be it is... :/
19:45:27 <wumpus> the reason to increase it was something with performance
19:46:03 <wumpus> #12495 maybe?
19:46:06 <gribble> https://github.com/bitcoin/bitcoin/issues/12495 | Increase LevelDB max_open_files by eklitzke · Pull Request #12495 · bitcoin/bitcoin · GitHub
19:46:21 <wumpus> "utACK ccedbaf - with the comment that I think relying on undocumented behavior of leveldb is risky. "
19:46:35 <wumpus> yeah, of course this is going to bite us in the ass
19:47:25 <wumpus> so revert?
19:47:33 <cfields> upstream has updated a bunch of stuff lately (namely moving to c++11 by default and dropping the need for lots of platform-dependent code). Has this been addressed as well, by any chance?
19:48:08 <gmaxwell> wumpus: hm so reading this it sounds to me that there is a seperate limit of 1000 mmaps. And running into that limit is what is causing more FDs to be used than we expected.
19:48:23 <gmaxwell> wumpus: so it might be possible to not revert that patch, but instead increase the maximum mmaps.
19:48:24 <cfields> (that bump is going to be painful, btw)
19:49:21 <wumpus> gmaxwell: I like switching to anything else than select() though!
19:49:29 <wumpus> we've been stuck with select with no good reason
19:49:32 <gmaxwell> wumpus: yes, I think we should do that too!
19:49:51 <sipa> gmaxwell: that would explain why i saw exactly 999 mmaps
19:49:54 <wumpus> my reason used to be, well, we're going to switch to using libevent for P2P any time now!
19:49:58 <gmaxwell> BlueMatt: and phantomcircuit: both previously had private patches to change to poll.
19:49:58 <wumpus> but that's been years
19:50:01 <wumpus> let's just do it
19:50:17 <sipa> wumpus: any time now!
19:50:19 <wumpus> change to poll() for unix-ish OSes
19:50:24 <gmaxwell> sipa: my comments above about the map limit are based on ossifrage (reporter's) work, and some of the comments in that PR above.
19:50:30 <cfields> wumpus: sorry :(
19:51:05 <gmaxwell> In any case, I think there are two issues:  (1) switch to poll (duh but can we do it for 0.17) and (2) leveldb should probably be allowed more maps unless we know some reason to not do that.
19:51:06 <wumpus> cfields: nooOO sorry I didn't mean it as an attack on you, I could have done it, many other people could have done it, but we didn't
19:51:24 <gmaxwell> and maybe (3) potentially revert that PR; but given my understanding, I don't think we need to.
19:51:49 <gmaxwell> Leveldb being map limited is probably going to end up bad for performance even if we no longer had the FD issues.
19:51:49 <luke-jr> man mmap doesn't talk about a specific mmap limit
19:52:10 <wumpus> there is no mmap limit afaik
19:52:15 <gmaxwell> ^
19:52:38 <gmaxwell> Thats where my concern about increasing leveldb's mmap usage comes from-- I dont understand why the limit is there.
19:52:41 <cfields> there's definitely something that limits it. Some unrelated sysctl ?
19:52:49 <cfields> I know i've seen it documented somewhere.
19:52:49 <gmaxwell> Unless they're trying to reduce TLB thrashing or something.
19:54:41 <wumpus> cfields: for 0.18 we should definitely update leveldb though
19:54:55 <ossifrage> I bumped my mmap file limit up to 4k, currently 1180 ldb files are mapped and it isn't using any fds for ldb files
19:54:57 <gmaxwell> https://github.com/bitcoin/bitcoin/pull/12495#issuecomment-367815005 < eklitzke suggests increasing leveldb's mmap limit here. ossifrage also apparently did that and reported it fixed his problems, though his specific problems are likely better fixed by switching to poll.
19:55:04 <wumpus> for 0.17 I'd propose to keep it like this, not do anything wild...
19:55:13 <sdaftuar> cfields: if i read correctly, the internet says sysctl vm.max_map_count
19:55:40 <gmaxwell> I thought it was probably too late to switch 0.17 to poll, but increasing the leveldb map count seems like an easy mitigation that probably improves performance.
19:56:01 <gmaxwell> [gmaxwell@bean ~]$ sysctl vm.max_map_count
19:56:01 <gmaxwell> vm.max_map_count = 65530
19:56:12 <cfields> is there a per-process one?
19:56:35 <ossifrage> vm.max_map_count = 65530 (same on my box, but that number seems a bit low to be a global count)
19:56:37 <cfields> gmaxwell: I'd be really nervous about that. There are so many specific quirks related to select/poll/epoll/kqueue
19:57:06 <ossifrage> It is per process
19:57:19 <cfields> (not against the change at all, only hesitant for the 0.17 cycle)
19:57:34 <wumpus> vm.max_map_count = 65530
19:57:44 <luke-jr> vm.max_map_count = 65530
19:57:46 <wumpus> checked on 3 linux boxes, same output
19:57:54 <gmaxwell> likewise.
19:57:54 <wumpus> I... strongly doubt this is an issue
19:58:24 <gmaxwell> So it might be prudent for 0.17 to bump the leveldb max to 4k or something.
19:58:32 <wumpus> if you manage to map 65000+ files, well, I"m not surprised you run into trouble
19:58:45 <gmaxwell> I still just wish I better understood why they had that 1000 default.
19:59:01 <luke-jr> what happens if the limit is hit?
19:59:28 <phantomcircuit> gmaxwell, mine only worked on linux and was actually just broken on windows
20:00:05 <wumpus> *dong*
20:00:19 <wumpus> #endmeeting