close
The Wayback Machine - https://web.archive.org/web/20201126052119/https://github.com/GameFoundry/bsf/issues/139
Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Broken PCH's on Mac and Broken Unity Builds on Windows and Mac #139

Open
cwfitzgerald opened this issue May 31, 2018 · 21 comments
Open

Broken PCH's on Mac and Broken Unity Builds on Windows and Mac #139

cwfitzgerald opened this issue May 31, 2018 · 21 comments

Comments

@cwfitzgerald
Copy link
Contributor

@cwfitzgerald cwfitzgerald commented May 31, 2018

EDITED SUMMARY
PCH implemented on Windows, Linux. Still needs to be fixed for Mac.
Unity implemented on Linux. Still needs to be fixed for Windows and Mac.

ORIGINAL ISSUE
As I'm starting to work to integrate with bsf, I think having precompiled headers (and unity builds) will help with compile times significantly. The idiomatic way to do this with CMake is to use cotire. I've been doing some poking around and it seems perfectly possible to do. If I were to make those changes in a more proper way would you be interested in having them? I'm thinking a setup where if it detects cotire it uses it, but if it doesn't, it just pretends like nothing happened.

@jonesmz
Copy link
Contributor

@jonesmz jonesmz commented May 31, 2018

What is cotire? I've never herd of it.

@cwfitzgerald
Copy link
Contributor Author

@cwfitzgerald cwfitzgerald commented May 31, 2018

https://github.com/sakra/cotire

Mature, CMake COmpile TIme REducer. Does both unity build and pch with minimal configuration.

@jonesmz
Copy link
Contributor

@jonesmz jonesmz commented May 31, 2018

There's no built in precompiled header support in cmake already?

@cwfitzgerald
Copy link
Contributor Author

@cwfitzgerald cwfitzgerald commented May 31, 2018

Nope. Cotire is the way everyone does it. I mean, it is honestly the best CMake library I've ever had to use. You have to remember CMake just got LTO support, so it will be a bit until something as fancy as unity builds of precompiled headers come along.

@jonesmz
Copy link
Contributor

@jonesmz jonesmz commented May 31, 2018

Weird. I'm surprised CMake doesn't have native support.

Oh well, it's not like I've got any say in the matter. I think it sounds like a reasonable thing to support, though.

@cwfitzgerald
Copy link
Contributor Author

@cwfitzgerald cwfitzgerald commented May 31, 2018

It should only take a day to implement everything as the only real problem is X11/x.h which defines everything in the world, but you can basically tell it to not include that in the precompiled header. I just don't want to go to the trouble of doing it all for it to not be accepted :)

@BearishSun
Copy link
Member

@BearishSun BearishSun commented May 31, 2018

I'd be very interested in speeding up compile times. Reading up on cotire it sounds good, although its documentation says it automatically sets up a unity build for the specified target. Won't that hurt incremental builds?

Also it seems to be configurable per-target, I'm wondering how well will it work with main bsf target and then all the plugins dependent on that target (does only the main target get precompiled headers, or does every plugin gets their own?).

@cwfitzgerald
Copy link
Contributor Author

@cwfitzgerald cwfitzgerald commented May 31, 2018

Unity builds are separate from normal builds. When you run ninja all you get a normal build with precompiled headers (good for incremental builds), when you run ninja all_unity you get a unity build. These are good for just getting the thing built as fast as possible. I was also generally supprised by how much faster unity builds are through cotire. As they aren't actually true unity builds (they are N different builds, N being the number of cores you have) they take advantage of parallelism and can be done very quickly. The only issue is that you have to insure that everything will work when included, but that's my job :)

All targets get precompiled headers if they are over 3 source files, any less and precompiled headers won't really help (takes longer generating it then it saves in compilation). This default can be changed however to something that works better. The pch's are made up of all the headers that target includes that are not part of the local tree (so they don't precompile themselves)

If you're willing to use it, I will start working on a PR.

@BearishSun
Copy link
Member

@BearishSun BearishSun commented May 31, 2018

Yep sounds good :) Very interested to see the speed up.

@cwfitzgerald
Copy link
Contributor Author

@cwfitzgerald cwfitzgerald commented May 31, 2018

So preliminary testing is done, converted everything to use precompiled headers, building the examples (on my machine):

Build Type Regular PCH Difference
GCC-Debug 363sec 250sec 1.4x
GCC-Release 455sec 386sec 1.17x
Clang-Debug 316sec 182sec 1.72x
Clang-Release 422sec 325sec 1.30x

There are some problems around clang's internal forwarding "inttypes.h" header really not wanting to be included in a pch, which I need to find a clean cross distro solution for... Other than that one error everything should work fine. I still will have to tackle unity builds though, I haven't even tried one yet.

@BearishSun
Copy link
Member

@BearishSun BearishSun commented May 31, 2018

Thanks for the benchmarks, nice to see a noticeable speedup.

I've quickly checked out the changes, and it looks minimally intrusive which is really nice. The only thing that worries me are the explicit paths to clang/llvm folders, especially the versioned llvm ones. Does that mean things will break if user has Clang installed in a non-default path, or when some newer version is introduced?

@cwfitzgerald
Copy link
Contributor Author

@cwfitzgerald cwfitzgerald commented Jun 1, 2018

As of right now it does, and that is the issue with "inttypes.h" I mentioned. That solution was just to get a full proof of concept working. I need to find a clean solution for that. I think I have a couple ideas which I'll look at tomorrow. Probably something to do with getting clang's search path's then searching for "inttypes.h" and finding the directory the offending one is in.

@cwfitzgerald
Copy link
Contributor Author

@cwfitzgerald cwfitzgerald commented Jun 1, 2018

My latest update seems to provide a nice generic interface for that. It basically asks the compilers for their include paths by parsing the verbose logs of compiling nothing, then uses those paths to find the "inttypes.h" file in both c and c++ and then banning that file from the precompiled header. Seems to work well.

https://github.com/cwfitzgerald/bsf/commit/eb197186317d88cbdb74274f63808a9a5689a134

Not sure if I need this on mac too as I don't have access to mac, but should work the same.

@BearishSun
Copy link
Member

@BearishSun BearishSun commented Jun 1, 2018

Looks like a good solution. Would be nice to be just able to ignore inttypes.h by name, but oh well.

I'd imagine its also needed on Mac. When you make a PR I'll test it out. Will it break in an obvious way if its not working properly? One thing that will probably need to change is the check for Clang since on Mac it's identified as AppleClang.

@cwfitzgerald
Copy link
Contributor Author

@cwfitzgerald cwfitzgerald commented Jun 1, 2018

It will definitely break in a very obvious way if it doesn't work. #include-ing "inttypes.h" will just completely stop working, things left completely undefined. I'm going to enable it on mac as well, as even if it doesn't suffer from the bug, having to reparse a tiny header like inttypes.h will do no harm.

Related comment about build systems, if the ninja generator is in use you really should add -fdiaganostics-color=always as ninja runs everything against an internal buffer so that it can buffer the output and clang and gcc think they are being redirected into a file and stop producing color. I will add this change into this PR as well as it's so minor that it would be silly to do another one.

Adding on to that, I think you probably should run travis using ninja. The reason I say this is it takes advantage of the parallelism they give you (they give you two cores) while still keeping errors in order due to the aforementioned buffering. I also don't really think there are any errors that show up in the ninja makefile that wouldn't also show up in the makefile generator, as the makefile generator is generally more mature and supports more things, so if it supports ninja, it will support makefiles.

Not so ninja edit: Pushed changes fixing the remaining issues with pch. Windows will have to be done at my house tonight, as I know there are some problems on there, but for now, linux/mac should be good.

@cwfitzgerald
Copy link
Contributor Author

@cwfitzgerald cwfitzgerald commented Jun 1, 2018

Alright, finished unity builds as well. This is where things get interesting. Times did improve, the significance of the improvement changes drastically depending on which linker you use as there is a large wait on the compilation and linking of libbsf.so. I have only included the clang times right now as I don't want to spend too much time benchmarking. Ld seems to... really shit the bed.

Build Type Regular PCH (ld) PCH (lld.ld) Unity (ld) Unity (gold.ld) Unity (lld.ld)
Clang-Debug 316sec 182sec 176sec 166 sec 103 sec 100 sec
@jonesmz
Copy link
Contributor

@jonesmz jonesmz commented Jun 1, 2018

Somewhat off topic, but I have to point out that I consider 300 second build times to be a lightning fast :-P. My day job involves 20+ hour builds.

That being said, wow 33%+ decreases on the low end, 66%+ on the high end? That's fantastic.

@BearishSun
Copy link
Member

@BearishSun BearishSun commented Jun 1, 2018

Really nice stuff indeed, great work.

Thanks for the suggestion for Travis, I'll test it out when I get a chance. I know I ran into out of memory errors when I tried running parallel make.

@cwfitzgerald
Copy link
Contributor Author

@cwfitzgerald cwfitzgerald commented Jun 2, 2018

Alright, hopped onto windows this weekend at home, and have fixed most of the problems on windows with PCH's (a couple warnings to go). TL;DR: Windows.h defines far and near and that messes EVERYTHING up. I had to make my own header which is included in the PCH that undefs far and near after the other headers that needed them went. I have pushed the changes I've made thus far. Timing: debug only because timing on windows has to be done manually:

Before PCH Ratio
175sec 123sec 1.42x

I have also ran into memory errors but as long as you keep the job count low (3 on travis) you should be good. I can also submit a pr with the changes to travis if you wish.

@cwfitzgerald
Copy link
Contributor Author

@cwfitzgerald cwfitzgerald commented Jun 3, 2018

Alright, with all the windows and linux issues sorted out, I enabled travis on the examples repository to see if mac would work as smoothly... and unfortunatly... this happened. I don't have a mac at all, so I don't have any way to solve this. I am going to make a PR with the current version so that @BearishSun can push stuff to the branch.

@cwfitzgerald cwfitzgerald changed the title Cotire and Precompiled Headers Broken PCH's on Mac and Broken Unity Builds on Windows and Mac Jun 7, 2018
@BearishSun BearishSun added this to the Longterm milestone Jun 9, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Linked pull requests

Successfully merging a pull request may close this issue.

3 participants
You can’t perform that action at this time.