Update on 6-29-97: Over the past year, now and then someone asked for one of the internal limits to get increased. Normally this was for SuperHeros or KQP, although I recall a couple others too. Anyways, each time I increased the limit in question then mailed off a new copy of the source, but I didn't actually change the version number or anything, since it was a minor change for one person. However, Aleksi Suhonen (Axu) couldn't get 1.0 to compile PainKeep, but my latest current copy did. So, with his prompting, I've changed the version to 1.01, added the newer 1.06 QuakeC source, and added the QuakeWorld 1.64 source. I won't release new versions with each QW source update, so you'll want to check the v164qw/qwrlnote.txt file to see how to find the latest version. Enjoy! Update on 9-19-96: Ok, a few more improvements and I'm calling it quits at last! I changed the pr_punctuation order so more common hits were in the front, as it is another linear search. I also added a hash for the PR_Expression loop, which does a linear search for opcodes. A few other minor optimizations as well, like cutting 260K strlen() calls in PR_LexPunctuation() by precomputing all the lengths ahead of time. That's about it. On one Sun the speedup is 56 seconds down to 5 seconds. On another it is from 15 seconds down to under 2 seconds. So I'd guess it is at least 7x and maybe 10x faster now. Enjoy. :) Update on 9-18-96: Ok, here is a hashing version. There is one issue... If I hash the constants directly, the progs.dat is still perfectly valid, just a tad larger and not identical to what qcc would make. I can correct for this at the cost of a bit of speed. I decided to compile fastqcc.exe to run as fast as possible, even if the progs.dat isn't identical. The defines are documented in the Makefile. I may work on a correction method that is faster, which would give identical output with a minimum of extra overhead. But probably not. It's fast enough now that it'd be overkill. ;) All 'reordering' code, etc, is gone. It's a pure hash now, although I do move hits in the hash to the front of the collision list. Update on 9-12-96: Ok, I found the problem. The linear list of symbols has two special symbols to denote the end of the globals and the end of the fields. I can't reorder those or it messes up things. :) So I reorder after it. One bug I had was that I wasn't limiting reordering to after those globals... now that it is fixed, it is working great. I also installed gcc for my PC to work on it, so I have included a fastqcc.exe binary. ====================== This "fastqcc" version was inspired by Carmack's finger message that he had fixed a linear search and got a 4x speedup. I decided to give it a shot myself, and here it is. The key improvements are: Precomputed away 260K strlen() calls in PR_LexPunctuation. Ordered punctuation symbols by frequency so linear search will end sooner. Hash table for the list of opcodes. Hash table for all constants and defined variables. This moved GetDef, the major cpu burning function, from 50% to under 1%. It now uses practically no CPU time at all. Short-circuit strcmp(). Even after some improvements, there were 8.7M strcmp() calls going on in GetDef. I added a macro to test the first byte for easy detection of non-identical strings. This was a big win because of two things. First, many compares are 1 byte anyways. Second, in GetDef we expect strcmp to fail often, so there will be many times the 1-byte check can abort before the full function call. This reduced strcmp() calls to under 1M. The macro was suggested by Brian Campbell. Inline PR_Check and PR_Expect. These were two very simple functions that were called often. I made them inline, and shaved a few more seconds of the runtime. For people without gcc you may not be able to do this, and you shouldn't use -DINLINE. Other minor improvements. These two things are no longer used at all, because I've pulled out the major linear searchs: Add to front of list instead of the end. Move hits to the front of the list. That's about it. Good luck using this, and I hope you find the faster compilation to be nicer! -Jonathan Roy (roy@atlantic.net)