A good compiler with bad defaults
01 Mar 2017Visual C++ 6 is a venerable compiler, but when I first saw it I was shocked how bad it was dealing with what should be fairly trivial cases. Take hello world, dynamically linked against the CRT, comparing Visual C++ 5 and Visual C++ 6:
C:\TEMP>type hw.c #include <windows.h> #include <stdio.h> int main(int argc, char * argv[]) { printf("Hello world from C version %i\n", _MSC_VER); return 0; } C:\TEMP>cl /MD hw.c Microsoft (R) 32-bit C/C++ Optimizing Compiler Version 11.00.7022 for 80x86 Copyright (C) Microsoft Corp 1984-1997. All rights reserved. hw.c Microsoft (R) 32-Bit Incremental Linker Version 5.00.7022 Copyright (C) Microsoft Corp 1992-1997. All rights reserved. /out:hw.exe hw.obj C:\TEMP>hw.exe Hello world from C version 1100 C:\TEMP>sdir -cw40 hw*|more ------------+------------+------------- hw.c 158b|hw.exe 3072b|hw.obj 473b ------------+------------+------------- 3 files, 0 dirs, 3703b used, 4094m vol size, 2044m vol free
Visual C++ produces a 3Kb hello world. Now for Visual C++ 6:
C:\TEMP>cl /MD hw.c Microsoft (R) 32-bit C/C++ Optimizing Compiler Version 12.00.8168 for 80x86 Copyright (C) Microsoft Corp 1984-1998. All rights reserved. hw.c Microsoft (R) Incremental Linker Version 6.00.8168 Copyright (C) Microsoft Corp 1992-1998. All rights reserved. /out:hw.exe hw.obj C:\TEMP>hw.exe Hello world from C version 1200 C:\TEMP>sdir -cw40 hw*|more ------------+------------+------------- hw.c 158b|hw.exe 16.0k|hw.obj 533b ------------+------------+------------- 3 files, 0 dirs, 16.6k used, 4094m vol size, 2044m vol free
Visual C++ 6 is 16Kb - more than 5 times worse. For what is otherwise a minor upgrade, that seems pretty serious. How did it go so badly wrong?
The answer lies in the layout of the executable file itself. Below is the output of "link /dump /headers" on the two executables. For ease of comparison I'm using the tools from Visual C++ 5 for this, with the program generated by Visual C++ 5 on the left and Visual C++ 6 on the right:
Microsoft (R) COFF Binary File Dumper Version 5.00.7022 Copyright (C) Microsoft Corp 1992-1997. All rights reserved. Dump of file hw5.exe PE signature found File Type: EXECUTABLE IMAGE FILE HEADER VALUES 14C machine (i386) 4 number of sections 58A01292 time date stamp Sat Feb 11 23:45:22 2017 0 file pointer to symbol table 0 number of symbols E0 size of optional header 10F characteristics Relocations stripped Executable Line numbers stripped Symbols stripped 32 bit word machine OPTIONAL HEADER VALUES 10B magic # 5.00 linker version 200 size of code 600 size of initialized data 0 size of uninitialized data 1020 address of entry point 1000 base of code 2000 base of data ----- new ----- 400000 image base 1000 section alignment 200 file alignment 3 subsystem (Windows CUI) 4.00 operating system version 0.00 image version 4.00 subsystem version 5000 size of image 400 size of headers 0 checksum 100000 size of stack reserve 1000 size of stack commit 100000 size of heap reserve 1000 size of heap commit 0 [ 0] address [size] of Export Directory 4000 [ 28] address [size] of Import Directory 0 [ 0] address [size] of Resource Directory 0 [ 0] address [size] of Exception Directory 0 [ 0] address [size] of Security Directory 0 [ 0] address [size] of Base Relocation Directory 0 [ 0] address [size] of Debug Directory 0 [ 0] address [size] of Description Directory 0 [ 0] address [size] of Special Directory 0 [ 0] address [size] of Thread Storage Directory 0 [ 0] address [size] of Load Configuration Directory 0 [ 0] address [size] of Bound Import Directory 4064 [ 3C] address [size] of Import Address Table Directory 0 [ 0] address [size] of Reserved Directory 0 [ 0] address [size] of Reserved Directory 0 [ 0] address [size] of Reserved Directory SECTION HEADER #1 .text name 1CC virtual size 1000 virtual address 200 size of raw data 400 file pointer to raw data 0 file pointer to relocation table 0 file pointer to line numbers 0 number of relocations 0 number of line numbers 60000020 flags Code (no align specified) Execute Read SECTION HEADER #2 .rdata name C virtual size 2000 virtual address 200 size of raw data 600 file pointer to raw data 0 file pointer to relocation table 0 file pointer to line numbers 0 number of relocations 0 number of line numbers 40000040 flags Initialized Data (no align specified) Read Only SECTION HEADER #3 .data name 5C virtual size 3000 virtual address 200 size of raw data 800 file pointer to raw data 0 file pointer to relocation table 0 file pointer to line numbers 0 number of relocations 0 number of line numbers C0000040 flags Initialized Data (no align specified) Read Write SECTION HEADER #4 .idata name 176 virtual size 4000 virtual address 200 size of raw data A00 file pointer to raw data 0 file pointer to relocation table 0 file pointer to line numbers 0 number of relocations 0 number of line numbers C0000040 flags Initialized Data (no align specified) Read Write Summary 1000 .data 1000 .idata 1000 .rdata 1000 .text |
Microsoft (R) COFF Binary File Dumper Version 5.00.7022 Copyright (C) Microsoft Corp 1992-1997. All rights reserved. Dump of file hw6.exe PE signature found File Type: EXECUTABLE IMAGE FILE HEADER VALUES 14C machine (i386) 3 number of sections 58A01351 time date stamp Sat Feb 11 23:48:33 2017 0 file pointer to symbol table 0 number of symbols E0 size of optional header 10F characteristics Relocations stripped Executable Line numbers stripped Symbols stripped 32 bit word machine OPTIONAL HEADER VALUES 10B magic # 6.00 linker version 1000 size of code 2000 size of initialized data 0 size of uninitialized data 101A address of entry point 1000 base of code 2000 base of data ----- new ----- 400000 image base 1000 section alignment 1000 file alignment 3 subsystem (Windows CUI) 4.00 operating system version 0.00 image version 4.00 subsystem version 4000 size of image 1000 size of headers 0 checksum 100000 size of stack reserve 1000 size of stack commit 100000 size of heap reserve 1000 size of heap commit 0 [ 0] address [size] of Export Directory 204C [ 28] address [size] of Import Directory 0 [ 0] address [size] of Resource Directory 0 [ 0] address [size] of Exception Directory 0 [ 0] address [size] of Security Directory 0 [ 0] address [size] of Base Relocation Directory 0 [ 0] address [size] of Debug Directory 0 [ 0] address [size] of Description Directory 0 [ 0] address [size] of Special Directory 0 [ 0] address [size] of Thread Storage Directory 0 [ 0] address [size] of Load Configuration Directory 0 [ 0] address [size] of Bound Import Directory 2000 [ 3C] address [size] of Import Address Table Directory 0 [ 0] address [size] of Reserved Directory 0 [ 0] address [size] of Reserved Directory 0 [ 0] address [size] of Reserved Directory SECTION HEADER #1 .text name 15C virtual size 1000 virtual address 1000 size of raw data 1000 file pointer to raw data 0 file pointer to relocation table 0 file pointer to line numbers 0 number of relocations 0 number of line numbers 60000020 flags Code (no align specified) Execute Read SECTION HEADER #2 .rdata name 186 virtual size 2000 virtual address 1000 size of raw data 2000 file pointer to raw data 0 file pointer to relocation table 0 file pointer to line numbers 0 number of relocations 0 number of line numbers 40000040 flags Initialized Data (no align specified) Read Only SECTION HEADER #3 .data name 5C virtual size 3000 virtual address 1000 size of raw data 3000 file pointer to raw data 0 file pointer to relocation table 0 file pointer to line numbers 0 number of relocations 0 number of line numbers C0000040 flags Initialized Data (no align specified) Read Write Summary 1000 .data 1000 .rdata 1000 .text |
What this output shows is that Visual C++ 5 generated four sections of 0x200 bytes (512 bytes) each, so 2Kb of sections, plus 0x400 bytes (1Kb) of headers, for a 3Kb executable. Visual C++ 6 generated three sections of 0x1000 bytes (4Kb) plus an extra 0x1000 bytes for headers, resulting in a 16Kb executable. The virtual size values are small and similar between the two - the difference is that by default Visual C++ 6 aligns all sections within the file on a 4Kb boundary.
This behavior is optional and can be turned off with the "/OPT:NOWIN98" linker switch. With that switch specified, the result is three 0x200 byte sections plus 0x400 bytes of headers, resulting in a 2.5Kb executable - 512 bytes smaller than the Visual C++ 5 version, and less than 1/6th the size produced by default. The user just needs to make the logical leap that the solution to large executable file sizes is related to Windows 98 optimization in order to discover this switch.
The reason file alignment matters in Windows executables is that each section needs to be laid out on its own page in memory, so when the program is run the result will be 3 or 4 4Kb pages, regardless of how compact the file is on disk. This happens because each section has slightly different page permissions - in the Visual C++ 6 case, one page is executable, one is readable, one is readable and writable. The only way to enforce these permissions is at the page level. So even when the executable is only 2.5Kb, there may be more than one 4Kb IO needed to read it, and may be more work to lay it out correctly if the disk representation doesn't match the memory representation.
What I don't know (and don't think I ever will know) is why Windows 98 was special. The costs referred to above exist for any platform that can properly execute Windows executables. Why would Windows 98 have costs that Windows 95 did not, or that Windows NT did not? If those costs are substantial, the choice from the Visual C++ team makes sense - the two products shipped at a similar time. But optimizing for best Windows 98 performance was arguably not the correct thing to do within a few years, and Visual C++ 6 lasted much longer than that.
The postscript to this is that Visual C++ 2005 reverted to the same behavior as Visual C++ 5 by default, with 512 byte file alignment.