\\?\ is what you want, despite ugliness
17 Mar 2017One of the things I've heard a lot over the years is a comment like, "we want to support paths greater than MAX_PATH, but \\?\ seems to change a lot of stuff and we don't know if we really want all that." It doesn't help that the "stuff" isn't particularly well documented, which makes these conversations more about fear than information.
Firstly, let's consider what the escape actually does:
- Suppress relative path evaluation.
- Suppress truncation of trailing periods or spaces.
- Suppress detection of special device names, including CON, PRN, AUX, NUL, COM1, COM2, COM3, COM4, COM5, COM6, COM7, COM8, COM9, LPT1, LPT2, LPT3, LPT4, LPT5, LPT6, LPT7, LPT8, and LPT9.
Depending on your application, relative path evaluation may be useful and even important. The default behavior of the others is almost certainly not what you want. Let's consider some cases.
Case 1: You're writing cross platform software
If your program can run on a non-Windows platform and that platform is capable of supporting trailing spaces or periods, or names like "con", then your users are likely in for a world of hurt when they try to go between platforms with your software unless you use this escape. Consider a tool that compresses a series of files into an archive - on platform A, users can create an archive with "file" and "file." in the archive, but when it's extracted on Windows, one of these files is retained, and the other is overwritten. Is that really what the user wanted?
In many ways, the biggest flaw with the escape in this scenario is it doesn't go far enough and still preserves things like case insensitivity.
Case 2: You're writing software that operates on a set of files
Take something like sdir which enumerates files. If a directory contains "file", "file." or even "con", is the intention really to enumerate one file twice and then query the console for metadata? What happens if the directory only contains "file." - is the intention to report metadata for the nonexistent "file" instead? The user asked to enumerate files that exist in the directory, whatever those files are.
This case to me seems like the most typical case in software development. Want to back up files? Want to scan them? Want to burn them to CD, compress them, encrypt them, make them available over a network, share them over a web server? You probably want to operate on the set of files that actually exist, not try to munge the names of something that exists into something that doesn't.
Case 3: You're writing interactive user software
Perhaps the strongest case that exists for not using the escape is for software where the user has entered one file name. Maybe they really meant to refer to a printer? Maybe the trailing space was a mistake? But when held up to scrutiny, it's kind of hard to defend the default behavior.
Try this simple question: do you actually support sending the data to the special device the user chose?
If the answer is "clearly not", then why not let the user do what they requested in the first place? Let's face it, it's hard to even define what it means for your UI application to output to a console, which makes it hard to believe anyone could really be depending on that behavior.
I struggle to think of any software, written today, that desires the default Win32 behavior. Decades ago I'm sure it made sense for command line software to open "prn" or "con" or "com1". I'm sure it made sense to work with pre-Windows 95 file systems to truncate trailing periods or spaces. But those days ended eons ago.
What I've been glossing over here is relative path translation. For a lot of software, this isn't really needed either, or isn't needed as much as much as it first appears. If software works on a set of files, often the translation is needed once, but all files discovered afterwards really should just be appended to an absolute path and used verbatim. For many situations, calling GetFullPathName to perform this translation, then escaping the result, works fairly well.