Lexiclean – Casey Rodarmor's Blog

Lexiclean
4.22.2020

programming

I just published a simple crate that performs lexical path cleaning: lexiclean.

Lexical path cleaning simplifies paths by removing ., .., and double separators: //, without querying the filesystem. It is inspired by Go's Clean function, and differs from the Go version by not removing . if that is the only path component left.

I implemented this for a command line utility I'm working on, but split it off so others could use it.

There are a few reasons I prefer lexical path cleaning to fs::canonicalize:

If the input is a relative path, the output will be a relative path. This means that if the input is a path the user typed, and the output path is displayed in an error message, the message is more likely to make sense to the user, since it will more obviously relate to the input path.
It simplifies some-file/.. to . without any fuss, even if some-file is not a directory.
It never returns an error, because it makes no system calls.

There are some reasons you might prefer fs::canonicalize:

fs::canonicalize respects symlinks.
fs::canonicalize always returns an absolute path. (Although you can do std::env::current_dir().join(path).lexiclean() if you want an absolute path.)

Are there any other reasons to prefer one over the other? I'd love to hear them!

It is very lightly tested! If you intend to use it, I encourage you to submit additional tests containing paths you might encounter, if you think the existing tests don't cover them. In particular, I haven't thought about all the exotic prefixes that Windows paths might be adorned with, so there might be bugs there.

I don't expect to modify the crate or add features to it beyond what I need for my own purposes, so if there are additional features you want, please consider opening a PR! Of course, if you find a bug, I will happily fix it.