One of the things that I personally struggled with when learning Rust was how to organize large programs with multiple modules.
In this post, I'll explain how I organize the codebase of
just, a command runner that I wrote.
just was the first large program I wrote in Rust, and its organization has
gone through many iterations, as I discovered what worked for me and what
There are some things that could use improvement, and many of the choices I made are somewhat strange, so definitely don't consider the whole project a normative example of how to write rust.
Users mostly interact with
just by running the main binary from the command
However, the crate actually consists of an executable target, in
src/main.rs, and a
library in src/lib.rs.
main function in
main.rs is a thin wrapper that calls the
run function in
The reason that
just is split into a executable target and a library target
is because there is a fuzz tester in
fuzz, and a regression
testing framework at janus, and both of these
use testing functions exposed by the library.
I prefer to keep my module tree flat, so you'll notice that all my source files are directly under src. I find that this makes it easy to remember what's where, since I don't need to remember what subdirectory each source file is in.
I use a fuzzy file searcher, fzf, to switch
between files, so having a ton of files in my
src directory doesn't bother
me. If I used a tree-based file viewer in my editor, I might prefer to group
modules into directories by topic.
Common Use Statements
I prefer to group all my
use statements together in a single file called
Then, at the top of every other file, I include them all with
I think this is somewhat uncommon, and most people prefer to put
statements at the top of every file, with just those things used in that
Both approaches are totally reasonable. I find that grouping use statements
into a single file saves a lot of duplication at the top of every file, and
makes it easy to start a new file, since you can just write
use crate::common::*; and have everything you need in scope.
This does require that I pick unique names for everything that I want to put in
common.rs, but I haven't found that to be particularly burdensome.
Submodule Names and Contents
Most modules contain a single definition, either a function, trait, struct, or
enum, that is used in the rest of the codebase. These modules are all named
after the public definition, and that definition is exported in
use crate::common::*; will bring that definition into scope, without needing to
qualify it with the module name.
As an example,
just's lexer is called
Lexer, and is in
common.rs, it is exported with
pub(crate) use lexer::Lexer;.
Since modules are always named after their sole export, it's pretty easy to
figure out where something comes just from the name. Since
Lexer is a type
just, and not a dependency, that means that it's probably defined in
A few modules, like
contain more than one thing. For modules like that,
common.rs just exports
the module itself, with
pub(crate) use crate::keyword;, and the module name
is used when referring to the definitions inside
If a name is from a dependency, then you can see where it comes from in
For testing purposes, I like to use error enums, instead of
Box<dyn Error, or
equivalent. I find that this makes it easy to write tests that look for
specific errors. Also,
just has detailed error messages, and this lets me
separate error message formatting from error value generation.
There are two main kinds of errors,
As you can guess,
CompilationError is for problems related to compilation,
e.g. lexing, parsing, and analyzing a justfile, and
RuntimeError is for
problems that occur when running a justfile, e.g. I/O errors and command
I currently don't use any of the many error handling helper crates, but in
other projects I use snafu, and if I
were to rewrite
just, I would definitely use
I use clippy, the animate paperclip/rust linter, to automatically check the codebase for issues.
Clippy has many lints that are either pedantic, or which restrict things that are totally reasonable in many contexts. I like a lot of these, but I didn't want to go through all of them and decide which lints to enable, so I turned on all the lints, even the annoying ones, and then just disable lints that I don't like as I encounter them.
You can see this at the top of src/lib.rs.
How does it work?
just does a lot of stuff! This makes it hard to give a concise overview of
how everything works, but I'll do my best!
run function is pretty short, so definitely check it out. It's in
src/run.rs. It does
some setup, like initializing Windows terminal color support and logging, then
parses the command line arguments.
Just calls parsed command line arguments a
Config. The command line arguments
are parsed with the venerable clap, and then
stored in a
Config struct, which is passed around the rest of the program.
Everything related to setting up the clap parser, and parsing the command line arguments is in src/config.rs.
just has a few distinct modes it can run in, e.g. actually running a recipe
in a justfile, listing the recipes in a justfile, or evaluating the variables
in a justfile. These are called subcommands, and you can see the different
subcommands in the
Subcommand enum in
Once a config is parsed, the function
executing the correct subcommand.
For the rest of this post, I'll cover
Subcommand::Run, which is the
subcommand which is responsible for actually running a justfile.
The justfile source is read and the compiler is invoked in the
Compiler is defined in
and has a single short method that calls the lexer, the parser, and the
There's no particular reason for having a
Compiler struct, since it doesn't
have any fields, so it's really just for organization. I would be totally fine
with having a module
src/compile.rs, and just exporting a single
function from that module.
The first step of compilation is to split the source text into tokens, which is
done by the
lexer looks a lot like a recursive descent parser. It has a bunch of different
methods, and those methods call each other to produce the different tokens.
The entry-point to the lexer is
Lexer is relatively well-commented, so please take a look if you're
The lexer produces a
Token type is in
Token contains a
TokenKind, defined in
Token contains a reference to the soruce program, as well as information
about the offset, length, line, and column of the token. A
you what kind of token it actually is.
Tokens produced by the
Lexer are passed to a
Parser, defined in
the main entry point is
The parser is a recursive descent parser that walks over the tokens, figuring out what kind of construct it's parsing as it goes.
The output of the parser is a
Module, defined in
Module represents a successful parse, but has not been fully validated. Just
does a lot of static analysis, like resolving names, inter-recipe dependencies,
and inter-variable dependencies, so not every
Module is valid.
You can think of a
Module as being like an
AST, that still hasn't
been statically analyzed for correctness. Inside a
contain the different source constructs, like
The next phase of compilation is analysis, performed by the
Analyzer makes sure that all references to recipes and variables can be
resolved, and that there are no circular dependencies.
The output of the
Analyzer is a
Justfile, defined in
Justfile represents a parsed and analyzed justfile. It contains all the
recipes, variables, and expressions, all resolved and ready to run. It is the
totus porcus, as it were.
A justfile is run with
Justfile::run, which takes a
information about where the justfile is and where the working directory is,
variable overrides passed on the command line, and a list of arguments.
The arguments are parsed into recipes and arguments to those recipes, and
finally those recipes are run with
Justfile::run_recipe, which actually
executes each recipe, and all dependencies.
just takes files, parses commands out of those files, and then runs them. I
am acutely aware of how this might go wrong, and would feel real bad if
just somehow got confused and ran a command that nuked someone's hard drive.
Because of this, I go pretty crazy with testing. There are four kinds of tests: unit tests, integration tests, fuzz testing, and ecosystem-wide regression testing.
Unit tests are spread around the codebase, in submodules named
tests submodule contains tests for whatever's in the containing module.
I'm not strict about covering everything with unit tests, but I do cover everything with integration tests. If something doesn't seem to be tested in a unit test, it's probably tested in an integration test.
Integration tests are in the
tests subdirectory, roughly
organized by topic. The vast majority are in
which test a full run of the
just binary, supplying standard input, args, and
a justfile, and checking that standard output, standard error, and the exit
code are correct.
Fuzz testing was contributed to
@RadicalZephyr, and is located in the
fuzz directory. It generates
random strings and feeds them to the parser. (NOT THE RUNNER, DEFINITELY NOT
THE RUNNER.) If the parser succeeds or returns an error, that's a successful
run. If the fuzzer is able to trigger a panic, then it's found a bug that needs
to be fixed.
Since a lot of people have written
a lot of justfiles,
I want to make sure I don't break them when I update
To do this, I wrote a tool called janus. Janus is inspired by Rust's crater.
Janus downloads all the justfiles that it can find on GitHub, and then compares
how two versions of
just compiles those justfiles. The two versions of
are usually the latest release, and a new version with a big, scary change.
Janus compiles all justfiles with both versions, and then compares the result.
Ideally, every valid justfiles parses into the same
Justfile with both
versions, and every justfile with an error produces the same error.
That's everything I can think of! Ultimately, much of how you organize your Rust programs comes down to personal preference, so just start mashing the keyboard, see what works, and iterate on whatever doesn't.