Now let’s add a new kind of build step that does code generation. We’ll use a templating engine called Jinja2.
Bazel is not opinionated about what tools are used, AND it’s not necessary to write any custom rules or macros to call existing tools using their CLI. In fact, it’s quite easy.
Fetch the jinja2 tool
We’ll add to the requirements/tools.in
file, which is where we organized dependencies that will only be needed at build-time.
echo "jinja2-cli" >> requirements/tools.in
Repin the transitive dependencies again: ./tools/repin
Tell Bazel how to run it
Open the BUILD file in the tools
folder. You’ll see there’s already a py_console_script_binary
rule which we used to run the copier
tool. We’ll make an identical one for jinja2 (note that the package name with a hyphen was converted by rules_python to be an underscore here!)
We also add a visibility
attribute, since Bazel enforces that the dependency graph doesn’t get tangled by default.
py_console_script_binary(
name = "jinja2",
pkg = "@pip//jinja2_cli",
visibility = ["//visibility:public"],
)
Now check that you can run it:
% bazel run tools:jinja2
...
INFO: Build completed successfully, 5 total actions
INFO: Running command line: bazel-bin/tools/jinja2
Usage: rules_python_entry_point_jinja2.py [options] <input template> <input data>
Options:
--version show program's version number and exit
-h, --help show this help message and exit
--format=FORMAT format of input variables: auto, env, ini, json,
querystring
-e EXTENSIONS, --extension=EXTENSIONS
extra jinja2 extensions to load
-D key=value Define template variable in the form of key=value
-s SECTION, --section=SECTION
Use only this section from the configuration
--strict Disallow undefined variables to be used within the
template
-o FILE, --outfile=FILE
File to use for output. Default is stdout.
This shows that we can run the tool under Bazel, but also the help for the CLI is going to be critical for our next step which is to invoke it with the right arguments.
Declare the code generation
Our library already came with a jinja2 template file in mylib/header.tmpl.txt
. You’ll see it has a {{TIMESTAMP}}
placeholder. We want to fill this in.
To help us out, the library also has a run_binary
rule in the BUILD file. This is a basic building block which uses a tool and some declared inputs, then runs it when needed to produce the declared outputs. You can see the args
already match the command-line flags from the jinja2 help output above.
You’ll see the tool = "TODO"
line for us to fill in with the jinja2 CLI we just setup. Replace that with the label //tools:jinja2
.
Now we can run a build
to check the codegen:
bazel build mylib:header.txt; cat bazel-bin/mylib/header.txt
INFO: Analyzed target //mylib:header.txt (0 packages loaded, 0 targets configured).
INFO: Found 1 target...
Target //mylib:header.txt up-to-date:
bazel-bin/mylib/header.txt
INFO: Elapsed time: 0.146s, Critical Path: 0.00s
INFO: 1 process: 1 internal.
INFO: Build completed successfully, 1 total action
-- Built at <unstamped> --
I'm a cow, I found this on the internet
It worked!
Why does the value say <unstamped>
though? This is “determinism” where the output of a build step should only depend on the inputs. If the current timestamp appears here, the output will be different every time, causing cache misses for any action that depends on this one, even transitively. That will slow down our builds! We’ll see how to stamp artifacts for production later.
Depend on the codegen
Finally let’s use our new templated header file in the application. The mylib
library included a second function, moo_stamped
so let’s change our app/__main__.py
to use that instead. When we try to run the application now we get an error:
Traceback (most recent call last):
File "execroot/_main/bazel-out/darwin_arm64-fastbuild/bin/app/app_bin.runfiles/_main/app/__main__.py", line 6, in <module>
say.moo_stamped(x.text)
File "execroot/_main/bazel-out/darwin_arm64-fastbuild/bin/app/app_bin.runfiles/_main/mylib/say.py", line 14, in moo_stamped
with open(path.join(FOLDER, "header.txt"), "r") as header:
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
FileNotFoundError: [Errno 2] No such file or directory: 'mylib/header.txt'
This is just saying that we didn’t declare the header.txt file as a dependency. And because the reference isn’t an import
statement, BUILD file generation is not smart enough to know what to do. We’ll edit the BUILD file by hand.
Since it’s a runtime dependency, we want to list it as a data
dependency. How do we know where it belongs though? The answer is the “locality principle” - the dependency should be declared locally to where the reference appears. There should always be a symmetry between the symbol reference in the code, and the BUILD definition for that file. Since the stack trace points to say.py
, the data
dependency should go next to the srcs=["say.py"]
declaration:
py_library(
name = "mylib",
srcs = ["say.py"],
data = ["header.txt"],
visibility = ["//:__subpackages__"],
deps = ["@pip//cowsay"],
)
Now we run the app and see the header appears: