| 10.1 Mill Build Pipelines | 181 |
| 10.2 Mill Modules | 185 |
| 10.3 Revisiting our Static Site Script | 189 |
| 10.4 Conversion to a Mill Build Pipeline | 190 |
| 10.5 Extending our Static Site Pipeline | 193 |
import mill.*
def srcs = Task.Source("src")
def concat = Task{
os.write(Task.dest / "concat.txt", os.list(srcs().path).map(os.read(_)))
PathRef(Task.dest / "concat.txt")
}
10.1.scala
Snippet 10.1: the definition of a simple Mill build pipeline
Build pipelines are a common pattern, where you have files and assets you want to process but want to do so efficiently, incrementally, and in parallel. This usually means only re-processing files when they change, and re-using the already processed assets as much as possible. Whether you are compiling Scala, minifying Javascript, or compressing tarballs, many of these file-processing workflows can be slow. Parallelizing these workflows and avoiding unnecessary work can greatly speed up your development cycle.
This chapter will walk through how to use the Mill build tool to set up these build pipelines, and demonstrate the advantages of a build pipeline over a naive build script. We will take the the simple static site generator we wrote in Chapter 9: Self-Contained Scala Scripts and convert it into an efficient build pipeline that can incrementally update the static site as you make changes to the sources. We will be using the Mill build tool in several of the projects later in the book, starting with Chapter 14: Simple Web and API Servers.
We will be using the Mill Build Tool to define our build pipelines. While Mill can be used to compile/test/package Scala code (which we will see in subsequent chapters) it can also be used as a general purpose tool for efficiently and incrementally keeping static assets up to date.
In this chapter we will be managing the compilation of markdown files into a HTML static site, but any workflow which you would like to do once and re-use can benefit from Mill. Minifying Javascript files, pre-processing CSS, generating source code, preparing tar or zip archives for deployment: these are all workflows which are slow enough you do not want to repeat them unnecessarily. Avoiding unnecessary re-processing is exactly what Mill helps you do.
To introduce the core concepts in Mill, we'll first look at a trivial Mill build that takes files in a source folder and concatenates them together:
build.mill10.2.scalaimportmill.*defsrcs=Task.Source("src")defconcat=Task{os.write(Task.dest/"concat.txt",os.list(srcs().path).map(os.read(_)))PathRef(Task.dest/"concat.txt")}
You can read this snippet as follows:
This build.mill file defines one set of sources, named srcs, and one
downstream target named concat.
We make use of srcs in the body of concat via the srcs() syntax, which
tells Mill that concat depends on srcs and makes the value of srcs
available to concat.
Inside concat, we list the src/ folder, read all the files, and
concatenate their contents into a single file concat.txt.
The final concat.txt file is wrapped in a PathRef, which tells Mill that
we care not just about the name of the file, but also its contents.
This results in the following simple dependency graph
The Mill build tool is built around the concept of targets. Targets are the nodes within the build graph, and represent individual steps that make up your build pipeline.
Every target you define via the def foo = Task{...} syntax gives you the
following things for free:
./mill fooTask{...} block is made printable via ./mill show fooTask{...} block evaluatesIn general, this helps automate a lot of the tedious book-keeping that is normally needed when writing incremental build scripts. Rather than spending your time writing command-line argument parsers or hand-rolling a caching and invalidation strategy, Mill handles all that for you and allows you to focus on the logical structure of your build.
Note that the concat.txt file is created within the concat target's
destination folder Task.dest. Every target has its own destination folder, named
after the fully-qualified path to the target (in this case, out/concat.dest/).
This means that we do not have to worry about the concat target's files being
accidentally over-written by other targets.
In general, any Mill target should only create and modify files within its own
Task.dest, to avoid collisions and interference with other targets. The contents
of Task.dest are deleted each time before the target evaluates,
(and only evaluates when changes are detected), ensuring that the
target always starts each evaluation with a fresh destination folder and isn't
affected by the outcome of previous evaluations.
We can install Mill in the current folder via curl, and create a src/ folder
with some files inside:
$ REPO=https://repo1.maven.org/maven2/com/lihaoyi/mill-dist/1.0.6
$ curl -L "$REPO/mill-dist-1.0.6-mill.sh" -o mill
$ chmod +x mill
$ mkdir src
$ echo "hear me moo" > src/iamcow.txt
$ echo "hello world" > src/hello.txt
10.3.bash
We can now build the concat target, ask Mill to print the path to its
output file, and inspect its contents:
$ ./mill concat
$ ./mill show concat
"ref:fd0201e7:/Users/lihaoyi/test/out/concat.dest/concat.txt"
$ cat out/concat.dest/concat.txt
hear me moo
hello world
10.4.bash
Mill re-uses output files whenever possible: in this case, since the concat
target only depends on srcs, calling ./mill concat repeatedly returns the
already generated concat.txt file. However, if we change the contents of the
srcs by adding a new file to the folder, Mill automatically re-builds
concat.txt to take the new input into account:
$ echo "twice as much as you" > src/iweigh.txt
$ ./mill concat
$ cat out/concat.dest/concat.txt
hear me moo
twice as much as you
hello world
10.5.scalaWhile our build pipeline above only has one set of sources and one target, we
can also define more complex builds. For example, here is an example build with
2 source folders (src/ and resources/) and 3 targets (concat, compress
and zipped):
build.mill10.6.scalaimportmill.*defsrcs=Task.Source("src")+defresources=Task.Source("resources")defconcat=Task{os.write(Task.dest/"concat.txt",os.list(srcs().path).map(os.read(_)))PathRef(Task.dest/"concat.txt")}+defcompress=Task{+forp<-os.list(resources().path)do+valcopied=Task.dest/p.relativeTo(resources().path)+os.copy(p,copied)+os.call(cmd=("gzip",copied))+PathRef(Task.dest)+}+defzipped=Task{+valtemp=Task.dest/"temp"+os.makeDir(temp)+os.copy(concat().path,temp/"concat.txt")+forp<-os.list(compress().path)do+os.copy(p,temp/p.relativeTo(compress().path))+os.call(cmd=("zip","-r",Task.dest/"out.zip","."),cwd=temp)+PathRef(Task.dest/"out.zip")+}
In addition to concatenating files, we also gzip compress the contents of our
resources/ folder. We then take the concatenated sources and compressed
resources and zip them all up into a final out.zip file:
Given files in both srcs/ and resources/:
| |
We can run ./mill zipped and see the expected concat.txt and *.gz files in
the output out.zip:
$ ./mill show zipped
"ref:a3771625:/Users/lihaoyi/test/out/zipped.dest/out.zip"
$ unzip -l out/zipped.dest/out.zip
Archive: out/zipped.dest/out.zip
Length Date Time Name
--------- ---------- ----- ----
35 11-30-2019 13:10 foo.md.gz
45 11-30-2019 13:10 concat.txt
40 11-30-2019 13:10 thing.py.gz
--------- -------
120 3 files
10.9.bashAs shown earlier, out.zip is re-used as long as none of the inputs (src/ and
resources/) change. However, because our pipeline has two branches, the
concat and compress targets are independent: concat is only re-generated
if the src/ folder changes:
And the compress target is only re-generated if the resources/ folder
changes:
While in these examples our Task{...} targets all returned PathRefs to files or
folders, you can also define targets that return any JSON-serializable data type
compatible with the uPickle library we went through in Chapter 8: JSON and Binary Data Serialization. Mill also supports a -j <n> flag to parallelize
independent targets over multiple threads, e.g. ./mill -j 2 zipped would spin
up 2 threads to work through the two branches of the target graph in parallel.
Mill also supports the concept of modules. You can use modules to define repetitive sets of build targets.
It is very common for certain sets of targets to be duplicated within your
build: perhaps for every folder of source files, you want to compile them, lint
them, package them, test them, and publish them. By defining a trait that
extends Module, you can apply the same set of targets to different folders on
disk, making it easy to manage the build for larger and more complex projects.
Here we are taking the set of srcs/resources and
concat/compress/zipped targets we defined earlier and wrapping them in a
trait FooModule so they can be re-used:
build.mill10.10.scalaimportmill.*+traitFooModuleextendsModule:defsrcs=Task.Source("src")defresources=Task.Source("resources")defconcat=Task{...}defcompress=Task{...}defzipped=Task{...}++objectbarextendsFooModule+objectquxextendsFooModule
object bar and object qux extend trait FooModule, and have source paths
(accessible via the inherited moduleDir property) of bar/ and qux/ respectively.
The srcs and resources definitions above thus point to the following folders:
bar/src/bar/resources/qux/src/qux/resources/You can ask Mill to list out the possible targets for you to build via
./mill resolve __ (that's two _s in a row):
$ ./mill resolve __
bar.compress
bar.concat
bar.resources
bar.srcs
bar.zipped
qux.compress
qux.concat
qux.resources
qux.srcs
qux.zipped
10.11.bash
Any of the targets above can be built from the command line, e.g. via
$ mkdir -p bar/src bar/resources
$ echo "Hello" > bar/src/hello.txt; echo "World" > bar/src/world.txt
$ ./mill show bar.zipped
"ref:efdf1f3c:/Users/lihaoyi/test/out/bar/zipped.dest/out.zip"
$ unzip out/bar/zipped.dest/out.zip
Archive: out/bar/zipped.dest/out.zip
extracting: concat.txt
$ cat concat.txt
Hello
World
10.12.bashModules can also be nested to form arbitrary hierarchies:
build.mill10.13.scala//| mill-version: 1.0.6importmill.*traitFooModuleextendsModule:defsrcs=Task.Source("src")defconcat=Task{os.write(Task.dest/"concat.txt",os.list(srcs().path).map(os.read(_)))PathRef(Task.dest/"concat.txt")}objectbarextendsFooModule:objectinner1extendsFooModuleobjectinner2extendsFooModuleobjectwrapperextendsModule:objectquxextendsFooModule
Here we have four FooModules: bar, bar.inner1, bar.inner2, and
wrapper.qux. This exposes the following source folders and targets:
|
Source Folders
|
Targets
|
Note that wrapper itself is a Module but not a FooModule, and thus does
not itself define a wrapper/src/ source folder or a wrapper.concat target.
In general, every object in your module hierarchy needs to inherit from
Module, although you can inherit from a custom subtype of FooModule if you
want them to have some common targets already defined.
The moduleDir made available within each module differs: while in the
top-level build pipelines we saw earlier moduleDir was always equal to
os.pwd, within a module the moduleDir reflects the module path, e.g.
the moduleDir of bar is bar/, the moduleDir of wrapper.qux
is wrapper/qux/, and so on.
The last basic concept we will look at is cross modules. These are most useful when the number or layout of modules in your build isn't fixed, but can vary based on e.g. the files on the filesystem:
build.mill10.14.scala//| mill-version: 1.0.6importmill.*importmillBuildCtx.api.,BuildCtx.workspaceRootvalitems=BuildCtx.watchValue{os.list(workspaceRoot/"foo").map(_.last)}objectfooextendsCross[FooModule](items)traitFooModuleextendsCross.Module[String]:defmoduleDir=super.moduleDir/crossValuedefsrcs=Task.Source("src")defconcat=Task{os.write(Task.dest/"concat.txt",os.list(srcs().path).map(os.read(_)))PathRef(Task.dest/"concat.txt")}
Here, we define a cross module foo that takes a set of items found by
listing the sub-folders in foo/. This set of items is dynamic, and can
change if the folders on disk change, without needing to update the build.mill
file for every change.
Note the BuildCtx.watchValue call; this is necessary to tell Mill to take note
in case the number or layout of modules within the foo/ folder changes.
Without it, we would need to restart the Mill process using ./mill shutdown to
pick up changes in how many entries the cross-module contains.
The Cross class that foo inherits is a Mill builtin, that automatically
generates a set of Mill modules corresponding to the items we passed in.
Typically, a cross module has the same moduleDir as an ordinary module, but in the example above
it is overwritten to consider the crossValue as part of the path, giving each module
a unique srcs directory.
As written, given a filesystem layout on the left, it results in the source
folders and concat targets on the right:
|
sources
targets |
If we then add a new source folder via mkdir -p, Mill picks up the additional
module and concat target:
$ mkdir -p foo/thing/src
$ ./mill resolve __.concat
foo[bar].concat
foo[qux].concat
foo[thing].concat
10.17.bashWe have now gone through the basics of how to use Mill to define simple asset
pipelines to incrementally perform operations on a small set of files. Next, we
will return to the Blog.scala static site script we wrote in Chapter 9, and
see how we can use these techniques to make it incremental: to only re-build the
pages of the static site whose inputs changed since the last time they were
built.
While Blog.scala works fine in small cases, there is one big limitation: the
entire script runs every time. Even if you only change one blog post's .md
file, every file will need to be re-processed. This is wasteful, and can be slow
as the number of blog posts grows. On a large blog, re-processing every post can
take upwards of 20-30 seconds: a long time to wait every time you tweak some
wording!
It is possible to manually keep track of which .md file was converted into
which .html file, and thus avoid re-processing .md files unnecessarily.
However, this kind of book-keeping is tedious and easy to get wrong. Luckily,
this is the kind of book-keeping and incremental re-processing that Mill is good
at!
We will now walk through a step by step conversion of this Blog.scala script file
into a Mill build.mill. First, we must rename Blog.scala into build.mill to
convert it into a Mill build pipeline and add the import mill.* declaration:
Blog.scala -> build.mill10.18.scala+importmill.*importscalatagsText..all.*
Note that the dependencies on mainargs and os-lib are dropped, as these are automatically available in Mill.
Second, since we can rely on Mill invalidating and deleting stale files and
folders as they fall out of date, we no longer need the os.remove.all and
os.makeDir.all calls:
build.mill10.19.scala-os.remove.all(os.pwd/"out")-os.makeDir.all(os.pwd/"out/post")
We will also remove the @main method wrapper and publishing code for now. mill build pipelines
use a different syntax for taking command-line arguments than Scala files do,
and porting this functionality to our Mill build pipeline is
left as an exercise at the end of the chapter.
build.mill10.20.scala-importmainargs.*-ParserForMethods(this).runOrExit(args)// `args` is available at the top-level-defmain(targetGitRepo:String="")=...--iftargetGitRepo!=""then-os.call(cmd=("git","init"),cwd=os.pwd/"out")-os.call(cmd=("git","add","-A"),cwd=os.pwd/"out")-os.call(cmd=("git","commit","-am","."),cwd=os.pwd/"out")-os.call(cmd=("git","push",targetGitRepo,"head","-f"),cwd=os.pwd/"out")
Third, we convert the for-loop that we previously used to iterate over the
files in the postInfo list, and convert it into a cross module. That will
allow every blog post's .md file to be processed, invalidated, and
re-processed independently only if the original .md file changes:
build.mill10.21.scala-for(_,suffix,path)<-postInfodo+objectpostextendsCross[PostModule](postInfo.map(_(0)))+traitPostModuleextendsCross.Module[String]:+defnumber=crossValue+valSome((_,suffix,markdownPath))=postInfo.find(_(0)==number)+defpath=Task.Source(markdownPath)+defrender=Task{valparser=org.commonmark.parser.Parser.builder().build()-valdocument=parser.parse(os.read(path))+valdocument=parser.parse(os.read(path().path))valrenderer=org.commonmark.renderer.html.HtmlRenderer.builder().build()valoutput=renderer.render(document)os.write(-os.pwd/"out/post"/mdNameToHtml(suffix),+Task.dest/mdNameToHtml(suffix),doctype("html")(...))+PathRef(Task.dest/mdNameToHtml(suffix))+}
Note how the items in the Cross[](...) declaration are the numbers corresponding
to each post in our postInfo list. For each item, we define a source path
which is the source file itself, as well as a def render target which is a
PathRef to the generated HTML. In the conversion from a hardcoded script to a
Mill build pipeline, all the hardcoded references writing files os.pwd / "out"
have been replaced by the Task.dest of each target.
Fourth, we wrap the generation of the index.html file into a target as well:
build.mill10.22.scala+deflinks=Task.Input{postInfo.map(_(1))}++defindex=Task{os.write(-os.pwd/"out/index.html",+Task.dest/"index.html",doctype("html")(html(head(bootstrapCss),body(h1("Blog"),-for(_,suffix,_)<-postInfodo+forsuffix<-links()doyieldh2(a(href:=("post/"+mdNameToHtml(suffix)))(suffix))))))+PathRef(Task.dest/"index.html")+}
Note that we need to define a def links target that is a Task.Input: this tells
Mill that the contents of the postInfo.map expression may change (since it
depends on the files present on disk) and to make sure to re-evaluate it every
time to check for changes. Again, the hardcoded references to os.pwd / "out"
have been replaced by the Task.dest of the individual target.
Lastly, we need to aggregate all our individual posts and the index.html file
into a single target, which we will call dist (short for "distribution"):
build.mill10.23.scala+valposts=Task.sequence(postInfo.map(_(0)).map(post(_).render))++defdist=Task{+forpost<-posts()do+os.copy(post.path,Task.dest/"post"/post.path.last,createFolders=true)+os.copy(index().path,Task.dest/"index.html")++PathRef(Task.dest)+}
This is necessary because while previously we created the HTML files for the
individual posts and index "in place", now they are each created in separate
Task.dest folders assigned by Mill so they can be separately invalidated and
re-generated. Thus we need to copy them all into a single folder that we can
open locally in the browser or upload to a static site host.
Note that we need to use the helper method Task.sequence to turn the
Seq[T[PathRef]] into a T[Seq[PathRef]] for us to use in def dist.
We now have a static site pipeline with the following shape:
We can now take the same set of posts we used earlier, and build them into a
static website using ./mill. Note that the output is now in the
out/dist.dest/ folder, which is the Task.dest folder for the dist target.
$ find post -type f
post/1 - My First Post.md
post/3 - My Third Post.md
post/2 - My Second Post.md
$ ./mill show dist
"ref:b33a3c95:/Users/lihaoyi/Github/blog/out/dist.dest"
$ find out/dist.dest -type f
out/dist.dest/index.html
out/dist.dest/post/my-first-post.html
out/dist.dest/post/my-second-post.html
out/dist.dest/post/my-third-post.html
10.24.bash
We can then open the index.html in our browser to view the blog. Every time
you run ./mill dist, Mill will only re-process the blog posts that have
changed since you last ran it. You can also use ./mill --watch dist or ./mill -w dist to have Mill watch the filesystem and automatically re-process the
files every time they change.
Now that we've defined a simple pipeline, let's consider two extensions:
Download the bootstrap.css file at build time and bundle it with the static
site, to avoid a dependency on the third party hosting service
Extract a preview of each blog post and include it on the home page
Bundling bootstrap is simple. We define a bootstrap target to download the file
and include it in our dist:
build.mill10.25.scala-valbootstrapCss=link(-rel:="stylesheet",-href:="https://stackpath.bootstrapcdn.com/bootstrap/4.5.0/css/bootstrap.css"-)+defbootstrap=Task{+os.write(+Task.dest/"bootstrap.css",+requests.get(+"https://stackpath.bootstrapcdn.com/bootstrap/4.5.0/css/bootstrap.css"+)+)+PathRef(Task.dest/"bootstrap.css")+}
build.mill10.26.scaladefdist=Task{forpost<-posts()doos.copy(post.path,Task.dest/"post"/post.path.last,createFolders=true)os.copy(index().path,Task.dest/"index.html")+os.copy(bootstrap().path,Task.dest/"bootstrap.css")PathRef(Task.dest)}
And then update our two bootstrapCss links to use a local URL:
build.mill10.27.scala-head(bootstrapCss),+head(link(rel:="stylesheet",href:="../bootstrap.css")),
build.mill10.28.scala-head(bootstrapCss),+head(link(rel:="stylesheet",href:="bootstrap.css")),
Now, when you run ./mill dist, you can see that the bootstrap.css file is
downloaded and bundled with your dist folder, and we can see in the browser
that we are now using a locally-bundled version of Bootstrap:
$ find out/dist.dest -type f
out/dist.dest/bootstrap.css
out/dist.dest/index.html
out/dist.dest/post/my-first-post.html
out/dist.dest/post/my-second-post.html
out/dist.dest/post/my-third-post.html
10.29.bash

Since it does not depend on any Task.Source, the bootstrap = Task{} target never
invalidates. This is usually what you want when depending on a stable URL like
bootstrap/4.5.0. If you are depending on something unstable that needs to be
regenerated every build, define it as a Task.Input{} task.
We now have the following build pipeline, with the additional bootstrap step:
To render a paragraph preview of each blog post in the index.html page, the
first step is to generate such a preview for each PostModule. We will simply
take everything before the first empty line in the Markdown file, treat that as
the "first paragraph" of the post, and feed it through our Markdown parser:
build.mill10.30.scalatraitPostModuleextendsCross.Module[String]:defnumber=crossValuevalSome((_,suffix,path))=postInfo.find(_(0)==number)defpath=Task.Source(markdownPath)+defpreview=Task{+valparser=org.commonmark.parser.Parser.builder().build()+valfirstPara=os.read.lines(path().path).takeWhile(_.nonEmpty)+valdocument=parser.parse(firstPara.mkString("\n"))+valrenderer=org.commonmark.renderer.html.HtmlRenderer.builder().build()+valoutput=renderer.render(document)+output+}defrender=Task{
Here we are leaving the preview as output: String rather than writing it to a
file and using a PathRef.
Next, we need to aggregate the previews the same way we aggregated the
renders earlier:
build.mill10.31.scaladeflinks=Task.Input{postInfo.map(_(1))}+valpreviews=Task.sequence(postInfo.map(_(0)).map(post(_).preview))defindex=Task{
Lastly, in dist, zip the preview together with the postInfo in order to
render them:
build.mill10.32.scala-forsuffix<-links()-yieldh2(a(href:=("post/"+mdNameToHtml(suffix)))(suffix))+for(suffix,preview)<-links().zip(previews())+yieldfrag(+h2(a(href:=("post/"+mdNameToHtml(suffix)))(suffix)),+raw(preview)// include markdown-generated HTML "raw" without HTML-escaping+)
Now we get pretty previews in index.html!

The build pipeline now looks like:
Note how we now have both post[n].preview and post[n].render targets, with
the preview targets being used in index to generate the home page and the
render targets only being used in the final dist. As we saw earlier, any
change to a file only results in that file's downstream targets being
re-generated. This saves time over naively re-generating the entire static site
from scratch. It should also be clear the value that Mill
Modules (10.2) bring, in allowing repetitive sets of targets like
preview and render to be defined for all blog posts without boilerplate.
Here's the complete code, with the repetitive
org.commonmark.parser.Parser.builder() code extracted into a shared def renderMarkdown function, and the repetitive HTML rendering code extracted into
a shared def renderHtmlPage function:
build.mill10.33.scala//| mill-version: 1.0.6//| mvnDeps://| - com.lihaoyi::scalatags:0.13.1//| - org.commonmark:commonmark:0.26.0importmill.*importmillBuildCtx.api.,BuildCtx.workspaceRootimportscalatagsText..all.*defmdNameToHtml(name:String)=name.replace(" ","-").toLowerCase+".html"valpostInfo=BuildCtx.watchValue{os.list(workspaceRoot/"post").map:p=>vals"$prefix-$suffix.md"=p.last(prefix,suffix,p).sortBy(_(0).toInt)}defbootstrap=Task{os.write(Task.dest/"bootstrap.css",requests.get("https://stackpath.bootstrapcdn.com/bootstrap/4.5.0/css/bootstrap.css"))PathRef(Task.dest/"bootstrap.css")}defrenderMarkdown(s:String)={valparser=org.commonmark.parser.Parser.builder().build()valdocument=parser.parse(s)valrenderer=org.commonmark.renderer.html.HtmlRenderer.builder().build()renderer.render(document)}defrenderHtmlPage(dest:os.Path,bootstrapUrl:String,contents:Frag*)={os.write(dest,doctype("html")(html(head(link(rel:="stylesheet",href:=bootstrapUrl)),body(contents))))PathRef(dest)}objectpostextendsCross[PostModule](postInfo.map(_(0)))traitPostModuleextendsCross.Module[String]:defnumber=crossValuevalSome((_,suffix,markdownPath))=postInfo.find(_(0)==number)defpath=Task.Source(markdownPath)defpreview=Task{renderMarkdown(os.read.lines(path().path).takeWhile(_.nonEmpty).mkString("\n"))}defrender=Task{renderHtmlPage(Task.dest/mdNameToHtml(suffix),"../bootstrap.css",h1(a(href:="../index.html")("Blog")," / ",suffix),raw(renderMarkdown(os.read(path().path))))}deflinks=Task.Input{postInfo.map(_(1))}valposts=Task.sequence(postInfo.map(_(0)).map(post(_).render))valpreviews=Task.sequence(postInfo.map(_(0)).map(post(_).preview))defindex=Task{renderHtmlPage(Task.dest/"index.html","bootstrap.css",h1("Blog"),for(suffix,preview)<-links().zip(previews())yieldfrag(h2(a(href:=("post/"+mdNameToHtml(suffix)))(suffix)),raw(preview)// include markdown-generated HTML "raw" without HTML-escaping))}defdist=Task{forpost<-posts()doos.copy(post.path,Task.dest/"post"/post.path.last,createFolders=true)os.copy(index().path,Task.dest/"index.html")os.copy(bootstrap().path,Task.dest/"bootstrap.css")PathRef(Task.dest)}
In this chapter, we have learned how to define simple incremental build pipelines using Mill. We then took the script in Chapter 9: Self-Contained Scala Scripts and converted it into a Mill build pipeline. Unlike a naive script, this pipeline allows fast incremental updates whenever the underlying sources change, along with easy parallelization, all in less than 90 lines of code. We have also seen how to extend the Mill build pipeline, adding additional build steps to do things like bundling CSS files or showing post previews, all while preserving the efficient incremental nature of the build pipeline.
Mill is a general-purpose build tool and can be used to create general-purpose build pipelines for all sorts of data. In later chapters we will be using the Mill build tool to compile Java and Scala source code into executables. For a more thorough reference, you can browse the Mill online documentation:
This chapter marks the end of the second section of this book: Part II Local Development. You should hopefully be confident using the Scala programming language to perform general housekeeping tasks on a single machine, manipulating files, subprocesses, and structured data to accomplish your goals. The next section of this book, Part III Web Services, will explore using Scala in a networked, distributed world: where your fundamental tools are not files and folders, but HTTP APIs, servers, and databases.
Exercise: Mill tasks can take also command line arguments, by defining def name(...) = Task.Command{...} methods. Similar to @main methods in Scala files, the
arguments to name are taken from the command line. Define an Task.Command in
our build.mill that allows the user to specify a remote git repository from
the command line, and uses os.call operations to push the static site to
that repository.
Exercise: You can use the Puppeteer Javascript library to convert HTML web pages into
PDFs, e.g. for printing or publishing as a book. Integrate Puppeteer into our
static blog, using the subprocess techniques we learned in Chapter 7: Files and Subprocesses, to add a ./mill pdfs target that creates a PDF version of
each of our blog posts.
Puppeteer can be installed via npm, and its docs can be found at:
The following script can be run via node, assuming you have the puppeteer
library installed via NPM, and takes a src HTML file path and dest output
PDF path as command line arguments to perform the conversion from HTML to PDF:
const puppeteer = require('puppeteer');
const [src, dest] = process.argv.slice(2)
puppeteer.launch().then(async function(browser){
const page = await browser.newPage();
await page.goto("file://" + src, {waitUntil: 'load'});
await page.pdf({path: dest, format: 'A4'});
process.exit(0)
})
10.34.javascriptSee example 10.9 - PostPdfExercise: The Apache PDFBox library is a convenient way to manipulate PDFs from Java or
Scala code, and can easily be added as a dependency with --dep for use with Scala CLI REPL
or scripts via the coordinates org.apache.pdfbox:pdfbox:2.0.18. Add a new target to our
build pipeline that uses the class
org.apache.pdfbox.multipdf.PDFMergerUtility from PDFBox to concatenate the
PDFs for each individual blog post into one long multi-page PDF that contains
all of the blog posts one after another.