How I Build My Static Assets for Hugo

Sunday, July 30, 2017

Update: This post is obsolete as of Hugo 0.43. See Regis Philbert’s great summary of best practices instead.

Hugo is a great static site generator written in Go. I use it for this blog. Its advantages are that it’s very fast, very easy to set up, and very flexible, but its disadvantage is that it doesn’t have the mature community support that Jekyll has. One example of that is that Hugo has no particular recommended route for managing a static asset pipeline. In this post, I’d like to explain how my personal pipeline works to see if it can help other Hugo users.

What is a pipeline?

But first, what is a “static asset pipeline”? Static assets are bits of code for a webpage that don’t change from one page to the next, such as CSS files, JavaScript files, webfonts, and icons. Often these files should be preprocessed somehow before being sent to a server: Sass files and ES6 files need to be converted to CSS and vanilla JS respectively, all files should be minimized, text files should be gzipped, cachebusting hashes should be added, and so on.

“Cachebusting” is the name for giving a different file name to an asset whenever it changes. Let’s say you were to change the CSS on your site but left the filename the same. A user who returns to your site might still have the CSS in their browser’s cache and so see strange results due to a mismatch between the new HTML and the old CSS. By changing the filename every time you change an asset, you prevent this problem and have the further benefit that you can tell browsers to cache the asset forever, since it won’t be changed while using the same name. One way to do this is manually, with names like “site-v1.css,” “site-v2.css,” and so on. The disadvantage here is that you have to remember to do it and to change both the filename and the template. Another way to do it is to use the date. This approach has the disadvantage that your users have to download new static files after every deploy even if you didn’t change anything. The best practice for this problem is to use the “hash signature” of a file. With a hash, the same file input always produce the same hash output, and different file inputs are mathematically guaranteed to produce different outputs (except in the case of an astronomically unlikely “collision”). For example, the MD5 hash of Hello World!\n is 8ddd8be4b179a529afa5f2ffae4b9858, but Hello, World!\n is bea8252ff4e80f41719ea13cdf007273. Even simple changes in input produce radically different output.

How my pipeline works

So, for my static site, I want several things to happen when I run the deploy command:

Clean out the results of prior builds
Recompile Sass files
Make hashes for static assets
Have Hugo build the site
Minify results from Hugo
Gzip results for Nginx
Rsync results to my server

To run all of these commands, I use Task, a simple make-clone written in Go that uses YAML files to describe tasks to run. I put a gist of Github of my current Taskfile.yml and related files.

Let’s talk about some of these steps in depth.

Recompile Sass files

My Sass isn’t so special, but I do use the autoprefixer postcss plugin, so that I don’t have to manually write out prefixes like -webkit for older browsers.

sassc -m -t compressed src/scss/site.scss static/css/site.css
postcss --use autoprefixer -o static/css/site.css static/css/site.css

Making asset hashes

To make asset hashes I use a tool I wrote called scattered. The actual command is

scattered -output data/assets.json -srcbasepath static '*.css' '*.png' '*.js'

That command tells scattered to make a copy of any CSS, JS, or PNG file in the directory static to a new file with the hash of the asset included in its filename, and to output a file called assets.json into Hugo’s data directory. For example, as I’m writing this, my assets.json has this bit of JSON (among its other lines):

{
    "css/site.css": "css/site.b3173f80fd05f4d6729d733d076f42b5.css"
}

Having Hugo build the site

Using Hugo’s data templates feature, I can refer to hashes listed in my assets.json in my templates. In a Hugo template, {{ index .Site.Data.assets "css/site.css" }} will be translated into the name of my latest hashed asset.

(To keep from using the hashes in local development, I actually use a partial template that you can see in my gist.)

Minify, Gzip, Rsync

For minification, I use minify from tdewolff, which can process HTML, CSS, JS, and XML (RSS). It strips out whitespace and other insignificant parts of a file so that users don’t have to download any fluff.

To speed up my webserver a little bit more, I use Nginx’s gzip_static module. It works by having compressed files in a directory next to regular files and sending those to browsers that support GZIP compression. I create the gzip files with monterey jack, another tool that I’ve written in Go.

Finally, I send the files to my server with rsync:

rsync --verbose  --progress --stats --compress \
--rsh=/usr/bin/ssh --recursive --times --perms --links \
--delete-during --exclude "*~" --exclude ".*" \
public/ mysite:/sites/epro

Conclusion

So, that’s my pipeline. There’s room for improvement. I don’t use much JavaScript on my site, so I don’t have an ES2017 transpiler in the pipeline, but it would be relatively simple to extend it to add that. I’m hosting my webfonts through Google Fonts, at least at the moment, so that’s another simplification. If you have ideas for how to improve my pipeline or improve Hugo itself, let me know.

The Ethically-Trained Programmer