Cleaning unused assets from your static website

As websites and blogs continue to grow, so can the amount of user-uploaded content like images and videos — especially when your site has multiple team members and writers. Over time, it can be tricky to figure out what’s actually used and what’s just collecting dust.

We recently helped a customer clean up 5 gigabytes(!!) of unused assets, and you might be surprised how much digital clutter is lurking on your site too.

Let’s dive in and see how much data you can free from your Jekyll website (and speed up build time while you’re at it!).

TL;DR You can now use the official Siteleaf Ruby gem to check for unused uploads by running: siteleaf clean. Skip to the end of this post to learn more.

How to check for unused assets

While you can use the Siteleaf gem to do this for you, follow along if you are interested in seeing how we got there (or want to write your own script).

First, you’ll want to build your full site (including all drafts and unpublished documents) to get a picture of what assets are in use:

$ bundle exec jekyll serve --unpublished --drafts --future

Now we can search the _site folder to verify if an asset (for example /uploads/screenshot.png) is linked or embedded somewhere in its outputted HTML files:

$ grep -rl "/uploads/screenshot.png" _site

This command will output a list of pages where the asset is either embedded using an <img> tag or otherwise linked on a page.

However Grep can be a little slow if you have a lot of content, so we can use The Silver Searcher to greatly speed things up.

To install on a Mac, run the following (see GitHub for other platforms):

$ brew install the_silver_searcher

Now you can search using:

$ ag -lQ "/uploads/screenshot.png" _site

Of course, you won’t want to search for each asset one-by-one, so we can write a simple shell script to loop through all the files in _uploads:

If you have files with spaces or other special characters, you’ll want to encode the filenames. We can use Jekyll’s URL.escape_path method to do just that.

Using the Siteleaf gem

To save you the hassle of maintaining your own script, we decided to add a simple command to our (open source) Siteleaf gem. It will automatically use The Silver Searcher if you have it installed, or otherwise fall back to Grep for searching.

First, add the following to your Gemfile (if you haven’t already):

source "https://rubygems.org"

gem 'jekyll'

group :development do
  gem 'siteleaf'
end

After a bundle update you can now run:

$ bundle exec siteleaf clean _uploads

This command will build your site locally, check for any unused files, and prompt if you would like to remove them. Note: _uploads is optional and used by default, you can also pass in a different folder such as assets/images.

Building Jekyll site...

Finding unused files...
_uploads/3D_slice.mp4
_uploads/Logo final final 2.svg
_uploads/Screen Shot 2020-09-16 at 9.42.59 AM.png
=> Delete above 3 file(s)? (y/n)

This all happens locally, and works with any Jekyll site — even if you don’t use Siteleaf. To fully remove the assets from Siteleaf and your published site, just commit and sync your changes with GitHub.

Happy cleaning!