The easier way to use lunr search with Hugo

It might not be immediately obvious but my blog is a collection of static pages, generated by Hugo static site generator and updated automatically whenever I push to the GitHub repository. Back when I started using it, I had to decide on a search solution. I ruled out a third-party service (because privacy) and a server-supported one (because security). Instead, I went with lunr.js which works entirely on the client side.

Now if you want to do the same, you better don’t waste your time on the solution currently proposed by the Hugo documentation. It relies on updating the search index manually using an external tool whenever you update the content. And that tool will often deduce page addresses incorrectly, only some Hugo configurations are supported.

Eventually I realized that Hugo is perfectly capable of generating a search index by itself. I recently contributed the necessary code to the MemE theme, so by using this theme you get search capability “for free.” But in case you don’t want to switch to a new theme right now, I’ll walk you through the necessary changes.

Generating the search index

Hugo can generate the search index the same way it generates RSS feeds for example, it’s just another output format. You merely need to add a template for it, e.g. layouts/index.searchindex.json:

[
  {{- range $index, $page := .Site.RegularPages -}}
    {{- if gt $index 0 -}} , {{- end -}}
    {{- $entry := dict "uri" $page.RelPermalink "title" $page.Title -}}
    {{- $entry = merge $entry (dict "content" ($page.Plain | htmlUnescape)) -}}
    {{- $entry = merge $entry (dict "description" $page.Description) -}}
    {{- $entry = merge $entry (dict "categories" $page.Params.categories) -}}
    {{- $entry | jsonify -}}
  {{- end -}}
]

This will generate a JSON file containing a list of all pages. A page entry contains its address, title, contents, description and categories. You can easily add more fields if you want them to be searchable, for example tags.

Now you have to make sure the search index is actually generated, the output format needs to be added to the site’s configuration. Here assuming YAML-formatted configuration and default existing outputs for the home page:

outputFormats:
  SearchIndex:
    baseName: search
    mediaType: application/json

outputs:
  home:
    - HTML
    - RSS
    - SearchIndex

After rebuilding the website you should have a search.json file in the root directory. It’s not going to be tiny, but with gzip compression enabled the download size should be acceptable for most websites.

Adding the necessary elements

Now you need a search form on your page. For me it looks like this:

<form id="search" class="search" role="search">
  <label for="search-input">
    <svg xmlns="http://www.w3.org/2000/svg" viewBox="0 0 512 512" class="icon search-icon"><path d="M505 442.7L405.3 343c-4.5-4.5-10.6-7-17-7H372c27.6-35.3 44-79.7 44-128C416 93.1 322.9 0 208 0S0 93.1 0 208s93.1 208 208 208c48.3 0 92.7-16.4 128-44v16.3c0 6.4 2.5 12.5 7 17l99.7 99.7c9.4 9.4 24.6 9.4 33.9 0l28.3-28.3c9.4-9.4 9.4-24.6.1-34zM208 336c-70.7 0-128-57.2-128-128 0-70.7 57.2-128 128-128 70.7 0 128 57.2 128 128 0 70.7-57.2 128-128 128z"/></svg>
  </label>
  <input type="search" id="search-input" class="search-input">
</form>

That’s an SVG icon from Font Awesome being used as search label. I style this form in such a way that the text field only occupies space when it is focused. In addition, there is an animation to make the icon spin when a search operation is in progress:

@keyframes spin {
  100% {
    transform: rotateY(360deg);
  }
}

.search {
  display: flex;
  justify-content: center;
  border: 1px solid black;
  min-width: 1em;
  height: 1em;
  line-height: 1;
  border-radius: 0.75em;
  padding: 0.25em;
}

.search-icon {
  color: black;
  cursor: pointer;
  width: 1em;
  height: 1em;
  margin: 0;
  vertical-align: bottom;
}

.search[data-running] .search-icon {
  animation: spin 1.5s linear infinite;
}

.search-input {
  border-width: 0;
  padding: 0;
  margin: 0;
  width: 0;
  outline: none;
  background: transparent;
  transition: width 0.5s;
}

.search-input:focus {
  margin-left: 0.5em;
  width: 10em;
}

Finally, we need a template for search results. This element is hidden but will be cloned and filled with data for any page found by the search. Mine looks like this:

<template id="search-result" hidden>
  <article class="content post">
    <h2 class="post-title"><a class="summary-title-link"></a></h2>
    <summary class="summary"></summary>
    <div class="read-more-container">
      <a class="read-more-link">Read More »</a>
    </div>
  </article>
</template>

The JavaScript code

And then you need some JavaScript code to make all of this work. Obviously, you will need lunr.js script itself. If you have non-English texts on your websites, you will also need lunr.stemmer.support.js and the right language pack from the lunr-languages package. And some code to connect all of this to the search field. In order to conserve bandwidth, my code only loads the search index when it is needed – the first time a search is performed.

window.addEventListener("DOMContentLoaded", function(event)
{
  var index = null;
  var lookup = null;
  var queuedTerm = null;

  var form = document.getElementById("search");
  var input = document.getElementById("search-input");

  form.addEventListener("submit", function(event)
  {
    event.preventDefault();

    var term = input.value.trim();
    if (!term)
      return;

    startSearch(term);
  }, false);

  function startSearch(term)
  {
    // Start icon animation.
    form.setAttribute("data-running", "true");

    if (index)
    {
      // Index already present, search directly.
      search(term);
    }
    else if (queuedTerm)
    {
      // Index is being loaded, replace the term we want to search for.
      queuedTerm = term;
    }
    else
    {
      // Start loading index, perform the search when done.
      queuedTerm = term;
      initIndex();
    }
  }

  function searchDone()
  {
    // Stop icon animation.
    form.removeAttribute("data-running");

    queuedTerm = null;
  }

  function initIndex()
  {
    var request = new XMLHttpRequest();
    request.open("GET", "/search.json");
    request.responseType = "json";
    request.addEventListener("load", function(event)
    {
      lookup = {};
      index = lunr(function()
      {
        // Uncomment the following line and replace de by the right language
        // code to use a lunr language pack.

        // this.use(lunr.de);

        this.ref("uri");

        // If you added more searchable fields to the search index, list them here.
        this.field("title");
        this.field("content");
        this.field("description");
        this.field("categories");

        for (var doc of request.response)
        {
          this.add(doc);
          lookup[doc.uri] = doc;
        }
      });

      // Search index is ready, perform the search now
      search(queuedTerm);
    }, false);
    request.addEventListener("error", searchDone, false);
    request.send(null);
  }

  function search(term)
  {
    var results = index.search(term);

    // The element where search results should be displayed, adjust as needed.
    var target = document.querySelector(".main-inner");

    while (target.firstChild)
      target.removeChild(target.firstChild);

    var title = document.createElement("h1");
    title.id = "search-results";
    title.className = "list-title";

    if (results.length == 0)
      title.textContent = `No results found for “${term}”`;
    else if (results.length == 1)
      title.textContent = `Found one result for “${term}”`;
    else
      title.textContent = `Found ${results.length} results for “${term}”`;
    target.appendChild(title);
    document.title = title.textContent;

    var template = document.getElementById("search-result");
    for (var result of results)
    {
      var doc = lookup[result.ref];

      // Fill out search result template, adjust as needed.
      var element = template.content.cloneNode(true);
      element.querySelector(".summary-title-link").href =
          element.querySelector(".read-more-link").href = doc.uri;
      element.querySelector(".summary-title-link").textContent = doc.title;
      element.querySelector(".summary").textContent = truncate(doc.content, 70);
      target.appendChild(element);
    }
    title.scrollIntoView(true);

    searchDone();
  }

  // This matches Hugo's own summary logic:
  // https://github.com/gohugoio/hugo/blob/b5f39d23b8/helpers/content.go#L543
  function truncate(text, minWords)
  {
    var match;
    var result = "";
    var wordCount = 0;
    var regexp = /(\S+)(\s*)/g;
    while (match = regexp.exec(text))
    {
      wordCount++;
      if (wordCount <= minWords)
        result += match[0];
      else
      {
        var char1 = match[1][match[1].length - 1];
        var char2 = match[2][0];
        if (/[.?!"]/.test(char1) || char2 == "\n")
        {
          result += match[1];
          break;
        }
        else
          result += match[0];
      }
    }
    return result;
  }
}, false);

This glue code might require a few changes depending on your setup. You need to adjust initIndex() function if you use a non-English language (uncomment this.use() call) or have additional fields in your search index. You also need to adjust search() function if your search result template is different from mine listed above or if you want the search title have a different class name.

The complete code

The code given above has been mildly simplified, the actual code used by the MemE theme considers a bunch more scenarios. If you want to take a look at the “real thing,” here it is:

Comments

  • John

    Thanks a lot! This really helped getting it working on my hugo site too!

  • Marcin

    Hi - I've done typical copy-paste from your article :) index is generating, I have search input and everything on the place. But... end with no results in this message in console: Uncaught TypeError: target is null

    Pointing at this line of code: while (target.firstChild)

    Any ideas?

    Thanks!

    Wladimir Palant

    Yes, you have no element with class main-inner. You need to adjust the line above, change the selector wherever search results should display. As mentioned, the JavaScript code most likely won't work without adjustments for your layout.

  • Pete

    Thank you so much for this.

  • Sd

    Nice tutorial.

  • Tony

    Very helpful, thank you for sharing

  • rgz

    thank u very much

  • JP

    Thanks for writing this up! It's possibly worth noting that this approach might slow down the browser on search for large sites — as a .json file containing all all your post content has to be downloaded and processed into a search index all before a search can be completed.

    I now use Hugo (and haven't implemented search, yet) but I wrote a Jekyll plugin for Lunr search a long while back, and got around this initialization cost by pre-computing the index at build time, rather than generating an indexable JSON file (here's a sample of that JSON from a site that uses that Jekyll plugin). It's still a large file to download, but it's slightly smaller (stop tokens have been removed and all words have ben stemmed), but it also uses almost no client-side compute time to search, even a very large index.

    I haven't found a hugo-recommended plugin that does the same, so I'll see what I can build — but my ideal case would be to build a sharded index (spread across multiple files), and have the client-side search only download the parts of the index it needs for your search particularly (reducing the download size too!)

    Wladimir Palant

    I think that back when I tried this the pre-compiled index was somewhat larger than just packaging up all the texts, mostly due to JSON syntax overhead. The processing time also isn’t noticeable for me. So I concluded that this wasn’t worth the complexity.

  • JP

    I should have guessed that you'd done the research! Thanks for the heads up, I think I need to figure out where that threshold is before I implement mine, as mine is starting to grow! Thanks :)