Lunr.js as search engine for a static site
Author: | Admin |
Title: | Lunr.js as search engine for a static site |
Language: | en-US |
Number of words: | 1477 |
Created: | 02:02 on Wednesday, 01. September 2021 |
Modified: | 02:02 on Wednesday, 01. September 2021 |
Keywords: | lunrjs, lunr, search, jekyll |
Excerpt: | How to use The lunr.js search script to implement a local site search for a static web site built with Jekyll or similar static page generators. |
Tags: | Jekyll |
Page layout: | nonav |
Static sites are a great thing and offer many advantages, but generating dynamic content is naturally difficult and a search feature usually needs a database back end and server-sided dynamic content to work. But this is only one side of the medal, there are ways to implement it without any server-sided dynamic content creation. Like in many other areas, modern JavaScript can rescue the situation by providing powerful and efficient local search solutions that run on the visitor’s browser.
For such a solution, the basic workflow looks like follows, assuming we are using a static site generator like Jekyll or HuGo. But basically, the solution works for any site that is based on static content (pure HTML, CSS and JavaScript).
- When building the site, generate a static page containing the search index. This can be a HTML page or a JSON document, it really does not matter. Most JavaScript search engines expect the index to be in the JSON format.
- Use a client-sided script that uses this index and presents the results.
lunr.js is such a script. It runs in your browser and searches a JSON-formatted search index for keywords. Ideally, the index should contain various metadata and some of the article’s real content. For my solution, an index entry for a single document contains the following:
- The title of the post or article
- The name of its author
- The list of categories it belongs to. On this site, an article can be member of multiple categories.
- The modification date. Used for sorting the results.
- An excerpt for the article. This is typically a short (a few sentences at max) summarize or abstract. It must be part of the front matter. If you do not want to maintain an excerpt for all your content, you can also use the first couple of sentences from the actual content or instruct jekyll to generate excerpts automatically using an excerpt separator.
- The URL for the article. Needed to provide a link in the search result.
- Keywords: A list of words describing the main subject the article is about.
- tags: The list of tags for the article or post.
This is how a single index entry looks like
This is the liquid template code fragment for building the index on this site
This code fragment iterates over all articles in all collections to generate one search index entry per document. The code should be easy enough to modify for your own personal needs.
please note lines 5-7 and 9-11 to see how you can exclude whole collections or use tags to exclude specific posts from appearing in the index. I exclude the sys collection, because it does not contain real posts, just support documents. I also exclude all posts tagged with drafts, because they are normally unpublished.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
<script>
{% assign nr_items = 0 %}
window.store = {
{% for collection in site.collections %}
{% if collection.label == "sys" or site[collection.label].size < 1 %}
{% continue %}
{% endif %}
{% for post in site[collection.label] %}
{% if post.tag contains 'drafts' %}
{% continue %}
{% endif %}
"{{ post.url | slugify }}": {
"title": "{{ post.title | xml_escape }}",
"author": "{{ post.author | xml_escape }}",
"category": {{ post.category | default: "" | strip_html | strip_newlines | jsonify }},
"modified": "{{ post.modified | date: "%H:%M on %A, %d.%B %Y" |xml_escape }}",
"excerpt": {{ post.excerpt | default: "No content excerpt available" | strip_html | strip_newlines | jsonify }},
"url": "{{ site.baseurl }}{{ post.url | xml_escape }}",
"keywords": {{ post.keywords | default: "" | strip_html | strip_newlines | jsonify }},
"words": "{{ post.content | number_of_words }}"
}
{% assign nr_items = nr_items | plus:1 %}
{% unless forloop.last %},{% endunless %}
{% endfor %}
{% unless forloop.last %},{% endunless %}
{% endfor %}
};
var nr_items = {{ nr_items }};
</script>
<script src="{{ site.baseurl }}/assets/js/lunr.min.js"></script>
<script src="{{ site.baseurl }}/assets/js/search.js"></script>
In our example, we build the JSON object with the window
object as its container. While this is a
good solution, it’s not mandatory. You could use an independent object for it, but the window
object has the simple advantage that you do not need to pass it to your search script. It’s always
available and shared to all scripts running in the current browser tab.
So, in our example, the search index is part of the /search.html
page that also offers a search
box and a container to present the search results. The index could be put into a separate JSON
file, but then your scripts would need to read it in which creates additional and unneeded
complexity. It is also possible to pass a search query to /search.html
using the query
parameter. For example: /search.thml?query=fun
directly performs a search. This is how the search
box in the top right corner works, it simply calls search.html passing whatever string you typed
into the search box as parameter.
Search index size considerations
Since the whole index will be part of the search.html
page on your site, wouldn’t this page grow
very big?
It depends on the site, obviously. On my small site with less than 100 documents, the
search.html
isn’t even 100kB in size. I’ve done the math and found that the average size per
index entry is about 450 to 500 bytes. To make it simpler, let’s just assume half a kB per
document, so 1000 documents would result in about 512kB (half a MB), which is still nothing given
today’s average bandwith availability. With 10.000 documents (already a fairly large site) the JSON
index would be about 5 MB in size, still very manageable dimensions.
and search performance?
Nothing you should worry about unless you get into high 5 digit numbers of documents. JavaScript on
modern browsers is fast enough to perform such searches in less than a second. The memory
requirements per tab would be more of a concern, since the JSON index must be read into and kept in
memory (part of the window
object). Reading the search.html
page could therefore result in
performance problems with really large indexes (ten-thousands of entries).
On the average site, performance problems wouldn’t be a concern. Expect that a local search using JavaScript would be faster than a remote server-side database search on a server running under high load. The hardware performance found in devices not older than 5-10 years should normally be good enough
Performing the search
Above, we did our homework, that is, building the search index. Now focus on lines 30 and 31 of the
code fragment. They include two scripts: lunr.min.js
is the search engine you download from the
lunr website. It’s also possible to include it from an online resource (CDN)
that always gives you the most recent version. Please see the docs on the lunr website for
instructions how to include the script on your site. I prefer to host it locally, but that’s only one of many options.
The second script (search.js
) is the interface script that initializes lunr with the search
index, performs the search and inserts the results into the DOM.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
(function() {
function displaySearchResults(results, store) {
var searchResults = document.getElementById('search-results');
if (results.length) { // Are there any results?
var appendString = '';
appendString += '<h4>The query found ' + results.length + ' results.</h4>';
for (var i = 0; i < results.length; i++) { // Iterate over the results
var item = store[results[i].ref];
appendString += '<li class="lunr-searchresult"><a href="' + item.url + '">' + item.title + '</a>';
appendString += '<br>' + item.excerpt;
appendString += '<br><span class="author subheader" style="float:right;">' + item.words + ' Words</span><span class="author subheader" style="float:left;"><span class="time_symbol"></span><span class="time" >' + item.modified + '</span></span>';
appendString += '<div class="clearfix"></div>';
//appendString += '<p>' + item.content.substring(0, 150) + '...</p></li>';
}
searchResults.innerHTML = appendString;
} else {
searchResults.innerHTML = '<h4>Nothing found.</h4>';
}
}
function getQueryVariable(variable) {
var query = window.location.search.substring(1);
var vars = query.split('&');
for (var i = 0; i < vars.length; i++) {
var pair = vars[i].split('=');
if (pair[0] === variable) {
return decodeURIComponent(pair[1].replace(/\+/g, '%20'));
}
}
}
var searchTerm = getQueryVariable('query');
if (searchTerm) {
document.getElementById('search-box').setAttribute("value", searchTerm);
// Initalize lunr with the fields it will be searching on. I've given title
// a boost of 10 to indicate matches on this field are more important.
var idx = lunr(function () {
this.field('id');
this.field('title', { boost: 10 });
this.field('author');
this.field('category');
this.field('keywords');
this.field('excerpt');
for (var key in window.store) { // Add the data to lunr
this.add({
'id': key,
'title': window.store[key].title,
'author': window.store[key].author,
'category': window.store[key].category,
'excerpt': window.store[key].excerpt,
'keywords': window.store[key].keywords
});
}
});
var results = idx.search(searchTerm); // Get lunr to perform a search
displaySearchResults(results, window.store); // We'll write this in the next section
}
})();
As you can see, it’s quite straightforward. Lines 2-20 present the results of the search and build
the response. This will then be inserted into the search-results
object which should normally be
a simple <div id="search-results"></div>
.
Beginning in line 43, lunr.js is initialized with data from the window.store
object (our JSON
index). Line 63 executes the search and
The search form on the search page.
A simple form to enter a search string. The page will then call itself passing the string as
parameter. The JavaScript function getQueryVariable()
will detect the query
parameter and
perform the search.
<form id="searchform" style="margin:0;padding: 0 20px 20px 20px;" action="/search.html" method="get">
<input type="text" id="search-box" name="query">
<input type="submit" value="Search">
<span id="total_items"></span>
</form>
<!-- here goes the results -->
<ul id="search-results"></ul>
The code for a search box anywhere on your site
This can go whereever you want. The site header would be a good place for it.
<form id="lunr-searchform" action="/search.html" method="get">
<div class="lunr-wrap">
<div class="lunr-search">
<input type="text" class="lunr-searchTerm" name="query" placeholder="Search this site...">
<button type="submit" value="search" class="lunr-searchButton">
<span class="search_symbol"></span>
</button>
</div>
</div>
</form>
The css for the search form(s)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
li.lunr-searchresult a {
font-weight: bold;
font-size: 130%;
}
li.lunr-searchresult {
margin-bottom: 20px;
font-family: $sans-font !important;
font-size: 90% !important;
}
div.lunr-search {
width: 100%;
position: relative;
display: flex;
}
input.lunr-searchTerm {
width: 100%;
border: 0;
border-right: none;
padding: 5px;
height: 18px;
border-radius: 0;
outline: none;
color: $field_color;
font-weight:bold;
background-color: $field_bg;
}
button.lunr-searchButton {
width: 40px;
height: 28px;
border: 0;
background: #00B4CC;
text-align: center;
color: #000;
border-radius: 0;
cursor: pointer;
font-size: 18px;
}
/*Resize the wrap to see the search bar change!*/
div.lunr-wrap{
width:20%;
position: absolute;
top: 50%;
right: 20px;
transform: translate(0, -50%);
}
form#lunr-searchform, form#searchform {
background-color: $background_color;
}
form#lunr-searchform {
margin: 0;
padding: 0;
float: right;
margin-right: 20px;
width: 20%;
min-width:40%;
}
form#searchform input {
color: $field_color;
background-color: $field_bg;
border: 1px solid $accent_color;;
padding: 4px;
}
span.search_symbol {
font: normal normal normal 15px/1 FontAwesome !important;
color: black;
margin: 0 4px 0 0;
}
span.search_symbol:before {
content: "\f002";
}