Making my Static Blog Generator
2023, August 9
Preface
I started writing this blog post before I started building the project. The blog post was written to capture my Rust learnings and my thought process as I went from zero to finished. The article begins with myself being a little clueless about the Rust language, and it ends with the finished project which I used to create the page that you are reading right now.
This post is for you if:
- You want to make a blog and you want to make it with Rust.
- You know how to code but you are new to Rust and you are thinking that Rust is overwhelming. Here I show you what it feels like to push through the density in order to create something useful.
- You are already proficient with Rust and you want to make a blog. You may enjoy reading through the article to use it as a reference for your own project. You might skim through some sections that are not relevant to you.
Whichever bullet point you fit into, I recommend that you look through the code on the final project. The repo can be found here:
Preface added on 2024, March 7
A little story about learning Rust for the first time
Introduction
I decided it was time to set-up my own blogging configuration, it has been too long since I first thought it would be a good idea to have my own blog, but the right time to create my blog had not come to me yet. I was uncertain about what I would write about and which technology I would use for my blog, but now my mind has settled on those topics, so lets get started!
Looking at existing solutions
At first I tried to look at existing Static Site Generators (SSG), every year I would spend like a few hours looking at existing solutions, trying to try them for a moment, and quickly giving up. Existing solutions were either too complicated, in a programming language that I did not want to invest myself, or gave me too many theme options and I would get stuck thinking about the perfect theme.
So recently I decided to learn Rust and I felt the urge to create my blog more strongly than ever. Alright so lets build my blog with Rust and hit two targets with one toss. So looking at existing SSG solutions for Rust I came across Zola, looked into that a little bit but it has too many options. I am easily overwhelmed when I have too many options, it gives me analysis paralysis. So I looked a little bit into Zola and found out that it uses Tera, a templating engine.
Perhaps I can just use the templating engine alone and ignore the rest of Zola.
I also want to write my blog in markdown though, I do not want to be writing HTML or similar, too complicated just to get some words out there into the net. So a quick search for a markdown parser in Rust yielded markdown-rs, markdown-rs comes with the option to output HTML from markdown files as well as other goodies such as Front Matter support, this is great because it lets me add metadata to my files in plain text, from my editor, so easily.
Okay cool, so in theory I can write in Markdown, process it into HTML With markdown-rs, and then
I can use this HTML output with Tera in order to get a better looking result. And, in the
process of it all, learn some crablang Rust!
Design
The first goal should be simplicity, this is a tough goal, but lets do that so that I can actually
finish the project. And what's simpler than a CLI tool? Something like create-html /path/to/files
and kaboom, I get my website HTML. I will worry about deploying later, it will probably
some simple bash script.
Now, I want to support some basic things, and leave some things for another day.
Do:
- I pass in a directory to my tool, and it finds all files within subdirectories in the path.
- Articles include a UUID in frontmatter that will be used for their URL.
- Articles include a date in the frontmatter to allow readers to know the date in which an article was published, but also to chronologically sort the articles on the index page of the blog.
- There will be an index page on the blog.
- Articles should be able to link to each other in markdown and it should get translated to proper links in HTML.
- Articles include a value on the frontmatter to hide or publish the article.
- Basic styling such that letters are legible.
- Simple mailing list at the page footer for future engagement.
- Include the directory of an article as metadata, so if an article is in
Code/Rust
then such information should be in display somewhere for that article. (maybe punt?)
Do not (leave it for later):
- Include pictures in the articles.
- Include tags in the articles.
- Add an index page to search through tags.
- Watch for changes on the directory of the articles and automate publishing to the web.
- Make it super stylish.
- Include a contact form. (For now just add a
mailto
link). - View analytics.
Other considerations
With time I have learned that keeping it simple and custom-made tends to be the best way to build lasting tools that can be improved and modified over time. This also allows me to be very specific about how I input my data and how it gets build. The idea is to make it as easy as possible to write a new article, to diminish the friction between thinking and publishing as much as possible in order for the habit to stick. So keeping it as a simple CLI tool that crawls through a specific directory lets me leverage my current note-taking tools: Neovim with Telekasten plugin. Additionally I sync my notes with a Virtual Private Server through Syncthing. For these reasons I care a lot about having the directory structure that works best for myself, nothing less and nothing more, this is one of the multiple reasons why pre-built SSG solutions are not my cup of tea, they impose their directory structure and their semantics on the user, in turn you get a lot of features like a massive gallery of themes that you can use and whatnot, if you are into that. My goal here is to build my own tools that work well withwith my existing workflow, and learning a little bit of Rust in the process, which is a lot more useful than learning the specific configuration of some blogging framework.
Anyways, all I am trying to do is my own small blog, not some enterprise CMS; after all, I am a man of hacks, not a man of fanciness, so lets move on.
Building the thing
The time has come to write some code, my background in Rust is minimal, I completed a challenge (singular) from Advent of Code (AoC) and read some parts of the Book of Rust. I remember that solving the AoC challenge was a difficult experience, but now some time has passed and I went over Chapter 4 of the Rust Book, and I feel like it made a lot more sense this time around, so here goes nothing!
I either succeeded and you are reading this right now, or these ramblings never left my machine.
Code
It has been a few days, I spent them consuming more tutorials at warp speed. I must stop learning without
practice though, any longer will be dangerous. I must overcome the coding block. I decide to start coding at 2023-08-03
19:14
.
Matching files with glob
I find Rust code very easy to read. The difficulty to Rust comes when writing it and getting all the types correctly. I consider this a fantastic circumstance, reading code is more important than writing it.
So I wrote some 30 lines of rust, I can pass a directory as a CLI argument and the program outputs the files in that directory which contain markdown files. The hardest part with Rust is refactoring the code to make it nicer, but refactoring is most important for learning.
Let me show you, this is what my code looked like at first:
// not in function, with nested `if let`, does not aggregate values
for entry in glob(&args.input_dir).expect("Failed to read input directory") {
match entry {
Ok(path) => {
if let Some(ext) = Path::new(path.to_str().unwrap()).extension() {
if let Some("md") = ext.to_str() {
println!("{:?}", path.display());
}
}
}
Err(e) => println!("{:?}", e),
}
}
Here is after a marginal improvement:
// not in a function, removed nested `if let`,
// does not aggregate values
for entry in glob(input_dir)? {
match entry {
Ok(path) => {
if let Some("md") = Path::new(path.to_str().unwrap())
.extension()
.unwrap_or_default()
.to_str()
{
println!("{:?}", path.display());
}
}
Err(e) => println!("{:?}", e),
}
}
And finally to the much nicer, extracted function:
// extracted!
fn find_md_files(input_dir: &str) -> Vec<PathBuf> {
glob(input_dir)
.expect("Failed to read input_dir")
.into_iter()
.filter_map(|s| s.ok())
.filter(|s| s.extension().is_some_and(|x| x == "md"))
.collect()
So far I learned a few things.
-
The differences between how to handle
Result
andOption
still confuse me a little bit, they are so similar. -
Learned that there are variations of
unwrap
such asunwrap_or_default
which make handling wrong cases much nicer. -
There is
filter()
, which is nice to get rid of unwanted items in an iteration -
There is
filter_map()
, which is nice to unwrap theResult
andOption
enums into their inner values, whilst ignoring the ones that areErr()
orNone()
. -
.ok()
and.is_some_and()
are very handy. -
An iterator is much nicer to write than a for-loop, it felt particularly useful when there are no elements in the list to iterate because the iterator returns the empty list, whilst a for-loop would not run at all and additional code to return an empty list would be necessary.
Learning efficiently
Something I must mention, for me personally, the rust-analyzer has been essential to be able to write Rust. This is because I am a newbie and I find it really difficult to get the right code in Rust without errors, so it becomes paramount for me to have the integration of the language server with my editor. In contrast, if I had to go to the terminal every time I wanted to try something different, this would have easily frustrated me too much to continue due to the slowness and due to how difficult it is to parse the error messages with my eyes when they are on the terminal; however, with the language server I can make my mistakes at lightspeed, and that is very important! Getting the right result is not about avoiding mistakes, it is more so about completing all the mistakes that you must before you find the answer.
More learnings
Another issue I came across is that some functions did not take reference types, the function expects a non-reference type that it consumes by taking ownership. I was not able of modifying this function because it comes from a library. For example here:
// functions from markdown-rs, takes some configuration options
let parse_options = {
return ParseOptions {
constructs: custom(),
..ParseOptions::gfm()
};
};
let options = Options {
parse: parse_options, // takes ownership of parse_options
..Options::gfm()
};
let html = to_html_with_options(md, &options) // OK
let ast = to_mdast(md, parse_options); // error! parse_options is no longer available
Sometimes you can just use .clone()
for similar errors, but in this case
the structs do not implement the Clone
trait. So a simple solution is
turning these into functions, I don't want to extract them
into their own function completely so I can turn them into closures easily:
let parse_options = || {
return ParseOptions {
constructs: custom(),
..ParseOptions::gfm()
};
};
let options = || Options {
parse: parse_options(),
..Options::gfm()
};
let ast = to_mdast(md, &(parse_options()));
let html = to_html_with_options(md, &(options()));
Now the code that creates the configuration is shared, but a new value is created for each function call, solving the issue with ownership.
However! My brother pointed out to me that this is silly and that I could simply re-order the statements and avoid the closures:
let parse_options = ParseOptions {
constructs: custom,
..ParseOptions::gfm()
};
let ast = to_mdast(md, &parse_options)?; // borrows parse_options so we can re-use it below
let options = Options {
// here `options` that we are declaring takes ownership of `parse_options`
parse: parse_options,
..Options::gfm()
};
let html = to_html_with_options(md, &options)?;
Yet another issue I came across is that sometimes I want to .map_filter
over
some values, run some function, and return the value while filtering out
the values that created an error. .map_filter
expects the result to be
an Option
, that is, either a Some()
or a None()
. This is fine, except
sometimes you have a Result
instead. Here you can use the .ok()
to turn
the Result
into an Option
, any Err()
will be turned to None()
, and
any Ok()
will be turned to Some()
, now you can use .filter_map
!
glob(input_dir)
.expect("Failed to read input_dir")
.into_iter()
.filter_map(|s| s.ok()) // all good
.collect()
However, this becomes an issue if your map function is a little bit more
full of logic and you want to use Rust niceties such as ?
. Using ?
inside
your function will return any Err
the moment it happens, and it doesn't give you
the opportunity to run .ok()
. In this case we can use flat_map()
instead of
filter_map()
because both Option
and Result
can be flattened into a map, as both
implement the IntoIterator
trait (ref).
let htmls: Vec<Result<std::string::String, std::string::String>> = files
.into_iter()
.flat_map(|f| -> Result<(PathBuf, String), Error> {
let s = std::fs::read_to_string(f.clone())?; // this could return Error
Ok((f, s)) // also allows me to return my own type because I can
// choose the value that I wrap with `Ok()`, instead of having to return
// the value returned by `.ok()`
})
.map(|(_f, s)| produce_html_from_md(&s[..]))
.collect();
Integrating Tera
At this point I have had enough battles with the compiler such that writing code
no longer feels like poking the dark with a stick. So it has been more or less
straight forward, reading docs, learning about the templating engine used by
Tera which is the same as Django and Jinja, for this I found it easier to
understand the templating by looking at Django documentation. I have managed
to generate HTML with markdown-rs and created HTML files that use the
Tera templates in conjunction with the HTML (from markdown-rs) to produce
an index.html
that links to the given articles/uuid.html
, the pages
share a header and a footer, and now I am antsy to ship it, I want to get it
over with even if some features have not been implemented yet. However,
even now I feel like shipping it out without styling is unacceptable, so I will
do that and wrap up the project, then write a bash thing that uploads the output to netlify.
Styling
For styling I always feel like tailwindcss gives me the most speed and flexibility, and I am feeling lazy. At first I thought I was going to use some base styles, but adding the tailwind markup to the Tera templates was really easy to do. Tailwind's typography plugin is enough to make it look professional, so I will simply add a little bit of whitespace in some places, and perhaps DaisyUI themes to get some dark/light themes for reading comfort. I had a bit of trouble getting figuring out how to add code highlighting, I found highlightjs and went with it, highlightjs does not work so well with themes but ship it.
Streamlining the development process
There were a few little helpful commands to streamline the development.
First, this is the command to run the project with the input directory -i
and the output directory -o
, this builds the HTML files with Tera
by running the project:
cargo run -- -i '${HOME}/Sync/PARK/Area/Publish/**/*' -o `pwd`
That's nice but I wanted this step to occur every time there was a change in the project, jumping to my terminal and running the command every time was too clunky. So I found cargo-watch and that worked great, now the command:
cargo watch -x "run -- -i '${HOME}/Sync/PARK/Area/Publish/**/*' -o `pwd`"
For tailwind the command was simple:
npx tailwindcss -i ./templates/input.css -o ./html/styles.css --watch
I put these into a simple makefile for convenience.
Deploying to netlify
It was simple, netlify comes with a CLI tool which is nice, so I just run this command and the built HTMLs get deployed to the CDN:
npx netlify deploy --dir=html
How to use it
Here is a valid markdown file that you can use with Bloggeroo if you want to run it on your machine:
---
title: Making my Static Blog Generator
date: 2023-08-09
uuid: 202307301408
publish: true
---
# A little story about learning Rust for the first time
Then, after cloning the project (make sure you git clone
the v0
tag for this test),
you can execute the following at the root of the Bloggeroo
project:
cargo run -- -i 'Path/To/Your/File/**/*' -o `pwd`
And you will find the output HTML in your pwd
inside an /html
directory.
Mailing list
I picked mailchimp for now because I have signed up to mailing lists through their service before and I really like that they make it very easy to unsubscribe from a list. I was also able to use a free tier for this so hooray, I got an HTML embed from their service and changed the markup a little bit so that the styling made sense for my blog.
Reference of useful resources
Finally I could not have done this without the help of kind people who create educational materials. In particular I went through the following:
- Read Chapter 4 of the Rust book.
- Watched Bogdan's series on youtube “Let's get Rusty” which goes over the chapters of the Rust book. This was crucial, I was incapable of reading through the Rust book because I have the brain of a fish, but Bogdan's videos at 2x speed was just perfect to get moving quickly.
- Looked at some Advent of Code solutions by TJ DeVries, because understanding Rust is not enough, watching how other people write code that actually looks good is key.
I still need to learn more about
The difference between returning Option
and returning Result
from a function, they
can be used in similar ways, which is exactly why I find it a little confusing and
it makes me uncertain of which to pick. It is fun because the ?
built-in operator
may return either Option
or Result
depending on the return type of the function.
And then I have to remember that sometimes the program should simply panic
, instead
of returning some value.
Wrapping up!
I did not implement all the features that I wanted, but the more time that passes since the first day of the project, the less it matters to complete everything, and the more it matters to just ship something that isn't too jarring.
So lets look at how well it went!
- 1. I pass in a directory to my tool, and it finds all files within subdirectories in the path.
- 2. Articles include a UUID in frontmatter that will be used for their URL.
- 3. Articles include a date in the frontmatter to allow readers to know the date in which an article was published, but also to chronologically sort the articles on the index page of the blog.
- 4. There will be an index page on the blog.
- 5. Articles should be able to link to each other in markdown and it should get translated to proper links in HTML.
- 6. Articles include a value on the frontmatter to hide or publish the article.
- 7. Basic styling such that letters are legible.
- 8. Simple mailing list at the page footer for future engagement.
- 9. Include the directory of an article as metadata, so if an article is in
Code/Rust
then such information should be in display somewhere for that article. (maybe punt?)
7/9? That's better than I had hoped for!
The source code at the time of writing can be found here: Sleepful/Bloggeroo.
I have added a git tag
such that you can look at my ugly code even if I improve it in the future.
At first I thought that it would be cool if this tool could be leveraged by other people,
but as of this moment that goal is far off. Particularly because the Tera templates inside the
/templates
directory have a lot of code particular to my blog. But maybe if you want to
build something similar you can use my project as reference.
I can't believe you got this far! Perhaps you would like to join my mailing list?
Go back to the top