Notes About The Notes Project Copypasta, Archive, Ipse Paideia

Notes About The Notes Project

Updates about what I set up with this project and how to use it, with copypasta for command reminders until I automate further.

TLDR;

  1. cd "D:\dev\personal\md_notes_project"
  2. python build_site.py
    • script will request a global password for encryption.
  3. wsl
  4. rsync -avz --delete -e ssh /mnt/f/jmbsquared.com/ justinmichaelbonanno@vps49412.dreamhostps.com:jmbsquared.com

Overview

Markdown, Zettlr, Pandoc, storage strategy.

I have been working on setting up a Markdown-based notes project. Markdown formatting is easy to read in any code editor or IDE, maintains the smallest file-size, has capacity for internal metadata (YAML frontmatter), and as a text file is portable to every consumer computer that I have ever known of. As plain text, this is as future proof as possible.

I don’t mind managing my notes in an IDE, but also discovered Zettlr in the process. Zettlr is open-source cross-platform software that visually formats Markdown as I type. It has Pandoc built-in and handles advanced document conversion to PDF, HTML, LaTeX, and whole bunch of others. It automatically includes full text or tag search of the open “workspace” (the root folder of the notes files). Zettlr does not apply or use any proprietary formatting of the files, and does not create an internal database dependency.

Markdown has a few variant specifications. There is CommonMark, Pandoc Markdown, GFM (GitHub Flavored Markdown), MDX, Wiki-style Markdown, etc. The files are not “incompatible” with each other, but each specification has a target environment and defined features. A feature defined specifically in one flavor doesn’t really cause an issue in another, but applications that use the specification may not recognize formatting from another. Pandoc flavored markdown supports LaTeX math notation, multiple table formats, citations, bibliographies,

Pandoc is the de facto standard for document conversion. Pandoc can handle any Markdown flavor conversion, though doing so requires understanding it’s YAML configuration files, LUA filters, conversion templates. It has a highly advanced feature set that converts documents between formats. I have used it for LaTeX and HTML conversions in the past. Pandoc is very powerful and from a programmer perspective, and very easy to use.

Zettlr’s internal version of Pandoc is stable, but out of date. I disabled the internal Pandoc and am using the Pandoc CLI seperately (also availabe via the pandoc Python module), which also allows me to keep my configurations, filters, and templates in a dedicated directory. I keep mine in my dedicated development folder at D:\dev\personal\md_notes_project at the moment. (I should rename this to simply pandoc).

My Markdown library is organized as follows:

F:\ipse_paideia
├───YYYYMM
└───assets
    ├───css
    ├───images
    │   └───YYYYMM
    └───js

The notes follow the filename pattern YYYYMMDDHHMMSS-label_for_legibility.md and are stored in the related YYYYMM folder. Related images (or other media files) that are referenced in a note document are stored in the matching assets/media/YYYYMM folder. Zettlr does not render image links by default, but does have an internal preview capability.

Export Formatting

The first goal for me was to create a Pandoc export profile that could save a markdown file as a standalone/atomic HTML file, embed images and fonts in base64 format and embedded CSS styling in the HTML header. I was learning the Zettlr interface, but it still was very quick to set up. This creates completely portable HTML files, rendered perfectly, that I can send around in text messages, emails, dump on my server, etc.

Static Web Enhancements

Then I wanted to do the same thing with resources externally linked and more website features. I wrote a very quick Python script that crawls my note directory, selects only modified files (or all files via a CLI flag) and saves all the resulting HTML in a mirrored folder structure in a static website root folder, and mirrors all the assets as well. (This is why I included the web assets folder in the notes folder. The notes folder is the source, the web folder is just an output) Then I sync the web root with my server using VPS and have an instant, complete website version of my notes as well.

As an important note, I am currently using Pico.css (v2) as a CSS framework that relies (mostly) on semantic HTML, with some custom overrides to Pandoc HTML templates and CSS for compatibility and aesthetic purpose.

Then of course, comes the website considerations. It worked immediately as a simple output, and I could easily link between documents using standard markdown. This required a small LUA filter to be applied, so that internal links between notes automatically updated from .md to .html. It would have been better to use absolute references to assets… however as an added bonus I wanted the static site to run directly from the filesystem. This required having two sets instructions with very slight variations, one for the Markdown root and one for the subfolders, using relative paths to allow linking to remain stable regardless of the static web root folder. (Currently F:\jmbsquared.com\.)

Domain Configuration

See Domain Set-Up Notes

I have stored favicons, necessary .htaccess Apache config file, robots.txt and a very bad 404.html file in D:\dev\personal\md_static_html_domain_prep.

Future Refinements

Progress Since…

Today I cleaned up the development folder and renamed it. I consolidated the script build_site.py to work off of a config.yaml file which sets the app (development folder with pandoc template and configuration file), src (the markdown notes root folder), and dst (distribution static html web root folder), and names the assets folder and website address.

There can now be nested subdirectories to any level in the markdown folders. Relative pathing is now achieved between the config.yaml file and applied to all files.

I added the function (using .htaccess and .htpasswd) to add a secure subdirectory anywhere which is then password protected. .htaccess forces all connections to https. .htpasswd must be created and saved via ssh one level above the web root, and will then be applied to all subdirectories of the webroot named secure. This is so that personal notes that contain copyright protected material can be saved for personal use without allowing public access, as per U.S. copyright laws.

I also added the ability to add the author name to the bottom of any page via the author: YAML frontmatter key, and the keywords associated with each page are now displayed at the bottom left of each page.

I don’t know if there’s any way to automate the domain setup. This needs to be done by hand depending on the host. I wonder what Ngnix uses instead of .htaccess?

Search Titles & Keywords