Seavreeze

I think the way I make my CV is useful enough for me to be of general interest. I like to customize its contents for each job application, which was initially a pain, because I had to fiddle with the LaTeX source to show or hide different bits of content depending on their relevance for the job. I also maintained two versions, one for this website and one to make a pretty PDF, which was frustrating. CVs have a pretty consistent structure, so I tried to solve my problem by maintaining one YAML file in which the actual contents live, and then rendering HTML and LaTeX files from these. The HTML went on this website, the second to a PDF.

For example, the input YAML file might look like this (contents are hypothetical):

name: 'Illir Muzden'
date_of_birth: '1989-03-28'
address:
    street: 'Surferstreet 11/1'
    city: 'Amsterdam'
    postcode: '1777 AS'
    country: Netherlands
email: 'illir.muzden@gmail.com'
phone: '+31 65553944'
website: 'illirmuzden.com'
github_name: 'adderbuzzup'
summary: ''
jobs:
    -
        date: 'May 2016 – Present'
        employer: 'Looking.ca'
        role: 'Data Scientist'
        summary: 'Analysis of large data-sets to identify trends to allow appropriate business decisions.'
        points:
            - 'Querying petabyte-scale data using Hadoop and related tools such as Hive and Spark to answer business questions.'
            - ...

I wrote a program that substituted these values as the context of LaTeX and HTML Jinja2 templates.

This worked for simple content, but I wanted finer control over the output. For example, say I wanted to tell the world that I've used the NoSQL database "MongoDB". This sort of string confuses TeX's hyphenation logic, so it has no problem breaking a line like this:

I'm super duper amazing at using Mon-
goDB, trust me mate, and also at loads
of other NoSQL things as well yeeaah.

which looks horrible. If I were only targeting TeX, I could fix this by preventing hyphenation of the word: \hbox{MongoDB} (let's not get into what this is doing, this is not the place). But how should I tell the HTML renderer what to do with this? I can't have bare TeX commands being passed through to the HTML version.

Because of issues like these, I ended up maintaining two YAML files, one for the HTML and one for the TeX. This let me include all of the markup I wanted, and it let me toggle on or off different sections more easily than in the TeX, but I still had to keep the two versions in sync.

In order to be able to crack this walnut-sized problem, I made a sledging hammer in the form of a little XML-based markup language, which I call 'CVML'. One might wonder, given that I was already writing my CV in a markup language, YAML, why I wouldn't just use its structure to specify what I want. I got myself confused wondering why this wouldn't work, until I realized that YAML isn't really a markup language; one can't easily use it to mark up continuous text. Indeed, I discovered that YAML doesn't stand for 'Yet Another Markup Language', as I had presumed, but 'YAML Ain't Markup Language'.

Anyway, so I stuck with YAML for specifying the elements in my CV, but allowed CVML in the strings, to mark up the text with presentational details. The simplest CVML tag, [[mu]] (for MarkUp), looks like this:

I'm super duper amazing at using
[[mu]][[tex]]\hbox{MongoDB}[[/tex]][[html]]MongoDB[[/html]][[/mu]],
trust me mate, and also at loads of
other NoSQL things as well yeeaah.

It sure has the verbosity of XML, but I don't know of a better alternative. A [[mu]] tag has children that specify the content to show for each language to be rendered. At render-time, the tag expands to the appropriate section of text. One snag was that, because this section might be a snippet of HTML, and CVML is XML-based, I faced the terror of having to escape all of the angle brackets in the HTML content, so that it wouldn't be mistaken for part of the CVML. I solved this by using characters other than angle brackets for CVML, as you can see, and substituting the characters for angle brackets just before parsing the tag. Between TeX and HTML, most of the good delimiters are taken, so I went with a multi-character delimiter.

Anyway, so I realized that with this tag I could go crazy with markup. I have the unfortunate combination:

  • I want to solve problems that require tools whose names tend to involve acronyms and initialisms
  • I am distracted by strings of uppercase characters in continuous text

The first means that I feel compelled to boast that I've used various tools whose names are constituted of an unusually high share of characters from the upper case. When combined with the second, I'm at risk of hating the sight of my own life on paper. I avoid this through excesses of SMALL CAPS styling, which maintains semantic and legal correctness, while looking less shouty. Anyway, I started modifying my CV data like so:

I'm super duper amazing at using
[[mu]][[tex]]\hbox{MongoDB}[[/tex]][[html]]MongoDB[[/html]][[/mu]],
trust me mate, and also at loads of other
NoS[[mu]][[tex]]\smaller{QL}[[/tex]][[html]]<small>QL</small>[[/html]][[/mu]]
things as well yeeaah.

(\smaller is a user-defined TeX macro, and <small> is an HTML tag. Don't worry, it sounds presentational but somehow it counts as semantic, so it's allowed in HTML5.)

I realized this was becoming ugly, so I added a [[small]][text][[/small]] CVML tag, which generates the correct markup internally, depending on the language:

I'm super duper amazing at using
[[mu]][[tex]]\hbox{MongoDB}[[/tex]][[html]]MongoDB[[/html]][[/mu]],
trust me mate, and also at loads of other
NoS[[small]]QL[[/small]] things as well yeeaah.

I've now added the same sort of tags to emphasise text ([[cite]]<text>[[/cite]]) and generate links ([[link]][[url]]<url>[[/url]][[name]]<name>[[/name]][[/link]]).

The result of this work is that I now only have one file to maintain, instead of two. Just one file, a markup language, and a multi-step document preparation pipeline...