Tag Archives: markdown

An efficient way to convert Markdown to a Word readable format

In my previous post I pointed to some of the advantages Markdown has for writing. However, sooner or later you arrive in a situation where you need your text in the .doc / .docx / .odt format. Obviously, you google how to convert it and you end up with the good old Pandoc. This program works great for converting almost any type of documents. But recently I discovered that there is more to it than just:

pandoc -o output_file.odt input_file.md

Here are two ways to spice it up. First, use the --smart argument which ensures that straight quotes are converted to curly quotes, two dashes to an en-dash etc. In other words, Pandoc will attempt to produce a typographically correct output. Similarly, the --normalize argument removes repeated spaces and makes other corrections.

Second, you can use the --reference-odt or --reference-docx arguments, depending on which format you want do convert into. This points Pandoc to a reference file whose styles should be applied to the converted text. It allows you to pre-define the formatting of headings, paragraphs and other parts of text so that you don’t have to do this manually after every conversion. All in all you would use:

pandoc --smart --normalize --reference-odt=file.odt -o output_file.odt input_file.md

Using plain text for writing gets a lot easier. You just need to define the styles once and Pandoc will take care of the rest for you. No wonder Pandoc is presented on its homepage as a swiss army knife for working with documents. You can even tell it to generate table of contents based on headings used in the text. The options are vast, just look into the manual man pandoc.

There are few more things I would like to note:

  • The Pandoc manual says that for best results, the style reference files should be generated by Pandoc in the first place.

  • Pandoc recognizes multimarkdown footnotes (for syntax, search for footnotes here) and converts them correctly, yay!

  • If you need to quickly convert Markdown to HTML or PDF, you can always use Dillinger, an online conversion tool.

  • There is also one other way to make the conversion without Pandoc, though I suspect there are less options than are offered by Pandoc.

Baby steps

It is quite some time since I entered the open-source world. It’s years, long years actually. I first installed linux when I was at high school (wow, Dapper Drake is old!) and since then I was slowly picking up the tools and mindset to do things the open-source way.

First things first, I had to take a command line 1.01 since I constantly needed to do things like reset my network card, update grub to catch up on my latest distro-hopping endeavor or set up the cool looking conky I just found on the web. Some copy pasting of commands I found on the forums usually did the trick. It was a hassle but it was well worth it. Nowdays, I work with command-line programs like git, grep, pandoc or wget on a daily basis and they are great. Yum!

Second, tinkering with various settings got me to recognize the powers of plain text. I got all my writing settled around Markdown now (Markdown FTW!), but it was a looong way, baby! Naturally, the starting point was using some kind of word alternative, be it open, libre or abi. My first attempt at utilizing plain text was with Latex. It felt really good to see my texts ending up looking professional, but after some time, it got tiring to deal constantly with the incompatibility with what everyone around me was using or with the process of setting it up (with all the specifics for my native language). And above all that, I’m really not a fan of the hyphenation enforcement which ruins the reliability of fulltext search for my texts.

I turned to HTML, a language that seemed that it can meet any of my needs. It didn’t take much time to realize that it wasn’t such an obviously good choice. Although it was nice that I could load the HTML document directly to MS Word or LO Writer whenever I needed to, it was really an overkill solution since I needed to annotate my texts with a lot more (and longer) tags. Certainly, there are more lightweight solutions which don’t distract from writing that much.

So guess what followed when I learned on Gradhacker that Markdown in its simplicity (actually, it is Multimarkdown but still) can handle footnotes. I checked that pandoc could process it and from there it was green lights all the way. It’s really universal. The github repo tracking my dissertation freaked out a little from the commits converting my files to Markdown but that’s it. Now I am living my happily ever after and it seems I will stick to this one at least for a while but I know that my learning isn’t over.

I started writing user documentation for the Pitivi video editor several months ago and it just never stops. I learned to manage git remotes, branching, interactive rebase, cherry-picking etc. The git thingies took some time but the Mallard markup was easy to pick up. And I practice my writing skills, especially for clarity. Recently, I also spent some time at figuring out regular expressions with some nice results for my research. Can’t wait what will be next.