maphew

Going Acrylic

Recipe used for converting httrack snapshot of Mindtouch wiki to markdown for acrylamid

Generate the processing list#

dir /s/b \www\maphew.com\*.html > process-list.xt

Edit process-list and remove junk, fix bad filenames (resultant from double quotes in name).

Copy to excel and:

  • convert text to table using \ for delim,
  • apply conditional formatting highlighting .html
  • sort by html column
  • remove duplicates
  • save tab delimited
  • search and replace tabs with \, removing dupes

Scripted Html to Markdown to Acrylamid#

for /f %a in (process-list-cleaned.txt) do @mkdir .%~pa

for /f %a in (process-list-cleaned.txt) do ^
    @pandoc --to markdown --standalone --template acrylamid-pandoc-template.txt "%a" -o ".\%~pa\%~na.md"

acrylamid init converted
rd /s/q converted\content
move www\maphew.com converted
pushd converted
rename maphew.com content
rd /s/q ..\www

copy \www\acr\confy.py .\conf.py
xcopy \www\acr\theme\* theme\*

acrylamid compile --search
:: fix title collisions as needed

Clean up header & footer crud#

Remove Mindtouch leftovers such as "javascript must be enabled" and "Powered by...", etc. etc. by using search and replace across all open files.

Vim regexes:

:bufdo:%s/^This application requires Javascript to be enabled\.\_.* Table of contents$//
:bufdo:%s/\*No headers\*//
:bufdo %s/Powered by \[MindTouch Core\_.*maphew)//e
:bufdo %s/---\n\n\n*/---\r\r/e
:bufdo %s/$title:\(.*\) - maphew$/title:\1/e

Sources:

Fix dates#

This marks the end of the automated repeatable process.

Read through Special_RecentChanges and manually edit each .md to reflect last time touched. We don't have dates for content which was migrated to Mindtouch (prior to 17.10.2009).

Add tags#

The folders are the primary tag, so search *.md and sort by location, then drag'n'drop each group into a handy text editor and paste the appropriate tag(s) in each.

I'm sure there's a way to automate this, but I decided it would be faster to brute force my way through (esp. since we've already broken with repeatable process anyway).


date: 2013-03-16
tags: [web-dev, technical]
category: web-dev