Recipe used for converting httrack snapshot of Mindtouch wiki to markdown for acrylamid
Generate the processing list
dir /s/b \www\maphew.com\*.html > process-list.xt
Edit process-list and remove junk, fix bad filenames (resultant from double quotes in name).
Copy to excel and:
- convert text to table using
- apply conditional formatting highlighting
- sort by html column
- remove duplicates
- save tab delimited
- search and replace tabs with
\, removing dupes
Scripted Html to Markdown to Acrylamid
for /f %a in (process-list-cleaned.txt) do @mkdir .%~pa for /f %a in (process-list-cleaned.txt) do ^ @pandoc --to markdown --standalone --template acrylamid-pandoc-template.txt "%a" -o ".\%~pa\%~na.md" acrylamid init converted rd /s/q converted\content move www\maphew.com converted pushd converted rename maphew.com content rd /s/q ..\www copy \www\acr\confy.py .\conf.py xcopy \www\acr\theme\* theme\* acrylamid compile --search :: fix title collisions as needed
Clean up header & footer crud
This marks the end of the automated repeatable process.
Read through Special_RecentChanges and manually edit each .md to reflect last time touched. We don't have dates for content which was migrated to Mindtouch (prior to 17.10.2009).
The folders are the primary tag, so search *.md and sort by location, then drag'n'drop each group into a handy text editor and paste the appropriate tag(s) in each.
I'm sure there's a way to automate this, but I decided it would be faster to brute force my way through (esp. since we've already broken with repeatable process anyway).