Going Acrylic
Recipe used for converting httrack snapshot of Mindtouch wiki to markdown for acrylamid
Generate the processing list#
dir /s/b \www\maphew.com\*.html > process-list.xt
Edit process-list and remove junk, fix bad filenames (resultant from double quotes in name).
Copy to excel and:
- convert text to table using
\
for delim, - apply conditional formatting highlighting
.html
- sort by html column
- remove duplicates
- save tab delimited
- search and replace tabs with
\
, removing dupes
Scripted Html to Markdown to Acrylamid#
for /f %a in (process-list-cleaned.txt) do @mkdir .%~pa
for /f %a in (process-list-cleaned.txt) do ^
@pandoc --to markdown --standalone --template acrylamid-pandoc-template.txt "%a" -o ".\%~pa\%~na.md"
acrylamid init converted
rd /s/q converted\content
move www\maphew.com converted
pushd converted
rename maphew.com content
rd /s/q ..\www
copy \www\acr\confy.py .\conf.py
xcopy \www\acr\theme\* theme\*
acrylamid compile --search
:: fix title collisions as needed
Clean up header & footer crud#
Remove Mindtouch leftovers such as "javascript must be enabled" and "Powered by...", etc. etc. by using search and replace across all open files.
Vim regexes:
:bufdo:%s/^This application requires Javascript to be enabled\.\_.* Table of contents$//
:bufdo:%s/\*No headers\*//
:bufdo %s/Powered by \[MindTouch Core\_.*maphew)//e
:bufdo %s/---\n\n\n*/---\r\r/e
:bufdo %s/$title:\(.*\) - maphew$/title:\1/e
Sources:
- http://vim.wikia.com/wiki/Search_across_multiple_lines
- http://vim.wikia.com/wiki/Search_and_replace_in_multiple_buffers
Fix dates#
This marks the end of the automated repeatable process.
Read through Special_RecentChanges and manually edit each .md to reflect last time touched. We don't have dates for content which was migrated to Mindtouch (prior to 17.10.2009).
Add tags#
The folders are the primary tag, so search *.md and sort by location, then drag'n'drop each group into a handy text editor and paste the appropriate tag(s) in each.
I'm sure there's a way to automate this, but I decided it would be faster to brute force my way through (esp. since we've already broken with repeatable process anyway).
date: 2013-03-16
tags: [web-dev, technical]
category: web-dev