Blog Content Migration
As you might have noticed suddenly a lot more posts appeared in the archives of my blog. I finally spend a few evenings to migrate the content of my old MovableType blog over into wordpress. Of course not by hand ;)
Wordpress has an importer that supports MT …but as soon as you have images and code inside your posts that’s not enough. E.g. I wanted to have my pictures getting served by flickr as this makes future blog migrations much easier. Preserving the keywords as tags also was a must. So I took the challange to play with a few new libraries to port everything across. Here is how…
The first idea was to improve my ruby knowdlege a bit while implementing this useful converter. Unfortunately the nice one of the ruby flickr apis did not support upload of pictures yet …and I was actually too lazy to look into that just to get this job done. Which brought me back to doing in it in java. The only available java flickr api, flickrj is not really nice but supports uploads and does the job.
So the first task was to parse the MT export file. Oh, boy! How can you come up with such a poor format! Anyway, I wrote a quick-and-dirty parser (really ugly!) for it that creates an object representation of the entries. Probably it would have been easier to extract these information the database …but anyway.
So next thing was to get the flickr upload working. First you have do get all the authentication right.
String apiKey = "...";
String sharedSecret = "...";
Transport transport = new REST();
transport.setHost("www.flickr.com");
Flickr f = new Flickr(apiKey, transport);
Flickr.debugStream = false;
RequestContext requestContext = RequestContext.getRequestContext();
requestContext.setSharedSecret(sharedSecret);
AuthInterface authInterface = f.getAuthInterface();
String frob = authInterface.getFrob();
URL url = authInterface.buildAuthenticationUrl(Permission.WRITE, frob);
System.out.println("Press return after you granted access at this URL:");
System.out.println(url.toExternalForm());
BrowserLauncher.openURL(url.toExternalForm());
BufferedReader infile = new BufferedReader ( new InputStreamReader (System.in) );
infile.readLine();
Auth auth = authInterface.getToken(frob);
requestContext.setAuth(auth);
Then you can do the upload the image. The tags I’ve extracted from the posts keyword.
Set tags = new HashSet(entry.getKeywords());
tags.add("imported");
Uploader u = new Uploader(f.getApiKey());
UploadMetaData meta = new UploadMetaData();
meta.setDescription(description);
meta.setPublicFlag(true);
meta.setTitle(title);
meta.setTags(tags);
String id = u.upload(new FileInputStream(photoFile), meta);
PhotosInterface photos = f.getPhotosInterface();
Photo photo = photos.getPhoto(id);
I then wanted to make sure the posts are all proper XHTML and passed them through TagSoup which does an excellent job of
parsing HTML and creating a proper DOM representation. (Unfortunately it’s GPL)
So I was able to extract image links from the DOM download them, upload them to flickr and replace the links by the appropriate flickr links. It also created a sql script that can be run against the wordpress database to insert the entries and a htaccess file to redirect my old movabletype post to their new home.
Puh …done!