Categories
Code ruby

Blog Move 1: Getting WordPress data to Ruby using XML

Step 1 in the “Moving my blog” process is “Extract the current site’s data into a manageable format”

Frankly, that’s easy! WordPress has a functionality to export the site’s content to a single XML file containing all the published Categories, Tags, Posts, Pages and Comments. To do this (WordPress v2.9.2) click Tools > Export and save the file. In previous versions of the software I believe it’s under the Manage menu.


I’m aware I could import the data directly from the WordPress database (to wherever it goes in the end) but let’s imagine we can’t. Anyway, database access would be tediously slow and inefficient to test against and implement.

A quick google for “import wordpress xml ruby” threw up nothing helpful so I turned to the Ruby XML libraries. John Nunemaker “feverishly posts everything he learns” at railstips.org and has two articles of use here:

The latter deals with three different ruby xml libraries and compares their speed, ease of use and how nice their names are to say. He puts REXML, hpricot and libxml-ruby. I’ll save you the pleasure of reading the article (if you like) and ccv John’s summary:

“Libxml is blisteringly fast, [but] Hpricot has cooler name, REXML and Hpricot both feel easier to use out of the box”

And there you go. Hpricot it is!

Now to get the data into Ruby. After a quick glance at the rubytips article and The RDocs I put together this code as a starting point:


cats_hierarchy={}
(doc/"wp:category").each do |category|
    cat_name = category.at("wp:category_nicename").innerHTML
    cat_parent = category.at("wp:category_parent").innerHTML

    if cats_hierarchy.include? cat_parent
        cats_hierarchy[cat_parent] = cat_name
    else
        cats_hierarchy[cat_name] = []
    end
end

cats = cats_hierarchy.to_a.flatten

That gives me two each to use Ruby objects each containing all of my category data: a hash which preserves the hierarchy of the structure and all the names in a linear array.


?> cats = cats_hierarchy.to_a.flatten.uniq
=> ["route66", nil, "rails", "american-2008", "reciprocal-affection", "hope-for-the-future", "code", "blog", "review-blog", "rant", "brands", "projects", "yab_shop", "textpattern", "meaningful-labor", "giants", "accessibility", "root", "charity-project", "apple", "xhtml", "america-2006-route-66", "ruby", "learning", "america-2007", "uncategorized", "iphone", "america-2008"]

?> cats_hierarchy
=> {"route66"=>nil, "rails"=>nil, "american-2008"=>nil, "reciprocal-affection"=>nil, "hope-for-the-future"=>nil, "code"=>nil, "blog"=>"review-blog", "rant"=>nil, "brands"=>nil, "projects"=>nil, "yab_shop"=>nil, "textpattern"=>nil, "meaningful-labor"=>nil, "giants"=>nil, "accessibility"=>nil, "root"=>nil, "charity-project"=>nil, "apple"=>nil, "xhtml"=>nil, "america-2006-route-66"=>nil, "ruby"=>nil, "learning"=>nil, "america-2007"=>nil, "uncategorized"=>nil, "iphone"=>nil, "america-2008"=>nil}

And so we have the starting point to getting this WordPress exported XML data into a Ruby application.

More soon.

Categories
Uncategorized

Enabling WML on Apache by htaccess or config

If you’re uploading a WML file to a web server to test its validity, etc., there’s a few ways to test. The WMLBrowser for Mozilla based browsers does the job, for testing on a PC/Mac. If you wish to use the server live to see that WML file and how it treats the whole ‘card’ thing for sure, you can do one of two things:

You can alter your apache httpd config file (/etc/httpd/httpd.conf)
# MIME Types for WAP
AddType text/vnd.wap.wml .wml
AddType image/vnd.wap.wbmp .wbmp
AddType application/vnd.wap.wmlc .wmlc
AddType text/vnd.wap.wmlscript .wmls
AddType application/vnd.wap.wmlscriptc .wmlsc
AddType application/vnd.wap.xhtml+xml .xhtml

Alternatively create or edit the .htaccess file in the directory to include this:
DirectoryIndex index.wml
addtype text/vnd.wap.wml wml
addtype application/vnd.wap.wmlc wmlc
addtype text/vnd.wap.wmlscript wmls
addtype application/vnd.wap.wmlscriptc wmlsc
addtype image/vnd.wap.wbmp wbmp
AddType application/vnd.wap.xhtml+xml .xhtml

Note in both cases the entire list is *not* required, you can see some are for Scripting/Indexing and so on, but you can test it out.
Bits and pieces resourced from here