Posted by Chris Roos Wed, 03 Oct 2007 23:52:00
I haven’t done any research on this so it may have already been proposed over at the microformats site.
Sometimes I want to communicate with the author of a webpage. I don’t want to have to hunt around to find contact details and I probably don’t want to fill in a form.
If the author’s contact details were semantically marked up, either on the page I was looking at or linked to elsewhere on the web, then I could use a simple parser to extract those details and select the best communication tool to use manually.
Maybe some code can help explain what I mean.
<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.01//EN">
<html>
<head>
<title>Contact Details</title>
</head>
<body>
<dl class="contactDetails">
<dt>Skype</dt>
<dd class="skype identifier">skype-username</dd>
<dt>MSN</dt>
<dd class="msn identifier">msn-username</dd>
</dl>
</body>
</html>require 'rubygems'
require 'hpricot'
class ContactParser
def self.from_html(html)
doc = Hpricot(html)
new(doc)
end
def initialize(hpricot_doc)
@hpricot_doc = hpricot_doc
end
def service_identifiers
(@hpricot_doc/'.contactDetails .identifier').inject({}) do |hash, e|
service = (e.classes - ['identifier']).first
identifier = e.inner_text
hash[service] = identifier
hash
end
end
def identifier(service)
(@hpricot_doc/".contactDetails .#{service}.identifier").inner_text
end
end
contact_details_file = File.dirname(__FILE__) + '/contact-details.html'
html = File.open(contact_details_file) { |f| f.read }
parser = ContactParser.from_html(html)
p parser.service_identifiers
# => {"skype"=>"skype-username", "msn"=>"msn-username"}
p parser.identifier('skype')
# => 'skype-username'
p parser.identifier('blurgh')
# => ''Anyone have any thoughts?
Code is all over on google code.
1 I’m thinking of multi protocol tools, like adium, in particular.