Posted by Chris Roos Wed, 23 Jan 2008 08:48:00
I’m in the process of migrating this site from typo to a bunch of static pages. I aim to get the static site deployed this evening. I’ll post again when I’ve migrated. If any RSS subscribers out there don’t see a new post this evening, or tomorrow morning, then it means that I’ve failed to get it deployed or that it’s deployed and I’ve screwed up…. That’s all for now.
Oh and the code is in the usual place if you fancy following along.
Posted by Chris Roos Wed, 16 Jan 2008 09:31:00
In order to ensure that I didn’t break anything while setting up my mod_rewrite rules I created a little ruby script to test my expectations. I’ve cut down the rules in the example below but you should get the idea. It’d be quite cool to create a little dsl to better express the intentions of the code.
require 'net/http'
expectations = {
# Redirect www. to .
'http://www.the-local-paper.co.uk/' => {
:url => 'http://the-local-paper.co.uk/', :code => '301'
},
# Requests for the-local-paper.co.uk should return a 200 (OK) response
'http://the-local-paper.co.uk/' => {
:code => '200'
}
}
expectations.each do |request_url, expected_attributes|
puts "Requesting: #{request_url}"
url = URI.parse(request_url)
request = Net::HTTP::Get.new(url.path)
response = Net::HTTP.start(url.host, url.port) do |http|
http.request(request)
end
if redirection_url = expected_attributes[:url]
raise "Expected '#{redirection_url}' in the Location header but got '#{response['Location']}'." unless redirection_url == response['Location']
end
if status_code = expected_attributes[:code]
raise "Expected status code of (#{status_code}) but got (#{response.code})." unless status_code == response.code
end
endTags apache, mod_rewrite, ruby, test,
Posted by Chris Roos Mon, 14 Jan 2008 09:59:00
I’ve often felt the need to search through our subversion repository to find, for example, when a method was removed/renamed. As far as I’m aware there’s no easy way to do this using the standard subversion, or trac, tools. A few days ago the need arose once more: I wanted to know when a particular class had stopped being used within our codebase. It struck me that if I had a diff file for each changeset that I’d be able to grep for the change I was looking for. I hacked together a script to produce a diff of each changeset (actually, as I knew when the class was added, I only produced a diff for a subset of the changesets) and was able to find the information I needed. The script is pasted below and on google code. It assumes that you are running it from within an svn working direction. It took a while to produce the diff files (I’m afraid I don’t have any stats) but the result was a really fast way of finding, otherwise hard to find, information. I guess it’d be quite easy to automate the diff creation process so that you had an always up-to-date bunch of diff fies…
Is this useful, or have I missed something obvious in Subversion or Trac that does exactly this?
require 'fileutils'
stop_at_revision = ARGV[0]
raise "Please specify the revision that you wish to go back to as the only argument to this script." unless stop_at_revision
stop_at_revision = Integer(stop_at_revision)
class Time
def friendly_format
strftime("%Y-%m-%d %H:%M:%S")
end
end
def msg(message)
puts "#{Time.now.friendly_format} - #{message}"
end
def produce_diff(revisions, earlier_revision = nil)
if revisions.empty?
msg "All finished."
exit
end
later_revision = earlier_revision ? earlier_revision : revisions.shift
earlier_revision = revisions.shift
msg "Creating diff from revision #{earlier_revision} to #{later_revision}."
`svn diff -r#{earlier_revision}:#{later_revision} > patches/#{later_revision}.patch`
produce_diff(revisions, earlier_revision)
end
FileUtils.mkdir_p 'patches'
msg "Scanning output of svn log to find all revisions that we care about..."
log = `svn log -rHEAD:#{stop_at_revision} -q`
revisions = log.scan(/^r(\d+)/)
revisions = revisions.flatten.collect { |revision| Integer(revision) }
produce_diff(revisions)Tags ruby, script, search, subversion, svn, trac,
Posted by Chris Roos Fri, 14 Dec 2007 08:46:00
I wasted quite a bit of time over the last couple of days trying to get a bookmarklet (or favelet) working in Internet Explorer 6. It worked perfectly in Firefox and Safari but wasn’t doing anything at all in IE. Through lots of trial and error, I came to the conclusion that it was because of the length of the bookmarklet: it seems that Microsoft reduced the maximum length of a bookmarklet in IE 6
Although my original bookmarklet wasn’t all that long, I had it spread over multiple lines for readability. The fix in my case was to remove all the spaces and place the bookmarklet on one line.
I started to think that it might be nice to have a bookmarklet rails helper. The bookmarklet helper would allow you to format your javascript for readability but shorten it all in the resulting html (it could even warn if the bookmarklet is too long for IE). I couldn’t find anything similar, and I probably won’t get around to doing it myself, but I did decide to investigate the limit in IE (which could form the basis of such a helper).
I used the html page below to test the limits of Internet Explorer. This is the longest bookmarklet that I could get to work: one more character (a 2 on the end of those numbers, for example) and the bookmarklet won’t work (it still works if you click the link in the page – it only fails when used as an actual bookmarklet).
<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.01//EN">
<html>
<head>
<title></title>
</head>
<body>
<a href="javascript:
var msg = 'hello world, this is my message. i\'m trying to see how long i can be until internet explorer blows up in my face.';
var msg = msg + '\n';
var msg = msg + 'this is the second line of my long message';
var msg = msg + '\n';
var msg = msg + '12345679012345678901';
alert(msg);">bookmarklet</a>
</body>
</html>I noticed that, once added to Internet Explorer, the bookmarklet appeared to be URL encoded (my assumption was based on the conversion of spaces to %20.)
I tried to replicate this encoded string in ruby but failed: A simple URI.escape in ruby was escaping too much. Inspecting the bookmarklet again seemed to suggest that the only encoding performed is to remove newlines and convert spaces to %20. Replicating this in irb gave me the same string that Internet Explorer had in its favourites.
irb) str = "javascript:
var msg = 'hello world, this is my message. i\'m trying to see how long i can be until internet explorer blows up in my face.';
var msg = msg + '\n';
var msg = msg + 'this is the second line of my long message';
var msg = msg + '\n';
var msg = msg + '12345679012345678901';
alert(msg);"
irb) p str.gsub("'\n'", "'--NEW-LINE--'").gsub("\n", '').gsub("'--NEW-LINE--'", "'\n'").gsub(' ', "%20").length
=> 505Everything I’d read about these bookmarklets in Internet Explorer suggested that the limit was 508 characters: the length of the bookmarklet above (once encoded) is only 505 characters. Adding one more character makes it fail as a bookmarklet. Hmm, maybe I’m doing something dumb?
I’ve cheated a bit in the ‘encoding’ above. IE removes all newlines (I wonder if this happens in the DOM or just when you add it as a favourite?) so that the bookmarklet appears on one line. I needed to do the same but also needed those newlines in the msg to remain (hence the NEW-LINE replacement stuff). This is a very naive imlpementation: if you have ’\n\n’, for example, then it breaks. The ‘encoding’ would need to be a little more intelligent in an actual helper (but only if you wanted to report on the size of the bookmarklet).
I couldn’t find any reference to source of this 508 character limit in Internet Explorer: all the blog articles just state it as fact. So, maybe it’s not 508. Maybe it’s actually 505…
Tags bookmarklet, browser, explorer, internet, internet-explorer, javascript, ruby, url,
Posted by Chris Roos Fri, 07 Dec 2007 17:23:00
I needed to use an external SMTP server to get mail through to certain providers (notably, hotmail).
I was sending my action mailer email but couldn’t be sure that I’d configured the settings correctly as I didn’t get any detailed log output (or at least not where I was looking).
I dug into the code and found the lines that were responsible for sending the email. Replaying that code in the console allowed me to identify the problem immediately (a missing domain in my case). I thought I’d paste those few lines here in case they’re of some use to someone else.
message, from, to = 'test message', 'sender-email', 'recipient-email'
s = ActionMailer::Base.server_settings
Net::SMTP.start(s[:address], s[:port], s[:domain], s[:user_name], s[:password], s[:authentication]) do |smtp|
smtp.sendmail(message, from, to)
endTags action, action-mailer, mailer, rails, ruby,
Posted by Chris Roos Fri, 07 Dec 2007 07:26:00
So, not being able to sleep does have some benefits. I managed to get started on my latest little pet project.
I had been using apache and mod_rewrite to redirect “chrisroos.co.uk/amazonwishlist” to my actual wishlist. I had something like this in my apache config.
<VirtualHost *:80>
ServerAdmin webmaster@seagul.co.uk
ServerName chrisroos.co.uk
ServerAlias www.chrisroos.co.uk
RewriteEngine On
# <anything>.chrisroos.co.uk -> chrisroos.co.uk
RewriteCond %{HTTP_HOST} !^chrisroos.co.uk$ [NC]
RewriteRule ^/(.*)$ http://chrisroos.co.uk/$1 [R=301,L]
# Amazon Wishlist
RewriteRule ^/amazonwishlist http://www.amazon.co.uk/gp/registry/IO9HVNCPEWGD [R]
</VirtualHost>I’ve replaced that with a much simpler apache config that proxies requests to my new mongrel redirection handler thing (source – hey, even that URL passes through my redirection service).
<VirtualHost *:80>
ServerName www.chrisroos.co.uk
ServerAlias chrisroos.co.uk
RewriteEngine On
RewriteRule (.*) http://localhost:4010$1 [P]
</VirtualHost>Feel free to go run this on your own server if you’re so inclined. At the moment you’ll have to generate the redirection rules by hand. In irb, you can do something like:
# require 'yaml'
# # For www.example.com
# rules = { 'www.example.com' => 'example.com' }
# File.open('PATH_TO_RULES_FOLDER' + '/www.example.com', 'w') { |file| file.puts(rules.to_yaml) }
# # For example.com
# rules = { '/google' => 'www.google.com' }
# File.open('PATH_TO_RULES_FOLDER' + '/example.com', 'w') { |file| file.puts(rules.to_yaml) }Start the redirection server.
$ REDIRECTION_PORT=XXXX ruby mongrel-redir.rb &
$# Leaving out REDIRECTION_PORT will mean that the server starts on port 4000
$# Leaving out the final ampersand (&) will mean that server runs in the foregroundVisit www.example.com/google1 and watch2 as you’re redirected first to example.com/google and then onto www.google.com. Cool huh.
Lots to do – notably a user interface for managing rules but it’s not a bad start.
Oh, and this is specifically so that you can redirect from your own domain. If you don’t care about the domain then you could try tinyurl or, for human friendly URLs, decent url.
[1] Assuming, of course, that www.example.com and example.com both point to the server that this redirection server is running on.
[2] You probably won’t be able to watch as it’ll redirect too quickly in the browser – your best bet is to use curl or the Live HTTP Headers firefox extension so see what’s actually going on.
Tags apache, decenturl, mod_rewrite, mongrel, ruby, tinyurl, url,
Posted by Chris Roos Thu, 06 Dec 2007 16:26:00
First up – I don’t know an awful lot about the magic that is Rake. As such, this is probably common knowledge to most people.
I discovered the difference when trying to execute the db:migrate task from within another rake task (I already knew it worked when declared as a dependency).
# Rakefile
task :foo do
Rake::Task['db:migrate'].execute
end
#$ rake foo
#=> rake aborted!
#=> uninitialized constant ActiveRecordI changed Task#execute to Task#invoke and, voila, it all worked fine. The rdoc for those methods is actually pretty self explanatory, having seen the differences in action. Oh well.
I put together a simple example to demonstrate the differences.
# The task (task_1) and its dependency (to_be_run_before_task_1)
task :to_be_run_before_task_1 do
puts "to_be_run_before_task_1"
end
task :task_1 => ['to_be_run_before_task_1'] do
puts "task_1"
end
# Three tasks that 'run' task_1
task :invoke_task_1 do
Rake::Task['task_1'].invoke
end
task :execute_task_1 do
Rake::Task['task_1'].execute
end
task :run_task_1_using_dependencies => ['task_1']
# 'Running' the tasks
#$ rake task_1
#=> to_be_run_before_task_1
#=> task_1
#$ rake invoke_task_1
#=> to_be_run_before_task_1
#=> task_1
#$ rake execute_task_1
#=> task_1 #*** Note that the dependencies are not run
#$ rake run_task_1_using_dependencies
#=> to_be_run_before_task_1
#=> task_1Posted by Chris Roos Wed, 03 Oct 2007 23:52:00
I haven’t done any research on this so it may have already been proposed over at the microformats site.
Sometimes I want to communicate with the author of a webpage. I don’t want to have to hunt around to find contact details and I probably don’t want to fill in a form.
If the author’s contact details were semantically marked up, either on the page I was looking at or linked to elsewhere on the web, then I could use a simple parser to extract those details and select the best communication tool to use manually.
Maybe some code can help explain what I mean.
<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.01//EN">
<html>
<head>
<title>Contact Details</title>
</head>
<body>
<dl class="contactDetails">
<dt>Skype</dt>
<dd class="skype identifier">skype-username</dd>
<dt>MSN</dt>
<dd class="msn identifier">msn-username</dd>
</dl>
</body>
</html>require 'rubygems'
require 'hpricot'
class ContactParser
def self.from_html(html)
doc = Hpricot(html)
new(doc)
end
def initialize(hpricot_doc)
@hpricot_doc = hpricot_doc
end
def service_identifiers
(@hpricot_doc/'.contactDetails .identifier').inject({}) do |hash, e|
service = (e.classes - ['identifier']).first
identifier = e.inner_text
hash[service] = identifier
hash
end
end
def identifier(service)
(@hpricot_doc/".contactDetails .#{service}.identifier").inner_text
end
end
contact_details_file = File.dirname(__FILE__) + '/contact-details.html'
html = File.open(contact_details_file) { |f| f.read }
parser = ContactParser.from_html(html)
p parser.service_identifiers
# => {"skype"=>"skype-username", "msn"=>"msn-username"}
p parser.identifier('skype')
# => 'skype-username'
p parser.identifier('blurgh')
# => ''Anyone have any thoughts?
Code is all over on google code.
1 I’m thinking of multi protocol tools, like adium, in particular.
Tags aim, contact, email, im, messaging, microformat, msn, ruby, skype, twitter, yim,
Posted by Chris Roos Sat, 29 Sep 2007 09:11:00
On Thursday, James and I spent some time investigating remote pairing options. One tiny part of that investigation was finding a way to keep two remote filesystems in sync. I initially thought of using rsync at regular intervals with cron. Unless I’ve missed something, the smallest interval between cron jobs is one minute which is potentially too long. We wanted a way to rsync (or sync in general) whenever the filesystem changed. A little searching brought this fslogger tool to my attention. It simply writes some event details to STDOUT every time something changes on the filesystem. I figured that we could use this, with a little bit of ruby glue, to trigger rsync. Our glue script continuously reads from STDIN, tries to match some pre-defined patterns and triggers rsync if those patterns are matched. It’s very brittle but it gets the job done. Well, actually, it only partially gets the job done: I’m very unfamiliar with rsync but it seemed to take between 10 and 30 seconds to sync even if only one file had changed. I suspect that I may have been doing something wrong though.
#!/usr/bin/env ruby
REMOTE_SERVER = 'YOUR_REMOTE_SERVER'
REMOTE_PATH = 'YOUR_REMOTE_PATH'
DIRECTORY_PATTERN = ARGV.delete_at(0)
raise "You must specify a directory_pattern" unless DIRECTORY_PATTERN
def rsync!
`rsync -rz #{File.join(DIRECTORY_PATTERN, '*')} #{REMOTE_SERVER}:#{REMOTE_PATH}`
end
while line=gets
if line =~ /FSE_ARG_VNODE/ # CREATE FILE OR DIR
rsync! if line[/path\s+=(.*)/, 1].strip =~ Regexp.new(DIRECTORY_PATTERN)
elsif line =~ /FSE_ARG_STRING/ # REMOVE FILE OR DIR
rsync! if line[/string\s+=(.*)/, 1].strip =~ Regexp.new(DIRECTORY_PATTERN)
end
endTags cron, file, file-system, fslogger, rsync, ruby, script, sync, system,
Posted by Chris Roos Thu, 27 Sep 2007 07:36:00
I’ve started looking into my *cough* promise *cough* to do that local paper site I mentioned sometime ago. The code is all in me google code repository if you want to see how it’s going (or finish it when I inevitably get bored again…).
Anyway, the point of this post is that I realised something about testing that I feel I may have missed in the past. I’ve got used to stating the behavior of the objects in my system through tests. So, when I wanted to ensure that an article couldn’t be created without a title, I added a test like so:
class ArticleTest < Test::Unit::TestCase
def test_should_validate_presence_of_title
...
end
endIn fact, for an article comprised of a title, an edition, a page number and an author, I wanted to ensure that all were there apart from the author. It was only once I’d added tests to validate the presence of the other three attributes that I realised it was probably important to prove that an author wasn’t required. So, I added the relevant test.
class ArticleTest < Test::Unit::TestCase
def test_should_not_validate_presence_of_author
...
end
endThat’s all well and good, but I’ve lost some information: WHY it shouldn’t validate the presence of an author. I know at the moment why it shouldn’t (because short articles don’t have an author listed), but will I know at some point the future? I suspect not. The same obviously applies for the things that I do want to ensure are present. Why do I care that an article has a title, edition and page number? At the moment I care because I’m planning on making them part of the URL, but what if I decide not to go down that route, yet leave the tests there? There’ll be no easy way for someone in the future to determine whether those things are actually important or not.
I wonder if this issue was highlighted because of my lack of SVN access. I generally like really small checkins to my version control software which would’ve allowed me to have added and committed those validation additions one at a time, and with messages that would have stated the reasons for that change1. Being disconnected means that I end up with bigger checkins that inevitably lose some information. Actually, the more I think about it, the more I think that the repository isn’t the right place for this information anyway – it’s too far away from the code that’s solving the problems.
How do other people solve this potential loss of knowledge? Am I missing something obvious, or have I really just not noticed this before?
P.S. I spent a few minutes chatting with James about it today and he had some interesting suggestions, although I’ll leave it up to him to share those if he wants (no pressure James, I’m just running out of time right now I’m afraid).
1 Paul suggested a while back that we use the commit note to explain the problem that we’ve solved, rather than just describing the changeset. Unfortunately, I can’t find the source of that suggestion so no link I’m afraid.
Tags agiledox, bdd, development, ruby, tdd, test, testing,
Posted by Chris Roos Tue, 04 Sep 2007 23:23:00
I’ve wondered in the past about sending trackbacks (or pingbacks) to bookmarked resources on del.icio.us. As with most things, you can either wait for someone to do it for you, or you can do it yourself… I started to investigate last Thursday afternoon and soon got sucked into the much more interesting issue of the missing permalinks for bookmarks. You see, every time you bookmark something on del.icio.us, you are creating a new resource. Unfortunately, it’s a very sad :-( resource. Sad because it doesn’t have a URL and so can’t be linked to. An anonymous resource floating around the world wide web looking for love. Boo hoo. Anyway, enough of that. I figured that if we could assign a unique tag to each of these resources then they would, as a side effect, get a permalink (you can already get your bookmarks containing a given tag by using http://del.icio.us/USERNAME/MY_TAG). I experimented with using the url as a tag. Although there seem to be some limits on what you can use, I found that if you strip the protocol and trailing slash from a url you are generally OK. So, if we bookmark BBC News and tag it with news.bbc.co.uk then we can access that individual bookmark at http://del.icio.us/chrisjroos/news.bbc.co.uk.
As you can only bookmark the same URL once, it should guarantee that the url-tag is unique. I can, however, see a problem with the trailing slash. Although it’s possible to bookmark URLs that differ only in the trailing slash (http://example.com/article1 and http://example.com/article1/ for example), it doesn’t seem possible to create a tag that contains that same trailing slash. OK, so we could always ensure that we remove the trailing slashes from the URLs we bookmark but that doesn’t seem like a very robust solution.
While experimenting, I found that I could use forward slashes within my tags, www.foo.com/bar/baz for example. So, I started prepending url/ to my url-tag so that I could easily recognise posts that already had permalinks. It was at this time that I noticed the similarity between the del.icio.us url history page del.icio.us/url/MD5_OF_URL (e.g. BBC News) and my permalinked bookmark, del.icio.us/chrisjroos/url/URL. If I was to hash the bookmarked url in the same way as del.icio.us then I would end up with a url like del.icio.us/chrisjroos/url/MD5_OF_URL. You can see the similarity in the URLs below.
It looks as though it’s part of del.icio.us itself doesn’t it? Cool huh. It’d be really cool to extend the del.icio.us firefox extension to automatically add this tag when bookmarking a site.
I’ve created a crappy script that will add this url/MD5_OF_URL tag to each of your posts. It’s hosted on google code and pasted below. Err, use at your own risk by the way.
USERNAME = 'YOUR_USERNAME'
PASSWORD = 'YOUR_PASSWORD'
POSTS_CACHE = 'FILE_TO_STORE_YOUR_EXISTING_DELICIOUS_POSTS_IN'
# GET ALL POSTS
unless File.exists?(POSTS_CACHE)
puts "Downloading all posts..."
curl_cmd = <<-EndCurl
curl "https://api.del.icio.us/v1/posts/all" \
-u"#{USERNAME}:#{PASSWORD}" \
-s
> #{POSTS_CACHE}
EndCurl
`#{curl_cmd}`
end
require 'hpricot'
require 'md5'
require 'cgi'
def add_url_hash_to_post(post)
url = CGI.unescapeHTML(post['href'])
url_hash = MD5.md5(url)
url = CGI.escape(url)
tags = post['tag'].split(' ').collect { |tag| CGI.unescapeHTML(tag) }
tags << "url/#{url_hash}" unless tags.include?("url/#{url_hash}")
tags = tags.collect { |tag| CGI.escape(tag) }
tags = tags.join(' ')
shared = post['shared']
description = CGI.escape(CGI.unescapeHTML(post['description']))
extended = CGI.escape(CGI.unescapeHTML(post['extended']))
curl_cmd = <<-EndCurl
curl "https://api.del.icio.us/v1/posts/add" \
-u"#{USERNAME}:#{PASSWORD}" \
-d"url=#{url}" \
-d"description=#{description}" \
-d"extended=#{extended}" \
-d"tags=#{tags}" \
-d"shared=#{shared}" \
-s
EndCurl
`#{curl_cmd}`
end
# ADD THE URL/<md5_hash> TAG TO EACH POST
posts_xml = File.open(POSTS_CACHE) { |f| f.read }
posts_doc = Hpricot(posts_xml)
count = 1
(posts_doc/'posts'/'post').each do |post_xml|
puts "Bookmark: #{count}" if (count % 5) == 0
add_url_hash_to_post(post_xml.attributes)
count += 1
endTags bookmark, cool, curl, del.icio.us, permalink, pingback, ruby, trackback,
Posted by Chris Roos Fri, 13 Jul 2007 14:41:00
Using the data supplied in this article, by Nik Sargent, I’ve created a simple web service that returns JSON formatted data about UK Outcodes (the first bit of the postcode).
The idea is that you can request, for example, /postcodes/se1 and get some json in return. For se1, we receive the following data.
{"latitude":"51.498","x":"532600","postcode":"SE1","y":"179500","longitude":"51.498"}The service is currently running at http://seagul.co.uk/postcodes (se1) but I make no guarantees as to its reliability. I’d suggest that if anyone actually finds this useful that they go host it somewhere for themselves in order to control the availability.
The code is, as always, on google code and pasted below.
I’ve created it as a proof of concept but I figure that, given a little love, it may prove useful to some folks.
require 'rubygems'
require 'mongrel'
require 'json'
postcodes_file = File.dirname(__FILE__) + '/uk-postcodes.json'
Postcodes = JSON.parse(File.open(postcodes_file) { |f| f.read })
Index = Postcodes.inject([]) { |index, postcode| index << postcode['postcode'].downcase }
class PostcodeHandler < Mongrel::HttpHandler
def process(request, response)
outcode = request.params['PATH_INFO'].sub(/^\//, '').downcase
postcode_index = Index.index(outcode)
if postcode_index
response.start do |head, out|
head["Content-Type"] = "text/plain"
out << Postcodes[postcode_index].to_json
end
else
response.start(404) do |head,out|
out << "Postcode not found\n"
end
end
end
end
config = Mongrel::Configurator.new :host => 'localhost', :port => '4000' do
listener do
uri "/postcodes", :handler => PostcodeHandler.new
end
trap("INT") { stop }
run
end
config.joinTags mongrel, outcodes, postcodes, rest, ruby, service, web,
Posted by Chris Roos Sat, 23 Jun 2007 18:00:00
I wanted to automate the download of transaction data for my ing direct account. Instead of just pasting up the finished result, I’ve tried to capture the process I went through, on the off chance it’s of interest to anyone…
Open the homepage
Click the login link. (goes to https://secure.ingdirect.co.uk/InitialINGDirect.html?command=displayLogin&device=web&locale=en_GB)
Type our Customer Number and Last Name into the login form.
Before submitting the form, we open the Live Http headers firefox extension.
We see that the following information gets POSTed to InitialINGDirect.html
command=enterCustomerNumber&locale=en_GB&device=web&ACN=<YOUR_CUSTOMER_NUMBER>&LNAME=<YOUR_LAST_NAME>&GO.x=23&GO.y=12We are redirected to a second security page.
The first use of curl is to see whether we can get this far from the command line.
curl -X"POST" -d"command=enterCustomerNumber" -d"locale=en_GB" -d"device=web" -d"ACN=YOUR_CUSTOMER_NUMBER" -d"LNAME=YOUR_LAST_NAME" -d"GO.x=23" -d"GO.y=12" "https://secure.ingdirect.co.uk/InitialINGDirect.html" -o"<FILENAME_FOR_OUTPUT>"The actual output in this instance is
<SCRIPT>
location.replace("/INGDirect.html?command=displayValidateCustomer&fill=1");
</SCRIPT>So we amend the curl command to see exactly what’s going on (the removal of -o and the addition of -v)
Right, so this is interesting – I was expecting to see an HTTP Location header directing us to the second security page. Instead the body of the response is the javascript seen above. I guess they want to ensure that we have javascript enabled eh.
Let’s just see what happens if we request the page returned in the javascript (https://secure.ingdirect.co.uk/INGDirect.html?command=displayValidateCustomer&fill=1)
Ok, so that’s not unexpected. How about, if we store the cookies sent to us in the original response, and then request the page returned in the javascript.
curl -X"POST" -c"/Users/chrisroos/Desktop/ing-cookie" -d"command=enterCustomerNumber" -d"locale=en_GB" -d"device=web" -d"ACN=YOUR_CUSTOMER_NUMBER" -d"LNAME=YOUR_LAST_NAME" -d"GO.x=23" -d"GO.y=12" "https://secure.ingdirect.co.uk/InitialINGDirect.html"
curl -b"/Users/chrisroos/Desktop/ing-cookie" "https://secure.ingdirect.co.uk/INGDirect.html?command=displayValidateCustomer&fill=1" -o"/Users/chrisroos/Desktop/ing.html"Ok, this looks promising. We can now proceed to the second security page.
Let’s use firefox to log in and get to the second security page again. Fill in the details as requested, open live http headers and get hold of the data we are sending. Right, so we POST the following details to InitialINGDirect.html
command=validateCustomer&locale=en_GB&device=web&PIN_A=5&PIN_B=1&PIN_C=4&DAYS=65&MONTHS=65&YEARS=66&GO.x=41&GO.y=9Ah, so this looks interesting. It seems that each key on the keypad is assigned a number (potentially different to the number displayed on the key). It is that number that gets sent. The numbers on the keypad appear in a random order, so I wonder if the same number is always assigned to the same key, or whether something else is required to tell the server the order of the keys. Let’s log in one more time and see if the results differ. The assumption being that if they do then the server must know what order the keys have been displayed in.
Bugger – now it won’t let me log in anymore (too many failed attempts) and I have to call the call centre. Ooops.
Right, so it’s now about 5 weeks after starting this investigation. I’m going to attempt to pick up where I left off and see if we can’t get it finished. Previously, before I got locked out of my account, I was going to test whether each digit on the keypad got assigned a different number each time. In order to test, I logged into my account twice. Each time, I saved the content of the page containing the keypad, and the output from Live Http Headers. A quick examination of the differences between the output seems to suggest that we only ever send the position of a number on the keypad. A made-up example might help to make this clear, starting with a keypad in no particular order.
5 1 2 4 7 3 0 8 9 6
If our pin number is 1234 and we are asked for the first digit then, although we would push the button labeled 1 on the keypad, we would actually send the value 2 to the server. The buttons (but not the values they represent) all have a constant position, where the top-left button is position 1, the bottom-right is position 9 and the bottom button is position 0 (same layout as phone keypads). For the above example, the button labeled 1 is in position 2, hence us sending a 2 to the server.
Right, so unless I’m missing something, the server must ‘know’ which three digits we’re being asked for, along with the order of the keypad. If we’re to log in automatically then we’re going to need to parse the text that asks us for certain digits and determine the order of the keypad. A little experimentation with hpricot yielded the following code.
require 'rubygems'
require 'hpricot'
html = File.open('/Users/chrisroos/Desktop/ingdirect-b1.html') { |f| f.read }
# The request for our PIN digits appears within the following text
# <b>Using the Key Pad, please enter the 3rd, 5th and 2nd digits from your PIN</b>
pin_a, pin_b, pin_c = html.scan(/Using the Key Pad, please enter the (\d).+?(\d).+?(\d).*?<\/b>/).flatten
# Ok, now let's find the order of the keypad (hey, it's in a div with id pin-pad - cool)
# This has turned out quite trivial. The keypad is rendered within an html table, where each cell contains a button (input)
# that has a value of the number displayed on the button. Hpricot makes it very easy to say 'get me all buttons (input)
# that appear in cells within the table that is in the div with an id of pin-pad'. As hpricot will parse the html table
# 'in order', i.e. from top-left to bottom right, we just have to append each button value into an array, preserving the
# order of the keypad.
doc = Hpricot(html)
keypad = (doc/'div#pin-pad/table/tr/td/input').inject([]) { |array, btn| array << btn.attributes['value'].to_i; array }
# The last value read from the html keypad table will appear at index 9 (our array is 0 based), where we actually want it to
# appear at index 0 (keypad is in phone keypad order, i.e. 0 at bottom). Luckily, we can just pop it off the end and
# prepend it to the array
keypad.unshift(keypad.pop)
# So, we can now convert a keypad like this
# 5 1 2
# 4 7 3
# 0 8 9
# 6
# into the equivalent ruby array [6, 5, 1, 2, 4, 7, 3, 0, 8, 9]
# We can then use Array#index to dive in grab the keypad position of a number
# keypad.index(5) #=> 1
# keypad.index(7) #=> 5Cool, so we should be able to string together what we have and automatically login to our account, we need to dip into ruby so that we might parse the keypad page using our code above.
Right, I’ve cobbled the code together but instead of seeing an account overview page, I’m actually receiving a page with more javascript relocation magic.
<SCRIPT>
location.replace("/InitialINGDirect.html?command=displayLoggedOutError&locale=en_GB&device=web");
</SCRIPT>I’m wondering if I’m missing some cookies – maybe some more cookies get added to the jar on the keypad page. Using the web developer extension this is trivial to check. Two cookies are set when we first visit the login page… Hmm, and there are still two when we get to the keypad page. So something else is different. Ok, so I found it. Somehow, I’d managed to miss out some data that was required for the login to work correctly. I was not sending the command=validateCustomer key/pair even though I’d already pasted it once above. Oh well. With this ‘command’ added, we still get a page with some javascript relocation but this time we get relocated to the account summary..
<SCRIPT>
location.replace("/INGDirect.html?command=accountSummary&locale=en_GB&device=web&method=fetchClientAccountSummary");
</SCRIPT>If we finally GET this page then we should be able to ‘see’ our account overview on the command line. Of course it wasn’t that simple. This just yields another javascript redirect.
<SCRIPT>
location.replace("/INGDirect.html?command=displayClientAccountSummary&fill=1");
</SCRIPT>Let’s try to GET this page then. Woohoo, we’re finally at the account overview page. Wow that was hard work, so, so much harder than it should be for a web application.
The final step is to GET and save the transaction page. The url to the account page in the browser is:
https://secure.ingdirect.co.uk/INGDirect.html?command=accountSummary&method=fetchClientAccountSummary&account=0&stepName=saving
GETting that url on the command line yields yet another javascript redirect (really, you do surprise me).
<SCRIPT>
location.replace("/INGDirect.html?command=displayAccountDetails");
</SCRIPT>Fine, let’s GET that url then (starting to get bored now). And, wait for it… Yeah, we have downloaded transactions. Phew. Right then, just a little code tidying and I’ll get it uploaded to google code and this article can be posted. Yay.
Ok, so we’re uploaded to google code. Now to set this little fella free… An we’re done.
Tags curl, direct, finance, ing, ing-direct, ofx, ruby, transaction, wesabe,
Posted by Chris Roos Sat, 23 Jun 2007 12:34:00
I’ve followed in my previous footsteps and automated the download of my egg credit card transactions and statements. Using this in combination with my ofx convertor makes it easier for me to upload my financial data to wesabe.
As always, the code is in google code and pasted below for your viewing pleasure…
require 'egg-credentials'
def execute_curl(cmd)
`#{cmd}`
end
COOKIE_LOCATION = "/tmp/egg.cookie"
STATEMENT_DATE = '18 January 2007'
RECENT_TRANSACTIONS_FILE = '/Users/chrisroos/Desktop/egg-recent-transactions.html'
STATEMENT_FILE = '/Users/chrisroos/Desktop/egg-statement.html'
# Logging in requires a cookie that is set when we first visit the login page
curl_cmd = %[curl -s -c"#{COOKIE_LOCATION}" "https://new.egg.com/security/customer/logon?URI=https://new.egg.com/customer/youraccounts"]
execute_curl curl_cmd
# POST our login credentials
curl_cmd = %[curl -s -b"#{COOKIE_LOCATION}" -c"#{COOKIE_LOCATION}" -L "https://logon.egg.com/LoginWebServer/services/CustomerLogin" -d"firstName=#{FIRST_NAME}" -d"lastName=#{LAST_NAME}" -d"dobDay=#{DOB_DAY}" -d"dobMonth=#{DOB_MONTH}" -d"dobYear=#{DOB_YEAR}" -d"postcode=#{POSTCODE}" -d"mmn=#{MOTHERS_MAIDEN_NAME}" -d"password=#{PASSWORD}" -d"HiddenURI=https%3A%2F%2Fnew.egg.com%2Fcustomer%2Fyouraccounts"]
execute_curl curl_cmd
# List and save recent transactions
curl_cmd = %[curl -s -o"#{RECENT_TRANSACTIONS_FILE}" -b"#{COOKIE_LOCATION}" "https://your.egg.com/customer/eggcard/recenttransactions.aspx"]
execute_curl curl_cmd
# There is no sensible way to get your statements. I'd hope for something like /statements/18-jun-2007.html but no such luck
# Instead, there is a zero indexed array of statement dates. Selecting one of these statement dates allows us to construct
# a uri using the index of the date in the list. So, if 18 June 2007 is the first date in the list (i.e. index 0) then we can view that
# statement at https://your.egg.com/customer/eggcard/statements.aspx?index=0
# Interestingly, this is how it works from the recent transactions page, from any actual statement page, it appears that this is
# too hard (it's not) and instead the developers have resorted to using some magical .net/javascript voodoo to obtain a different
# statement.
require 'rubygems'
require 'hpricot'
recent_transactions = File.open(RECENT_TRANSACTIONS_FILE) { |f| f.read }
doc = Hpricot(recent_transactions)
statement_dates = []
(doc/'select#ctl01_content_recenttransactions_lstPreviousStatements/option').each { |e| statement_dates << e.inner_text }
statement_index = statement_dates.index(STATEMENT_DATE)
curl_cmd = %[curl -s -o"#{STATEMENT_FILE}" -b"#{COOKIE_LOCATION}" "https://your.egg.com/customer/eggcard/statements.aspx?index=#{statement_index}"]
execute_curl curl_cmd
Tags curl, egg, finance, ofx, ruby, wesabe,
Posted by Chris Roos Fri, 04 May 2007 12:53:00
I’ve updated my previous curl/ruby script to automatically download my most recent transactions from Hsbc. I created the original script as I had a couple of hours spare and wanted to see if I could do it. The update has been driven by my continued use of wesabe. Not only does it make my life a bit easier (which is, after all, the point of software) but the wesabe guys have suggested that they may be able to roll it into their client. Cool.
As an aside. It really, really, really shouldn’t be as hard as it is to automatically get a downloaded list of my transactions. At least Hsbc offer transactions in ofx format, unlike egg and ing direct. Come on, The Banks, sort it out eh.
Tags curl, egg, hsbc, ingdirect, ofx, ruby, wesabe,
Posted by Chris Roos Mon, 30 Apr 2007 09:08:00
require 'test/unit'
class CandidateTest < Test::Unit::TestCase
def setup
@candidate = Candidate.new
end
def test_should_love_learning
assert @candidate.loves_learning?
end
def test_should_love_ruby
assert @candidate.loves_ruby?
end
def test_should_love_problem_solving
assert @candidate.loves_problem_solving?
end
def test_should_not_be_a_job_agency
assert !@candidate.is_a_job_agency?
end
def test_should_know_the_secret_password
secret_password = 'hmm, so you think you know the secret password huh?'
candidate = Candidate.new(secret_password)
i_know_it_so_please_show_me_the_money(candidate)
end
end
__END__
Y2xhc3MgQ2FuZGlkYXRlCiAgZGVmIGluaXRpYWxpemUodG9wX3NlY3JldF9w
YXNzd29yZCA9IG5pbCkKICAgIEBzZWNyZXRfbWVzc2FnZSA9IG5pbAogICAg
aWYgdG9wX3NlY3JldF9wYXNzd29yZCA9PSAncnVieSBpcyBjb29sJwogICAg
ICBAc2VjcmV0X21lc3NhZ2UgPSAiQ2lvcUtpb3FLaW9xS2lvcUtpb3FLaW9x
S2lvcUtpb3FLaW9xS2lvcUtpb3FLaW9xS2lvcUtpb3FLaW9xCktpb3FLaW9x
S2lvcUtpb3FLaW9xS2lvcUtpb3FLaW9xS2lvcUtpb3FLaW9xS2lvcUtpb3FL
aW9xQ2xkdgpiMmh2Ynl3Z2VXOTFJR1p2ZFc1a0lIUm9hWE1nZG1WeWVTQndi
Mjl5YkhrZ2FHbGtaR1Z1TENCdFpXZGgKTFhObFkzSmxkQ3dnYldWemMyRm5a
UzRLQ2trZ2MyRjVJRzFsWjJFdGMyVmpjbVYwTENCaWRYUWdkRzhnClltVWdh
Rzl1WlhOMExDQjBhR2x6SUdseklHcDFjM1FnWVNCMGFHbHViSGtnZG1WcGJH
VmtJR3B2WWlCaApaSFpsY25RZ1ptOXlDbkpsWlhadmJ5NWpiMjB1SUNCVGIz
SnllU3dnWW5WMElFa2dkMkZ6SUdKdmNtVmsKSUc5dUlIUm9aU0IwY21GcGJp
NHVMZ29LUVc1NWFHOXZMQ0JwWmlCNWIzVWdabUZ1WTNrZ2QyOXlhMmx1Clp5
QjNhWFJvSUhWeklIUm9aVzRnWTJobFkyc2dRbVZ1SjNNZ2NHOXpkRnN4WFNC
aGJtUWdZWEJ3YkhrcwpJR0Z3Y0d4NUxDQmhjSEJzZVM0S0Nsc3hYU0JvZEhS
d09pOHZkM2QzTG5KbFpYWnZieTVqYjIwdllteHYKWjNNdlltVnVaM0pwWm1a
cGRHaHpMekl3TURjdk1EUXZNRE12Y21GcGJITXRaR1YyWld4dmNHVnlMV3B2
CllpOEtLaW9xS2lvcUtpb3FLaW9xS2lvcUtpb3FLaW9xS2lvcUtpb3FLaW9x
S2lvcUtpb3FLaW9xS2lvcQpLaW9xS2lvcUtpb3FLaW9xS2lvcUtpb3FLaW9x
S2lvcUtpb3FLaW9xS2lvcUtpb3FLaW9xS2lvcUtpb0sKQ2c9PQoiCiAgICBl
bmQKICBlbmQKICBkZWYga25vd3NfdGhlX3NlY3JldF9wYXNzd29yZD8KICAg
IEBzZWNyZXRfbWVzc2FnZQogIGVuZAogIGRlZiBsb3Zlc19sZWFybmluZz8K
ICAgIHRydWUKICBlbmQKICBkZWYgbG92ZXNfcnVieT8KICAgIHRydWUKICBl
bmQKICBkZWYgbG92ZXNfcHJvYmxlbV9zb2x2aW5nPwogICAgdHJ1ZQogIGVu
ZAogIGRlZiBpc19hX2pvYl9hZ2VuY3k/CiAgICBmYWxzZQogIGVuZAogIGRl
ZiBzZWNyZXRfbWVzc2FnZQogICAgcmFpc2UgIlVoIG9oLCBsb29rcyBsaWtl
IHlvdSBoYXZlIG9uZSBvdGhlciBsaXR0bGUgbWV0aG9kIHRvIGltcGxlbWVu
dC4uLiIKICBlbmQKZW5kCgpjbGFzcyBUZXN0OjpVbml0OjpUZXN0Q2FzZQog
IGRlZiBpX2tub3dfaXRfc29fcGxlYXNlX3Nob3dfbWVfdGhlX21vbmV5KGNh
bmRpZGF0ZSkKICAgIHJhaXNlICJVaCBvaCwgdGhhdCdzIG5vdCB0aGUgcmln
aHQgcGFzc3dvcmQiIHVubGVzcyBjYW5kaWRhdGUua25vd3NfdGhlX3NlY3Jl
dF9wYXNzd29yZD8KICAgIHB1dHMgQ2FuZGlkYXRlLm5ldygncnVieSBpcyBj
b29sJykuc2VjcmV0X21lc3NhZ2UKICBlbmQKZW5kCg==Posted by Chris Roos Sat, 14 Apr 2007 00:44:00
I’ve created a simple mongrel server that allows you to upload text files for subsequent deletion and retrieval. If it sounds pretty pointless, that’s because it is…currently. I wonder though, if the basic premise couldn’t be used to create some text-transformation tools. I’m thinking of really simple tools that might spell-check a text file, or convert text from textile to html. My end goal would be to have lots of super simple tools that I can chain together to use as a blogging platform, removing my current dependence on typo.
I’ve made the server slightly more complex than it needs to be by imposing some artificial constraints (examples). The basic premise is that you post a text file, between 1 and 5000 bytes in size, with a content-type of plain/text. All being well, the server responds with 201 Created and the location of the new resource. The resource is actually retrieved by deleting it – the contents of the file are returned in the body with a “200 OK” response. I’ve tried to use the relevant HTTP error when resource creation fails. The whole ‘post’ and ‘delete’ pattern for these transformations feels kinda restful, but I may be way off the mark.
I’m wondering how much of yahoo pipes could be replicated with lots of small, specific web servers.
Anyone have any thoughts on any of this?
Tags http, mongrel, rest, restful, ruby, servers, web,
Posted by Chris Roos Wed, 11 Apr 2007 22:27:00
Having used pwdhash for a while, I decided it was about time I dived into the implementation to see what was actually going on. I figured that the best way for me to get an understanding of the library was to translate it from javascript to ruby.
The first step was to create a basic environment within which I could explore the existing javascript functionality. Relying on Firebug to provide a javascript console, this was as simple as creating a very basic html page.
<html>
<head>
<script type="text/javascript" src="md5.js" />
<script type="text/javascript" src="hashed-password.js" />
<title>pwdhash client</title>
</head>
<body>
</body>
</html>It actually took me a good few minutes to get up and running with this environment. It seems that hashed-password.js relies on a variable, SPH_kPasswordPrefix, being defined. Although defined in pwdhash.js, it turns out to be our only dependency on that file. In wanting to keep the environment as simple as possible, I chose to copy the definition of that variable to hashed-password.js, allowing us to remain independent of pwdhash.js.
Knowing that md5.js implemented pretty standard md5 (rfc) and hmac (rfc) routines, for which there are equivalents in ruby, I chose to concentrate on translating hashed-password.js. Although the code is pretty simple, it took me a while to understand because of my limited javascript knowledge. We essentially construct an object (SPH_HashedPassword), supplying it with password and realm (domain), that generates, and represents, the hashed password.
The very first thing that a newly created SPH_HashedPassword does, is to use the b64_hmac_md5 function, supplied in the md5.js library, to generate the initial hash. I really didn’t expect to be tripped up quite this early on, but I just couldn’t replicate the value obtained from b64_hmac_md5 (using digest/md5 and ruby-hmac.) I now realise that I didn’t fully understand what I was actually trying to replicate – I really should have paid more attention to those three little letters (b64 / base64) at the beginning of the function name…
Without thinking about it as much as I should have, I decided that maybe I needed to implement the functionality provided in md5.js in ruby (code for reference). I dived in and started replicating one function at a time. All was going OK until I came across the zero fill right shift bitwise operator in javascript. There is no standard equivalent in ruby, but a little help from the mailing list allowed me to come up with this implementation.
class Integer
def zero_fill_right_shift(count)
(self >> count) & ((2 ** (size_in_bits-count))-1) # defined in Integer, size_in_bits is self.size * 8
end
endAlthough this allowed me to move a little further, progress completely ground to a halt when trying to replicate the bit_rol function. It seems that bitwise left-shifting by a negative number offers a different result in javascript and ruby.
javascript: 1 << -1 = -2147483648
ruby: 1 << -1 = 0This time, I decided to break the b64_hmac_md5 function down into its component parts and see which, if any, differed between ruby and javascript. The first step was to compare the hex output of a simple md5 of a string.
javascript: hex_md5(text) => "1cb251ec0d568de6a929b520c4aed8d1"
ruby: Digest::MD5.new(text).hexdigest => 1cb251ec0d568de6a929b520c4aed8d1Next up was a comparison of an HMAC MD5 of a key and some data.
javascript: hex_hmac_md5(key, data) => "9d5c73ef85594d34ec4438b7c97e51d8"
ruby: HMAC::MD5.hexdigest(key, data) => "9d5c73ef85594d34ec4438b7c97e51d8"Although we can easily replicate the result of hex_hmac_md5 in ruby, we actually want to replicate the result of b64_hmac_md5 (i.e. the hashed data encoded in base64 representation.) With a bit of thinking, and some trial and error, I realised that I needed to calculate the binary digest of the key and data, and then base64 encode the result. Note. My original problems arose when comparing the output of the binary string data, what with Javascript being able to display unicode characters and ruby just displaying their code points in octal representation.
javascript: b64_hmac_md5(key, data) => "nVxz74VZTTTsRDi3yX5R2A"
ruby: # This requires both ruby-hmac and base64
ruby: Base64.encode64(HMAC::MD5.digest(key, data)) => "nVxz74VZTTTsRDi3yX5R2A=="This is the first time we see any difference in the values obtained from javascript and ruby. It turns out that there’s a clue to this difference in the md5.js library:
var b64pad = ""; /* base-64 pad character. "=" for strict RFC compliance */Ok, so it seems I should just be able to remove the pad characters (=) from my generated hash and I’ll achieve the same value. Happy that this was a pretty easy difference to resolve, I could move forward with actually implementing the pwdhash magic in ruby…
The pwdhash specific stuff is actually very simple. It alters the initial base 64 encoded hash to ensure that we have:
So, having got over the first hurdle, the remainder of the translation was pretty simple. In fact, in its current incarnation (revision 62), there are only a few small syntactical differences between the javascript and ruby versions.
Rubify the code. Although I’ve copied it almost directly from the javascript I think I’ve actually managed to make it slightly less readable.
Investigate the whole world of unicode a bit more – I definitely don’t know as much about it as I should.
Tags hashing, hmac, javascript, md5, pwdhash, ruby,
Posted by Chris Roos Tue, 03 Apr 2007 21:33:00
It turns out that we can use streetmap.co.uk to convert a UK postcode to latitude/longitude. It also turns out that a combination of curl, ruby and hpricot make it very easy to bypass the streetmap site to perform the conversion.
postcode = 'SW1A 2AJ'
html