Posted in Ruby on June 22nd, 2009 by alex / No Comments »
HTML parsing in JRuby seems to be going through a slightly odd patch. Nokogiri and Hpricot both seem to have problems. There’s one project I’m working on at the moment which needs xpath support, and by chance I happen to be using Celerity, which wraps htmlunit. If I need an HTML parser, I thought, there must be one somewhere hidden within that I can use. For extra bonus points, I wouldn’t even need to package any native code, celerity already has that covered…
And so it came to pass. celerity_parser is an almost trivially thin wrapper around HtmlUnit’s HTMLParser class that’s got just enough functionality to do what I need, which is search for elements by xpath, and extract text and XHTML structure. When I say “trivially thin”, I really mean it - there’s a grand total of 2 Ruby classes, and 5 methods you might want to use.
Here’s how it works, taken from the README:
root_node = CelerityParser.parse(html_content)
found_elements = root_node.search("//html/head/title")
found_elements.first.text # => "Html page title"
That’s pretty much it. Dependencies are on jarib-celerity and jruby itself. Enjoy, and I’m open to pull requests and suggestions if you need more than this. I’ve not done any speed tests, but it’s native Java so might be quite nippy.
Tags: html,jruby,Ruby,rubygems.
Posted in Ruby on June 3rd, 2009 by alex / No Comments »
It’s slightly more involved than you might think to make a custom Rails environment that is based on another. In my case, I wanted to have a staging environment that was as close as possible to production. So, I thought require 'config/environment/production' should do the trick.
Not so.
Because of the config.foo magic and the fact that it requires binding tomfoolery, environments aren’t loaded, or loadable, with require. They’re read and eval’d. Here’s what I’ve got at the top of config/environment/staging.rb at the moment:
production_environment_path = File.join(File.dirname(configuration.environment_path), 'production.rb')
eval(IO.read(production_environment_path), binding, production_environment_path)
So far so good. I’ll update here if that turns out not to be the whole story.
Posted in Ruby on May 14th, 2009 by alex / No Comments »
I stumbled on this today. I don’t know why, but I’ve always had a block over what the word “monad” actually means, and how the bind and return operations map to that meaning.
In the linked StackOverflow post, there is a single sentence that fixes the problem:
An alternative term is computation builder which is a bit more descriptive of what they are actually useful for.
Ah-ha! The rest of the post is made up of some great examples. Go read if you’re as confused as I was.
Tags: haskell,programming.
Posted in Ruby on May 5th, 2009 by alex / No Comments »
As part of the spangly new and exciting project I’m working on, I’ve got a dumb JRuby client app that runs on the user’s desktop, which I need to have upload data to my S3 buckets. I pondered and read for a bit. S3 is new to me, so I was wandering off down the path of “touch the key with a public-writeable ACL, wait for a completion notification callback, then close the ACL.” This, obviously, is madness.
Luckily, the fine folks at Amazon have already thought of this, and provided a POST mechanism designed for browsers. It’s got a slightly strange gotcha, though: the uploaded file must be the last element in the POST body. This and Ruby’s default Net::HTTP API sent me looking for alternatives to building the post by hand.
The solution is quite neat. I’ve got a teeny Sinatra app sitting on the server whose only purpose is to serve prevalidated upload forms to authenticated clients. Auth is provided by Rack. The client just uses 6 lines of Mechanize code to fill in the form details and submit it to S3. It’s rather a library-heavy solution, but as a concept it doesn’t get much simpler.
And simpler is better.
Posted in Ruby on May 5th, 2009 by alex / No Comments »
For reference, here’s my standard Rails kit at the moment:
- Authlogic for session handling.
- Paperclip for uploads.
- RestClient for remote service handling.
- Mocha for mocking.
- Webrat for integration testing.
- Scaffolding_extensions for laziness.
- Passenger/Apache for serving.
- Vlad for deployment.
I want to add cucumber and thin to this, but cucumber has given me problems in the past and I just haven’t got round to trying thin out yet.
Posted in Ruby on April 28th, 2009 by alex / No Comments »
The more I think about it, the more I think that we need controller tests in Rails that don’t render. Functional tests just don’t cut it if you’ve got anything non-trivial in either the controller actions or the view. I know we should be aiming for simplicity in precisely those places, but sometimes you just can’t distill everything into a fat model.
Posted in Ruby on April 25th, 2009 by alex / No Comments »
Looks like I’m not going to be using restful_authentication on this project. The instructions are convoluted, and what you end up with doesn’t test cleanly. Can’t be bothered with that. While I’m in learn-mode, authlogic looks like a better bet; I also like the fact that someone else has realised that model > ORM model.
Posted in Ruby on April 25th, 2009 by alex / No Comments »
Yup, that pretty much sums it up.
I’ve been fairly thoroughly exposed to Darcs, Mercurial, Git and Subversion now. The problems I’ve had have been hashed over in a thousand blog posts by others who tried all this out before me, so I won’t bother with tales of slowness, really slowness, or the revelation of git branching and stashing. Nor will I complain about git’s learning curve.
I just don’t think I’ll be using any of the others except for legacy codebases now. What finally sealed the deal was the realisation that git’s submodules were just svn:externals in disguise. Oh, and the fact that pretty much all the third party code I’m interfacing with these days is on Github.
Tags: Add new tag.
Posted in Coding, Internet, Projects, Ruby on April 24th, 2009 by alex / No Comments »
So… first green fields Major Project in a while. It’s a Rails app, but I’m shifting to PostgreSQL and Amazon EC2/S3 for a bunch of it, so there’s going to be a fair amount of new learnings here.
It also feels slightly odd to be jumping back into Rails again. I’ve not done any new Rails work for a little while; the majority of my consulting has been on apps frozen at 2.1, so it’ll be good to be working on the fresh code-base.
I’ve shut down Other People Work for the next couple of weeks to get this project out of the door, although I reckon that what with various travel and visiting plans, I’ve only got about 2/3rds of that time to play with.
No time to hang around here blogging, there’s work to be done!
Tags: amazon,aws,ec2,rails,Ruby,s3.
Posted in Ruby on April 24th, 2009 by alex / No Comments »
I know I shouldn’t leave it so long, but I’ve just updated to WP 2.7.1.
The new admin interface rocks. I could never see the point of the old dashboard, whereas this is actually useful.
Tags: meta,wordpress.