I could fetch all the data in one query that had a large group by clause, which would return all the data I needed in 60 seconds instead of 8 hours.
But then I had a pile of data in rows. That's not very handy when you have to slice it by seven different characteristics.
So I wrote a recursive Ruby hash structure with semantics that allow me to retrieve by arbitrary attributes:
h=RecursiveHash.new([:business, :product_line, :format, :country, :state, :count], 0)
h.insert ['ConsumerBusiness', 'Product A', 'HTML', "UnitedStates", 1, 10]
h.retrieve { :product_line => 'ConsumerBusiness' }
h.retrieve { :product_line => 'ConsumerBusiness', :doc_format => 'HTML' }
The default mode is to sum the responses, but it can also return an array of matches.
It's really fast, and has a simple interface that, to me, is highly readable.
Here it is - RecursiveHash.rb:
class Array; def sum; inject( nil ) { |sum,x| sum ? sum+x : x }; end; end class RecursiveHash def initialize(attributes, default_value, mode=:sum) @attribute_name = attributes[0] @child_attributes=attributes[1..-1] @default_value=default_value @master=Hash.new @mode=mode end def insert(values) if values.size > 2 #puts "Inserting child hash at key #{values[0]}, child: #{values[1..-1].join(',')}" if @master[values[0]]==nil @master[values[0]]=RecursiveHash.new(@child_attributes, @default_value) end @master[values[0]].insert(values[1..-1]) else puts "Inserting value at key #{values[0]}, value: #{values[1]}" @master[values[0]]=values[1] end end def return_val(obj, attributes) if obj.is_a? RecursiveHash return obj.retrieve(attributes) elsif obj==nil return @default_value else return obj end end def retrieve(attributes) if attributes[@attribute_name]==nil or attributes[@attribute_name]=='*' or attributes[@attribute_name].is_a? Array keys=nil if attributes[@attribute_name].is_a? Array keys=attributes[@attribute_name] else keys=@master.keys end v=keys.collect { |key| return_val(@master[key], attributes) } #puts "v: #{v.join(',')}" return @mode==:sum ? v.sum : v else return return_val(@master[attributes[@attribute_name]], attributes) end end def pprint(n=0, parent_key="N/A") indent = " " * n puts "#{indent}#{parent_key} (holds #{@attribute_name})" @master.each_key { |key| if @master[key].is_a? RecursiveHash @master[key].pprint(n+1, key) else puts "#{indent} #{key}: #{@master[key]}" end } end end
No comments:
Post a Comment