Ruby Recursive Hash with SQL-like capabilities

I recently was working on a Ruby project where I had to query an SQL server, and the connection setup/query/response/close latency was about 5 seconds. I needed to generate many unique data slices for a reporting project, and the naive implementation generated some 6,000 database queries, which would take about eight hours to run.

I could fetch all the data in one query that had a large group by clause, which would return all the data I needed in 60 seconds instead of 8 hours.

But then I had a pile of data in rows. That's not very handy when you have to slice it by seven different characteristics.

So I wrote a recursive Ruby hash structure with semantics that allow me to retrieve by arbitrary attributes:

h=RecursiveHash.new([:business, :product_line, :format, :country, :state, :count], 0)

h.insert ['ConsumerBusiness', 'Product A', 'HTML', "UnitedStates", 1, 10]

h.retrieve { :product_line => 'ConsumerBusiness' }
h.retrieve { :product_line => 'ConsumerBusiness', :doc_format => 'HTML' }

The default mode is to sum the responses, but it can also return an array of matches.
It's really fast, and has a simple interface that, to me, is highly readable.

Here it is - RecursiveHash.rb:

class Array; def sum; inject( nil ) { |sum,x| sum ? sum+x : x }; end; end

class RecursiveHash
 def initialize(attributes, default_value, mode=:sum)
  @attribute_name = attributes[0]
  @child_attributes=attributes[1..-1]
  @default_value=default_value
  @master=Hash.new
  @mode=mode
 end
 
 def insert(values)
  if values.size > 2
   #puts "Inserting child hash at key #{values[0]}, child: #{values[1..-1].join(',')}"
   if @master[values[0]]==nil
    @master[values[0]]=RecursiveHash.new(@child_attributes, @default_value)
   end
   @master[values[0]].insert(values[1..-1])
  else
   puts "Inserting value at key #{values[0]}, value: #{values[1]}"
   @master[values[0]]=values[1]
  end
 end
 
 def return_val(obj, attributes)
  if obj.is_a? RecursiveHash
   return obj.retrieve(attributes)
  elsif obj==nil
   return @default_value
  else
   return obj
  end
 end
 
 def retrieve(attributes)
  if attributes[@attribute_name]==nil or attributes[@attribute_name]=='*' or attributes[@attribute_name].is_a? Array
   keys=nil
   if attributes[@attribute_name].is_a? Array
    keys=attributes[@attribute_name]
   else
    keys=@master.keys
   end
   
   v=keys.collect { |key| return_val(@master[key], attributes) }
   #puts "v: #{v.join(',')}"
   return @mode==:sum ? v.sum : v
  else
   return return_val(@master[attributes[@attribute_name]], attributes)
  end   
 end
 
 def pprint(n=0, parent_key="N/A")
  indent = "  " * n
  puts "#{indent}#{parent_key} (holds #{@attribute_name})"
  @master.each_key { |key| 
   if @master[key].is_a? RecursiveHash 
    @master[key].pprint(n+1, key)
   else
    puts "#{indent}    #{key}: #{@master[key]}"
   end
  }
 end
end

No comments: