Ruby performance: rewrite class extension to compare array elements in C? -
i have code extends ruby array class:
# extendes array class have methods giving same rspec functionality of # checking if array elements equals ones'. class array def self.same_elements?(actual, expected) extra_items = difference_between_arrays(actual, expected) missing_items = difference_between_arrays(expected, actual) extra_items.empty? & missing_items.empty? end def has_same_elements?(expected) extra_items = self.class.difference_between_arrays(self, expected) missing_items = self.class.difference_between_arrays(expected, self) extra_items.empty? & missing_items.empty? end protected def self.difference_between_arrays(array_1, array_2) difference = array_1.dup array_2.each |element| if (index = difference.index(element)) difference.delete_at(index) end end difference end end
and spec:
describe array before(:each) @a = [1,2,3] @b = [3,2,1] @c = [1,2,3,4] @d = [4,1,3] @e = [1,2,3] end "should respond .same_elements?" array.should respond_to(:same_elements?) end "should respond #has_same_elements?" array.new.should respond_to(:has_same_elements?) end describe ".same_elements?" "should return correct values" array.same_elements?(@a,@a).should eq(true) array.same_elements?(@a,@b).should eq(true) array.same_elements?(@a,@c).should eq(false) array.same_elements?(@a,@d).should eq(false) array.same_elements?(@a,@e).should eq(true) array.same_elements?(@b,@a).should eq(true) array.same_elements?(@b,@b).should eq(true) array.same_elements?(@b,@c).should eq(false) array.same_elements?(@b,@d).should eq(false) array.same_elements?(@b,@e).should eq(true) array.same_elements?(@c,@a).should eq(false) array.same_elements?(@c,@b).should eq(false) array.same_elements?(@c,@c).should eq(true) array.same_elements?(@c,@d).should eq(false) array.same_elements?(@c,@e).should eq(false) array.same_elements?(@d,@a).should eq(false) array.same_elements?(@d,@b).should eq(false) array.same_elements?(@d,@c).should eq(false) array.same_elements?(@d,@d).should eq(true) array.same_elements?(@d,@e).should eq(false) array.same_elements?(@e,@a).should eq(true) array.same_elements?(@e,@b).should eq(true) array.same_elements?(@e,@c).should eq(false) array.same_elements?(@e,@d).should eq(false) array.same_elements?(@e,@e).should eq(true) end end describe "#has_same_elements?" "should return correct values" @a.has_same_elements?(@a).should eq(true) @a.has_same_elements?(@b).should eq(true) @a.has_same_elements?(@c).should eq(false) @a.has_same_elements?(@d).should eq(false) @a.has_same_elements?(@e).should eq(true) @b.has_same_elements?(@a).should eq(true) @b.has_same_elements?(@b).should eq(true) @b.has_same_elements?(@c).should eq(false) @b.has_same_elements?(@d).should eq(false) @b.has_same_elements?(@e).should eq(true) @c.has_same_elements?(@a).should eq(false) @c.has_same_elements?(@b).should eq(false) @c.has_same_elements?(@c).should eq(true) @c.has_same_elements?(@d).should eq(false) @c.has_same_elements?(@e).should eq(false) @d.has_same_elements?(@a).should eq(false) @d.has_same_elements?(@b).should eq(false) @d.has_same_elements?(@c).should eq(false) @d.has_same_elements?(@d).should eq(true) @d.has_same_elements?(@e).should eq(false) @e.has_same_elements?(@a).should eq(true) @e.has_same_elements?(@b).should eq(true) @e.has_same_elements?(@c).should eq(false) @e.has_same_elements?(@d).should eq(false) @e.has_same_elements?(@e).should eq(true) end end end
this code apparently it's becoming slow large arrays.
is worth port methods in c performance reasons? how tackle issue? can suggest (recent) article build such functionality given i'm using ruby 2.0.0-p0?
update
benchmark between 2 proposed solutions:
user system total real (sort) instance method arr_1 vs arr_2 1.910000 0.030000 1.940000 ( 1.935651) (set) instance method arr_1 vs arr_2 7.010000 0.360000 7.370000 ( 7.377319) (sort) class method arr_1 vs arr_2 1.920000 0.030000 1.950000 ( 1.952080) (set) class method arr_1 vs arr_2 6.610000 0.320000 6.930000 ( 6.919273) (sort) instance method arr_1 vs arr_3 2.520000 0.090000 2.610000 ( 2.620047) (set) instance method arr_1 vs arr_3 7.620000 0.330000 7.950000 ( 7.951402) (sort) class method arr_1 vs arr_3 1.920000 0.030000 1.950000 ( 1.943820) (set) class method arr_1 vs arr_3 8.130000 0.390000 8.520000 ( 8.523959)
quick bm code:
require 'benchmark' require 'set' class array def self.same_elements_with_sort?(actual, expected) actual.has_same_elements_with_sort?(expected) end def has_same_elements_with_sort?(expected) self.sort == expected.sort end # ----- def self.same_elements?(actual, expected) actual.to_set == expected.to_set end def has_same_elements?(expected) array.same_elements?(self, expected) end end benchmark.bm(40) do|b| arr_1 = (1..8000000).to_a.sample(40000) arr_2 = (1..8000000).to_a.sample(40000) arr_3 = arr_1 b.report('(sort) instance method arr_1 vs arr_2') 150.times { arr_1.has_same_elements_with_sort?(arr_2) } end b.report('(set) instance method arr_1 vs arr_2') 150.times { arr_1.has_same_elements?(arr_2) } end b.report('(sort) class method arr_1 vs arr_2') 150.times { array.same_elements_with_sort?(arr_1, arr_2) } end b.report('(set) class method arr_1 vs arr_2') 150.times { array.same_elements?(arr_1, arr_2) } end b.report('(sort) instance method arr_1 vs arr_3') 150.times { arr_1.has_same_elements_with_sort?(arr_3) } end b.report('(set) instance method arr_1 vs arr_3') 150.times { arr_1.has_same_elements?(arr_3) } end b.report('(sort) class method arr_1 vs arr_3') 150.times { array.same_elements_with_sort?(arr_1, arr_3) } end b.report('(set) class method arr_1 vs arr_3') 150.times { array.same_elements?(arr_1, arr_3) } end end
your code comes rspec's matcharray
. rspec computes extra_items
, missing_items
provide meaningful error message.
if don't need information, sort arrays:
class array def self.same_elements?(actual, expected) actual.has_same_elements?(expected) end def has_same_elements?(expected) self.sort == expected.sort end end
Comments
Post a Comment