当涉及到大量String的时候,记得用Set,O(1) vs O(n)
platforms = Set.new %w[bash Chai D3JS Go Javascript Ruby]
foo if platforms.include? platform
Set implements a collection of unordered values withno duplicates. This is a hybrid of Array's intuitive inter-operationfacilities and Hash's fast lookup.
Set is easy to use with Enumerable objects (implementingeach
). Most of the initializer methods and binary operatorsaccept generic Enumerable objects besidessets and arrays. An Enumerable object can beconverted to Set using theto_set
method.
Set uses Hash as storage, so you must note thefollowing points:
- Equality of elements is determined according to Object#eql? andObject#hash.
- Set assumes that the identity of each element doesnot change while it is stored. Modifying an element of a set will renderthe set to an unreliable state.
- When a string is to be stored, a frozen copy of the string is storedinstead unless the original string is already frozen.
用了一段代码验证了一下:
require 'benchmark'
require 'set'
arr = (1..1000000).map {|e| e.to_s}
set = Set.new arr
i = rand(1000000)
j = i.to_s
Benchmark.bmbm(10) do |t|
t.report("arr_not_include") { 10.times { arr.include? i} }
t.report("set_not_include") { 10.times { set.include? i} }
t.report("arr_include") { 10.times { arr.include? j} }
t.report("set_include") { 10.times { set.include? j} }
end
结果如下:
Rehearsal ---------------------------------------------------
arr_not_include 1.109000 0.000000 1.109000 ( 1.105745)
set_not_include 0.000000 0.000000 0.000000 ( 0.000014)
arr_include 0.079000 0.000000 0.079000 ( 0.071470)
set_include 0.000000 0.000000 0.000000 ( 0.000084)
------------------------------------------ total: 1.188000sec
user system total real
arr_not_include 1.203000 0.000000 1.203000 ( 1.211598)
set_not_include 0.000000 0.000000 0.000000 ( 0.000010)
arr_include 0.062000 0.000000 0.062000 ( 0.064577)
set_include 0.000000 0.000000 0.000000 ( 0.000015)