Mindon.IDEA

Air off, Mind on ~ / Javascript+Golang, Sci, Health… /

Find Out Unique Elements in a Javascript Array

To remove duplicate elements in a array, there are a few algorithms to implement.

Algorithm 1

function unique1(d) {
  var o = {}, i, l = d.length, r = [];
  for(i=0; i<l;i+=1) o[d[i]] = d[i];
  for(i in o) r.push(o[i]);
  return r;
};

This method has 2 loops, that’s a big time-cost problem.

Algorithm 2

reduce one loop from the algorithm 1, improve a litter bit

function unique2(d) {
  var r = [], i = {}, j = 0;
  for(var k=0, kmax=d.length; k<kmax; k++) {
    if(!i[d[k]]) {
      i[d[k]] = 1;
      r[j++]=d[k];
    }
  }
  return r;
}

Algorithm 3

In ECMA-262 standard, there’s a indexOf method for Array object, we use it to improve a lot.

function unique3(d) {
  var r = [], j = 0;
  for(var k=0, kmax=d.length; k<kmax; k++) {
    if(r.indexOf(d[k]) < 0) {
      r[j++]=d[k];
    }
  }
  return r;
}

and it could be better…

Algorithm 4

jQuery.unique source ( Sizzle.uniqueSort ), it’s the fastest one.

function unique4( d ) {
  d.sort();
  for (var i = 1; i < d.length; i++ ) {
    if ( d[i] === d[ i - 1 ] ) {
      d.splice( i--, 1 );
    }
  }

  return d;
};

Compare these 4 algorithms:

Speed(Performance): 4 > 3 > 2 > 1

Testing code:

var d = [1, 3, 2, '2'];

function test(fn, count) {
  var t = new Date().getTime();
  for(var k =0 ; k < count; k++) {
    fn(d);
  }

  return new Date().getTime() - t;
}

var t1 = test(unique1, 100000);
var t2 = test(unique2, 100000);
var t3 = test(unique3, 100000);
var t4 = test(unique4, 100000);

Time consume result sample:

IE 9(in Editplus):

[t1,t2,t3,t4] = [93, 55, 25, 25]

IE 9(Browser):

[t1,t2,t3,t4] = [508, 449, 432, 271]

Chrome 18:

[t1,t2,t3,t4] = [141, 59, 43, 43]

Firefox 11:

[t1,t2,t3,t4] = [129, 118, 25, 37]

(use == instead of === in unique4 will make it a little bit faster.)

Problems

handling a array with different data types:

var d = [1, 3, 2, '2'];

results:

unique1(d) : [1, '2', 3]

unique2(d) : [1, 3, 2]

unique3(d) : [1, 3, 2, '2']

unique4(d) : [1, 2, '2', 3]

if we change the === in unique4 into ==, then the new result will be

unique4(d) : [1, 2, 3]

Another issue is the order problem:

unique1 and unique4 will re-order elements.

unique2 and unique3 will keep the original order.

Algorithm 3: unique3 depends on indexOf of ECMA-262 standard implement, and it cannot handle different data types.

Conclusion

  • Same data-type elements

orgianl order: unique3

sorted: unique4

  • Different data-type elements ( thinking ‘2’ is the same as 2 )

orgianl order: unique2

sorted: unique4 (!NOTICE: change === into == )

  • Different data-type elements ( thinking ‘2’ is different from 2 )

orgianl order: unique3

sorted: unique4 (!NOTICE: keep === )

以上为去除数组中的重复元素的各种算法,性能及问题。

— Mindon(麦盾) Apri 18, 2012 Shenzhen(深圳)

(整理这样一篇东西还挺耗时的,子时了zZzZZz… )

Comments