|
|
We wrote a program to compare the performance of Array_set with the current implementation of Set(3C++) as described in the Set tutorial, elsewhere in this volume. The latter uses a relatively complicated digital trie data structure. Actually, we used the Set_of_p in the comparison because, of the three unordered collection classes defined in Set(3C++), Set_of_p has the best performance characteristics.
Here are the first few lines of our program:
#include <Stopwatch.h> see Stopwatch(3C++) #include <Array_alg.h> Stopwatch w; typedef double* DOUBLEP;
#ifdef SETH #include <Set.h> typedef Set_of_p<double> SET; #else #include <Array_set.h> typedef Array_set<DOUBLEP> SET; #endif SET p,q;
The program uses a Stopwatch(3C++) for timing. It uses the Array Algorithm shuffle(), described in shuffle(3C++), to obtain good averages (see below).
The data used for the comparison consisted of N unique pseudo-random pointers:
main(int,char* argv[]){ int N = atoi(argv[1]); DOUBLEP* data = new DOUBLEP[N]; for(int i = 0;i<N;i++){ do{ data[i] = (DOUBLEP)rand(); see rand(3C) }while(p.contains(data[i])); p.insert(data[i]); } ... }
Note that uniqueness is guaranteed by testing each pointer for containment before inserting it into p. For our first test, we repeatedly emptied p into q, then q into p, and so on. We timed the repetitions using the Stopwatch and took an average over the number of repetitions. To get a good average, we used shuffle() to apply a random permutation to the data at the end of each pass:
int NREPS = 10000/N; w.start();for(i=0;i<NREPS;i++){ for(int ii=0;ii<N;ii++){ assert(p.remove(data[ii])); assert(q.insert(data[ii])); }
for(ii=0;ii<N;ii++){ assert(q.remove(data[ii])); assert(p.insert(data[ii])); } shuffle(data,data+N); } w.stop(); cout << "insert/remove time = " << w.user()/NREPS << "\ n";
This test showed that, for insertion and removal, Array_set outperforms Set_of_p for values of N up to about 8. In terms of order estimates, the O(1) complexity of the single-element Set_of_p operations quickly overtakes the O(N) complexity of the corresponding Array_set operations.
Next, we timed the contains operation for a variety of elements known to be either members (data) or non-members (data2):
DOUBLEP* data2 = new DOUBLEP[N];for(i=0;i<N;i++){ do{ data2[i] = (DOUBLEP)rand(); }while(p.contains(data2[i])); } w.reset(); w.start();
for(n=0;n<NREPS;n++){ for (i = 0; i < HOWMANY; i++){ assert(p.contains(arr1[i])); } for(i = 0; i < HOWMANY; i++){ assert(!p.contains(arr2[i])); } }
w.stop(); cerr << "contains (yes/no) time = " << w.user()/NREPS << "\ n";
This test showed that Array_set outperforms Set_of_p for values of N up to 9.
The story changes dramatically when we compare operations that are O(N) in both implementations. Consider iteration over all elements. Array_set iteration is little more than a for-loop over an array; in terms of order estimates, the constant multiplier for iteration is very close to 1. Set_of_p iteration is a walk over a complicated N-ary tree structure; in terms of order estimates, the constant multiplier is relatively large. We would therefore expect Array_set iteration to outperform Set_of_p iteration even for small values of N, and to become progressively better as N grows larger. We would expect the same for the algebraic (union, difference, and so-on) and relational operations (equality, subset, superset), which are also O(N) in both implementations. Actual measurements bear this out (times are in milliseconds):
N = 100 Array_set Set_of_pequality 0.5 11.2 union 2.5 14.5 difference 3.3 28.2
N = 1000 Array_set Set_of_p
equality 5.0 71.7 union 11.6 89.0 difference 20.0 161.7
In summary, for applications emphasizing O(1) operations (insertion, removal, membership testing), Set_of_p overtakes Array_set at around 10 elements, and becomes progressively better as the number of elements increases. For applications emphasizing O(N) operations (iteration, algebraic, relational operations), Array_set is faster than Set_of_p for all values of N and becomes progressively better as the size increases.
A similar comparison of the current implementation of Map(3C++) against a Block(3C++) and Array_alg(3C++) implementation of the same specification gave even more dramatic results: insertion breaks even at 155 elements, membership at 80 elements, and Array_set iteration is always faster.
If your application runs on a machine with limited memory, executable size is probably just as important to you as execution speed. To give an idea of relative sizes of programs using Array_set and the current Set implementation, we compiled and linked the following two programs:
CC -o main1 -l++ #include <Set.h> main(){ } CC -o main2 -l++ #include <Array_set.h> main(){ }
On a Sun 3, the stripped executable sizes of main1 and main2 were 82K bytes and 41K bytes, respectively.