It is known that a single IndexReader is a limiting factor in a multithreaded environment. So I decided to run the same tests with 1,2,4 and 8 IndexReaders (actually I create IndexReaders and then create an IndexSearcher from each of these and share the IndexSearchers).
Below are the results. All of the test environment are the same as my previous evaluation, except:
- It goes up to 8192 threads instead of the original 4096 threads
- I had to pass in to the Java VM: -Xmx4000m because the VM was running out of heap for 8 readers
- I've made the graph larger
As you can see, 2,4 and 8 readers significantly improve query rate over a single shared reader, between ~10 and 512 threads. The overall winner is 4 readers, showing marked improvement over 2 and 8 readers in the range from 16 threads to about 512 threads. After this point all numbers of readers are effectively the same.
I am not sure why 4 readers appears to be the sweet spot for this particular configuration. I will have to re-run this experiment with a finer granularity of readers (1,2,3,4,5,6,7,8,9,10,12,14,16 readers). However, remember that this machine is a dual CPU, dual core configuration (4 real cores, hyper-threaded to 8 virtual cores): it may be that having the same number of readers as the number of physical cores improves things, perhaps through less state swapping. I am not an expert. With more evaluations we may be able to comment more intelligently on this.
This is just one data point, but I hope it will be helpful.
I am still planning on releasing the code for the evaluation and the results plotting with gnuplot.
I would appreciate any feedback.
This work was done as part of a Lucene evaluation for my employer, CISTI, the National Research Council Canada. I work with Lucene as it is relevant to my research, with an example of some of my Lucene-based research here: Ungava.