Why HBase Java Client is slow compared to REST/Thrift -


i running performance tests on hbase java client / thrift / rest interface. have table called “airline” has 500k rows. fetching 500k rows table through 4 different java programs. (using java client, thrift, thrift2 , rest)

following performance numbers various fetch sizes. these batch size set 100000


[table shows performance numbers. times in ms][1] 

perf numbers


i see that, there performance improvement increase fetch size in case of rest, thrift, , thrift2.

but java api, seeing consistent performance, irrespective of fetch size. why fetch size not impacting in java client?

here snippet of java program


table table = conn.gettable(tablename.valueof("airline")); scan scan =  new scan(); resultscanner scanner = table.getscanner(scan);  (result[] result = scanner.next(fetchsize); result.length != 0; result = scanner.next(fetchsize)) 

{ - process rows }


can me in this. using wrong methods/classes data fetching through java client.

your scanner not set right fetch number of rows want in timely manner. in other words, you're tuning resultscanner, not thing doing scan, scan object.

i believe functions want partially following:

scan.setcaching scan.setcacheblocks 

https://hbase.apache.org/apidocs/org/apache/hadoop/hbase/client/scan.html

you call functions before loop...

source pig's hbasestorage#initscan function


Comments

Popular posts from this blog

php - Permission denied. Laravel linux server -

google bigquery - Delta between query execution time and Java query call to finish -

python - Pandas two dataframes multiplication? -