google bigquery - Delta between query execution time and Java query call to finish -
context
- our container cluster located @ us-east1-c
- we using following java library: google-cloud-bigquery, 0.9.2-beta
- our dataset has around 26m rows , represents ~10g
- all of our queries return less 100 rows grouping on specific column
question
we analyzed last 100 queries executed in bigquery, these executed in 2-3 seconds (we analyzed calling bq --format=prettyjson show -j jobid, end time - creation time).
in our java logs though, of calls bigquery.query blocking 5-6 seconds (and 10 seconds not out of ordinary). explain systematic gap between query finish in bigquery cluster , results being available in java? know 5-6 seconds isn't astronomic, curious see if normal behaviour when using java bigquery cloud library.
i didn't dig point analyzed outbound call using wireshark. our tests executed in our container cluster (kubernetes).
code
queryrequest request = queryrequest.newbuilder(sql) .setmaxwaittime(30000l) .setuselegacysql(false) .setusequerycache(false) .build(); queryresponse response = bigquery.query(request);
thank you
just looking @ code briefly here: https://github.com/googlecloudplatform/google-cloud-java/blob/master/google-cloud-bigquery/src/main/java/com/google/cloud/bigquery/bigqueryimpl.java
it appears there multiple potential sources of delay:
- getting query results
- restarting (there automatic restarts in there can explain delay spikes)
- the frequency of checking new results
it sounds looking @ wireshark give precise answer of happening.
Comments
Post a Comment