The Impala Shell is a great tool for quickly running exploratory queries, or testing
new features in Impala. While Impala is pretty fast, some queries can still take several
seconds or longer to complete. It’s therefore useful to be able to see how much progress
the query has made and to get an idea of how long the query will take. You can get at a
lot of this information through Impala’s debug webpages (
http::<impalad>:25000), but not
every user has access to these pages, and besides, it’s more useful to have this feedback
directly in the tool that you’re using to issue queries.
A better way to get an overview about query progress in the Impala shell will be shipped in Impala 2.3 and was implemented as part of IMPALA-80. (see the patch here). This gives you live updates on query progress - either as a simple progress bar, or a detailed breakdown of the progress of each operator in the query plan.
There are two new command line flags for the Impala Shell, and two corresponding new options for the shell’s “SET” command. The two new command line flags are:
--live_summary Print a query summary every 1s while the query is running. [default: False] --live_progress Print a query progress every 1s while the query is running. [default: False]
When you want to change the variables from within the Impala shell you can use the new shell options. The shell options are similar to query options, but are only evaluated in the context of the shell.
LIVE_PROGRESS: False LIVE_SUMMARY: False
Both options can be controlled by setting them to
False. For example:
The live progress percentage bar is based on the number of completed vs. issued scan ranges. So if your query is dominated by non-scan based operations it can show 100% while continuing to run. In this case the live query summary can give a better indication of the query progress.