Saturday, April 22, 2017

Apache Hive—Hive CLI vs Beeline

Lineage of Apache Hive
  1. Original model 
    • was a heavyweight command-line tool that accepted queries and executed them utilizing MapReduce
  2. Client-server model
    1. Hive CLI + HiveServer1
    2. Beeline + HiveServer2 (HS2)
In this article, we will examine the differences between Hive CLI and Beeline, especially a new Hive CLI implementation (i.,e Beeline + embedded HS2).

Hive CLI vs Beeline

Hive CLI, which is an Apache Thrift-based client, Beeline is a JDBC client based on the SQLLine CLI — although the JDBC driver used communicates with HiveServer2 using HiveServer2’s Thrift APIs.

In the latest Apache Hive, both "Hive CLI" and Beeline are supported via
exec "${HIVE_HOME}/bin/hive.distro" "$@"
For example, to launch both command line interfaces, you do

Hive CLI
$ hive --service cli --help


$ hive --service beeline --help

Using Hive (version: 1.2.1000. as an example, here are the list of services available:
beeline cleardanglingscratchdir cli help hiveburninclient hiveserver2 hiveserver hwi jar lineage metastore metatool orcfiledump rcfilecat schemaTool version
Note that "beeline" command is equivalent to "hive --service beeline".

Hive CLI (New)

Because of the wide use of Hive CLI, the Hive community is replacing Hive CLI's implementation with a new Hive CLI on top of Beeline plus embedded HiveServer2 (HIVE-10511) so that the Hive community only needs to maintain a single code path.[2]

In this way, the new Hive CLI is just an alias to Beeline at two levels:
  • Shell script level 
  • High code level. 

Using the JMH to measure the average time cost when retrieving a data set,  The community has reported that there is no clear performance gap between New Hive CLI and Beeline in terms of retrieving data.

Interactive Shell Commands Support

When $HIVE_HOME/bin/hive is run without either the -e or -f option, it enters interactive shell mode.  To learn more, read the following references:


With  HiveServer2 (HS2),  Beeline is the recommended command-line interface,  To learn more, read the following references:


  1. Migrating from Hive CLI to Beeline: A Primer
  2. Replacing the Implementation of Hive CLI Using Beeline
  3. Setting up HiveServer2 (Apache Hive)
  4. Hive CLI
  5. HiveServer2 Clients (Apache) 
  6. SQLLine Manual
  7. Beeline—Command Line Shell
  8. Embedded mode
    • Running Hive client tools with embedded servers is a convenient way to test a query or debug a problem. While both Hive CLI and Beeline can embed a Hive server instance, you would start them in embedded mode in slightly different ways.