why we need beeline?

0 votes
why we need beeline?
Dec 18, 2017 in Data Analytics by Learner
4,050 views

1 answer to this question.

0 votes

Hey Learner,

Hope you're doing great.

Would you please take a look at the following,

Introduction: 

As Hive development has shifted from the original Hive server (HiveServer1) to the new server (HiveServer2), users and developers accordingly need to switch to the new client tool. However, there’s more to this process than simply switching the executable name from “hive” to “beeline”.

HiveServer2 is a service which needs to be started manually. 

Beeline is a command line interface of hive server2 a new launched product of hive.

We do not have a ready document on beeline but we have tried to frame a answer for you and have sent some links which will help you in understanding the beeline.

In its original form, Apache Hive was a heavyweight command-line tool that accepted queries and executed them utilizing MapReduce. Later, the tool was split into a client-server model, in which HiveServer1 is the server (responsible for compiling and monitoring MapReduce jobs) and Hive CLI is the command-line interface (sends SQL to the server).

Recently, the Hive community introduced HiveServer2 which is an enhanced Hive server designed for multi-client concurrency and improved authentication that also provides better support for clients connecting through JDBC and ODBC.

Now HiveServer2, with Beeline as the command-line interface, is the recommended solution; HiveServer1 and Hive CLI are deprecated and the latter won’t even work with HiveServer2.

Please refer the below link which will give you a brief insight on beeline and its difference with hive.

Link: http://blog.cloudera.com/blog/2014/02/migrating-from-hive-cli-to-beeline-a-primer/

Link: https://cwiki.apache.org/confluence/display/Hive/HiveServer2+Clients

Hive Server 1:

Hive server is a server client model service

Allow users to connect using Hive CLI interface and uisng thrift client.

Support for remote client connection but only one client can connect at a time.

No session management support.

Because of thrift no concurrency control due to thrift API


 


 


 


 


 


 


 


 


 


 


 

Difference between hive and hive server 2:

The primary difference between the two involves how the clients connect to Hive. The Hive CLI connects directly to the Hive Driver and requires that Hive be installed on the same machine as the client. However, Beeline connects to HiveServer2 and does not require the installation of Hive libraries on the same machine as the client. Beeline is a thin client that also uses the Hive JDBC driver but instead executes queries through HiveServer2, which allows multiple concurrent client connections and supports authentication.

Cloudera's Sentry security is working through HiveServer2 and not HiveServer1 which is used by Hive CLI. So hive though the command-line will not follow the policy from Setry. According to the cloudera docs you should not use Hive CLI and WebHCat. Use beeline or impala-sell instead.

Please let us know if this helps or the issue is still there.

We're eagerly waiting for your response.

answered Dec 18, 2017 by Sudhir
• 1,610 points

Related Questions In Data Analytics

0 votes
1 answer

Do we need to create vectors to create arrays in R?

Yes, you need vectors to create arrays. Arrays take vectors in ...READ MORE

answered Oct 28 in Data Analytics by Cherukuri
• 31,840 points
12 views
0 votes
1 answer

How can we trim leading and trailing whitespaces in R?

trimws {base} //Remove Leading/Trailing Whitespace Removes leading and/or ...READ MORE

answered Apr 18, 2018 in Data Analytics by zombie
• 3,690 points
553 views
0 votes
2 answers

How can we count TRUE values in a logical vector?

Hi, You can get a count of all ...READ MORE

answered Aug 20 in Data Analytics by anonymous
• 31,840 points
2,405 views
0 votes
1 answer

Why should I use set.seed function() in R?

set.seed(seed) Set the seed of R‘s random number ...READ MORE

answered Apr 23, 2018 in Data Analytics by zombie
• 3,690 points
42 views
0 votes
2 answers

Why should anyone learn Python instead of R for machine learning?

Machine learning is the latest technology everyone ...READ MORE

answered Apr 13 in Data Analytics by SA
• 1,030 points
99 views
+1 vote
1 answer

How good at SQL does a data scientist really need to be?

SQL is a standardized query language for requesting information ...READ MORE

answered Aug 9, 2018 in Data Analytics by Anmol
• 3,620 points
42 views
0 votes
1 answer

Why should I adopt R programming

R Programming is the best mechanism for ...READ MORE

answered Oct 29, 2018 in Data Analytics by Maverick
• 10,040 points
29 views
0 votes
1 answer

Why is data cleaning needed?

Data cleaning is the fourth step in ...READ MORE

answered Nov 14, 2018 in Data Analytics by Maverick
• 10,040 points
31 views
0 votes
1 answer

?: The only 'Run as' choice 'Run Configurations. Why?

Dear Learner, Greetings ! The issue which you are ...READ MORE

answered Dec 17, 2017 in Data Analytics by Sudhir
• 1,610 points
25 views
+1 vote
1 answer

Need a hadoop engine in backend to run r server

Dear Koushik, Hope you are doing great. The hadoop ...READ MORE

answered Dec 17, 2017 in Data Analytics by Sudhir
• 1,610 points
67 views