why we need beeline

0 votes
why we need beeline?
Dec 18, 2017 in Data Analytics by Learner
15,812 views

2 answers to this question.

0 votes

Hey Learner,

Hope you're doing great.

Would you please take a look at the following,

Introduction: 

As Hive development has shifted from the original Hive server (HiveServer1) to the new server (HiveServer2), users and developers accordingly need to switch to the new client tool. However, there’s more to this process than simply switching the executable name from “hive” to “beeline”.

HiveServer2 is a service which needs to be started manually. 

Beeline is a command line interface of hive server2 a new launched product of hive.

We do not have a ready document on beeline but we have tried to frame a answer for you and have sent some links which will help you in understanding the beeline.

In its original form, Apache Hive was a heavyweight command-line tool that accepted queries and executed them utilizing MapReduce. Later, the tool was split into a client-server model, in which HiveServer1 is the server (responsible for compiling and monitoring MapReduce jobs) and Hive CLI is the command-line interface (sends SQL to the server).

Recently, the Hive community introduced HiveServer2 which is an enhanced Hive server designed for multi-client concurrency and improved authentication that also provides better support for clients connecting through JDBC and ODBC.

Now HiveServer2, with Beeline as the command-line interface, is the recommended solution; HiveServer1 and Hive CLI are deprecated and the latter won’t even work with HiveServer2.

Please refer the below link which will give you a brief insight on beeline and its difference with hive.

Link: http://blog.cloudera.com/blog/2014/02/migrating-from-hive-cli-to-beeline-a-primer/

Link: https://cwiki.apache.org/confluence/display/Hive/HiveServer2+Clients

Hive Server 1:

Hive server is a server client model service

Allow users to connect using Hive CLI interface and uisng thrift client.

Support for remote client connection but only one client can connect at a time.

No session management support.

Because of thrift no concurrency control due to thrift API


 


 


 


 


 


 


 


 


 


 


 

Difference between hive and hive server 2:

The primary difference between the two involves how the clients connect to Hive. The Hive CLI connects directly to the Hive Driver and requires that Hive be installed on the same machine as the client. However, Beeline connects to HiveServer2 and does not require the installation of Hive libraries on the same machine as the client. Beeline is a thin client that also uses the Hive JDBC driver but instead executes queries through HiveServer2, which allows multiple concurrent client connections and supports authentication.

Cloudera's Sentry security is working through HiveServer2 and not HiveServer1 which is used by Hive CLI. So hive though the command-line will not follow the policy from Setry. According to the cloudera docs you should not use Hive CLI and WebHCat. Use beeline or impala-sell instead.

Please let us know if this helps or the issue is still there.

We're eagerly waiting for your response.

answered Dec 18, 2017 by Sudhir
• 1,610 points
0 votes
Remote we are connecting to hive with multiple connections.

Through Beeline we can connect to hive environment even we dont have the hive libraries.
answered Jul 24, 2020 by K. Raja Yasodhar

Related Questions In Data Analytics

0 votes
1 answer

Do we need to create vectors to create arrays in R?

Yes, you need vectors to create arrays. Arrays take vectors in ...READ MORE

answered Oct 29, 2019 in Data Analytics by Cherukuri
• 33,030 points
375 views
0 votes
1 answer

How can we trim leading and trailing whitespaces in R?

trimws {base} //Remove Leading/Trailing Whitespace Removes leading and/or ...READ MORE

answered Apr 18, 2018 in Data Analytics by zombie
• 3,790 points
1,813 views
+1 vote
2 answers

How can we count TRUE values in a logical vector?

Hi, You can get a count of all ...READ MORE

answered Aug 21, 2019 in Data Analytics by anonymous
• 33,030 points
9,743 views
0 votes
1 answer

Why should I use set.seed function() in R?

set.seed(seed) Set the seed of R‘s random number ...READ MORE

answered Apr 24, 2018 in Data Analytics by zombie
• 3,790 points
1,626 views
0 votes
2 answers

Why should anyone learn Python instead of R for machine learning?

Machine learning is the latest technology everyone ...READ MORE

answered Apr 13, 2019 in Data Analytics by SA
• 1,090 points
718 views
+1 vote
1 answer

How good at SQL does a data scientist really need to be?

SQL is a standardized query language for requesting information ...READ MORE

answered Aug 9, 2018 in Data Analytics by Abhi
• 3,720 points
427 views
0 votes
1 answer

Why should I adopt R programming

R Programming is the best mechanism for ...READ MORE

answered Oct 29, 2018 in Data Analytics by Maverick
• 10,840 points
624 views
0 votes
1 answer

Why is data cleaning needed?

Data cleaning is the fourth step in ...READ MORE

answered Nov 14, 2018 in Data Analytics by Maverick
• 10,840 points
775 views
0 votes
1 answer

?: The only 'Run as' choice 'Run Configurations. Why?

Dear Learner, Greetings ! The issue which you are ...READ MORE

answered Dec 18, 2017 in Data Analytics by Sudhir
• 1,610 points
341 views
+2 votes
1 answer

Need a hadoop engine in backend to run r server

Dear Koushik, Hope you are doing great. The hadoop ...READ MORE

answered Dec 18, 2017 in Data Analytics by Sudhir
• 1,610 points
602 views
webinar REGISTER FOR FREE WEBINAR X
REGISTER NOW
webinar_success Thank you for registering Join Edureka Meetup community for 100+ Free Webinars each month JOIN MEETUP GROUP