[LIVY-616] Livy Server discovery by o-shevchenko · Pull Request #189 · apache/livy

o-shevchenko · 2019-07-31T17:30:13Z

What changes were proposed in this pull request?

JIRA ticket https://issues.apache.org/jira/browse/LIVY-616.
Implement Livy Server discovery mechanism based on ZooKeeper. And provide convenient API for clients to get LivyServer URL without explicitly set it manually by "livy.server.host" property. Then we also can use it in LivyClientBuilder to set URL automatically. It will provide a more easy way for configuration. We don't need to set "livy.server.host" or "zeppelin.livy.url" manually in a case when we don't know where Livy Server will be started.

I have separated ZooKeeper logic from StateStore logic, as a result, we can use new ZooKeeperManager just to communicate with ZooKeeper and build new logic based on it (as I did for DiscoveryManager and ZooKeeperStateStore)

How was this patch tested?

Added unit tests for new functionality.
Tested manually:
After starting LivyServer we can get Livy Server URI from ZooKeeper

Also verified on a secured cluster.

What the best way to add new API?
I see the couple of ways to do that:

Add it to LivyConf - in this case, we need to create DiscoveryManager in LivyConf which is not good.
Add it to LivyClient interface and set some variable during creating client via LivyClientBuilder.

Please, advise.

o-shevchenko · 2019-08-01T08:54:56Z

What IDEA plugin do you use for import ordering?

o-shevchenko · 2019-08-01T11:03:02Z

BinaryThriftServerSuite."fetch different data types" looks flaky

https://travis-ci.org/apache/incubator-livy/jobs/566398488

 BinaryThriftServerSuite:
 2328- Reuse existing session (30 seconds, 746 milliseconds)
 2329- fetch different data types *** FAILED *** (4 seconds, 270 milliseconds)
 2330 java.sql.SQLException: java.util.concurrent.ExecutionException: java.io.IOException: RSCClient instance stopped.
 2331 at org.apache.hive.jdbc.HiveStatement.waitForOperationToComplete(HiveStatement.java:401)
 2332 at org.apache.hive.jdbc.HiveStatement.execute(HiveStatement.java:266)
 2333 at org.apache.hive.jdbc.HiveStatement.executeQuery(HiveStatement.java:497)
 2334 at org.apache.livy.thriftserver.CommonThriftTests$class.dataTypesTest(ThriftServerSuites.scala:31)
 2335 at org.apache.livy.thriftserver.BinaryThriftServerSuite.dataTypesTest(ThriftServerSuites.scala:74)
 2336 at org.apache.livy.thriftserver.BinaryThriftServerSuite$$anonfun$2$$anonfun$apply$mcV$sp$2.apply(ThriftServerSuites.scala:98)
 2337 at org.apache.livy.thriftserver.BinaryThriftServerSuite$$anonfun$2$$anonfun$apply$mcV$sp$2.apply(ThriftServerSuites.scala:97)
 2338 at org.apache.livy.thriftserver.ThriftServerBaseTest$$anonfun$withJdbcStatement$1.apply(ThriftServerBaseTest.scala:102)
 2339 at org.apache.livy.thriftserver.ThriftServerBaseTest$$anonfun$withJdbcStatement$1.apply(ThriftServerBaseTest.scala:99)
 2340 at org.apache.livy.thriftserver.ThriftServerBaseTest.withJdbcConnection(ThriftServerBaseTest.scala:92)

Created separated ticket https://issues.apache.org/jira/browse/LIVY-618

codecov-io · 2019-08-01T11:05:24Z

Codecov Report

Merging #189 into master will decrease coverage by 0.04%.
The diff coverage is 87.5%.

@@             Coverage Diff              @@
##             master     #189      +/-   ##
============================================
- Coverage     68.71%   68.66%   -0.05%     
- Complexity      908      909       +1     
============================================
  Files           100      103       +3     
  Lines          5674     5710      +36     
  Branches        854      854              
============================================
+ Hits           3899     3921      +22     
- Misses         1219     1231      +12     
- Partials        556      558       +2

Impacted Files	Coverage Δ	Complexity Δ
...he/livy/server/recovery/FileSystemStateStore.scala	`63.04% <ø> (ø)`	`11 <0> (ø)`	⬇️
...c/main/java/org/apache/livy/LivyClientBuilder.java	`83.05% <ø> (ø)`	`17 <0> (ø)`	⬇️
...a/org/apache/livy/server/recovery/StateStore.scala	`78.94% <ø> (-4.39%)`	`1 <0> (ø)`
...c/src/main/java/org/apache/livy/rsc/RSCClient.java	`75.15% <ø> (+1.24%)`	`20 <0> (ø)`	⬇️
...n/java/org/apache/livy/client/http/HttpClient.java	`86.15% <100%> (+0.66%)`	`16 <1> (+1)`	⬆️
...ala/org/apache/livy/scalaapi/LivyScalaClient.scala	`83.87% <100%> (+0.53%)`	`8 <1> (+1)`	⬆️
...rver/src/main/scala/org/apache/livy/LivyConf.scala	`96.01% <100%> (+0.16%)`	`21 <0> (ø)`	⬇️
...org/apache/livy/server/recovery/SessionStore.scala	`80% <100%> (ø)`	`10 <1> (ø)`	⬇️
...che/livy/server/recovery/BlackholeStateStore.scala	`100% <100%> (ø)`	`5 <1> (ø)`	⬇️
.../org/apache/livy/server/discovery/JsonMapper.scala	`100% <100%> (ø)`	`0 <0> (?)`
... and 16 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 1ce266d...18c10ed. Read the comment docs.

o-shevchenko · 2019-08-01T16:16:33Z

I have added API for getting Livy Server address to LivyClient, HttpClient, RSCClient and LivyScalaClient. I don't see a way to forward Livy address from DiscoveryManager to HttpClient or LivyScalaClient (we need to add a dependency on livy-server to client-http in this case, I'm not sure that this is acceptable).

So, from client code we can use:

val address = DiscoveryManager.getServerUri() // here we can use it in REST requests
val client = LivyClientBuilder.setURI(s"http://$address").build // or create client 
val futureUri = client.getServerUri // and get address from client if needed

val scalaClient = new LivyScalaClient(client)
val futureUri = scalaClient.getServerUri

Added scaladocs, small refactoring.

o-shevchenko · 2019-08-02T10:01:44Z

Refactored configuration to provide a more flexible mechanism for configuration, simplified code

# Livy Server discovery
# ZooKeeper quorum URLs, e.g. host1:port1,host2:port2
# livy.zookeeper.url =
# Name of base Livy znode. Default livy
# livy.zookeeper.namespace = livy
# Name of Livy Server znode. Uses livy.zookeeper.namespace as parent.
# By default will be /livy/server.uri
# livy.server.zookeeper.namespace = server.uri
# Number of trials to establish the connection to ZooKeeper quorum
# livy.server.zookeeper.connection.max.retries = 3
# Sleep time between connection retries to ZooKeeper quorum
# livy.server.zookeeper.connection.retry.interval.ms = 500

jerryshao · 2019-08-02T12:14:12Z

Hi @o-shevchenko thanks a lot for your contribution. It would be better to have a design doc to describe what you're trying to resolve and what is your solution. It's really hard for the reviewer to understand your code without any documentation.

o-shevchenko · 2019-08-02T12:51:39Z

Thanks a lot @jerryshao for your comment. I'll prepare documentation about the solution.
I'm just adding storing of Livy Server address/uri to ZooKeeper to be able to get it via some Livy API.

o-shevchenko · 2019-08-02T15:25:52Z

@jerryshao you can find documentation in the Jira ticket: https://issues.apache.org/jira/browse/LIVY-616.
Please, let me know if something is unclear and I need to add more info.
Thanks!

… as mixin it instead of creating as field

o-shevchenko · 2019-08-28T18:53:10Z

@jerryshao Any updates?
As far as I see, there are other PRs (#212 and #193) which adding similar functionality and trying to solve similar problems. We need to unify approaches and reuse the same properties and functionality. Because contributors do the same work in parallel (for example do some refactoring of code and configurations). Want to save time for all of us.

yiheng · 2019-09-02T08:34:50Z

+  private def resolvedSeverHost(livyConf: LivyConf) = {
+    val host = livyConf.get(LivyConf.SERVER_HOST)
+    if (host.equals(livyConf.get(LivyConf.SERVER_HOST.dflt.toString))) {
+      InetAddress.getLocalHost.getHostName


I do not quite understand the if condition here. Could you please explain it a bit? Thanks.

And another question is why we return the hostname instead of IP here? IMHO, IP should be more convenient, as hostname user needs to update their hosts file or set up the name service.

You can find more detailed info in the specification which attached to the Jira ticket

To enable Livy Server discovery we need to set livy.zookeeper.url. During start Livy Server we save URI to ZooKeeper. If we specify host by livy.server.host then we will use this value. if we use the default value “0.0.0.0” the local host name will be stored(InetAddress.getLocalHost.getHostName) to be able to communicate from other nodes. We will store address as java.net.URI (in format schema://host:port, e.g.http://livyhostname:8998)

So, basically it's needed to resolve Livy Server address by replacing 0.0.0.0 default value with the correct IP address if user didn't specify any Livy Server address in the conf file. In this case, we can use InetAddress.getLocalHost.getHostName to get the address from the node where LivyServer was started.

See your point, InetAddress.getLocalHost.getHostAddress might be the better choice here.
Thanks!

From your description, should it be

if (host.equals(LivyConf.SERVER_HOST.dflt.toString)) { ...

oh, I see
it's my mistake.
good catch!
livyConf.get(LivyConf.SERVER_HOST.dflt.toString) not make a sense in this case, looks like it's typical copy problem

PR was updated

yiheng · 2019-09-02T09:10:51Z

+  * @param livyConf - Livy configurations
+  * @param mockCuratorClient - used for testing
+  */
+class LivyDiscoveryManager(val livyConf: LivyConf,


Should we consider providing a generic method to publishing configuration key/value to zookeeper?

In some other scenario, e.g. this PR #193 , a bunch of hive related configurations need to be published.

If we have a generic method publish configuration key/value, it may be easier to let other components to leverage the code.

We already have one in ZooKeeperManager
LivyDiscovery is a hight level of abstraction which is used for better understanding, single responsibility and API simplification

I just refactored StateStore to move functionality for communication with ZK from ZooKeeperStateStore.scala to separated trait ZooKeeperManager.scala. In ZooKeeperStateStore we will use ZooKeeperManager to store data in ZK. ZooKeeperManager code was not changed except couple small improvements for configurations to make it more flexible.

We can use this trait everywhere when we need to store something to ZooKeeper since it's generic:

def setData(key: String, value: Object): Unit = { val prefixedKey = prefixKey(key) val data = serializeToBytes(value) if (exist(prefixedKey)) { curatorClient.setData().forPath(prefixedKey, data) } else { curatorClient.create().creatingParentsIfNeeded().forPath(prefixedKey, data) } } def getData[T: ClassTag](key: String): Option[T] = { val prefixedKey = prefixKey(key) if (exist(prefixedKey)) { Option(deserialize[T](curatorClient.getData().forPath(prefixedKey))) } else { None } }

ZooKeeperManager should be used every time by design when we need to store something to ZK. I think we need to have one logic for ZK interaction for LivyServer discovery, ZK state store, Livy HA, Livy Thrift Server HA. That's why I moved this functionality to the separated trait, to make it reuseable for someone else.

Also, I have tried to make all configurations more generic and configurable on the top level to use it everywhere. As far as I see, for now, we have a couple of PRs with different ZK logic which can't be reused a lot and I see too specific confs like livy.server.thrift.zookeeper.quorum instead of simple common livy.zookeeper.url which can be reused. In this PR I want to not only add LivyServer discovery but also create init structure for subsequent work with ZooKeeper.

codecov-commenter · 2026-05-23T01:54:13Z

Codecov Report

❌ Patch coverage is 87.50000% with 10 lines in your changes missing coverage. Please review.
✅ Project coverage is 68.51%. Comparing base (1ce266d) to head (5c47b09).
⚠️ Report is 144 commits behind head on master.

Files with missing lines	Patch %	Lines
...pache/livy/server/discovery/ZooKeeperManager.scala	85.29%	2 Missing and 3 partials ⚠️
...main/scala/org/apache/livy/server/LivyServer.scala	83.33%	1 Missing and 1 partial ⚠️
...e/livy/server/discovery/LivyDiscoveryManager.scala	77.77%	2 Missing ⚠️
...che/livy/server/recovery/ZooKeeperStateStore.scala	85.71%	1 Missing ⚠️

Additional details and impacted files

@@             Coverage Diff              @@
##             master     #189      +/-   ##
============================================
- Coverage     68.71%   68.51%   -0.20%     
- Complexity      908      919      +11     
============================================
  Files           100      103       +3     
  Lines          5674     5746      +72     
  Branches        854      866      +12     
============================================
+ Hits           3899     3937      +38     
- Misses         1219     1243      +24     
- Partials        556      566      +10

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

o-shevchenko added 2 commits July 31, 2019 20:16

[LIVY-616] Livy Server discovery

18254ff

scalastyle import order and line lengths

6526052

o-shevchenko added 2 commits August 1, 2019 13:10

reorganized imports to fit project scalastyle requirements

5f30372

Trigger, flaky tests

9731aae

added Java/Scala API, scaladocs, small refactoring

64dabaf

o-shevchenko added 7 commits August 1, 2019 19:23

scalastyle

a7fd7d8

Trigger, flaky tests

aacfd9f

Trigger, flaky tests

d693ea3

Trigger, flaky tests

729beda

Trigger, flaky tests

27a7ffc

Trigger, flaky tests

104f66b

make more configurable

5baf435

o-shevchenko added 2 commits August 2, 2019 13:04

fix typo

5d7ea28

Trigger, flaky tests

728deb9

honor livy.server.host

a71c0ef

remove package access

1d0add4

jerryshao mentioned this pull request Aug 13, 2019

[LIVY-621]add dynamic service discovery for thrift server #193

Open

o-shevchenko added 4 commits August 14, 2019 11:03

update scaladoc

7e57010

Trigger, flaky tests

627e455

refactored, changed class ZooKeeperManager to trait to be able to add…

df5a395

… as mixin it instead of creating as field

Logging already mixed to ZooKeeperManager

18c10ed

yantzu mentioned this pull request Aug 27, 2019

[LIVY-11] Enable HA support #212

Open

yiheng reviewed Sep 2, 2019

View reviewed changes

o-shevchenko added 6 commits September 2, 2019 12:19

fixed problem with host resolving

7d655fc

fixed test

e8dc350

log message improved

5e88b2d

fix scalastyle

47d8852

fix scalastyle

1250fea

Trigger, flaky

5c47b09

Conversation

o-shevchenko commented Jul 31, 2019 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What changes were proposed in this pull request?

How was this patch tested?

Uh oh!

o-shevchenko commented Aug 1, 2019 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

o-shevchenko commented Aug 1, 2019

Uh oh!

codecov-io commented Aug 1, 2019 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

o-shevchenko commented Aug 1, 2019 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

o-shevchenko commented Aug 2, 2019 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

jerryshao commented Aug 2, 2019

Uh oh!

o-shevchenko commented Aug 2, 2019

Uh oh!

o-shevchenko commented Aug 2, 2019

Uh oh!

o-shevchenko commented Aug 28, 2019 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

yiheng Sep 2, 2019

Choose a reason for hiding this comment

Uh oh!

o-shevchenko Sep 2, 2019 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

yiheng Sep 2, 2019

Choose a reason for hiding this comment

Uh oh!

o-shevchenko Sep 2, 2019

Choose a reason for hiding this comment

Uh oh!

o-shevchenko Sep 2, 2019

Choose a reason for hiding this comment

Uh oh!

yiheng Sep 2, 2019

Choose a reason for hiding this comment

Uh oh!

o-shevchenko Sep 2, 2019 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

o-shevchenko Sep 2, 2019 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

o-shevchenko Sep 2, 2019

Choose a reason for hiding this comment

Uh oh!

o-shevchenko Sep 2, 2019 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

codecov-commenter commented May 23, 2026

Codecov Report

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

o-shevchenko commented Jul 31, 2019 •

edited

Loading

o-shevchenko commented Aug 1, 2019 •

edited

Loading

codecov-io commented Aug 1, 2019 •

edited

Loading

o-shevchenko commented Aug 1, 2019 •

edited

Loading

o-shevchenko commented Aug 2, 2019 •

edited

Loading

o-shevchenko commented Aug 28, 2019 •

edited

Loading

o-shevchenko Sep 2, 2019 •

edited

Loading

o-shevchenko Sep 2, 2019 •

edited

Loading

o-shevchenko Sep 2, 2019 •

edited

Loading

o-shevchenko Sep 2, 2019 •

edited

Loading