Skip to content

[LIVY-616] Livy Server discovery#189

Open
o-shevchenko wants to merge 26 commits into
apache:masterfrom
o-shevchenko:LIVY-616
Open

[LIVY-616] Livy Server discovery#189
o-shevchenko wants to merge 26 commits into
apache:masterfrom
o-shevchenko:LIVY-616

Conversation

@o-shevchenko

@o-shevchenko o-shevchenko commented Jul 31, 2019

Copy link
Copy Markdown
Contributor

What changes were proposed in this pull request?

JIRA ticket https://issues.apache.org/jira/browse/LIVY-616.
Implement Livy Server discovery mechanism based on ZooKeeper. And provide convenient API for clients to get LivyServer URL without explicitly set it manually by "livy.server.host" property. Then we also can use it in LivyClientBuilder to set URL automatically. It will provide a more easy way for configuration. We don't need to set "livy.server.host" or "zeppelin.livy.url" manually in a case when we don't know where Livy Server will be started.

I have separated ZooKeeper logic from StateStore logic, as a result, we can use new ZooKeeperManager just to communicate with ZooKeeper and build new logic based on it (as I did for DiscoveryManager and ZooKeeperStateStore)

How was this patch tested?

Added unit tests for new functionality.
Tested manually:
After starting LivyServer we can get Livy Server URI from ZooKeeper
image

image
Also verified on a secured cluster.

What the best way to add new API?
I see the couple of ways to do that:

  1. Add it to LivyConf - in this case, we need to create DiscoveryManager in LivyConf which is not good.
  2. Add it to LivyClient interface and set some variable during creating client via LivyClientBuilder.

Please, advise.

@o-shevchenko

o-shevchenko commented Aug 1, 2019

Copy link
Copy Markdown
Contributor Author

What IDEA plugin do you use for import ordering?

@o-shevchenko

Copy link
Copy Markdown
Contributor Author

BinaryThriftServerSuite."fetch different data types" looks flaky

https://travis-ci.org/apache/incubator-livy/jobs/566398488

 BinaryThriftServerSuite:
 2328- Reuse existing session (30 seconds, 746 milliseconds)
 2329- fetch different data types *** FAILED *** (4 seconds, 270 milliseconds)
 2330 java.sql.SQLException: java.util.concurrent.ExecutionException: java.io.IOException: RSCClient instance stopped.
 2331 at org.apache.hive.jdbc.HiveStatement.waitForOperationToComplete(HiveStatement.java:401)
 2332 at org.apache.hive.jdbc.HiveStatement.execute(HiveStatement.java:266)
 2333 at org.apache.hive.jdbc.HiveStatement.executeQuery(HiveStatement.java:497)
 2334 at org.apache.livy.thriftserver.CommonThriftTests$class.dataTypesTest(ThriftServerSuites.scala:31)
 2335 at org.apache.livy.thriftserver.BinaryThriftServerSuite.dataTypesTest(ThriftServerSuites.scala:74)
 2336 at org.apache.livy.thriftserver.BinaryThriftServerSuite$$anonfun$2$$anonfun$apply$mcV$sp$2.apply(ThriftServerSuites.scala:98)
 2337 at org.apache.livy.thriftserver.BinaryThriftServerSuite$$anonfun$2$$anonfun$apply$mcV$sp$2.apply(ThriftServerSuites.scala:97)
 2338 at org.apache.livy.thriftserver.ThriftServerBaseTest$$anonfun$withJdbcStatement$1.apply(ThriftServerBaseTest.scala:102)
 2339 at org.apache.livy.thriftserver.ThriftServerBaseTest$$anonfun$withJdbcStatement$1.apply(ThriftServerBaseTest.scala:99)
 2340 at org.apache.livy.thriftserver.ThriftServerBaseTest.withJdbcConnection(ThriftServerBaseTest.scala:92)

Created separated ticket https://issues.apache.org/jira/browse/LIVY-618

@codecov-io

codecov-io commented Aug 1, 2019

Copy link
Copy Markdown

Codecov Report

Merging #189 into master will decrease coverage by 0.04%.
The diff coverage is 87.5%.

Impacted file tree graph

@@             Coverage Diff              @@
##             master     #189      +/-   ##
============================================
- Coverage     68.71%   68.66%   -0.05%     
- Complexity      908      909       +1     
============================================
  Files           100      103       +3     
  Lines          5674     5710      +36     
  Branches        854      854              
============================================
+ Hits           3899     3921      +22     
- Misses         1219     1231      +12     
- Partials        556      558       +2
Impacted Files Coverage Δ Complexity Δ
...he/livy/server/recovery/FileSystemStateStore.scala 63.04% <ø> (ø) 11 <0> (ø) ⬇️
...c/main/java/org/apache/livy/LivyClientBuilder.java 83.05% <ø> (ø) 17 <0> (ø) ⬇️
...a/org/apache/livy/server/recovery/StateStore.scala 78.94% <ø> (-4.39%) 1 <0> (ø)
...c/src/main/java/org/apache/livy/rsc/RSCClient.java 75.15% <ø> (+1.24%) 20 <0> (ø) ⬇️
...n/java/org/apache/livy/client/http/HttpClient.java 86.15% <100%> (+0.66%) 16 <1> (+1) ⬆️
...ala/org/apache/livy/scalaapi/LivyScalaClient.scala 83.87% <100%> (+0.53%) 8 <1> (+1) ⬆️
...rver/src/main/scala/org/apache/livy/LivyConf.scala 96.01% <100%> (+0.16%) 21 <0> (ø) ⬇️
...org/apache/livy/server/recovery/SessionStore.scala 80% <100%> (ø) 10 <1> (ø) ⬇️
...che/livy/server/recovery/BlackholeStateStore.scala 100% <100%> (ø) 5 <1> (ø) ⬇️
.../org/apache/livy/server/discovery/JsonMapper.scala 100% <100%> (ø) 0 <0> (?)
... and 16 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 1ce266d...18c10ed. Read the comment docs.

@o-shevchenko

o-shevchenko commented Aug 1, 2019

Copy link
Copy Markdown
Contributor Author

I have added API for getting Livy Server address to LivyClient, HttpClient, RSCClient and LivyScalaClient. I don't see a way to forward Livy address from DiscoveryManager to HttpClient or LivyScalaClient (we need to add a dependency on livy-server to client-http in this case, I'm not sure that this is acceptable).

So, from client code we can use:

val address = DiscoveryManager.getServerUri() // here we can use it in REST requests
val client = LivyClientBuilder.setURI(s"http://$address").build // or create client 
val futureUri = client.getServerUri // and get address from client if needed

val scalaClient = new LivyScalaClient(client)
val futureUri = scalaClient.getServerUri

Added scaladocs, small refactoring.

@o-shevchenko

o-shevchenko commented Aug 2, 2019

Copy link
Copy Markdown
Contributor Author

Refactored configuration to provide a more flexible mechanism for configuration, simplified code

# Livy Server discovery
# ZooKeeper quorum URLs, e.g. host1:port1,host2:port2
# livy.zookeeper.url =
# Name of base Livy znode. Default livy
# livy.zookeeper.namespace = livy
# Name of Livy Server znode. Uses livy.zookeeper.namespace as parent.
# By default will be /livy/server.uri
# livy.server.zookeeper.namespace = server.uri
# Number of trials to establish the connection to ZooKeeper quorum
# livy.server.zookeeper.connection.max.retries = 3
# Sleep time between connection retries to ZooKeeper quorum
# livy.server.zookeeper.connection.retry.interval.ms = 500

@jerryshao

Copy link
Copy Markdown
Contributor

Hi @o-shevchenko thanks a lot for your contribution. It would be better to have a design doc to describe what you're trying to resolve and what is your solution. It's really hard for the reviewer to understand your code without any documentation.

@o-shevchenko

Copy link
Copy Markdown
Contributor Author

Thanks a lot @jerryshao for your comment. I'll prepare documentation about the solution.
I'm just adding storing of Livy Server address/uri to ZooKeeper to be able to get it via some Livy API.

@o-shevchenko

Copy link
Copy Markdown
Contributor Author

@jerryshao you can find documentation in the Jira ticket: https://issues.apache.org/jira/browse/LIVY-616.
Please, let me know if something is unclear and I need to add more info.
Thanks!

@o-shevchenko

o-shevchenko commented Aug 28, 2019

Copy link
Copy Markdown
Contributor Author

@jerryshao Any updates?
As far as I see, there are other PRs (#212 and #193) which adding similar functionality and trying to solve similar problems. We need to unify approaches and reuse the same properties and functionality. Because contributors do the same work in parallel (for example do some refactoring of code and configurations). Want to save time for all of us.

private def resolvedSeverHost(livyConf: LivyConf) = {
val host = livyConf.get(LivyConf.SERVER_HOST)
if (host.equals(livyConf.get(LivyConf.SERVER_HOST.dflt.toString))) {
InetAddress.getLocalHost.getHostName

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I do not quite understand the if condition here. Could you please explain it a bit? Thanks.

And another question is why we return the hostname instead of IP here? IMHO, IP should be more convenient, as hostname user needs to update their hosts file or set up the name service.

@o-shevchenko o-shevchenko Sep 2, 2019

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can find more detailed info in the specification which attached to the Jira ticket

To enable Livy Server discovery we need to set ​livy.zookeeper.url​. During start Livy Server we save URI to ZooKeeper. If we specify host by ​livy.server.host​ then we will use this value. if we use the default value “0.0.0.0” the local host name will be stored(InetAddress.getLocalHost.getHostName) to be able to communicate from other nodes. We will store address as java.net.URI (in format schema://host:port, e.g.http://livyhostname:8998​)

So, basically it's needed to resolve Livy Server address by replacing 0.0.0.0 default value with the correct IP address if user didn't specify any Livy Server address in the conf file. In this case, we can use InetAddress.getLocalHost.getHostName to get the address from the node where LivyServer was started.

See your point, InetAddress.getLocalHost.getHostAddress might be the better choice here.
Thanks!

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From your description, should it be

if (host.equals(LivyConf.SERVER_HOST.dflt.toString)) {
...

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

oh, I see
it's my mistake.
good catch!
livyConf.get(LivyConf.SERVER_HOST.dflt.toString) not make a sense in this case, looks like it's typical copy problem

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PR was updated

* @param livyConf - Livy configurations
* @param mockCuratorClient - used for testing
*/
class LivyDiscoveryManager(val livyConf: LivyConf,

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we consider providing a generic method to publishing configuration key/value to zookeeper?

In some other scenario, e.g. this PR #193 , a bunch of hive related configurations need to be published.

If we have a generic method publish configuration key/value, it may be easier to let other components to leverage the code.

@o-shevchenko o-shevchenko Sep 2, 2019

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We already have one in ZooKeeperManager
LivyDiscovery is a hight level of abstraction which is used for better understanding, single responsibility and API simplification

@o-shevchenko o-shevchenko Sep 2, 2019

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I just refactored StateStore to move functionality for communication with ZK from ZooKeeperStateStore.scala to separated trait ZooKeeperManager.scala. In ZooKeeperStateStore we will use ZooKeeperManager to store data in ZK. ZooKeeperManager code was not changed except couple small improvements for configurations to make it more flexible.

We can use this trait everywhere when we need to store something to ZooKeeper since it's generic:

def setData(key: String, value: Object): Unit = {
    val prefixedKey = prefixKey(key)
    val data = serializeToBytes(value)
    if (exist(prefixedKey)) {
      curatorClient.setData().forPath(prefixedKey, data)
    } else {
      curatorClient.create().creatingParentsIfNeeded().forPath(prefixedKey, data)
    }
  }

  def getData[T: ClassTag](key: String): Option[T] = {
    val prefixedKey = prefixKey(key)
    if (exist(prefixedKey)) {
      Option(deserialize[T](curatorClient.getData().forPath(prefixedKey)))
    } else {
      None
    }
  }

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ZooKeeperManager should be used every time by design when we need to store something to ZK. I think we need to have one logic for ZK interaction for LivyServer discovery, ZK state store, Livy HA, Livy Thrift Server HA. That's why I moved this functionality to the separated trait, to make it reuseable for someone else.

@o-shevchenko o-shevchenko Sep 2, 2019

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, I have tried to make all configurations more generic and configurable on the top level to use it everywhere. As far as I see, for now, we have a couple of PRs with different ZK logic which can't be reused a lot and I see too specific confs like livy.server.thrift.zookeeper.quorum instead of simple common livy.zookeeper.url which can be reused. In this PR I want to not only add LivyServer discovery but also create init structure for subsequent work with ZooKeeper.

@codecov-commenter

Copy link
Copy Markdown

Codecov Report

❌ Patch coverage is 87.50000% with 10 lines in your changes missing coverage. Please review.
✅ Project coverage is 68.51%. Comparing base (1ce266d) to head (5c47b09).
⚠️ Report is 144 commits behind head on master.

Files with missing lines Patch % Lines
...pache/livy/server/discovery/ZooKeeperManager.scala 85.29% 2 Missing and 3 partials ⚠️
...main/scala/org/apache/livy/server/LivyServer.scala 83.33% 1 Missing and 1 partial ⚠️
...e/livy/server/discovery/LivyDiscoveryManager.scala 77.77% 2 Missing ⚠️
...che/livy/server/recovery/ZooKeeperStateStore.scala 85.71% 1 Missing ⚠️
Additional details and impacted files
@@             Coverage Diff              @@
##             master     #189      +/-   ##
============================================
- Coverage     68.71%   68.51%   -0.20%     
- Complexity      908      919      +11     
============================================
  Files           100      103       +3     
  Lines          5674     5746      +72     
  Branches        854      866      +12     
============================================
+ Hits           3899     3937      +38     
- Misses         1219     1243      +24     
- Partials        556      566      +10     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants