Skip to content
This repository was archived by the owner on Dec 15, 2025. It is now read-only.

Commit ea92553

Browse files
committed
Refactor Travis CI to support multiple Java Builds
* Moved all xml files to a created a folder that contains the artifacts for either hadoop 2.6 or 3.2, those will be pickd up depending on the testing needs in travis.yml * Moved spark-env file to a created a folder that contains the artifacts for either spark1.6 or 2.4, those will be pickd up depending on the testing needs in travis.yml * Removed harcoded values from haddop.conf and spark.conf, this will be filled up depending on the testing needs. * Added an `install_hadoop_spark` script that will download hadoop and spark binaries depending on the testing needs. * Added a `config_hadoop_spark` script that will setup hadoop, spark and hibench depending on the testing needs. * Added a `jdk_ver` script to pick up the current java version installed for travis CI. * `restart_hadoop_spark` script modified to be agnostic to the required binaries for testing. * travis/config_hadoop_spark.sh: * for Java 8 and 11 skiping `sql` test since HIVE is no longer used to perform queries. Newer Spark version perform queries using `SparkSession` no longer used `import org.apache.spark.sql` * .travis.yml: * Added `dist: trusty` to keep using this distro, Travis picks up xenial if not especified.. If Any greather Ubuntu version required in Travis won't support openjdk 7. * Refactored the CI flow to behave, download, setup, run and test hadoop and spark depending on the jdk required either versions 7, 8 and 11. * Hibench will be configured depending on the jdk required either versions 7, 8 and 11. * Hibench will be built depending on the jdk required either versions 7, 8 and 11. * benchmarks will be run for all jdk versions set. Signed-off-by: Luis Ponce <luis.f.ponce.navarro@linux.intel.com>
1 parent d113389 commit ea92553

17 files changed

Lines changed: 448 additions & 35 deletions

.travis.yml

Lines changed: 62 additions & 23 deletions
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,9 @@
1+
dist: trusty
12
sudo: required
23
language: java
34
jdk:
5+
- openjdk11
6+
- openjdk8
47
- openjdk7
58
before_install:
69
- cat /etc/hosts # optionally check the content *before*
@@ -10,32 +13,68 @@ before_install:
1013
- cat /proc/cpuinfo | grep cores | wc -l
1114
- free -h
1215
install:
13-
- hibench=$(pwd)
14-
- cd /opt/
15-
- wget http://d3kbcqa49mib13.cloudfront.net/spark-1.6.0-bin-hadoop2.6.tgz
16-
- tar -xzf spark-1.6.0-bin-hadoop2.6.tgz
17-
- wget https://archive.apache.org/dist/hadoop/core/hadoop-2.6.5/hadoop-2.6.5.tar.gz
18-
- tar -xzf hadoop-2.6.5.tar.gz
19-
- cd ${hibench}
20-
- cp ./travis/spark-env.sh /opt/spark-1.6.0-bin-hadoop2.6/conf/
21-
- cp ./travis/core-site.xml /opt/hadoop-2.6.5/etc/hadoop/
22-
- cp ./travis/hdfs-site.xml /opt/hadoop-2.6.5/etc/hadoop/
23-
- cp ./travis/mapred-site.xml /opt/hadoop-2.6.5/etc/hadoop/
24-
- cp ./travis/yarn-site.xml /opt/hadoop-2.6.5/etc/hadoop/
25-
- cp ./travis/hibench.conf ./conf/
26-
- cp ./travis/benchmarks.lst ./conf/
16+
- |
17+
export java_ver=$(./travis/jdk_ver.sh)
18+
if [[ "$java_ver" == 11 ]]; then
19+
export HADOOP_VER=3.2.0
20+
export SPARK_VER=2.4.3
21+
export SPARK_PACKAGE_TYPE=without-hadoop-scala-2.12
22+
elif [[ "$java_ver" == 8 ]]; then
23+
export HADOOP_VER=3.2.0
24+
export SPARK_VER=2.4.3
25+
export SPARK_PACKAGE_TYPE=without-hadoop
26+
elif [[ "$java_ver" == 7 ]]; then
27+
export HADOOP_VER=2.6.5
28+
export SPARK_VER=1.6.0
29+
export SPARK_PACKAGE_TYPE=hadoop2.6
30+
else
31+
exit 1
32+
fi
33+
34+
# Folders where are stored Spark and Hadoop depending on version required
35+
export SPARK_BINARIES_FOLDER=spark-$SPARK_VER-bin-$SPARK_PACKAGE_TYPE
36+
export HADOOP_BINARIES_FOLDER=hadoop-$HADOOP_VER
37+
export HADOOP_CONF_DIR=/opt/$HADOOP_BINARIES_FOLDER/etc/hadoop/
38+
39+
if [[ "$HADOOP_VER" =~ "3.2" ]]; then
40+
export HADOOP_IDENT_STRING=root
41+
export HDFS_NAMENODE_USER=root
42+
export HDFS_DATANODE_USER=root
43+
export HDFS_SECONDARYNAMENODE_USER=root
44+
export YARN_RESOURCEMANAGER_USER=root
45+
export YARN_NODEMANAGER_USER=root
46+
export SPARK_CONF_DIR=/opt/$SPARK_BINARIES_FOLDER/conf/
47+
export HADOOP_HOME=/opt/$HADOOP_BINARIES_FOLDER
48+
export SPARK_HOME=/opt/$SPARK_BINARIES_FOLDER
49+
export HADOOP_MAPRED_HOME=$HADOOP_HOME
50+
export HADOOP_COMMON_HOME=$HADOOP_HOME
51+
export HADOOP_HDFS_HOME=$HADOOP_HOME
52+
export YARN_HOME=$HADOOP_HOME
53+
export HADOOP_INSTALL=$HADOOP_HOME
54+
export SPARK_DIST_CLASSPATH=$(/opt/$HADOOP_BINARIES_FOLDER/bin/hadoop classpath)
55+
fi
56+
57+
sudo -E ./travis/install_hadoop_spark.sh
58+
sudo -E ./travis/config_hadoop_spark.sh
2759
before_script:
2860
- "export JAVA_OPTS=-Xmx512m"
2961
cache:
3062
directories:
3163
- $HOME/.m2
3264
script:
33-
- mvn clean package -q -Dmaven.javadoc.skip=true -Dspark=2.2 -Dscala=2.11
34-
- mvn clean package -q -Dmaven.javadoc.skip=true -Dspark=2.0 -Dscala=2.11
35-
- mvn clean package -q -Dmaven.javadoc.skip=true -Dspark=1.6 -Dscala=2.10
36-
- sudo -E ./travis/configssh.sh
37-
- sudo -E ./travis/restart_hadoop_spark.sh
38-
- cp ./travis/hadoop.conf ./conf/
39-
- cp ./travis/spark.conf ./conf/
40-
- /opt/hadoop-2.6.5/bin/yarn node -list 2
41-
- sudo -E ./bin/run_all.sh
65+
- |
66+
if [[ "$java_ver" == 11 ]]; then
67+
mvn clean package -q -Psparkbench -Phadoopbench -Dmaven.javadoc.skip=true -Dhadoop=3.2 -Dspark=2.4 -Dscala=2.12 -Dexclude-streaming
68+
elif [[ "$java_ver" == 8 ]]; then
69+
mvn clean package -q -Dmaven.javadoc.skip=true -Dhadoop=3.2 -Dspark=2.4 -Dscala=2.11
70+
elif [[ "$java_ver" == 7 ]]; then
71+
mvn clean package -q -Dmaven.javadoc.skip=true -Dspark=2.2 -Dscala=2.11
72+
mvn clean package -q -Dmaven.javadoc.skip=true -Dspark=2.0 -Dscala=2.11
73+
mvn clean package -q -Dmaven.javadoc.skip=true -Dspark=1.6 -Dscala=2.10
74+
else
75+
exit 1
76+
fi
77+
78+
sudo -E ./travis/configssh.sh
79+
sudo -E ./travis/restart_hadoop_spark.sh
80+
sudo -E ./bin/run_all.sh
Lines changed: 26 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,26 @@
1+
<?xml version="1.0" encoding="UTF-8"?>
2+
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
3+
<!--
4+
Licensed under the Apache License, Version 2.0 (the "License");
5+
you may not use this file except in compliance with the License.
6+
You may obtain a copy of the License at
7+
8+
http://www.apache.org/licenses/LICENSE-2.0
9+
10+
Unless required by applicable law or agreed to in writing, software
11+
distributed under the License is distributed on an "AS IS" BASIS,
12+
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
13+
See the License for the specific language governing permissions and
14+
limitations under the License. See accompanying LICENSE file.
15+
-->
16+
17+
<!-- Put site-specific property overrides in this file. -->
18+
19+
<configuration>
20+
21+
<property>
22+
<name>fs.default.name</name>
23+
<value>hdfs://localhost:9000</value>
24+
</property>
25+
26+
</configuration>
Lines changed: 36 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,36 @@
1+
<?xml version="1.0" encoding="UTF-8"?>
2+
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
3+
<!--
4+
Licensed under the Apache License, Version 2.0 (the "License");
5+
you may not use this file except in compliance with the License.
6+
You may obtain a copy of the License at
7+
8+
http://www.apache.org/licenses/LICENSE-2.0
9+
10+
Unless required by applicable law or agreed to in writing, software
11+
distributed under the License is distributed on an "AS IS" BASIS,
12+
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
13+
See the License for the specific language governing permissions and
14+
limitations under the License. See accompanying LICENSE file.
15+
-->
16+
17+
<!-- Put site-specific property overrides in this file. -->
18+
19+
<configuration>
20+
<property>
21+
<name>dfs.replication</name>
22+
<value>1</value>
23+
</property>
24+
<property>
25+
<name>dfs.namenode.name.dir</name>
26+
<value>/usr/local/hdfs/namenode</value>
27+
</property>
28+
<property>
29+
<name>dfs.datanode.data.dir</name>
30+
<value>/usr/local/hdfs/datanode</value>
31+
</property>
32+
<property>
33+
<name>dfs.client.use.datanode.hostname</name>
34+
<value>true</value>
35+
</property>
36+
</configuration>
Lines changed: 53 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,53 @@
1+
<?xml version="1.0"?>
2+
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
3+
<!--
4+
Licensed under the Apache License, Version 2.0 (the "License");
5+
you may not use this file except in compliance with the License.
6+
You may obtain a copy of the License at
7+
8+
http://www.apache.org/licenses/LICENSE-2.0
9+
10+
Unless required by applicable law or agreed to in writing, software
11+
distributed under the License is distributed on an "AS IS" BASIS,
12+
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
13+
See the License for the specific language governing permissions and
14+
limitations under the License. See accompanying LICENSE file.
15+
-->
16+
17+
<!-- Put site-specific property overrides in this file. -->
18+
19+
<configuration>
20+
<property>
21+
<name>mapreduce.framework.name</name>
22+
<value>yarn</value>
23+
</property>
24+
25+
<property>
26+
<name>mapreduce.application.classpath</name>
27+
<value>$HADOOP_HOME/share/hadoop/common/*,
28+
$HADOOP_HOME/share/hadoop/common/lib/*,
29+
$HADOOP_HOME/share/hadoop/hdfs/*,
30+
$HADOOP_HOME/share/hadoop/hdfs/lib/*,
31+
$HADOOP_HOME/share/hadoop/yarn/*,
32+
$HADOOP_HOME/share/hadoop/yarn/lib/*,
33+
$HADOOP_HOME/share/hadoop/mapreduce/*,
34+
$HADOOP_HOME/share/hadoop/mapreduce/lib/*
35+
</value>
36+
</property>
37+
38+
<property>
39+
<name>yarn.app.mapreduce.am.env</name>
40+
<value>HADOOP_MAPRED_HOME=$HADOOP_HOME</value>
41+
</property>
42+
43+
<property>
44+
<name>mapreduce.map.env</name>
45+
<value>HADOOP_MAPRED_HOME=$HADOOP_HOME</value>
46+
</property>
47+
48+
<property>
49+
<name>mapreduce.reduce.env</name>
50+
<value>HADOOP_MAPRED_HOME=$HADOOP_HOME</value>
51+
</property>
52+
53+
</configuration>
Lines changed: 71 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,71 @@
1+
#!/usr/bin/env bash
2+
3+
#
4+
# Licensed to the Apache Software Foundation (ASF) under one or more
5+
# contributor license agreements. See the NOTICE file distributed with
6+
# this work for additional information regarding copyright ownership.
7+
# The ASF licenses this file to You under the Apache License, Version 2.0
8+
# (the "License"); you may not use this file except in compliance with
9+
# the License. You may obtain a copy of the License at
10+
#
11+
# http://www.apache.org/licenses/LICENSE-2.0
12+
#
13+
# Unless required by applicable law or agreed to in writing, software
14+
# distributed under the License is distributed on an "AS IS" BASIS,
15+
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
16+
# See the License for the specific language governing permissions and
17+
# limitations under the License.
18+
#
19+
20+
# This file is sourced when running various Spark programs.
21+
# Copy it as spark-env.sh and edit that to configure Spark for your site.
22+
23+
# Options read when launching programs locally with
24+
# ./bin/run-example or ./bin/spark-submit
25+
# - HADOOP_CONF_DIR, to point Spark towards Hadoop configuration files
26+
# - SPARK_LOCAL_IP, to set the IP address Spark binds to on this node
27+
# - SPARK_PUBLIC_DNS, to set the public dns name of the driver program
28+
# - SPARK_CLASSPATH, default classpath entries to append
29+
30+
# Options read by executors and drivers running inside the cluster
31+
# - SPARK_LOCAL_IP, to set the IP address Spark binds to on this node
32+
# - SPARK_PUBLIC_DNS, to set the public DNS name of the driver program
33+
# - SPARK_CLASSPATH, default classpath entries to append
34+
# - SPARK_LOCAL_DIRS, storage directories to use on this node for shuffle and RDD data
35+
# - MESOS_NATIVE_JAVA_LIBRARY, to point to your libmesos.so if you use Mesos
36+
37+
# Options read in YARN client mode
38+
# - HADOOP_CONF_DIR, to point Spark towards Hadoop configuration files
39+
# - SPARK_EXECUTOR_INSTANCES, Number of executors to start (Default: 2)
40+
# - SPARK_EXECUTOR_CORES, Number of cores for the executors (Default: 1).
41+
# - SPARK_EXECUTOR_MEMORY, Memory per Executor (e.g. 1000M, 2G) (Default: 1G)
42+
# - SPARK_DRIVER_MEMORY, Memory for Driver (e.g. 1000M, 2G) (Default: 1G)
43+
# - SPARK_YARN_APP_NAME, The name of your application (Default: Spark)
44+
# - SPARK_YARN_QUEUE, The hadoop queue to use for allocation requests (Default: ‘default’)
45+
# - SPARK_YARN_DIST_FILES, Comma separated list of files to be distributed with the job.
46+
# - SPARK_YARN_DIST_ARCHIVES, Comma separated list of archives to be distributed with the job.
47+
48+
# Options for the daemons used in the standalone deploy mode
49+
export SPARK_MASTER_IP= localhost
50+
export SPARK_DIST_CLASSPATH=$(/opt/$HADOOP_BINARIES_FOLDER/bin/hadoop classpath)
51+
#i, to bind the master to a different IP address or hostname
52+
# - SPARK_MASTER_PORT / SPARK_MASTER_WEBUI_PORT, to use non-default ports for the master
53+
# - SPARK_MASTER_OPTS, to set config properties only for the master (e.g. "-Dx=y")
54+
# - SPARK_WORKER_CORES, to set the number of cores to use on this machine
55+
# - SPARK_WORKER_MEMORY, to set how much total memory workers have to give executors (e.g. 1000m, 2g)
56+
# - SPARK_WORKER_PORT / SPARK_WORKER_WEBUI_PORT, to use non-default ports for the worker
57+
# - SPARK_WORKER_INSTANCES, to set the number of worker processes per node
58+
# - SPARK_WORKER_DIR, to set the working directory of worker processes
59+
# - SPARK_WORKER_OPTS, to set config properties only for the worker (e.g. "-Dx=y")
60+
# - SPARK_DAEMON_MEMORY, to allocate to the master, worker and history server themselves (default: 1g).
61+
# - SPARK_HISTORY_OPTS, to set config properties only for the history server (e.g. "-Dx=y")
62+
# - SPARK_SHUFFLE_OPTS, to set config properties only for the external shuffle service (e.g. "-Dx=y")
63+
# - SPARK_DAEMON_JAVA_OPTS, to set config properties for all daemons (e.g. "-Dx=y")
64+
# - SPARK_PUBLIC_DNS, to set the public dns name of the master or workers
65+
66+
# Generic options for the daemons used in the standalone deploy mode
67+
# - SPARK_CONF_DIR Alternate conf dir. (Default: ${SPARK_HOME}/conf)
68+
# - SPARK_LOG_DIR Where log files are stored. (Default: ${SPARK_HOME}/logs)
69+
# - SPARK_PID_DIR Where the pid file is stored. (Default: /tmp)
70+
# - SPARK_IDENT_STRING A string representing this instance of spark. (Default: $USER)
71+
# - SPARK_NICENESS The scheduling priority for daemons. (Default: 0)

0 commit comments

Comments
 (0)