Juju Charms Collection
apache-hadoop-spark package

Merge lp:~bigdata-dev/charms/bundles/apache-hadoop-spark/trunk into lp:~charmers/charms/bundles/apache-hadoop-spark/bundle

Charm Bundles (0.0)
trunk
Merge into bundle

Proposed by Kevin W Monroe on 2016-02-23

Status:

Merged

Merged at revision:

Proposed branch:

lp:~bigdata-dev/charms/bundles/apache-hadoop-spark/trunk

Merge into:

lp:~charmers/charms/bundles/apache-hadoop-spark/bundle

Diff against target:

264 lines (+142/-16)

6 files modified

README.md (+1/-1)
bundle-dev.yaml (+1/-1)
bundle-local.yaml (+7/-7)
bundle.yaml (+7/-7)
tests/01-bundle.py (+123/-0)
tests/tests.yaml (+3/-0)

To merge this branch:

bzr merge lp:~bigdata-dev/charms/bundles/apache-hadoop-spark/trunk

Undecided

Fix Released

Link a bug report

Reviewer	Review Type	Date Requested	Status
Kevin W Monroe			Approve on 2016-02-23
Review via email: mp+286959@code.launchpad.net

Description of the change

updates from bigdata-dev:
- bump up compute-slave mem constraint
- move bundle.yaml charms to latest promulgated versions (ppc64le PoC is complete)
- add tests

Revision history for this message

Kevin W Monroe (kwmonroe) wrote on 2016-02-23:

+1, aws tests successful

review: Approve

Preview Diff

[H/L] Next/Prev Comment, [J/K] Next/Prev File, [N/P] Next/Prev Hunk

Subscribers

People subscribed via source and target branches

to all changes:

Big Data Charmers

Juju Big Data Development

charm ze

charmers

 === modified file 'README.md'
 --- README.md	2015-06-26 17:04:43 +0000
 +++ README.md	2016-02-23 21:38:47 +0000
@@ -14,7 +14,7 @@
  ## Usage
  Deploy this bundle using juju-quickstart:
--    juju quickstart u/bigdata-dev/apache-hadoop-spark
++    juju quickstart apache-hadoop-spark
  See `juju quickstart --help` for deployment options, including machine
  constraints and how to deploy a locally modified version of the
 === modified file 'bundle-dev.yaml'
 --- bundle-dev.yaml	2015-09-17 13:29:50 +0000
 +++ bundle-dev.yaml	2016-02-23 21:38:47 +0000
@@ -5,7 +5,7 @@
      annotations:
        gui-x: "300"
        gui-y: "200"
--    constraints: mem=3G
++    constraints: mem=7G
    hdfs-master:
      charm: cs:~bigdata-dev/trusty/apache-hadoop-hdfs-master
      num_units: 1
 === modified file 'bundle-local.yaml'
 --- bundle-local.yaml	2015-09-17 13:29:50 +0000
 +++ bundle-local.yaml	2016-02-23 21:38:47 +0000
@@ -1,39 +1,39 @@
  services:
    compute-slave:
--    charm: trusty/apache-hadoop-compute-slave
++    charm: apache-hadoop-compute-slave
      num_units: 3
      annotations:
        gui-x: "300"
        gui-y: "200"
--    constraints: mem=3G
++    constraints: mem=7G
    hdfs-master:
--    charm: trusty/apache-hadoop-hdfs-master
++    charm: apache-hadoop-hdfs-master
      num_units: 1
      annotations:
        gui-x: "600"
        gui-y: "350"
      constraints: mem=7G
    plugin:
--    charm: trusty/apache-hadoop-plugin
++    charm: apache-hadoop-plugin
      annotations:
        gui-x: "900"
        gui-y: "200"
    secondary-namenode:
--    charm: trusty/apache-hadoop-hdfs-secondary
++    charm: apache-hadoop-hdfs-secondary
      num_units: 1
      annotations:
        gui-x: "600"
        gui-y: "600"
      constraints: mem=7G
    spark:
--    charm: trusty/apache-spark
++    charm: apache-spark
      num_units: 1
      annotations:
        gui-x: "1200"
        gui-y: "200"
      constraints: mem=3G
    yarn-master:
--    charm: trusty/apache-hadoop-yarn-master
++    charm: apache-hadoop-yarn-master
      num_units: 1
      annotations:
        gui-x: "600"
 === modified file 'bundle.yaml'
 --- bundle.yaml	2015-09-17 13:29:50 +0000
 +++ bundle.yaml	2016-02-23 21:38:47 +0000
@@ -1,39 +1,39 @@
  services:
    compute-slave:
--    charm: cs:trusty/apache-hadoop-compute-slave-8
++    charm: cs:trusty/apache-hadoop-compute-slave-9
      num_units: 3
      annotations:
        gui-x: "300"
        gui-y: "200"
--    constraints: mem=3G
++    constraints: mem=7G
    hdfs-master:
--    charm: cs:trusty/apache-hadoop-hdfs-master-8
++    charm: cs:trusty/apache-hadoop-hdfs-master-9
      num_units: 1
      annotations:
        gui-x: "600"
        gui-y: "350"
      constraints: mem=7G
    plugin:
--    charm: cs:trusty/apache-hadoop-plugin-7
++    charm: cs:trusty/apache-hadoop-plugin-10
      annotations:
        gui-x: "900"
        gui-y: "200"
    secondary-namenode:
--    charm: cs:trusty/apache-hadoop-hdfs-secondary-6
++    charm: cs:trusty/apache-hadoop-hdfs-secondary-7
      num_units: 1
      annotations:
        gui-x: "600"
        gui-y: "600"
      constraints: mem=7G
    spark:
--    charm: cs:trusty/apache-spark-3
++    charm: cs:trusty/apache-spark-6
      num_units: 1
      annotations:
        gui-x: "1200"
        gui-y: "200"
      constraints: mem=3G
    yarn-master:
--    charm: cs:trusty/apache-hadoop-yarn-master-6
++    charm: cs:trusty/apache-hadoop-yarn-master-7
      num_units: 1
      annotations:
        gui-x: "600"
 === added directory 'tests'
 === added file 'tests/01-bundle.py'
 --- tests/01-bundle.py	1970-01-01 00:00:00 +0000
 +++ tests/01-bundle.py	2016-02-23 21:38:47 +0000
@@ -0,0 +1,123 @@
++#!/usr/bin/env python3
++
++import os
++import unittest
++
++import yaml
++import amulet
++
++
++class TestBundle(unittest.TestCase):
++    bundle_file = os.path.join(os.path.dirname(__file__), '..', 'bundle.yaml')
++
++    @classmethod
++    def setUpClass(cls):
++        cls.d = amulet.Deployment(series='trusty')
++        with open(cls.bundle_file) as f:
++            bun = f.read()
++        bundle = yaml.safe_load(bun)
++        cls.d.load(bundle)
++        cls.d.setup(timeout=1800)
++        cls.d.sentry.wait_for_messages({'spark': 'Ready'}, timeout=1800)
++        cls.hdfs = cls.d.sentry['hdfs-master'][0]
++        cls.yarn = cls.d.sentry['yarn-master'][0]
++        cls.slave = cls.d.sentry['compute-slave'][0]
++        cls.secondary = cls.d.sentry['secondary-namenode'][0]
++        cls.spark = cls.d.sentry['spark'][0]
++
++    def test_components(self):
++        """
++        Confirm that all of the required components are up and running.
++        """
++        hdfs, retcode = self.hdfs.run("pgrep -a java")
++        yarn, retcode = self.yarn.run("pgrep -a java")
++        slave, retcode = self.slave.run("pgrep -a java")
++        secondary, retcode = self.secondary.run("pgrep -a java")
++        spark, retcode = self.spark.run("pgrep -a java")
++
++        # .NameNode needs the . to differentiate it from SecondaryNameNode
++        assert '.NameNode' in hdfs, "NameNode not started"
++        assert '.NameNode' not in yarn, "NameNode should not be running on yarn-master"
++        assert '.NameNode' not in slave, "NameNode should not be running on compute-slave"
++        assert '.NameNode' not in secondary, "NameNode should not be running on secondary-namenode"
++        assert '.NameNode' not in spark, "NameNode should not be running on spark"
++
++        assert 'ResourceManager' in yarn, "ResourceManager not started"
++        assert 'ResourceManager' not in hdfs, "ResourceManager should not be running on hdfs-master"
++        assert 'ResourceManager' not in slave, "ResourceManager should not be running on compute-slave"
++        assert 'ResourceManager' not in secondary, "ResourceManager should not be running on secondary-namenode"
++        assert 'ResourceManager' not in spark, "ResourceManager should not be running on spark"
++
++        assert 'JobHistoryServer' in yarn, "JobHistoryServer not started"
++        assert 'JobHistoryServer' not in hdfs, "JobHistoryServer should not be running on hdfs-master"
++        assert 'JobHistoryServer' not in slave, "JobHistoryServer should not be running on compute-slave"
++        assert 'JobHistoryServer' not in secondary, "JobHistoryServer should not be running on secondary-namenode"
++        assert 'JobHistoryServer' not in spark, "JobHistoryServer should not be running on spark"
++
++        assert 'NodeManager' in slave, "NodeManager not started"
++        assert 'NodeManager' not in yarn, "NodeManager should not be running on yarn-master"
++        assert 'NodeManager' not in hdfs, "NodeManager should not be running on hdfs-master"
++        assert 'NodeManager' not in secondary, "NodeManager should not be running on secondary-namenode"
++        assert 'NodeManager' not in spark, "NodeManager should not be running on spark"
++
++        assert 'DataNode' in slave, "DataServer not started"
++        assert 'DataNode' not in yarn, "DataNode should not be running on yarn-master"
++        assert 'DataNode' not in hdfs, "DataNode should not be running on hdfs-master"
++        assert 'DataNode' not in secondary, "DataNode should not be running on secondary-namenode"
++        assert 'DataNode' not in spark, "DataNode should not be running on spark"
++
++        assert 'SecondaryNameNode' in secondary, "SecondaryNameNode not started"
++        assert 'SecondaryNameNode' not in yarn, "SecondaryNameNode should not be running on yarn-master"
++        assert 'SecondaryNameNode' not in hdfs, "SecondaryNameNode should not be running on hdfs-master"
++        assert 'SecondaryNameNode' not in slave, "SecondaryNameNode should not be running on compute-slave"
++        assert 'SecondaryNameNode' not in spark, "SecondaryNameNode should not be running on spark"
++
++        assert 'spark' in spark, 'Spark should be running on spark'
++
++    def test_hdfs_dir(self):
++        """
++        Validate admin few hadoop activities on HDFS cluster.
++            1) This test validates mkdir on hdfs cluster
++            2) This test validates change hdfs dir owner on the cluster
++            3) This test validates setting hdfs directory access permission on the cluster
++
++        NB: These are order-dependent, so must be done as part of a single test case.
++        """
++        output, retcode = self.spark.run("su hdfs -c 'hdfs dfs -mkdir -p /user/ubuntu'")
++        assert retcode == 0, "Created a user directory on hdfs FAILED:\n{}".format(output)
++        output, retcode = self.spark.run("su hdfs -c 'hdfs dfs -chown ubuntu:ubuntu /user/ubuntu'")
++        assert retcode == 0, "Assigning an owner to hdfs directory FAILED:\n{}".format(output)
++        output, retcode = self.spark.run("su hdfs -c 'hdfs dfs -chmod -R 755 /user/ubuntu'")
++        assert retcode == 0, "seting directory permission on hdfs FAILED:\n{}".format(output)
++
++    def test_yarn_mapreduce_exe(self):
++        """
++        Validate yarn mapreduce operations:
++            1) validate mapreduce execution - writing to hdfs
++            2) validate successful mapreduce operation after the execution
++            3) validate mapreduce execution - reading and writing to hdfs
++            4) validate successful mapreduce operation after the execution
++            5) validate successful deletion of mapreduce operation result from hdfs
++
++        NB: These are order-dependent, so must be done as part of a single test case.
++        """
++        jar_file = '/usr/lib/hadoop/share/hadoop/mapreduce/hadoop-mapreduce-examples-*.jar'
++        test_steps = [
++            ('teragen',      "su ubuntu -c 'hadoop jar {} teragen  10000 /user/ubuntu/teragenout'".format(jar_file)),
++            ('mapreduce #1', "su hdfs -c 'hdfs dfs -ls /user/ubuntu/teragenout/_SUCCESS'"),
++            ('terasort',     "su ubuntu -c 'hadoop jar {} terasort /user/ubuntu/teragenout /user/ubuntu/terasortout'".
++                format(jar_file)),
++            ('mapreduce #2', "su hdfs -c 'hdfs dfs -ls /user/ubuntu/terasortout/_SUCCESS'"),
++            ('cleanup',      "su hdfs -c 'hdfs dfs -rm -r /user/ubuntu/teragenout'"),
++        ]
++        for name, step in test_steps:
++            output, retcode = self.spark.run(step)
++            assert retcode == 0, "{} FAILED:\n{}".format(name, output)
++
++    def test_spark(self):
++        output, retcode = self.spark.run("su ubuntu -c 'bash -lc /home/ubuntu/sparkpi.sh 2>&1'")
++        assert 'Pi is roughly' in output, 'SparkPI test failed: %s' % output
++
++
++if __name__ == '__main__':
++    unittest.main()
 === added file 'tests/tests.yaml'
 --- tests/tests.yaml	1970-01-01 00:00:00 +0000
 +++ tests/tests.yaml	2016-02-23 21:38:47 +0000
@@ -0,0 +1,3 @@
++reset: false
++packages:
++  - amulet

Juju Charms Collectionapache-hadoop-spark package

Merge lp:~bigdata-dev/charms/bundles/apache-hadoop-spark/trunk into lp:~charmers/charms/bundles/apache-hadoop-spark/bundle

Commit message

Description of the change

Preview Diff

Subscribers

Juju Charms Collection
apache-hadoop-spark package