Merge lp:~bigdata-dev/charms/trusty/apache-hadoop-compute-slave/readme into lp:~bigdata-dev/charms/trusty/apache-hadoop-compute-slave/trunk

Proposed by Cory Johns
Status: Merged
Merged at revision: 42
Proposed branch: lp:~bigdata-dev/charms/trusty/apache-hadoop-compute-slave/readme
Merge into: lp:~bigdata-dev/charms/trusty/apache-hadoop-compute-slave/trunk
Diff against target: 278 lines (+110/-110)
3 files modified
README.dev.md (+72/-0)
README.md (+36/-108)
resources.yaml (+2/-2)
To merge this branch: bzr merge lp:~bigdata-dev/charms/trusty/apache-hadoop-compute-slave/readme
Reviewer Review Type Date Requested Status
amir sanjar (community) Approve
Review via email: mp+252617@code.launchpad.net

Description of the change

New READMEs and minor relation cleanups

To post a comment you must log in.
42. By Cory Johns

Disambiguate the interface in README.dev

Revision history for this message
amir sanjar (asanjar) wrote :

we will also need to return hostname of computes nodes in future

review: Approve

Preview Diff

[H/L] Next/Prev Comment, [J/K] Next/Prev File, [N/P] Next/Prev Hunk
1=== added file 'README.dev.md'
2--- README.dev.md 1970-01-01 00:00:00 +0000
3+++ README.dev.md 2015-03-11 18:02:10 +0000
4@@ -0,0 +1,72 @@
5+## Overview
6+
7+This charm provides computation and storage resources for an Apache Hadoop
8+deployment, and is intended to be used only as a part of that deployment.
9+This document describes how this charm connects to and interacts with the
10+other components of the deployment.
11+
12+
13+## Provided Relations
14+
15+### datanode (interface: dfs-slave)
16+
17+This relation connects this charm to the apache-hadoop-hdfs-master charm.
18+It is a bi-directional interface, with the following keys being exchanged:
19+
20+* Sent to hdfs-master:
21+
22+ * `private-address`: Address of this unit, to be registered as a DataNode
23+
24+* Received from hdfs-master:
25+
26+ * `private-address`: Address of the HDFS master unit, to provide the NameNode
27+ * `ready`: A flag indicating that HDFS is ready to register DataNodes
28+
29+Ports will soon be added to both of these.
30+
31+
32+### nodemanager (interface: mapred-slave)
33+
34+This relation connects this charm to the apache-hadoop-yarn-master charm.
35+It is a bi-directional interface, with the following keys being exchanged:
36+
37+* Sent to yarn-master:
38+
39+ * `private-address`: Address of this unit, to be registered as a NodeManager
40+
41+* Received from yarn-master:
42+
43+ * `private-address`: Address of the YARN master unit, to provide the ResourceManager
44+ * `ready`: A flag indicating that YARN is ready to register NodeManagers
45+
46+Ports will soon be added to both of these.
47+
48+
49+## Required Relations
50+
51+*There are no required relations for this charm.*
52+
53+
54+## Manual Deployment
55+
56+The easiest way to deploy the core Apache Hadoop platform is to use one of
57+the [apache-core-batch-processing-* bundles](https://jujucharms.com/q/bigdata-dev/apache?type=bundle).
58+However, to manually deploy the base Apache Hadoop platform without using one of the
59+bundles, you can use the following:
60+
61+ juju deploy apache-hadoop-hdfs-master hdfs-master
62+ juju deploy apache-hadoop-hdfs-secondary secondary-namenode
63+ juju deploy apache-hadoop-yarn-master yarn-master
64+ juju deploy apache-hadoop-compute-slave compute-slave -n3
65+ juju deploy apache-hadoop-client client
66+ juju add-relation yarn-master hdfs-master
67+ juju add-relation secondary-namenode hdfs-master
68+ juju add-relation compute-slave yarn-master
69+ juju add-relation compute-slave hdfs-master
70+ juju add-relation client yarn-master
71+ juju add-relation client hdfs-master
72+
73+This will create a scalable deployment with separate nodes for each master,
74+and a three unit compute slave (NodeManager and DataNode) cluster. The master
75+charms also support co-locating using the `--to` option to `juju deploy` for
76+more dense deployments.
77
78=== modified file 'README.md'
79--- README.md 2015-02-13 22:34:21 +0000
80+++ README.md 2015-03-11 18:02:10 +0000
81@@ -1,118 +1,39 @@
82 ## Overview
83
84-This charm is a component of the Apache Hadoop platform. It is intended
85-to be deployed with the other components using the bundle:
86-`bundle:~bigdata-charmers/apache-hadoop`
87-
88-**What is Apache Hadoop?**
89-
90 The Apache Hadoop software library is a framework that allows for the
91 distributed processing of large data sets across clusters of computers
92 using a simple programming model.
93
94-It is designed to scale up from single servers to thousands of machines,
95-each offering local computation and storage. Rather than rely on hardware
96-to deliver high-avaiability, the library itself is designed to detect
97-and handle failures at the application layer, so delivering a
98-highly-availabile service on top of a cluster of computers, each of
99-which may be prone to failures.
100-
101-Apache Hadoop 2.4.1 consists of significant improvements over the previous stable
102-release (hadoop-1.x).
103-
104-Here is a short overview of the improvments to both HDFS and MapReduce.
105-
106- - **HDFS Federation**
107- In order to scale the name service horizontally, federation uses multiple
108- independent Namenodes/Namespaces. The Namenodes are federated, that is, the
109- Namenodes are independent and don't require coordination with each other.
110- The datanodes are used as common storage for blocks by all the Namenodes.
111- Each datanode registers with all the Namenodes in the cluster. Datanodes
112- send periodic heartbeats and block reports and handles commands from the
113- Namenodes.
114-
115- More details are available in the HDFS Federation document:
116- <http://hadoop.apache.org/docs/r2.4.1/hadoop-project-dist/hadoop-hdfs/Federation.html>
117-
118- - **MapReduce NextGen aka YARN aka MRv2**
119- The new architecture introduced in hadoop-0.23, divides the two major functions of the
120- JobTracker: resource management and job life-cycle management into separate components.
121- The new ResourceManager manages the global assignment of compute resources to
122- applications and the per-application ApplicationMaster manages the application‚
123- scheduling and coordination.
124- An application is either a single job in the sense of classic MapReduce jobs or a DAG of
125- such jobs.
126-
127- The ResourceManager and per-machine NodeManager daemon, which manages the user
128- processes on that machine, form the computation fabric.
129-
130- The per-application ApplicationMaster is, in effect, a framework specific
131- library and is tasked with negotiating resources from the ResourceManager and
132- working with the NodeManager(s) to execute and monitor the tasks.
133-
134- More details are available in the YARN document:
135- <http://hadoop.apache.org/docs/r2.4.1/hadoop-yarn/hadoop-yarn-site/YARN.html>
136+This charm deploys a compute / slave node running the NodeManager
137+and DataNode components of
138+[Apache Hadoop 2.4.1](http://hadoop.apache.org/docs/r2.4.1/),
139+which provides computation and storage resources to the platform.
140
141 ## Usage
142
143-This charm manages the compute slave nodes, which include the DataNode and
144-NodeManager components. It is intended to be used with `apache-hadoop-hdfs-master`
145-and `apache-hadoop-yarn-master`.
146-
147-### Simple Usage: Single YARN / HDFS master deployment
148-
149-In this configuration, the YARN and HDFS master components run on the same
150-machine. This is useful for lower-resource deployments::
151-
152- juju deploy apache-hadoop-hdfs-master hdfs-master
153- juju deploy apache-hadoop-hdfs-checkpoint secondary-namenode --to 1
154- juju deploy apache-hadoop-yarn-master yarn-master --to 1
155- juju deploy apache-hadoop-compute-slave compute-slave
156- juju deploy apache-hadoop-client client
157- juju add-relation yarn-master hdfs-master
158- juju add-relation secondary-namenode hdfs-master
159- juju add-relation compute-slave yarn-master
160- juju add-relation compute-slave hdfs-master
161- juju add-relation client yarn-master
162- juju add-relation client hdfs-master
163-
164-Note that the machine number (`--to 1`) should match the machine number
165-for the `hdfs-master` charm. If you previously deployed other services
166-in your environment, you may need to adjust the machine number appropriately.
167-
168-
169-### Scale Out Usage: Separate HDFS, YARN, and compute nodes
170-
171-In this configuration the HDFS and YARN deployments operate on
172-different service units as separate services::
173-
174- juju deploy apache-hadoop-hdfs-master hdfs-master
175- juju deploy apache-hadoop-hdfs-checkpoint secondary-namenode
176- juju deploy apache-hadoop-yarn-master yarn-master
177- juju deploy apache-hadoop-compute-slave compute-slave -n 3
178- juju deploy apache-hadoop-client client
179- juju add-relation yarn-master hdfs-master
180- juju add-relation secondary-namenode hdfs-master
181- juju add-relation compute-slave yarn-master
182- juju add-relation compute-slave hdfs-master
183- juju add-relation client yarn-master
184- juju add-relation client hdfs-master
185-
186-The `-n 3` option can be adjusted according to the number of compute nodes
187-you need. You can also add additional compute nodes later::
188-
189- juju add-unit compute-slave -n 2
190-
191-
192-### To deploy a Hadoop service with elasticsearch service::
193- # deploy ElasticSearch locally:
194- **juju deploy elasticsearch elasticsearch**
195- # elasticsearch-hadoop.jar file will be added to LIBJARS path
196- # Recommanded to use hadoop -libjars option to included elk jar file
197- **juju add-unit -n elasticsearch**
198- # deploy hive service by any senarios mentioned above
199- # associate Hive with elasticsearch
200- **juju add-relation {hadoop master}:elasticsearch elasticsearch:client**
201+This charm is intended to be deployed via one of the
202+[bundles](https://jujucharms.com/q/bigdata-dev/apache?type=bundle).
203+For example:
204+
205+ juju quickstart u/bigdata-dev/apache-analytics-sql
206+
207+This will deploy the Apache Hadoop platform with Apache Hive available to
208+perform SQL-like queries against your data.
209+
210+You can also manually load and run map-reduce jobs via the client:
211+
212+ juju scp my-job.jar client/0:
213+ juju ssh client/0
214+ hadoop jar my-job.jar
215+
216+
217+### Scaling
218+
219+The compute-slave node is the "workhorse" of the Apache Hadoop platform.
220+To scale your deployment's performance, you can simply add more compute-slave
221+units. For example, to add three mode units:
222+
223+ juju add-unit compute-slave -n 3
224
225
226 ## Deploying in Network-Restricted Environments
227@@ -121,12 +42,14 @@
228 access. To deploy in this environment, you will need a local mirror to serve
229 the packages and resources required by these charms.
230
231+
232 ### Mirroring Packages
233
234 You can setup a local mirror for apt packages using squid-deb-proxy.
235 For instructions on configuring juju to use this, see the
236 [Juju Proxy Documentation](https://juju.ubuntu.com/docs/howto-proxies.html).
237
238+
239 ### Mirroring Resources
240
241 In addition to apt packages, the Apache Hadoop charms require a few binary
242@@ -144,14 +67,19 @@
243
244 You can fetch the resources for all of the Apache Hadoop charms
245 (`apache-hadoop-hdfs-master`, `apache-hadoop-yarn-master`,
246-`apache-hadoop-compute-slave`, `apache-hadoop-client`, etc) into a single
247+`apache-hadoop-hdfs-secondary`, `apache-hadoop-client`, etc) into a single
248 directory and serve them all with a single `juju resources serve` instance.
249
250
251 ## Contact Information
252-amir sanjar <amir.sanjar@canonical.com>
253+
254+* Amir Sanjar <amir.sanjar@canonical.com>
255+* Cory Johns <cory.johns@canonical.com>
256+* Kevin Monroe <kevin.monroe@canonical.com>
257+
258
259 ## Hadoop
260+
261 - [Apache Hadoop](http://hadoop.apache.org/) home page
262 - [Apache Hadoop bug trackers](http://hadoop.apache.org/issue_tracking.html)
263 - [Apache Hadoop mailing lists](http://hadoop.apache.org/mailing_lists.html)
264
265=== modified file 'resources.yaml'
266--- resources.yaml 2015-03-06 22:28:48 +0000
267+++ resources.yaml 2015-03-11 18:02:10 +0000
268@@ -8,8 +8,8 @@
269 six:
270 pypi: six
271 charmhelpers:
272- pypi: http://bazaar.launchpad.net/~bigdata-dev/bigdata-data/trunk/download/cory.johns%40canonical.com-20150306222841-0ibvrfaungfdtkn8/charmhelpers0.2.2.ta-20150304033309-4fa7ewnosqavnwms-1/charmhelpers-0.2.2.tar.gz
273- hash: 787fc1cc70fc89e653b08dd192ec702d844709e450ff67577b7c5e99b6bbf39b
274+ pypi: http://bazaar.launchpad.net/~bigdata-dev/bigdata-data/trunk/download/cory.johns%40canonical.com-20150310214330-f2bk32gk92iinrx8/charmhelpers0.2.2.ta-20150304033309-4fa7ewnosqavnwms-1/charmhelpers-0.2.2.tar.gz
275+ hash: a1cafa5e315d3a33db15a8e18f56b4c64d47c2c1c6fcbdba81e42bb00642971c
276 hash_type: sha256
277 optional_resources:
278 hadoop-aarch64:

Subscribers

People subscribed via source and target branches

to all changes: