Building a 64 Node Hadoop Cluster
Last Updated: August 1, 2016
Cisco Validated Design
The CVD program consists of systems and solutions designed, tested, and documented to facilitate faster, more reliable, and more predictable customer deployments. For more information visit
http://www.cisco.com/go/designzone.
ALL DESIGNS, SPECIFICATIONS, STATEMENTS, INFORMATION, AND RECOMMENDATIONS (COLLECTIVELY, "DESIGNS") IN THIS MANUAL ARE PRESENTED "AS IS," WITH ALL FAULTS. CISCO AND ITS SUPPLIERS DISCLAIM ALL WARRANTIES, INCLUDING, WITHOUT LIMITATION, THE WARRANTY OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT OR ARISING FROM A COURSE OF DEALING, USAGE, OR TRADE PRACTICE. IN NO EVENT SHALL CISCO OR ITS SUPPLIERS BE LIABLE FOR ANY INDIRECT, SPECIAL, CONSEQUENTIAL, OR INCIDENTAL DAMAGES, INCLUDING, WITHOUT LIMITATION, LOST PROFITS OR LOSS OR DAMAGE TO DATA ARISING OUT OF THE USE OR INABILITY TO USE THE DESIGNS, EVEN IF CISCO OR ITS SUPPLIERS HAVE BEEN ADVISED OF THE POSSIBILITY OF SUCH DAMAGES.
THE DESIGNS ARE SUBJECT TO CHANGE WITHOUT NOTICE. USERS ARE SOLELY RESPONSIBLE FOR THEIR APPLICATION OF THE DESIGNS. THE DESIGNS DO NOT CONSTITUTE THE TECHNICAL OR OTHER PROFESSIONAL ADVICE OF CISCO, ITS SUPPLIERS OR PARTNERS. USERS SHOULD CONSULT THEIR OWN TECHNICAL ADVISORS BEFORE IMPLEMENTING THE DESIGNS. RESULTS MAY VARY DEPENDING ON FACTORS NOT TESTED BY CISCO.
CCDE, CCENT, Cisco Eos, Cisco Lumin, Cisco Nexus, Cisco StadiumVision, Cisco TelePresence, Cisco WebEx, the Cisco logo, DCE, and Welcome to the Human Network are trademarks; Changing the Way We Work, Live, Play, and Learn and Cisco Store are service marks; and Access Registrar, Aironet, AsyncOS, Bringing the Meeting To You, Catalyst, CCDA, CCDP, CCIE, CCIP, CCNA, CCNP, CCSP, CCVP, Cisco, the Cisco Certified Internetwork Expert logo, Cisco IOS, Cisco Press, Cisco Systems, Cisco Systems Capital, the Cisco Systems logo, Cisco Unity, Collaboration Without Limitation, EtherFast, EtherSwitch, Event Center, Fast Step, Follow Me Browsing, FormShare, GigaDrive, HomeLink, Internet Quotient, IOS, iPhone, iQuick Study, IronPort, the IronPort logo, LightStream, Linksys, MediaTone, MeetingPlace, MeetingPlace Chime Sound, MGX, Networkers, Networking Academy, Network Registrar, PCNow, PIX, PowerPanels, ProConnect, ScriptShare, SenderBase, SMARTnet, Spectrum Expert, StackWise, The Fastest Way to Increase Your Internet Quotient, TransPath, WebEx, and the WebEx logo are registered trademarks of Cisco Systems, Inc. and/or its affiliates in the United States and certain other countries.
All other trademarks mentioned in this document or website are the property of their respective owners. The use of the word partner does not imply a partnership relationship between Cisco and any other company. (0809R)
© 2016 Cisco Systems, Inc. All rights reserved.
Table of Contents
Lambda Architecture: Merging Batch and Real-time Data
Cisco UCS Integrated Infrastructure for Big Data and Analytics with HDP and HDF
Cisco UCS 6200 Series Fabric Interconnects
Cisco UCS 6300 Series Fabric Interconnects
Cisco UCS C-Series Rack Mount Servers
Cisco UCS Virtual Interface Cards (VICs)
Benefits of Hortonworks DataFlow
Features of Hortonworks DataFlow
Port Configuration on Fabric Interconnects
Server Configuration and Cabling for Cisco UCS C-Series M4
Software Distributions and Versions
Hortonworks Data Platform (HDP 2.4)
Red Hat Enterprise Linux (RHEL)
Performing Initial Setup of Cisco UCS 6296 Fabric Interconnects
Configure Fabric Interconnect A
Configure Fabric Interconnect B
Logging Into Cisco UCS Manager
Upgrading UCSM Software to Version 3.1(1g)
Adding a Block of IP Addresses for KVM Access
Creating Pools for Service Profile Templates
Creating Policies for Service Profile Templates
Creating Host Firmware Package Policy
Creating the Local Disk Configuration Policy
Creating a Service Profile Template
Configuring the Storage Provisioning for the Template
Configuring Network Settings for the Template
Configuring the vMedia Policy for the Template
Configuring Server Boot Order for the Template
Configuring Server Assignment for the Template
Configuring Operational Policies for the Template
Installing Red Hat Enterprise Linux 7.2
Setting Up Password-less Login
Creating a Red Hat Enterprise Linux (RHEL) 7.2 Local Repo
Creating the Red Hat Repository Database.
Set Up all Nodes to use the RHEL Repository
Upgrading the Cisco Network driver for VIC1227
Disable Transparent Huge Pages
Configuring Data Drives on Name Node and Other Management Nodes
Configuring Data Drives on Data Nodes
Configuring the Filesystem for NameNodes and Datanodes
Install and Setup Ambari Server on rhel1
External Database PostgreSQL Installation
Setting up the Ambari Server on the Admin Node (rhel1)
Logging into the Ambari Server
Summary of the Installation Process
Configuring Disk Drives for the Operating System on HDF Nodes
Configuring Disk Drives for Data on HDF Nodes
Configuring the Filesystem for HDF Data Disks.
Post OS configuration of HDF Nodes
Appendix B – A Sample Application with HDF Using Live Twitter Stream with Apache Solr and Banana
Configuring Apache Nifi Project with Solr
Twitter Dashboard of the Demo Application
In the early days, most enterprises focused on batch processing to unlock the value of big data. Although enterprises benefited from batch processing, companies today want to extract value from their increasing volumes of data in real-time. Sensors, Internet of Things (IoT) devices, social networking and online transactions are all generating data that needs to be captured, monitored and rapidly processed to make data-based decisions instantly.
This strong demand for processing both data-in-motion and data-at-rest has been the main factor promoting the development of the Lambda architecture, which supports both batch and real-time processing of data. It includes support for complex event processing with applications such as Kafka and Storm, near real-time analytics with Spark Streaming, interactive SQL with Hive and HAWQ, as well as persistence and batch analytics with the Hadoop Distributed File System (HDFS) and MapReduce.
The Hortonworks Data Platform (HDP) is the industry’s enterprise-ready open-source Apache Hadoop framework that completely addresses the needs of data-at-rest processing, powers real-time customer applications and accelerates decision-making and innovation.
Hortonworks Dataflow (HDF) accelerates deployment of big data infrastructure and enables real-time analysis via an intuitive graphical user interface. HDF simplifies, streamlines and secures real-time data collection from distributed, heterogeneous data sources and provides a coding-free, off-the-shelf UI for on-time big data insights. HDF provides a better approach to data collection which is simple, secure, flexible and easily extensible.
Building next-generation big data architecture requires simplified and centralized management, high performance, and a linearly-scaling infrastructure and software platform. Cisco UCS® Integrated Infrastructure for Big Data and Analytics with HDP and HDF powers the next-generation architecture for big data systems, spanning a myriad of use cases including IoT, fraud analytics, and precision medicine via genome sequencing, among others. The configuration detailed in this document can be extended to clusters of various sizes depending on what the application demands. Up to 80 servers (5 racks) can be supported with no additional switching in a single UCS domain. Scaling beyond 5 racks (80 servers) can be implemented by interconnecting multiple UCS domains using Nexus 9000 Series switches or Application Centric Infrastructure (ACI), scalable to thousands of servers and to hundreds of petabytes of storage, all managed from a single pane using UCS Central.
Big Data is now all about data-in-motion, data-at-rest and analytic applications. This solution unlocks the value of big data while maximizing existing investments.
The speed of today’s processing systems has moved from classical data warehousing batch reporting to the realm of real-time processing and analytics. Real-time means near-zero latency, and access to information as it is generated to enable real-time decision-making from streaming data sources.
Big data, data science and analytics now have the potential to radically change how industries deliver services. They now empower consumers to make informed decisions with new business insights.
Apache Hadoop is the most popular big data framework. The technology is evolving rapidly to enable decisions for business strategy, technology architecture and implementation approaches for rapid success.
This solution is a simple and linearly scalable architecture that provides data processing on the HDP that caters to both batch and real-time processing with a centrally managed automated Hadoop deployment, providing all the benefits of the UCS Integrated Infrastructure for Big Data and Analytics.
With this solution you can deploy HDF on an existing Hadoop cluster or on a completely new cluster. This implementation addresses batch processing and stream processing combined with other technologies like NiFi, Kafka, etc.
Some of the features of this solution include:
· Infrastructure for both big data and agile analytics.
· Simplified infrastructure management via the Cisco UCS Manager.
· Flexible big data platform which works for both batch and real time processing.
· Architectural scalability - linear scaling based on data requirements.
· Usage of Hortonworks Data Platform (HDP) for comprehensive cluster monitoring and management.
This solution is based on the Cisco UCS Integrated Infrastructure for Big Data and Analytics which includes compute, storage, network and unified management capabilities to help companies manage the immense amount of data they collect today. It is built on the Cisco Unified Computing System (Cisco UCS) infrastructure using Cisco UCS 6200 Series Fabric Interconnects and Cisco UCS C-Series Rack Servers. This architecture is specifically designed for performance and linear scalability for big data workloads.
This document describes the architecture and deployment procedures for Hortonworks Data Platform (HDP) and Hortonworks Data Flow (HDF) on a 64 Cisco UCS C240 M4 node cluster based on Cisco UCS Integrated Infrastructure for Big Data and Analytics. The intended audience of this document includes, but is not limited to, sales engineers, field consultants, professional services, IT managers, partner engineering and customers who want to deploy the Hortonworks Connected Data Platform on Cisco UCS Integrated Infrastructure for Big Data and Analytics.
This CVD describes in detail the process for installing Hortonworks 2.4.2 with Apache Spark, Kafka, Storm and NiFi including the configuration details of the cluster. It also details application configuration for the HDF libraries. The current version of Cisco UCS Integrated Infrastructure for Big Data and Analytics offers the following configurations depending on the compute and storage requirements as shown in Table 1.
Table 1 Cisco UCS Integrated Infrastructure for Big Data and Analytics Configuration Details
Performance Optimized Option 1 (UCS-SL-CPA4-P1) |
Performance Optimized Option 2 (UCS-SL-CPA4-P2) |
Performance Optimized Option 3 UCS-SL-CPA4-P3 |
Capacity Optimized Option 1 UCS-SL-CPA4-C1 |
Capacity Optimized Option 2 UCS-SL-CPA4-C2 |
2 Cisco UCS 6296 UP, 96 ports Fabric Interconnects. 16 Cisco UCS C240 M4 Rack Servers (SFF), each with: 2 Intel Xeon processors E5-2680 v4 CPUs (14 cores on each CPU) 256 GB of memory Cisco 12-Gbps SAS Modular Raid Controller with 2-GB flash-based write cache (FBWC) 24 1.2-TB 10K SFF SAS drives (460 TB total) 2 240-GB 6-Gbps 2.5-inch Enterprise Value SATA SSDs for Boot Cisco UCS VIC 1227 (with 2 10 GE SFP+ ports) |
2 Cisco UCS 6296 UP, 96 ports Fabric Interconnects. 16 Cisco UCS C240 M4 Rack Servers (SFF), each with: 2 Intel Xeon processors E5-2680 v4 CPUs (14 cores on each CPU) 256 GB of memory Cisco 12-Gbps SAS Modular Raid Controller with 2-GB flash-based write cache (FBWC) 24 1.8-TB 10K SFF SAS drives (691 TB total) 2 240-GB 6-Gbps 2.5-inch Enterprise Value SATA SSDs for Boot Cisco UCS VIC 1227 (with 2 10 GE SFP+ ports) |
2 Cisco UCS 6332 Fabric Interconnects. 16 Cisco UCS C240 M4 Rack Servers (SFF), each with: 2 Intel Xeon processors E5-2680 v4 CPUs (14 cores on each CPU) 256 GB of memory Cisco 12-Gbps SAS Modular Raid Controller with 2-GB flash-based write cache (FBWC) 24 1.8-TB 10K SFF SAS drives (460 TB total) 2 240-GB 6-Gbps 2.5-inch Enterprise Value SATA SSDs for Boot Cisco UCS VIC 1387 (with 2 40 GE SFP+ ports) |
2 Cisco UCS 6296 UP, 96 ports Fabric Interconnects. 16 Cisco UCS C240 M4 Rack Servers (LFF), each with: 2 Intel Xeon processors E5-2620 v4 CPUs (8 128 GB of memory Cisco 12-Gbps SAS Modular Raid Controller with 2-GB flash-based write cache (FBWC) 12 6-TB 7.2K LFF SAS drives (1152 TB total) 2 240-GB 6-Gbps 2.5-inch Enterprise Value SATA SSDs for Boot Cisco UCS VIC 1227 (with 2 10 GE SFP+ ports) |
2 Cisco UCS 6296 UP, 96 ports Fabric Interconnects. 16 Cisco UCS C240 M4 Rack Servers (LFF), each with: 2 Intel Xeon processors E5-2620 v4 CPUs (8 256 GB of memory Cisco 12-Gbps SAS Modular Raid Controller with 2-GB flash-based write cache (FBWC) 12 8-TB 7.2K LFF SAS drives (1536 TB total) 2 240-GB 6-Gbps 2.5-inch Enterprise Value SATA SSDs for Boot Cisco UCS VIC 1227 (with 2 10 GE SFP+ ports) |
Companies have realized the power of big data and are now collecting more data than ever before. They need to get value from this data, often in real-time. Sensors, IoT devices, social network data and online transactions are just a few examples. They are all generating data continuously, 24x7. This data needs to be captured, monitored and processed quickly in order to make informed, data-driven decisions in real-time.
In addition, real-time streaming data needs to be sent to the company’s enterprise-wide data store where it can be used for traditional analysis and reporting, data discovery and as input to sophisticated machine-learning algorithms.
Real-time data processing at scale has very specialized requirements and one of the most popular tools in use today is Apache Spark. By moving the computation into memory Spark enables a wide variety of processing, including: traditional batch jobs, interactive analysis, real-time streaming and machine learning. Spark accomplishes this through a powerful set of built-in libraries: Spark Core, Spark Streaming, Spark SQL and MLlib. The internal details of these libraries are described below in the Technology Overview section of this document.
Spark enables applications in Hadoop clusters to run faster by caching datasets. With the data now being available in RAM instead of on disk, performance is improved dramatically, especially for iterative algorithms that access the same dataset repeatedly. In addition to traditional Map and Reduce operations, Spark’s unique features enable interactive and batch operations with SQL-like queries, machine learning and graph data processing. Spark allows programmers to develop complex, multi-step data pipelines using directed acyclic graph (DAG) patterns. It also supports in-memory data sharing across DAGs so that different jobs can work with the same data.
Analyzing streaming data at the edge using fog nodes for data collection, Apache Kafka for transport and Spark for analysis are becoming very common in the industry. Dashboards and visualization software on top of these analysis platforms enable enterprises to visualize and monitor their business in real-time.
The data can also be ingested into the enterprise distributed data lake, traditional SQL databases and NoSQL databases where it can be used to power dashboards, reporting, interactive analysis, data mining and machine learning.
The Hortonworks Connected Data Platform enables the creation of systems that implement the Lambda Architecture, a data-processing architecture designed to handle massive quantities of data by taking advantage of both batch- and stream-processing methods. This approach to architecture balances latency, throughput and fault-tolerance by using batch processing to provide comprehensive and accurate views of batch data, while simultaneously using real-time stream processing to provide views of online data. The two view outputs may be joined before presentation providing historical, interactive and real-time views of the data.
Figure 1 below depicts the merging of batch and real-time data. Data-in-motion is handled by HDF which collects the real-time data from the source, then filters, analyzes and delivers it to the target data store. HDF efficiently collects, aggregates and transports large amounts of streaming event data, processing it in real-time as necessary before sending it on. HDP is the industry’s enterprise-ready open-source Apache Hadoop framework that completely addresses the needs of data-at-rest processing, powers real-time customer applications and accelerates decision-making and innovation. Note the complete integration of dynamic, disparate and distributed data sources all with different formats, schemas, protocols, processing speeds and data sizes.
Figure 1 Typical Data Flow for Real Time Processing
The Cisco UCS Integrated Infrastructure for Big Data and Analytics, Hortonworks Data Platform and Hortonworks DataFlow are designed to accelerate the return-on-investment from big data. This solution delivers data from anywhere it originates to anywhere it needs to go.
With the ability to react quickly and make informed decisions based on real-time data, more businesses are moving beyond traditional batch-processing methods to faster, targeted, more informed approaches. Now processing can happen in seconds to minutes instead of hours to days.
The next section details the relevant reference architectures for meeting these real world challenges.
Figure 2 shows the base configuration of 64 nodes with SFF (1.8TB) drives. This also offers HA of the cluster with 3 management nodes.
Figure 2 Reference Architecture for HDP
Note: This CVD describes the installation process of HDP 2.4.2 for a 64 node (3 Management Nodes for HA + 61 Data Nodes) on the Performance Optimized Option 2 Cluster configuration. It also has details on how to add in HDF if needed as part of the same cluster.
Note: If a customer decides to use the 6300 series FI (40 G connectivity) for the configuration instead of the 6200 series FI in Performance Optimized Option 2, the only change will be to add in the Cisco VIC 1387, the rest of the configuration will be the same.
Figure 3 shows the complete reference architecture with HDF included for streaming data into the Hadoop cluster. In this architecture additional Cisco UCS C220 M4 nodes are added to the HDP architecture shown above.
Figure 3 Reference Architecture for Spark Streaming with HDF & HDP
Table 2 Configuration Details
Component |
Description |
Connectivity |
2 Cisco UCS 6296UP 96-Port Fabric Interconnects Up to 80 servers with no additional switching infrastructure |
HDP Nodes |
64 Cisco UCS C240 M4 Rack Servers Hadoop NameNode/Secondary NameNode and Resource Manager and Data Nodes. Spark Executors are collocated on a Data Node. *Please refer to the Service Assignment section for specific service assignment and configuration details. |
HDF Nodes |
8 Cisco UCS C220 M4 Rack Servers |
The Cisco UCS Integrated Infrastructure for Big Data and Analytics solution for Hortonworks, is based on Cisco UCS Integrated Infrastructure for Big Data and Analytics, a highly scalable architecture designed to meet a variety of scale-out application demands with seamless data integration and management integration capabilities built using the following components:
Cisco UCS 6200 Series Fabric Interconnects, as shown in Figure 4, provide high-bandwidth, low-latency connectivity for servers, with integrated, unified management provided for all connected devices by Cisco UCS Manager. Deployed in redundant pairs, Cisco Fabric Interconnects offer the full active-active redundancy, performance and exceptional scalability needed to support the large number of nodes that are typical in clusters serving big data applications. Cisco UCS Manager enables rapid and consistent server configuration using service profiles, automating ongoing system maintenance activities such as firmware updates across the entire cluster as a single operation. Cisco UCS Manager also offers advanced monitoring with options to raise alarms and send notifications about the health of the entire cluster.
Figure 4 Cisco UCS 6296UP 96-Port Fabric Interconnect
Cisco UCS 6300 Series Fabric Interconnects is the new series of Fabric Interconnects that Cisco introduced. The Cisco UCS 6300 Series Fabric Interconnects as shown in Figure 5, are a core part of Cisco UCS, providing low-latency, lossless 10 and 40 Gigabit Ethernet, Fiber Channel over Ethernet (FCoE), and Fiber Channel functions with management capabilities for the system. All servers attached to Fabric Interconnects become part of a single, highly available management domain.
Figure 5 Cisco UCS 6332 UP 32 -Port Fabric Interconnect
Cisco UCS C-Series Rack Mount C220 M4 High-Density Rack Servers (Small Form Factor Disk Drive Model), and Cisco UCS C240 M4 High-Density Rack Servers (Small Form Factor Disk Drive Model), are enterprise-class systems that support a wide range of computing, I/O and storage-capacity demands in compact designs, as shown in Figure 6 and Figure 7. Cisco UCS C-Series Rack-Mount Servers are based on the Intel Xeon E5-2600 v4 series processor family that delivers the best combination of performance, flexibility and efficiency gains with 12-Gbps SAS throughput. The Cisco UCS C240/C220 M4 servers provide 24 DIMM (PCIe) 3.0 slots and can support up to 1.5 TB of main memory (128 or 256 GB is typical for big data applications). It can support a range of disk drive and SSD options. Specifically, Cisco UCS C240 M4 supports twenty-four Small Form Factor (SFF) disk drives plus two (optional) internal SATA boot drives for a total of 26 internal drives in the Performance Optimized option or twelve Large Form Factor (LFF) disk drives option plus two (optional) internal SATA boot drives for a total of 14 internal drives are supported in the Capacity Optimized option. Cisco UCS Virtual Interface cards 1227 (VICs) are designed for the M4 generation of Cisco UCS C-Series Rack Servers (both C240 and C220 servers) and are optimized for high-bandwidth and low-latency cluster connectivity, with support for up to 256 virtual devices that are configured on demand through Cisco UCS Manager.
Figure 6 Cisco UCS C240 M4 Rack Server
Figure 7 Cisco UCS C220 M4 Rack Server (Small Form Factor Disk Drive Model)
Cisco UCS Virtual Interface Cards (VICs) are unique to Cisco. Cisco UCS Virtual Interface Cards as shown in Figure 8, incorporate next-generation converged network adapter (CNA) technology from Cisco and offer dual 10-Gbps ports designed for use with Cisco UCS C-Series Rack-Mount Servers. Optimized for virtualized networking, these cards deliver high performance and bandwidth utilization, and support up to 256 virtual devices. The Cisco UCS Virtual Interface Card (VIC) 1227 is a dual-port, Enhanced Small Form-Factor Pluggable (SFP+), 10 Gigabit Ethernet, and Fiber Channel over Ethernet (FCoE)-capable, PCI Express (PCIe) modular LAN on motherboard (mLOM) adapter.
The Cisco UCS Virtual Interface Card 1387 offers dual-port Enhanced Quad Small Form-Factor Pluggable (QSFP+) 40 Gigabit Ethernet and Fiber Channel over Ethernet (FCoE) in a modular-LAN-on-motherboard (mLOM) form factor. The mLOM slot can be used to install a Cisco VIC without consuming a PCIe slot providing greater I/O expandability. Shown in Figure 9, below.
Cisco UCS Manager resides within the Cisco UCS 6200 Series Fabric Interconnect, (shown in Figure 10). It makes the system self-aware and self-integrating, managing all of the system components as a single logical entity. Cisco UCS Manager can be accessed through an intuitive graphical user interface (GUI), a command-line interface (CLI) or an XML application-programming interface (API). Cisco UCS Manager uses service profiles to define the personality, configuration, and connectivity of all resources within Cisco UCS, radically simplifying provisioning of resources so that the process takes minutes instead of days. This simplification allows IT departments to shift their focus from constant maintenance to strategic business initiatives.
The Hortonworks Data Platform (HDP) delivers essential capabilities in a completely open, integrated and tested platform that is ready for enterprise usage. With Hadoop YARN at its core, HDP provides flexible enterprise data processing across a range of data processing engines, paired with comprehensive enterprise capabilities for governance, security and operations.
All the integration of the entire solution is thoroughly tested and fully documented. By taking the guesswork out of building out a Hadoop deployment, HDP gives a streamlined path to success in solving real business problems.
With YARN at its foundation, HDP provides a range of processing engines that allow users to interact with data in multiple and parallel ways, without the need to stand up individual clusters for each data set/application. Some applications require batch while others require interactive SQL or low-latency access with NoSQL. Other applications require search, streaming or in-memory analytics. Apache Solr, Storm and Spark fulfill those needs respectively.
To function as a true data platform, the YARN-based architecture of HDP enables the widest possible range of access methods to coexist within the same cluster avoiding unnecessary and costly data silos.
As shown in Figure 11, HDP Enterprise natively provides for the following data access types:
· Batch – Apache MapReduce has served as the default Hadoop processing engine for years. It is tested and relied upon by many existing applications.
· Interactive SQL Query - Apache Hive™ is the de facto standard for SQL interactions at petabyte scale within Hadoop. Hive delivers interactive and batch SQL querying across the broadest set of SQL semantics.
· Search - HDP integrates Apache Solr to provide high-speed indexing and sub-second search times across all your HDFS data.
· Scripting - Apache Pig is a scripting language for Hadoop that can run on MapReduce or Apache Tez, allowing you to aggregate, join and sort data.
· Low-latency access via NoSQL - Apache HBase provides extremely fast access to data as a columnar format, NoSQL database. Apache Accumulo also provides high-performance storage and retrieval, but with fine-grained access control to the data.
· Streaming - Apache Storm processes streams of data in real time and can analyze and take action on data as it flows into HDFS.
HDP delivers a comprehensive set of completely open operational capabilities that provide both visibilities into cluster health as well as the ability to manage, monitor and configure resources.
· Apache Ambari – is a completely open framework to provision, manage and monitor Apache Hadoop clusters. It provides a simple, elegant UI that allows you to image a Hadoop cluster.
· Apache Oozie - provides a critical scheduling capability to organize and schedule jobs within Enterprise Hadoop across all data access points.
Hortonworks DataFlow is a single combined platform for data acquisition, simple event processing, transport and delivery, designed to accommodate the highly diverse and complicated data flows generated by a world of connected people, systems and things. HDF enables simple, fast data acquisition, secure data transport, prioritized data flow and clear traceability of data from the very edge of your network all the way to the core data center. Through a combination of an intuitive visual interface, a high fidelity access and authorization mechanism and an “always on” chain of custody (data provenance) framework, HDF is the perfect complement to HDP to bring together historical and perishable insights for your business. A single integrated platform for data acquisition, simple event processing, transport and delivery mechanism from source to storage. Figure 12 shows the dataflow.
Figure 12 Hortonworks DataFlow Process
HDF was designed specifically to meet the practical challenges of collecting data from a wide range of disparate data sources securely, efficiently and over a geographically disperse and possibly fragmented network.
Hortonworks Dataflow enables enterprises to
· Accelerate big data ROI through a single data-source agnostic collection platform.
· Reduce cost and complexity through an intuitive, real-time visual user interface.
· Unprecedented yet simple to implement data security from source to storage.
· Better business decisions with highly granular data sharing policies.
· React in real time by leveraging bi-directional data flows and prioritized data feeds.
· Adapt to new data sources through an extremely scalable, extensible platform.
· Data collection – Integrated collection from dynamic, disparate and distributed sources of differing formats, schemas, protocols, speeds and sizes such as machines, geo location devices, click streams, files, social feeds, log files and videos.
· Real time decisions - Real-time evaluation of perishable insights at the edge as being pertinent or not, and executing upon consequent decisions to send, drop or locally store data as needed.
· Operational efficiency - Fast, effective drag and drop interface for creation, management, tuning and troubleshooting of dataflows, enabling coding free creation and adjustments of dataflows in five minutes or less.
· Security and provenance- Secure end-to-end routing from source to destination, with discrete user authorization and detailed real-time visual chain of custody and metadata (data provenance).
· Bi-directional dataflow - Reliably prioritize and transport data in real-time leveraging bi-directional dataflows to dynamically adapt to fluctuations in data volume, network connectivity and source and endpoint capacity.
· Command and control - Immediate ability to create, change, tune, view, start, stop, trace, parse, filter, join, merge, transform, fork, clone or replay dataflows through a visual user interface with real time operational visibility and feedback.
Figure 13 below displays a Hortonworks DataFlow Workflow Diagram.
Figure 13 Hortonworks DataFlow Workflow Diagram
Traditional servers are not designed to support the massive scalability, performance and efficiency requirements of big data solutions. These outdated and siloed computing solutions are difficult to integrate with network and storage resources, and are time-consuming to deploy and expensive to operate. Cisco UCS Integrated Infrastructure for Big Data and Analytics with Apache Spark takes a different approach, combining computing, networking, storage access and management capabilities into a unified, fabric-based architecture that is optimized for big data workloads.
Apache Spark enhances the existing big data environments by adding new capabilities to Hadoop or other big data deployments. The platform unifies a broad range of capabilities—batch processing, real-time stream processing, advanced analytic capabilities, and interactive exploration—that can intelligently optimize applications. Spark’s key advantage is speed, with an advanced DAG execution engine that supports cyclic data flow and in-memory computing. It can run programs much faster than Hadoop/Map-Reduce. Applications can be developed using built-in, high-level Apache Spark operations or they can be interactive applications with Python, R, and Scala shells or they can be in Java. These various options allow users to quickly and easily build new applications and explore data faster.
Spark provides programmers with an application interface centered on a data structure called the resilient distributed dataset (RDD), a read-only set of data items distributed over a cluster of machines that is maintained in a fault-tolerant way. Calculations are performed and results are delivered only when needed, and results can be configured to persist in memory which allows Apache Spark to deliver a new level of computing efficiency and computation performance to big data deployments.
Figure 14 Apache Spark Libraries
As shown in Figure 14 above, Apache Spark has a number of libraries:
· Apache Spark SQL/DataFrame API for querying structured data inside Spark programs.
· Apache Spark streaming offers Spark’s core API enabling real-time processing of streaming data, including web server log files, social media, and messaging queues.
· MLLib to take advantage of machine- learning algorithms and accelerate application performance across clusters.
Spark can access diverse data sources including HDFS, Cassandra, HBase, and S3. Spark with YARN is an optimal way to schedule and run Spark jobs on a Hadoop cluster alongside a variety of other data processing frameworks, leveraging existing clusters using queue-based placement policies, and enabling security by running on Kerberos enabled clusters.
Some common use cases that are popular in the field with Apache Spark:
Insurance |
Optimize claims reimbursement processes by using Spark’s machine learning capabilities to process and analyze all claims. |
Healthcare |
Build a Patient Care System using Spark Core, Streaming and SQL. |
Retail |
Use Spark to analyze point-of-sale data and coupon usage. |
Internet |
Use Spark’s ML capability to identify fake profiles and enhance product matches to show their customers. |
Banking |
Use a machine learning model to predict the profile of retail banking customers for certain financial products. |
Government |
Analyze spending across geography, time and category. |
Scientific Research |
Analyze earthquake events by time, depth and geography to predict future events. |
Investment Banking |
Analyze intra-day stock prices to predict future price movements. |
Geospatial Analysis |
Analyze Uber trips by time and geography to predict future demand and pricing. |
Twitter Sentiment Analysis |
Analyze large volumes of Tweets to determine positive, negative or neutral sentiment for specific organizations and products. |
Airlines |
Build a model for predicting airline travel delays. |
Devices |
Predict likelihood of a building exceeding threshold temperatures. |
Spark Streaming brings Spark's language-integrated API to stream processing. The API is provided in Java, Scala and Python. Spark’s single execution engine and unified programming model for batch and streaming lead to some unique benefits over other traditional streaming systems.
In Spark Streaming, batches of Resilient Distributed Datasets (RDDs) are passed to Spark Streaming which processes these batches using the Spark Engine and returns a processed stream of batches. This processed stream can be written to the file system.
Each batch of data is a Resilient Distributed Dataset (RDD), which is the basic abstraction of a fault-tolerant dataset in Spark. This common representation allows batch and streaming workloads to interoperate seamlessly. Users can apply arbitrary Spark functions on each batch of streaming data: for example, it’s easy to join a DStream (key programming abstraction in Spark Streaming) with a precomputed static dataset (an RDD). Spark interoperability extends to rich libraries like MLlib (machine learning), SQL and DataFrames.
Machine learning models generated offline with MLlib can be applied on streaming data. Fault tolerance in Spark Streaming is similar to fault tolerance in Spark. Spark Streaming is a streaming platform that allows reaching sub-second latency. The processing capability scales linearly with the size of the cluster. As a result, it is being used in production by many organizations.
Spark SQL allows users to query structured data inside Spark programs, using either SQL or DataFrame API. DataFrames and SQL provide a common way to access a variety of data sources, including Hive, Avro, Parquet, ORC, JSON and JDBC. Spark SQL brings native support for SQL to Spark and streamlines the process of querying data stored both in RDDs and in external sources. Spark SQL conveniently blurs the lines between RDDs and relational tables.
Spark SQL provides, a programming abstraction DataFrame that acts as a distributed SQL query engine. In addition to the data sources API, Spark SQL now makes it easier to compute over structured data stored in a wide variety of formats, including Parquet, JSON and the Apache Avro library. A built-in JDBC server makes it easy to connect to the structured data stored in relational database tables and perform big data analytics using the traditional BI tools.
Apache Kafka
Apache Kafka is a distributed publish-subscribe messaging system that is designed to be fast, scalable and durable. Kafka maintains feeds of messages in topics. Producers write data to topics and consumers read from topics. Since Kafka is a distributed system, topics are partitioned and replicated across multiple nodes. Kafka is designed to allow a single cluster to serve as the central data backbone for a large organization. It can be elastically and transparently expanded without downtime. Data streams are partitioned and spread over a cluster of machines to allow data streams larger than the capability of any single machine and to allow clusters of coordinated consumers.
· Messages are simply byte arrays and developers can use them to store any object in any format with string, JSON, and Avro being the most common. It is possible to attach a key to each message, in which case the producer guarantees that all messages with the same key will arrive at the same partition.
· Messages are persisted on disk and replicated within the cluster to prevent data loss. Each broker can handle terabytes of messages without performance impact. When consuming from a topic, it is possible to configure a consumer group with multiple consumers. Each consumer in a consumer group will read messages from a unique subset of partitions in each topic they subscribe to, so each message is delivered to one consumer in the group and all messages with the same key arrive at the same consumer.
Common Use Cases include:
· Stream Processing
· Website Activity Tracking
· Metrics Collection and Monitoring
· Log Aggregation
Some of the important characteristics that make Kafka such an attractive option for these use cases include the following:
Feature |
Description |
Scalability |
Distributed system scales easily with no downtime |
Durability |
Persists messages on disk, and provides intra-cluster replication |
Reliability |
Replicates data, supports multiple subscribers and automatically balances consumers in case of failure |
Performance |
High throughput for both publishing and subscribing, with disk structures that provide constant performance even with many terabytes of stored messages |
This CVD describes architecture and deployment procedures for Hortonworks Data Platform (HDP) 2.4.2 on a 64 Cisco UCS C240 M4SX node cluster based on Cisco UCS Integrated Infrastructure for Big Data and Analytics. The solution goes into detail configuring HDP 2.4.2 on the Cisco UCS Integrated infrastructure for Big Data. In addition it also details the configuration for Hortonworks Dataflow for various use cases.
The Performance cluster configuration consists of the following:
· Two Cisco UCS 6296UP Fabric Interconnects
· 64 Cisco UCS C240 M4 Rack-Mount servers (16 per rack)
· Four Cisco R42610 standard racks
· Eight Vertical Power distribution units (PDUs), (Country Specific)
Each rack consists of two vertical PDUs. The master rack consists of two Cisco UCS 6296UP Fabric Interconnects, sixteen Cisco UCS C240 M4 Servers connected to each of the vertical PDUs for redundancy; thereby, ensuring availability during power source failure. The expansion racks consists of sixteen Cisco UCS C240 M4 Servers connected to each of the vertical PDUs for redundancy; thereby, ensuring availability during power source failure, similar to the master rack. The Cisco UCS C220 M4 is included if installing the HDF with HDP.
Note: Please contact your Cisco representative for country specific information.
Table 3 describes the rack configurations of rack 1 (master rack) and racks 2-4 (expansion racks).
Table 3 Rack 1 (Master Rack) Racks 2-4 (Expansion Racks)
Cisco |
Master Rack |
Cisco |
Expansion Rack |
42URack |
|
42URack |
|
42 |
Cisco UCS FI 6296UP |
42 |
Unused |
41 |
41 |
Unused |
|
40 |
Cisco UCS FI 6296UP |
40 |
Unused |
39 |
39 |
Unused |
|
38 |
Cisco UCS C220 M4 |
38 |
Cisco UCS C220 M4 |
37 |
Cisco UCS C220 M4 |
37 |
Cisco UCS C220 M4 |
36 |
Unused |
36 |
Unused |
35 |
35 |
Unused |
|
34 |
Unused |
34 |
Unused |
33 |
33 |
Unused |
|
32 |
Cisco UCS C240 M4 |
32 |
Cisco UCS C240 M4 |
31 |
31 |
|
|
30 |
Cisco UCS C240 M4 |
30 |
Cisco UCS C240 M4 |
29 |
29 |
|
|
8 |
Cisco UCS C240 M4 |
28 |
Cisco UCS C240 M4 |
27 |
27 |
|
|
26 |
Cisco UCS C240 M4 |
26 |
Cisco UCS C240 M4 |
25 |
25 |
|
|
24 |
Cisco UCS C240 M4 |
24 |
Cisco UCS C240 M4 |
23 |
23 |
|
|
22 |
Cisco UCS C240 M4 |
22 |
Cisco UCS C240 M4 |
21 |
21 |
|
|
20 |
Cisco UCS C240 M4 |
20 |
Cisco UCS C240 M4 |
19 |
19 |
|
|
18 |
Cisco UCS C240 M4 |
18 |
Cisco UCS C240 M4 |
17 |
17 |
Cisco UCS C240 M4 |
|
16 |
Cisco UCS C240 M4 |
16 |
|
15 |
15 |
|
|
14 |
Cisco UCS C240 M4 |
14 |
Cisco UCS C240 M4 |
13 |
13 |
|
|
12 |
Cisco UCS C240 M4 |
12 |
Cisco UCS C240 M4 |
11 |
11 |
|
|
10 |
Cisco UCS C240 M4 |
10 |
Cisco UCS C240 M4 |
9 |
9 |
|
|
8 |
Cisco UCS C240 M4 |
8 |
Cisco UCS C240 M4 |
7 |
7 |
|
|
6 |
Cisco UCS C240 M4 |
6 |
Cisco UCS C240 M4 |
5 |
5 |
|
|
4 |
Cisco UCS C240 M4 |
4 |
Cisco UCS C240 M4 |
3 |
3 |
|
|
2 |
Cisco UCS C240 M4 |
2 |
Cisco UCS C240 M4 |
1 |
1 |
|
Port Type |
Port Number |
Network |
1 |
Server |
2 to 65 |
The Cisco C-Series M4 rack server is equipped with Intel Xeon E5-2680 v4 processors; 256 GB of memory, Cisco UCS Virtual Interface Card 1227, Cisco 12-Gbps SAS Modular Raid Controller with 2-GB FBWC, Cisco UCS C240 M4 servers equipped with 24 1.8-TB 10K SFF SAS drives, 2 240-GB SATA SSD for Boot. The Cisco UCS C220 M4 servers have 8 960GB SFF SSDs
Figure 15 illustrates the port connectivity between the Fabric Interconnect, and Cisco UCS C240 M4 server. Sixteen Cisco UCS C240 M4 servers are used in Master rack configurations. Server configuration with Cisco UCS C220 M4 is similar as in Cisco UCS C240 M4.
Figure 15 Fabric Topology for Cisco C240 M4
For more information on physical connectivity and single-wire management see:
For more information on physical connectivity illustrations and cluster setup, see:
Figure 16 depicts a 64-node cluster. Each rack has 16 Cisco UCS C240 M4 servers. Each link in the figure represents 16 x 10 Gigabit Ethernet links from each of the 16 servers connecting to a Fabric Interconnect as a Direct Connect. Every server is connected to both Fabric Interconnects represented with a dual link.
Figure 16 64 Nodes Cluster Configuration
The software distributions required versions are listed below.
The Hortonworks Data Platform supported is HDP 2.4. For more information go to http://www.hortonworks.com.
The operating system supported is Red Hat Enterprise Linux 7.2. For more information visit http://www.redhat.com.
The software versions tested and validated in this document are shown in Table 4.
Layer |
Component |
Version or Release |
Compute |
Cisco UCS C240-M4 |
C240M4.2.0.10c |
Network |
Cisco UCS 6296UP |
UCS 3.1(1g) A |
Cisco UCS VIC1227 Firmware |
4.1.1(d) |
|
Cisco UCS VIC1227 Driver |
2.3.0.20 |
|
Storage |
LSI SAS 3108 |
24.9.1-0011 |
|
LSI MegaRAID SAS Driver |
06.810.10.00 |
Software |
Red Hat Enterprise Linux Server |
7.2 (x86_64) |
Cisco UCS Manager |
3.1(1g) |
|
HDP |
2.4 |
Note: The latest drivers can be downloaded from the link below:
https://software.cisco.com/download/release.html?mdfid=283862063&flowid=25886&softwareid=283853158&release=1.5.7d&relind=AVAILABLE&rellifecycle=&reltype=latest
Note: The Latest Supported RAID controller Driver is already included with the RHEL 7.2 operating system.
Note: Cisco C240 M4 Rack Servers with Broadwell (E5 -2600 v4) CPUs are supported from Cisco UCS firmware 3.1(1g) onwards.
This section provides details for configuring a fully redundant, highly available Cisco UCS 6296 fabric configuration.
· Initial setup of the Fabric Interconnect A and B.
· Connect to UCS Manager using virtual IP address of using the web browser.
· Launch UCS Manager.
· Enable server, uplink and appliance ports.
· Start discovery process.
· Create pools and polices for service profile template.
· Create Service Profile template and 64 Service profiles.
· Associate Service Profiles to servers.
This section describes the initial setup of the Cisco UCS 6296 Fabric Interconnects A and B.
1. Connect to the console port on the first Cisco UCS 6296 Fabric Interconnect.
2. At the prompt to enter the configuration method, enter console to continue.
3. If asked to either perform a new setup or restore from backup, enter setup to continue.
4. Enter y to continue to set up a new Fabric Interconnect.
5. Enter y to enforce strong passwords.
6. Enter the password for the admin user.
7. Enter the same password again to confirm the password for the admin user.
8. When asked if this fabric interconnect is part of a cluster, answer y to continue.
9. Enter A for the switch fabric.
10. Enter the cluster name for the system name.
11. Enter the Mgmt0 IPv4 address.
12. Enter the Mgmt0 IPv4 netmask.
13. Enter the IPv4 address of the default gateway.
14. Enter the cluster IPv4 address.
15. To configure DNS, answer y.
16. Enter the DNS IPv4 address.
17. Answer y to set up the default domain name.
18. Enter the default domain name.
19. Review the settings that were printed to the console, and if they are correct, answer yes to save the configuration.
20. Wait for the login prompt to make sure the configuration has been saved.
1. Connect to the console port on the second Cisco UCS 6296 Fabric Interconnect.
2. When prompted to enter the configuration method, enter console to continue.
3. The installer detects the presence of the partner Fabric Interconnect and adds this fabric interconnect to the cluster. Enter y to continue the installation.
4. Enter the admin password that was configured for the first Fabric Interconnect.
5. Enter the Mgmt0 IPv4 address.
6. Answer yes to save the configuration.
7. Wait for the login prompt to confirm that the configuration has been saved.
For more information on configuring Cisco UCS 6200 Series Fabric Interconnect, see: http://www.cisco.com/en/US/docs/unified_computing/ucs/sw/gui/config/guide/2.0/b_UCSM_GUI_Configuration_Guide_2_0_chapter_0100.html.
To login to Cisco UCS Manager, complete the following steps:
1. Open a Web browser and navigate to the Cisco UCS 6296 Fabric Interconnect cluster address.
2. Click the Launch link to download the Cisco UCS Manager software.
3. If prompted to accept security certificates, accept as necessary.
4. When prompted, enter admin for the username and enter the administrative password.
5. Click Login to log in to the Cisco UCS Manager.
This document assumes the use of UCS 3.1(1g) Refer to Cisco UCS 3.1 Release (upgrade the Cisco UCS Manager software and Cisco UCS 6296 Fabric Interconnect software to version 3.1(1g). Also, make sure the Cisco UCS C-Series version 3.1(1g) software bundles is installed on the Fabric Interconnects.
To create a block of KVM IP addresses for server access in the Cisco UCS environment, complete the following steps.
1. Select the LAN tab at the top of the left window.
2. Select Pools > IpPools > Ip Pool ext-mgmt.
3. Right-click IP Pool ext-mgmt.
4. Select Create Block of IPv4 Addresses.
Figure 17 below shows the initial screen for adding the block of IP addresses.
Figure 17 Adding a Block of IPv4 Addresses for KVM Access Part 1
5. Enter the starting IP address of the block and number of IPs needed, as well as the subnet and gateway information. Figure 18 below shows the window.
Figure 18 Adding a Block of IPv4 Addresses for KVM Access Part 2
6. Click OK to create the IP block.
7. Click OK in the message box.
To enable uplinks ports, complete the following steps:
1. Select the Equipment tab on the top left of the window.
2. Select Equipment > Fabric Interconnects > Fabric Interconnect A (primary) > Fixed Module.
3. Expand the Unconfigured Ethernet Ports section.
4. Select port 1 that is connected to the uplink switch, right-click, then select Reconfigure > Configure as Uplink Port.
5. Select Show Interface and select 10GB for Uplink Connection.
6. A pop-up window appears to confirm your selection. Click Yes then OK to continue.
7. Select Equipment > Fabric Interconnects > Fabric Interconnect B (subordinate) > Fixed Module.
8. Expand the Unconfigured Ethernet Ports section.
9. Select port number 1, which is connected to the uplink switch, right-click, then select Reconfigure > Configure as Uplink Port.
10. Select Show Interface and select 10GB for Uplink Connection.
11. A pop-up window appears to confirm your selection. Click Yes then OK to continue.
Figure 19 shows the access screen selections.
Figure 19 Enabling Uplink Ports
VLANs are configured as in shown in Table 5.
VLAN |
NIC Port |
Function |
VLAN19 |
eth0 |
Data |
The NIC will carry the data traffic from VLAN19. A single vNIC is used in this configuration and the Fabric Failover feature in Fabric Interconnects will take care of any physical port down issues. It will be a seamless transition from an application perspective.
To configure VLANs in the Cisco UCS Manager GUI, complete the following steps:
1. Select the LAN tab in the left pane in the UCSM GUI.
2. Select LAN > LAN Cloud > VLANs.
3. Right-click the VLANs under the root organization.
4. Select Create VLANs to create the VLAN.
Figure 20 shows the VLAN selection window.
5. Enter vlan19 for the VLAN Name.
6. Keep multicast policy as <not set>.
7. Select Common/Global for vlan19.
8. Enter 19 in the VLAN IDs field for the Create VLAN IDs.
9. Click OK and then, click Finish.
10. Click OK in the success message box.
Figure 21 below shows the Create VLANs window.
Figure 21 Creating VLAN for Data
11. Click OK and then, click Finish.
To enable server ports, complete the following steps:
1. Select the Equipment tab on the top left of the window.
2. Select Equipment > Fabric Interconnects > Fabric Interconnect A (primary) > Fixed Module.
3. Expand the Unconfigured Ethernet Ports section.
4. Select all the ports that are connected to the Servers right-click them, and select Reconfigure > Configure as a Server Port.
5. A pop-up window appears to confirm your selection. Click Yes then OK to continue.
6. Select Equipment > Fabric Interconnects > Fabric Interconnect B (subordinate) > Fixed Module.
7. Expand the Unconfigured Ethernet Ports section.
8. Select all the ports that are connected to the Servers right-click them, and select Reconfigure > Configure as a Server Port.
9. A pop-up window appears to confirm your selection. Click Yes, then OK to continue.
The screen for selecting and enabling server ports is shown in Figure 22 below.
Figure 22 Enabling Server Ports
After the Server Discovery, Port 1 will be a Network Port and 2-65 will be Server Ports.
Figure 23 below shows the list of Ethernet Ports created.
Figure 23 List of Ethernet Ports created.
Noy
Organizations are used to arrange and restrict access to various groups within the IT organization, thereby enabling multi-tenancy of the compute resources. This document does not assume the use of Organizations; however the necessary steps are provided for future reference.
To configure an organization within the Cisco UCS Manager GUI, complete the following steps:
1. Click New on the top left corner in the right pane in the UCS Manager GUI.
2. Select Create Organization from the options
3. Enter a name for the organization.
4. (Optional) Enter a description for the organization.
5. Click OK.
6. Click OK in the success message box.
To create MAC address pools, complete the following steps:
1. Select the LAN tab on the left of the window.
2. Select Pools > root.
3. Right-click MAC Pools under the root organization.
4. Select Create MAC Pool to create the MAC address pool. Enter ucs for the name of the MAC pool.
5. (Optional) Enter a description of the MAC pool.
6. Select Assignment Order Sequential.
7. Click Next.
8. Click Add.
9. Specify a starting MAC address.
10. Specify a size of the MAC address pool, which is sufficient to support the available server resources.
11. Click OK.
Figure 24 and Figure 25 show the windows for setting up the MAC Pool.
Figure 25 Specifying first MAC Address and Size
12. Click Finish.
Figure 26 below shows the screen to add the MAC address.
Figure 26 Adding MAC Addresses
13. When the message box displays, click OK.
A server pool contains a set of servers. These servers typically share the same characteristics. Those characteristics can be their location in the chassis, or an attribute such as server type, amount of memory, local storage, type of CPU, or local drive configuration. You can manually assign a server to a server pool, or use server pool policies and server pool policy qualifications to automate the assignment
To configure the server pool within the Cisco UCS Manager GUI, complete the following steps:
1. Select the Servers tab in the left pane in the UCS Manager GUI.
2. Select Pools > root.
3. Right-click the Server Pools.
4. Select Create Server Pool.
5. Enter your required name (ucs) for the Server Pool in the name text box.
6. (Optional) enter a description for the organization.
7. Click Next > to add the servers.
Figure 27 below show the Server Name and Description input window.
Figure 27 Server Name and Description
8. Select all the Cisco UCS C240M4SX servers to be added to the server pool that was previously created (ucs), then Click >> to add them to the pool.
9. Click Finish.
10. Click OK and then click Finish.
Figure 28 below displays the screen for adding servers.
Firmware management policies allow the administrator to select the corresponding packages for a given server configuration. These include adapters, BIOS, board controllers, FC adapters, HBA options, and storage controller properties as applicable.
To create a firmware management policy for a given server configuration using the Cisco UCS Manager GUI, complete the following steps:
1. Select the Servers tab in the left pane in the Cisco UCS Manager GUI.
2. Select Policies > root.
3. Right-click Host Firmware Packages.
4. Select Create Host Firmware Package.
5. Enter the required Host Firmware package name (ucs).
6. Select Simple radio button to configure the Host Firmware package.
7. Select the appropriate Rack package that has been installed.
8. Click OK to complete creating the management firmware package
9. Click OK.
Figure 29 shows the Create Host Firmware window.
Figure 29 Creating a Host Firmware Package
To create the QoS policy for a given server configuration using the Cisco UCS Manager GUI, complete the following steps:
1. Select the LAN tab in the left pane in the UCS Manager GUI.
2. Select Policies > root.
3. Right-click QoS Policies.
4. Select Create QoS Policy.
Figure 30 shows the selection process to create a QoS policy.
5. Enter Platinum as the name of the policy.
6. Select Platinum from the drop down menu.
7. Keep the Burst(Bytes) field set to default (10240).
8. Keep the Rate(Kbps) field set to default (line-rate).
9. Keep Host Control radio button set to default (none).
10. Once the pop-up window appears (Figure 31), click OK to complete the creation of the Policy.
Figure 31 QoS Policy Confirmation screen
To set Jumbo Frames and enable QoS, complete the following steps:
1. Select the LAN tab in the left pane in the Cisco UCSM GUI, as shown in Figure 32, below.
2. Select LAN Cloud > QoS System Class.
3. In the right pane, select the General tab.
4. In the Platinum row, enter 9216 for MTU.
5. Check the Enabled Check box next to Platinum.
6. In the Best Effort row, select none for weight.
7. In the Fiber Channel row, select none for weight.
8. Click Save Changes.
9. Click OK.
To create local disk configuration in the Cisco UCS Manager GUI, complete the following steps:
1. Select the Servers tab on the left pane in the UCS Manager GUI, as shown below in Figure 33.
2. Go to Policies > root.
3. Right-click Local Disk Config Policies.
4. Select Create Local Disk Configuration Policy.
5. Enter ucs as the local disk configuration policy name.
6. Change the Mode to Any Configuration. Check the Protect Configuration box.
7. Keep the FlexFlash State field as default (Disable).
8. Keep the FlexFlash RAID Reporting State field as default (Disable).
9. Click OK to complete the creation of the Local Disk Configuration Policy.
10. Click OK.
Figure 33 Cisco UCSM Servers Tab
The BIOS policy feature in Cisco UCS automates the BIOS configuration process. The traditional method of setting the BIOS is manually, and is often error-prone. By creating a BIOS policy and assigning the policy to a server or group of servers, can enable transparency within the BIOS settings configuration.
Note: BIOS settings can have a significant performance impact, depending on the workload and the applications. The BIOS settings listed in this section is for configurations optimized for best performance which can be adjusted based on the application, performance, and energy efficiency requirements.
To create a server BIOS policy using the Cisco UCS Manager GUI, complete the following steps:
1. Select the Servers tab in the left pane in the UCS Manager GUI, as shown in Figure 34 below.
2. Select Policies > root.
3. Right-click BIOS Policies.
4. Select Create BIOS Policy.
5. Enter your preferred BIOS policy name (ucs).
6. Change the BIOS settings as shown in the following figures.
7. Changes need to be made only to the Processor and RAS Memory settings.
Figure 34 Create Server BIOS Policy Processor screen
8. Set RAS Memory to Maximum Performance/enabled, and DRAM refresh rate to 1x, as shown in Error! Reference source not found. below.
Figure 35 RAS Memory screen
To create boot policies within the Cisco UCS Manager GUI, complete the following steps:
1. Select the Servers tab in the left pane in the UCS Manager GUI.
2. Select Policies > root.
3. Right-click the Boot Policies.
4. Select Create Boot Policy.
Figure 36 shows the screen.
Figure 36 Create Server Boot Policy
5. Enter ucs as the boot policy name.
6. (Optional) enter a description for the boot policy.
7. Keep the Reboot on Boot Order Change check box unchecked.
8. Keep Enforce vNIC/vHBA/iSCSI Name check box checked.
9. Keep Boot Mode Default (Legacy).
10. Expand Local Devices > Add CD/DVD and select Add Local CD/DVD.
11. Expand Local Devices and select Add Local Disk.
12. Expand vNICs and select Add LAN Boot and enter eth0.
13. Click OK to add the Boot Policy.
14. Click OK.
Figure 37 shows the Create Boot Policy screen to add the LAN Boot.
Figure 37 Create Boot Policy/Add LAN Boot screen
To create Power Control policies within the Cisco UCS Manager GUI, complete the following steps:
1. Select the Servers tab in the left pane in the UCS Manager GUI.
2. Select Policies > root.
3. Right-click the Power Control Policies.
4. Select Create Power Control Policy.
Figure 38 displays the Create Power Control Policy screen.
Figure 38 Create Power Control Policy
5. Enter ucs as the Power Control policy name.
6. (Optional) enter a description for the boot policy.
7. Select Performance for Fan Speed Policy.
8. Select No cap for Power Capping selection.
9. Click OK to create the Power Control Policy.
10. Click OK.
Figure 39 shows the Create Power Control Policy, Power Capping selection.
Figure 39 Create Power Control Policy
To create a Service Profile Template, complete the following steps:
1. Select the Servers tab in the left pane in the UCSM GUI.
2. Right-click Service Profile Templates.
3. Select Create Service Profile Template.
Figure 40 shows the Create Service Profile Template selection process.
Figure 40 Create Service Profile Template
The Create Service Profile Template window appears.
To identify the service profile template, complete the following steps as in Figure 41:
1. Name the service profile template ucs. Select the Updating Template radio button.
2. In the UUID section, select Hardware Default as the UUID pool.
3. Click Next to continue to the next section.
Figure 41 Identifying the Service Profile Template
To configure storage policies, complete the following steps:
1. Go to the Local Disk Configuration Policy tab, and select ucs for the Local Storage as shown in Figure 42.
2. Click Next to continue to the next section.
Figure 42 Storage Provisioning
3. Click Next. The Networking window opens.
To configure network settings for the template, complete the following steps as shown in Figure 43.
1. Keep the Dynamic vNIC Connection Policy field at the default.
2. Select Expert radio button for the option how would you like to configure LAN connectivity?
3. Click Add to add a vNIC to the template.
The Create vNIC window opens, as shown in Figure 44.
4. Name the vNIC eth0.
5. Select ucs in the Mac Address Assignment pool.
6. Select the Fabric A radio button.
7. Check the Enable failover check box for the Fabric ID.
8. Check the VLAN19 check box for VLANs, and select the Native VLAN radio button.
9. Select MTU size as 9000.
Perform the next steps as shown in Figure 45 below.
10. Select Adapter Policy as Linux.
11. Select QoS Policy as Platinum.
12. Keep the Network Control Policy as Default.
13. Click OK.
Figure 45 Operational Parameters/Adapter Performance Profile
Figure 46 Networking LAN Configuration
Figure 46 above shows the Network vLAN 19 is set up.
Note: Optionally Network Bonding can be setup on the vNICs for each host for redundancy as well as for increased throughput; steps for this are captured in the Appendix 1.
14. Click Next to continue with SAN Connectivity as shown in Figure 47.
15. Select no vHBAs for How would you like to configure SAN Connectivity?
16. Click Next to continue with Zoning as shown in Figure 48.
No changes need to be made to the Zoning screen.
17. Click Next to continue with vNIC/vHBA placement and shown in Figure 49.
No changes are made to the vNIC/vHBA placement screen.
18. Click Next to configure vMedia Policy.
1. The vMedia Policy window appears as shown in Figure 50, click next to go to the next section.
To set the boot order for the servers, complete the following steps, as shown in Figure 51 below:
1. Select ucs in the Boot Policy name field.
2. Review to make sure that all of the boot devices were created and identified.
3. Verify that the boot devices are in the correct boot sequence.
4. Click OK.
5. Click Next to continue to the next section.
Figure 51 Server Boot Order Screen
6. In the Maintenance Policy window, (Figure 52), apply the maintenance policy.
7. Keep the Maintenance policy set to no policy used by default.
8. Click Next to continue to the next section.
Figure 52 Maintenance Policy Window
In the Server Assignment window (Figure 53), to assign the servers to the pool, complete the following steps:
1. Select ucs for the Pool Assignment field.
2. Select the power state to be Up.
3. Keep the Server Pool Qualification field set to <not set>.
4. Check the Restrict Migration check box.
5. Select ucs in Host Firmware Package.
Figure 53 Server Assignment Window
In the Operational Policies Window (Figure 54), complete the following steps:
1. Select ucs in the BIOS Policy field.
2. Select ucs in the Power Control Policy field.
Figure 54 Operational Policies Window
3. Click Finish to create the Service Profile template.
4. Click OK in the pop-up window to proceed.
5. Select the Servers tab in the left pane of the UCS Manager GUI as shown in Figure 55.
6. Go to Service Profile Templates > root.
7. Right-click Service Profile Templates ucs.
8. Select Create Service Profiles From Template.
Figure 55 Servers Tab/Create Service Profiles From Template
The Create Service Profiles from Template window appears (Figure 56).
Figure 56 Create Service Profiles from Template Window
Association of the Service Profiles will take place automatically.
9. Click OK in the pop-up window to proceed.
The final Cisco UCS Manager window is shown in below in Figure 57.
Figure 57 Cisco UCS Manager Window
The following section provides detailed procedures for installing Red Hat Enterprise Linux 7.2 using Software RAID (OS based Mirroring) on Cisco UCS C240 M4 servers. There are multiple ways to install the Red Hat Linux operating system. The installation procedure described in this deployment guide uses KVM console and virtual media from Cisco UCS Manager.
Note: This requires RHEL 7.2 DVD/ISO for the installation.
To install the Red Hat Linux 7.2 operating system, complete the following steps:
1. Log in to the Cisco UCS 6296 Fabric Interconnect and launch the Cisco UCS Manager application.
2. Select the Equipment tab as shown in Figure 58.
3. In the navigation pane expand Rack-Mounts and then Servers.
4. Right click on the server and select KVM Console.
5. In the KVM window, select the Virtual Media tab.
6. Click the Activate Virtual Devices found in the Virtual Media tab. (Figure 59 below.)
7. In the KVM window (Figure 60), select the Virtual Media tab and click the Map CD/DVD.
8. Browse to the Red Hat Enterprise Linux Server 7.2 installer ISO image file.
Note: The Red Hat Enterprise Linux 7.2 DVD is assumed to be on the client machine.
9. Click Open to add the image to the list of virtual media.
10. In the KVM window, select the KVM tab to monitor during boot.
11. In the KVM window, select the Macros > Static Macros > Ctrl-Alt-Del button in the upper left corner.
12. Click OK.
13. Click OK to reboot the system.
14. On reboot, the machine detects the presence of the Red Hat Enterprise Linux Server 7.2 install media.
15. Select the Install or Upgrade an Existing System.
16. Skip the Media test and start the installation.
17. Select language of installation (Figure 61), and click Continue.
Figure 61 Select Language Window
18. Select Date and time as shown in Figure 62.
Figure 62 Date and Time Window
19. Select the location on the map, set the time and click Done.
Figure 63 Installation Summary Window
20. Click on Installation Destination, shown above in Figure 63.
Figure 64 Installation Summary Window
A Caution symbols appears next to Installation Destination as shown in Figure 64 above.
21. This opens the Installation Destination window displaying the boot disks. This is shown in Figure 65 below.
22. Make the selection, and choose I will configure partitioning. Click Done.
Figure 65 Installation Destination Window
This opens the new window for creating the partitions, as shown in Figure 66.
23. Click on the + sign to add a new partition as shown below, boot partition of size 2048 MB.
24. Click Add Mount Point to add the partition.
The screen refreshes to show the added Mount Point (Figure 67).
Figure 67 Manual Partitioning/Change Device Type
25. Change the Device type to RAID and make sure the RAID Level is RAID1 (Redundancy).
26. Click on Update Settings to save the changes.
27. Click on the + sign to create the swap partition of size 2048 MB as shown in Figure 68 below.
Figure 68 Manual Partitioning/Swap
28. Change the Device type to RAID and RAID level to RAID1 (Redundancy) and click on Update Settings.
Figure 69 Manual Partitioning/Swap
29. Click + to add the / partition. The size can be left empty so it uses the remaining capacity and click Add Mountpoint. (Figure 70).
Figure 70 Manual Partitioning/Add A New Mount Point
In the next window (Figure 71):
30. Change the Device type to RAID and RAID level to RAID1 (Redundancy). Click Update Settings.
Figure 71 / Partition/Change Device Type to RAID
31. Click Done to go back to the main screen and continue the Installation.
The Installation screen opens (Figure 72).
32. Click on Software Selection.
Figure 72 Installation Summary Window
The Software Selection screen opens (Figure 73).
33. Select Infrastructure Server and select the Add-Ons as noted below. Click Done.
The Installation Summary window returns (Figure 72).
34. Click on Network and Hostname.
Figure 74 Network and Host Name Window
Configure Hostname and Networking for the Host ().
35. Type in the hostname as shown below.
Figure 75 Network and Host Name
36. Click on Configure to open the Network Connectivity window (Figure 76).
37. Click on IPV4Settings.
Figure 76 Network Connectivity Window
38. Change the Method to Manual and click Add.
Figure 77 shows the Add Details pop up window.
39. Enter the IP Address, Netmask and Gateway details. Click Add after each addition.
Figure 77 Add IP Address, Netmask and Gateway Details
40. Click Save.
The Ethernet window will open (Figure 78).
41. Update the hostname and turn Ethernet ON. Click Done to return to the main menu.
The Installation Summary window opens (Figure 79).
42. Click Begin Installation in the main menu.
Figure 79 Installation Summary Window
A new window opens (Figure 80).
43. Select Root Password in the User Settings.
Figure 80 Select Root Password
44. On the next screen (Figure 81), enter the Root Password and click done.
Figure 81 Enter the Root Password
A progress window will open (Figure 82).
45. Once the installation is complete reboot the system.
46. Repeat steps 1 to 45 to install Red Hat Enterprise Linux 7.2 on Servers 2 through 64.
Note: The OS installation and configuration of the nodes that is mentioned above can be automated through PXE boot or third party tools.
The hostnames and their corresponding IP addresses are shown in Table 6.
Table 6 Hostnames and IP Addresses
Hostname |
eth0 |
rhel1 |
10.4.1.31 |
rhel2 |
10.4.1.32 |
rhel3 |
10.4.1.33 |
rhel4 |
10.4.1.34 |
rhel1 |
10.4.1.35 |
rhel6 |
10.4.1.36 |
rhel7 |
10.4.1.37 |
rhel8 |
10.4.1.38 |
rhel9 |
10.4.1.39 |
rhel10 |
10.4.1.40 |
rhel11 |
10.4.1.41 |
rhel12 |
10.4.1.42 |
rhel13 |
10.4.1.43 |
rhel14 |
10.4.1.44 |
rhel15 |
10.4.1.45 |
rhel16 |
10.4.1.46 |
… |
… |
rhel64 |
10.4.1.94 |
Note: Hortonworks Data Platform (HDP) does not recommend multi-homing configurations, so please assign only one network to each node.
Choose one of the nodes of the cluster or a separate node as the Admin Node for management such as HDP installation, cluster parallel shell, creating a local Red Hat repo and others. In this document, we use rhel1 for this purpose.
To manage all of the clusters nodes from the admin node password-less login needs to be setup. It assists in automating common tasks with clustershell (clush, a cluster wide parallel shell), and shell-scripts without having to use passwords.
Once Red Hat Linux is installed across all the nodes in the cluster, follow the steps below in order to enable password-less login across all the nodes.
1. Login to the Admin Node (rhel1).
#ssh 10.4.1.31
2. Run the ssh-keygen command to create both public and private keys on the admin node.
3. Then run the following command from the admin node to copy the public key id_rsa.pub to all the nodes of the cluster. ssh-copy-id appends the keys to the remote-host’s .ssh/authorized_keys.
#for IP in {31..94}; do echo -n "$IP -> "; ssh-copy-id -i ~/.ssh/id_rsa.pub 10.4.1.$IP; done
4. Enter yes for Are you sure you want to continue connecting (yes/no)?
5. Enter the password of the remote host.
Setup /etc/hosts on the Admin node; this is a pre-configuration to setup DNS as shown in the next section.
To create the host file on the admin node, complete the following steps:
1. Populate the host file with IP addresses and corresponding hostnames on the Admin node (rhel1) and other nodes as follows:
2. On Admin Node (rhel1)
#vi /etc/hosts
127.0.0.1 localhost localhost.localdomain localhost4 \ localhost4.localdomain4
::1 localhost localhost.localdomain localhost6 \ localhost6.localdomain6
10.4.1.31 rhel1
10.4.1.32 rhel2
10.4.1.33 rhel3
10.4.1.34 rhel4
10.4.1.35 rhel5
10.4.1.36 rhel6
10.4.1.37 rhel7
10.4.1.38 rhel8
10.4.1.39 rhel9
10.4.1.40 rhel10
10.4.1.41 rhel11
10.4.1.42 rhel12
10.4.1.43 rhel13
10.4.1.44 rhel14
10.4.1.45 rhel15
10.4.1.46 rhel16
...
10.4.1.94 rhel64
To create a repository using RHEL DVD or ISO on the admin node (in this deployment rhel1 is used for this purpose), create a directory with all the required RPMs, run the createrepo command and then publish the resulting repository.
1. Log on to rhel1. Create a directory that would contain the repository.
2. #mkdir -p /var/www/html/rhelrepo
3. Copy the contents of the Red Hat DVD to /var/www/html/rhelrepo
4. Alternatively, if you have access to a Red Hat ISO Image, Copy the ISO file to rhel1.
5. And login back to rhel1 and create the mount directory.
#scp rhel-server-7.2-x86_64-dvd.iso rhel1:/root/
#mkdir -p /mnt/rheliso
#mount -t iso9660 -o loop /root/rhel-server-7.2-x86_64-dvd.iso /mnt/rheliso/
6. Copy the contents of the ISO to the /var/www/html/rhelrepo directory.
#cp -r /mnt/rheliso/* /var/www/html/rhelrepo
7. Now on rhel1 create a .repo file to enable the use of the yum command.
#vi /var/www/html/rhelrepo/rheliso.repo
[rhel7.2]
name=Red Hat Enterprise Linux 7.2
baseurl=http://10.4.1.31/rhelrepo
gpgcheck=0
enabled=1
8. Now copy rheliso.repo file from /var/www/html/rhelrepo to /etc/yum.repos.d on rhel1.
#cp /var/www/html/rhelrepo/rheliso.repo /etc/yum.repos.d/
Note: Based on this repo file yum requires httpd to be running on rhel1 for other nodes to access the repository.
9. To make use of repository files on rhel1 without httpd, edit the baseurl of repo file /etc/yum.repos.d/rheliso.repo to point repository location in the file system.
Note: This step is needed to install software on Admin Node (rhel1) using the repo (such as httpd, create-repo, etc.)
#vi /etc/yum.repos.d/rheliso.repo
[rhel7.2]
name=Red Hat Enterprise Linux 7.2
baseurl=file:///var/www/html/rhelrepo
gpgcheck=0
enabled=1
1. Install the createrepo package on admin node (rhel1). Use it to regenerate the repository database(s) for the local copy of the RHEL DVD contents.
#yum -y install createrepo
2. Run createrepo on the RHEL repository to create the repo database on admin node
#cd /var/www/html/rhelrepo
#createrepo .
ClusterShell (or clush) is the cluster-wide shell that runs commands on several hosts in parallel.
1. From the system connected to the Internet download Cluster shell (clush) and install it on rhel1. Cluster shell is available from EPEL (Extra Packages for Enterprise Linux) repository.
#wget http://rpm.pbone.net/index.php3/stat/4/idpl/31529309/dir/redhat_el_7/com/clustershell-1.7-1.el7.noarch.rpm.html
#scp clustershell-1.7-1.el7.noarch.rpm rhel1:/root/
2. Login to rhel1 and install cluster shell.
#yum –y install clustershell-1.7-1.el7.noarch.rpm
3. Edit /etc/clustershell/groups.d/local.cfg file to include hostnames for all the nodes of the cluster. This set of hosts is taken when running clush with the ‘-a’ option.
4. For 64 node cluster as in our CVD, set groups file as follows,
#vi /etc/clustershell/groups.d/local.cfg
all: rhel[1-64]
Note: For more information and documentation on ClusterShell, visit https://github.com/cea-hpc/clustershell/wiki/UserAndProgrammingGuide.
Note: clustershell will not work if not ssh to the machine earlier (as it requires to be in known_hosts file), for instance, as in the case below for rhel<host>.
Setting up RHEL repo on the admin node requires httpd. To set up RHEL repository on the admin node, complete the following steps:
1. Install httpd on the admin node to host repositories.
The Red Hat repository is hosted using HTTP on the admin node, this machine is accessible by all the hosts in the cluster.
#yum –y install httpd
2. Add ServerName and make the necessary changes to the server configuration file.
#vi /etc/httpd/conf/httpd.conf
ServerName 10.4.1.31:80
3. Start httpd
#service httpd start
#chkconfig httpd on
Note: Based on this repo file yum requires httpd to be running on rhel1 for other nodes to access the repository.
1. Copy the rheliso.repo to all the nodes of the cluster.
#clush –w rhel[2-64] -c /var/www/html/rhelrepo/rheliso.repo --dest=/etc/yum.repos.d/
2. Also copy the /etc/hosts file to all nodes.
#clush –w rhel[2-64] –c /etc/hosts –-dest=/etc/hosts
3. Purge the yum caches after this
#clush -a -B yum clean all
#clush –a –B yum repolist
Note: While suggested configuration is to disable SELinux as shown below, if for any reason SELinux needs to be enabled on the cluster, then ensure to run the following to make sure that the httpd is able to read the Yum repofiles.
#chcon -R -t httpd_sys_content_t /var/www/html/
This section details setting up DNS using dnsmasq as an example based on the /etc/hosts configuration setup in the earlier section.
To create the host file across all the nodes in the cluster, complete the following steps:
1. Disable Network manager on all nodes:
#clush -a -b service NetworkManager stop
#clush -a -b chkconfig NetworkManager off
2. Update /etc/resolv.conf file to point to Admin Node:
#vi /etc/resolv.conf
nameserver 10.4.1.31
Note: This step is needed if setting up dnsmasq on Admin node. Otherwise this file should be updated with the correct nameserver.
Note: Alternatively #systemctl start NetworkManager.service can be used to start the service. #systemctl stop NetworkManager.service can be used to stop the service. Use #systemctl disable NetworkManager.service to stop a service from being automatically started at boot time.
3. Install and Start dnsmasq on Admin node:
#service dnsmasq start
#chkconfig dnsmasq on
4. Deploy /etc/resolv.conf from the admin node (rhel1) to all the nodes via the following clush command:
#clush -a -B -c /etc/resolv.conf
Note: A clush copy without –dest copies to the same directory location as the source-file directory
5. Ensure DNS is working fine by running the following command on Admin node and any data-node:
[root@rhel2 ~]# nslookup rhel1
Server: 10.4.1.31
Address: 10.4.1.31#53
Name: rhel1
Address: 10.4.1.31 ç
Note: yum install –y bind-utils will need to be run for nslookup to utility to run.
The latest Cisco Network driver is required for performance and updates. The latest drivers can be downloaded from the link below:
https://software.cisco.com/download/release.html?mdfid=286281356&reltype=latest&relind=AVAILABLE&dwnld=true&softwareid=283853158&rellifecycle=&atcFlag=N&release=2.0%289b%29&dwldImageGuid=84C2FF3BB579A1BF32F7227C59F6DF886CEDBE99&flowid=71443
1. In the ISO image, the required driver kmod-enic-2.3.0.20-rhel7u2.el7.x86_64.rpm can be located at \Linux\Network\Cisco\VIC\RHEL\RHEL7.2.
2. From a node connected to the Internet, download, extract and transfer kmod-enic-2.3.0.20-rhel7u2.el7.x86_64.rpm to rhel1 (admin node).
3. Install the rpm on all nodes of the cluster using the following clush commands. For this example the rpm is assumed to be in present working directory of rhel1.
[root@rhel1 ~]# clush -a -b -c kmod-enic-2.3.0.20-rhel7u2.el7.x86_64.rpm
[root@rhel1 ~]# clush -a -b "rpm –ivh kmod-enic-2.3.0.20-rhel7u2.el7.x86_64.rpm"
4. Ensure that the above installed version of kmod-enic driver is being used on all nodes by running the command "modinfo enic" on all nodes:
[root@rhel1 ~]# clush -a -B "modinfo enic | head -5"
5. Also it is recommended to download the kmod-megaraid driver for higher performance , the RPM can be found in the same package at \Linux\Storage\LSI\Cisco_Storage_12G_SAS_RAID_controller\RHEL\RHEL7.2
From the admin node rhel1 run the command below to Install xfsprogs on all the nodes for xfs filesystem.
#clush -a -B yum -y install xfsprogs
HDP 2.4 requires JAVA 8.
1. Download jdk-7u75-linux-x64.rpm from oracle.com and scp the rpm to (http://www.oracle.com/technetwork/java/javase/downloads/jdk8-downloads-2133151.html ) to admin node (rhel1).
2. Run the following commands on admin node (rhel1) to install and setup java on all nodes.
3. Copy JDK rpm to all nodes.
clush -a -b -c /root/jdk-8u91-linux-x64.rpm --dest=/root/
4. Extract and Install JDK on all nodes.
clush -a -b rpm -ivh /root/jdk-8u91-linux-x64.rpm
5. Create the following files java-set-alternatives.sh and java-home.sh on admin node (rhel1).
vi java-set-alternatives.sh
#!/bin/bash
for item in java javac javaws jar jps javah javap jcontrol jconsole jdb; do
rm -f /var/lib/alternatives/$item
alternatives --install /usr/bin/$item $item /usr/java/jdk1.8.0_91/bin/$item 9
alternatives --set $item /usr/java/jdk1.8.0_91/bin/$item
done
vi java-home.sh
export JAVA_HOME=/usr/java/jdk1.8.0_91
6. Make the two java scripts created above executable.
chmod 755 ./java-set-alternatives.sh ./java-home.sh
7. Copying java-set-alternatives.sh to all nodes.
clush -b -a -c ./java-set-alternatives.sh --dest=/root/
8. Setup Java Alternatives.
clush -b -a ./java-set-alternatives.sh
9. Ensure correct java is setup on all nodes (should point to newly installed java path).
clush -b -a "alternatives --display java | head -2"
10. Setup JAVA_HOME on all nodes.
clush -b -a -c ./java-home.sh --dest=/etc/profile.d
11. Display JAVA_HOME on all nodes.
clush -a -b "echo \$JAVA_HOME"
12. Display current java –version.
clush -B -a java -version
The Network Time Protocol (NTP) is used to synchronize the time of all the nodes within the cluster. The Network Time Protocol daemon (ntpd) sets and maintains the system time of day in synchronism with the timeserver located in the admin node (rhel1). Configuring NTP is critical for any Hadoop Cluster. If server clocks in the cluster drift out of sync, serious problems will occur with HBase and other services.
#clush –a –b "yum –y install ntp"
Note: Installing an internal NTP server keeps your cluster synchronized even when an outside NTP server is inaccessible. |
1. Configure /etc/ntp.conf on the admin node only with the following contents:
#vi /etc/ntp.conf
driftfile /var/lib/ntp/drift
restrict 127.0.0.1
restrict -6 ::1
server 127.127.1.0
fudge 127.127.1.0 stratum 10
includefile /etc/ntp/crypto/pw
keys /etc/ntp/keys
2. Create /root/ntp.conf on the admin node and copy it to all nodes:
#vi /root/ntp.conf
server 10.4.1.31
driftfile /var/lib/ntp/drift
restrict 127.0.0.1
restrict -6 ::1
includefile /etc/ntp/crypto/pw
keys /etc/ntp/keys
3. Copy ntp.conf file from the admin node to /etc of all the nodes by executing the following command in the admin node (rhel1):
#for SERVER in {32..94}; do scp /root/ntp.conf 10.4.1.$SERVER:/etc/ntp.conf; done
Note: Instead of the above for loop, this could be run as a clush command with "–w"option.
#clush -w rhel[2-94] –b –c /root/ntp.conf --dest=/etc
4. Run the following to synchronize the time and restart NTP daemon on all nodes.
#clush -a -b "service ntpd stop"
#clush -a -b "ntpdate rhel1"
#clush -a -b "service ntpd start"
5. Ensure restart of NTP daemon across reboots:
#clush –a –b "systemctl enable ntpd"
Alternatively, the new Chrony service can be installed, that is quicker to synchronize clocks in mobile and virtual systems.
1. Install the Chrony service:
# yum install -y chrony
2. Activate the Chrony service at boot:
# systemctl enable chronyd
3. Start the Chrony service:
# systemctl start chronyd
The Chrony configuration is in the /etc/chrony.conf file, configured similar to /etc/ntp.conf.
Syslog must be enabled on each node to preserve logs regarding killed processes or failed jobs. Modern versions such as syslog-ng and rsyslog are possible, making it more difficult to be sure that a syslog daemon is present. One of the following commands should suffice to confirm that the service is properly configured:
#clush -B -a rsyslogd –v
#clush -B -a service rsyslog status
On each node, ulimit -n specifies the number of inodes that can be opened simultaneously. With the default value of 1024, the system appears to be out of disk space and shows no inodes available. This value should be set to 64000 on every node.
Higher values are unlikely to result in an appreciable performance gain.
1. For setting the ulimit on Redhat, edit /etc/security/limits.conf on admin node rhel1 and add the following lines:
root soft nofile 64000
root hard nofile 64000
2. Copy the /etc/security/limits.conf file from admin node (rhel1) to all the nodes using the following command.
#clush -a -b -c /etc/security/limits.conf --dest=/etc/security/
3. Check that the /etc/pam.d/su file contains the following settings:
#%PAM-1.0
auth sufficient pam_rootOK.so
# Uncomment the following line to implicitly trust users in the "wheel" group.
#auth sufficient pam_wheel.so trust use_uid
# Uncomment the following line to require a user to be in the "wheel" group.
#auth required pam_wheel.so use_uid
auth include system-auth
account sufficient pam_succeed_if.so uid = 0 use_uid quiet
account include system-auth
password include system-auth
session include system-auth
session optional pam_xauth.so
Note: The ulimit values are applied on a new shell, running the command on a node on an earlier instance of a shell will show old values.
SELinux must be disabled during the install procedure and cluster setup. SELinux can be enabled after installation and while the cluster is running.
1. SELinux can be disabled by editing /etc/selinux/config and changing the SELINUX line to SELINUX=disabled. The following command will disable SELINUX on all nodes.
#clush -a -b "sed –i 's/SELINUX=enforcing/SELINUX=disabled/g' /etc/selinux/config"
#clush –a –b "setenforce 0"
Note: The above command may fail if SELinux is already disabled.
2. Reboot the machine, if needed for SELinux to be disabled incase it does not take effect. It can checked using:
#clush –a –b sestatus
Adjusting the tcp_retries parameter for the system network enables faster detection of failed nodes. Given the advanced networking features of UCS this is a safe and recommended change (failures observed at the operating system layer are most likely serious rather than transitory). On each node, set the number of TCP retries to 5 can help detect unreachable nodes with less latency.
1. Edit the file /etc/sysctl.conf and on admin node rhel1 and add the following lines:
net.ipv4.tcp_retries2=5
2. Copy the /etc/sysctl.conf file from admin node (rhel1) to all the nodes using the following command:
#clush -a -b -c /etc/sysctl.conf --dest=/etc/
3. Load the settings from default sysctl file /etc/sysctl.conf by running.
#clush -B -a sysctl -p
The default Linux firewall settings are far too restrictive for any Hadoop deployment. Since the UCS Big Data deployment will be in its own isolated network there is no need for that additional firewall.
#clush -a -b " firewall-cmd --zone=public --add-port=80/tcp --permanent"
#clush -a -b "firewall-cmd --reload"
#clush –a –b “systemctl disable firewalld”
1. In order to reduce Swapping, run the following on all nodes. Variable vm.swappiness defines how often swap should be used, 60 is default.
#clush -a -b " echo 'vm.swappiness=1' >> /etc/sysctl.conf"
2. Load the settings from default sysctl file /etc/sysctl.conf.
#clush –a –b "sysctl –p"
Disabling Transparent Huge Pages (THP) reduces elevated CPU usage caused by THP.
#clush -a -b "echo never > /sys/kernel/mm/transparent_hugepage/enabled”
#clush -a -b "echo never > /sys/kernel/mm/transparent_hugepage/defrag"
1. The above commands must be run for every reboot, so copy this command to /etc/rc.local so they are executed automatically for every reboot.
2. On the Admin node, run the following commands
#rm –f /root/thp_disable
#echo "echo never > /sys/kernel/mm/transparent_hugepage/enabled" >>
/root/thp_disable
#echo "echo never > /sys/kernel/mm/transparent_hugepage/defrag " >>
/root/thp_disable
3. Copy file to each node:
#clush –a –b –c /root/thp_disable
4. Append the content of file thp_disable to /etc/rc.local:
#clush -a -b “cat /root/thp_disable >> /etc/rc.local”
1. Disable IPv6 as the addresses used are IPv4.
#clush -a -b "echo 'net.ipv6.conf.all.disable_ipv6 = 1' >> /etc/sysctl.conf"
#clush -a -b "echo 'net.ipv6.conf.default.disable_ipv6 = 1' >> /etc/sysctl.conf"
#clush -a -b "echo 'net.ipv6.conf.lo.disable_ipv6 = 1' >> /etc/sysctl.conf"
2. Load the settings from default sysctl file /etc/sysctl.conf.
#clush –a –b "sysctl –p"
This section describes steps to configure non-OS disk drives as RAID1 using StorCli command as described below. All the drives are going to be part of a single RAID1 volume. This volume can be used for staging any client data to be loaded to HDFS. This volume won’t be used for HDFS data.
1. From the website download storcli http://www.lsi.com/downloads/Public/RAID%20Controllers/RAID%20Controllers%20Common%20Files/1.14.12_StorCLI.zip
2. Extract the zip file and copy storcli-1.14.12-1.noarch.rpm from the linux directory.
3. Download storcli and its dependencies and transfer to Admin node.
#scp storcli-1.14.12-1.noarch.rpm rhel1:/root/
4. Copy storcli rpm to all the nodes using the following commands:
#clush -a -b -c /root/storcli-1.14.12-1.noarch.rpm --dest=/root/
5. Run the below command to install storcli on all the nodes:
#clush -a -b “rpm -ivh storcli-1.14.12-1.noarch.rpm”
6. Run the below command to copy storcli64 to root directory.
#cd /opt/MegaRAID/storcli/
#cp storcli64 /root/
7. Copy storcli64 to all the nodes using the following commands:
#clush -a -b -c /root/storcli64 --dest=/root/
8. Run the following script as root user on rhel1 to rhel3 to create the virtual drives for the management nodes.
#vi /root/raid1.sh
./storcli64 -cfgldadd r1[$1:1,$1:2,$1:3,$1:4,$1:5,$1:6,$1:7,$1:8,$1:9,$1:10,$1:11,$1:12,$1:13,$1:14,$1:15,$1:16,$1:17,$1:18,$1:19,$1:20,$1:21,$1:22,$1:23,$1:24] wb ra nocachedbadbbu strpsz1024 -a0
The script above requires an enclosure ID as a parameter.
9. Run the following command to get enclosure id.
#./storcli64 pdlist -a0 | grep Enc | grep -v 252 | awk '{print $4}' | sort | uniq -c | awk '{print $2}'
#chmod 755 raid1.sh
10. Run MegaCli script as follows:
#./raid1.sh <EnclosureID> obtained by running the command above
WB: Write back
RA: Read Ahead
NoCachedBadBBU: Do not write cache when the BBU is bad.
Strpsz1024: Strip Size of 1024K
Note: The command above will not override any existing configuration. To clear and reconfigure existing configurations refer to Embedded MegaRAID Software Users Guide available at www.lsi.com.
This section describes steps to configure non-OS disk drives as individual RAID0 volumes using StorCli command as described below. These volumes are going to be used for HDFS Data.
1. Issue the following command from the admin node to create the virtual drives with individual RAID 0 configurations on all the data nodes.
#clush –w rhel[3-64] -B ./storcli64 -cfgeachdskraid0 WB RA direct NoCachedBadBBU strpsz1024 -a0
WB: Write back
RA: Read Ahead
NoCachedBadBBU: Do not write cache when the BBU is bad.
Strpsz1024: Strip Size of 1024K
Note: The command above will not override existing configurations. To clear and reconfigure existing configurations refer to Embedded MegaRAID Software Users Guide available at www.lsi.com.
The following script will format and mount the available volumes on each node whether it is Namenode or Data node. OS boot partition is going to be skipped. All drives are going to be mounted based on their UUID as /data/disk1, /data/disk2, and so on.
1. On the Admin node, create a file containing the following script.
2. To create partition tables and file systems on the local disks supplied to each of the nodes, run the following script as the root user on each node.
Note: The script assumes there are no partitions already existing on the data volumes. If there are partitions, delete them before running the script. This process is documented in the "Note" section at the end of the section.
#vi /root/driveconf.sh
#!/bin/bash
#disks_count=`lsblk -id | grep sd | wc -l`
#if [ $disks_count -eq 24 ]; then
# echo "Found 24 disks"
#else
# echo "Found $disks_count disks. Expecting 24. Exiting.."
# exit 1
#fi
[[ "-x" == "${1}" ]] && set -x && set -v && shift 1
count=1
for X in /sys/class/scsi_host/host?/scan
do
echo '- - -' > ${X}
done
for X in /dev/sd?
do
echo "========"
echo $X
echo "========"
if [[ -b ${X} && `/sbin/parted -s ${X} print quit|/bin/grep -c boot` -ne 0
]]
then
echo "$X bootable - skipping."
continue
else
Y=${X##*/}1
echo "Formatting and Mounting Drive => ${X}"
/sbin/mkfs.xfs -f ${X}
(( $? )) && continue
#Identify UUID
UUID=`blkid ${X} | cut -d " " -f2 | cut -d "=" -f2 | sed 's/"//g'`
/bin/mkdir -p /data/disk${count}
(( $? )) && continue
echo "UUID of ${X} = ${UUID}, mounting ${X} using UUID on /data/disk${count}"
/bin/mount -t xfs -o inode64,noatime,nobarrier -U ${UUID} /data/disk${count}
(( $? )) && continue
echo "UUID=${UUID} /data/disk${count} xfs inode64,noatime,nobarrier 0 0" >> /etc/fstab
((count++))
fi
done
3. Run the following command to copy driveconf.sh to all the nodes:
#chmod 755 /root/driveconf.sh
#clush –a -B –c /root/driveconf.sh
4. Run the following command from the admin node to run the script across all data nodes:
#clush –a –B /root/driveconf.sh
5. Run the following from the admin node to list the partitions and mount points:
#clush –a -B df –h
#clush –a -B mount
#clush –a -B cat /etc/fstab
Note: In-case there is a need to delete any partitions, it can be done so using the following.
6. Run the mount command (‘mount’) to identify which drive is mounted to which device /dev/sd<?>.
7. umount the drive for which partition is to be deleted and run fdisk to delete as shown below.
Note: Care should be taken not to delete the OS partition as this will wipe out the OS.
#mount
#umount /data/disk1 ç (disk1 shown as example)
#(echo d; echo w;) | sudo fdisk /dev/sd<?>
This section describes the steps to create the script cluster_verification.sh that helps to verify the CPU, memory, NIC, and storage adapter settings across the cluster on all nodes. This script also checks additional prerequisites such as NTP status, SELinux status, ulimit settings, JAVA_HOME settings and JDK version, IP address and hostname resolution, Linux version and firewall settings.
1. Create the script cluster_verification.sh as shown, on the Admin node (rhel1).
#vi cluster_verification.sh
#!/bin/bash
shopt -s expand_aliases,
# Setting Color codes
green='\e[0;32m'
red='\e[0;31m'
NC='\e[0m' # No Color
echo -e "${green} === Cisco UCS Integrated Infrastructure for Big Data and Analytics \ Cluster Verification === ${NC}"
echo ""
echo ""
echo -e "${green} ==== System Information ==== ${NC}"
echo ""
echo ""
echo -e "${green}System ${NC}"
clush -a -B " `which dmidecode` |grep -A2 '^System Information'"
echo ""
echo ""
echo -e "${green}BIOS ${NC}"
clush -a -B " `which dmidecode` | grep -A3 '^BIOS I'"
echo ""
echo ""
echo -e "${green}Memory ${NC}"
clush -a -B "cat /proc/meminfo | grep -i ^memt | uniq"
echo ""
echo ""
echo -e "${green}Number of Dimms ${NC}"
clush -a -B "echo -n 'DIMM slots: '; `which dmidecode` |grep -c \ '^[[:space:]]*Locator:'"
clush -a -B "echo -n 'DIMM count is: '; `which dmidecode` | grep \ "Size"| grep -c "MB""
clush -a -B " `which dmidecode` | awk '/Memory Device$/,/^$/ {print}' |\ grep -e '^Mem' -e Size: -e Speed: -e Part | sort -u | grep -v -e 'NO \ DIMM' -e 'No Module Installed' -e Unknown"
echo ""
echo ""
# probe for cpu info #
echo -e "${green}CPU ${NC}"
clush -a -B "grep '^model name' /proc/cpuinfo | sort -u"
echo ""
clush -a -B "`which lscpu` | grep -v -e op-mode -e ^Vendor -e family -e\ Model: -e Stepping: -e BogoMIPS -e Virtual -e ^Byte -e '^NUMA node(s)'"
echo ""
echo ""
# probe for nic info #
echo -e "${green}NIC ${NC}"
clush -a -B "`which ifconfig` | egrep '(^e|^p)' | awk '{print \$1}' | \ xargs -l `which ethtool` | grep -e ^Settings -e Speed"
echo ""
clush -a -B "`which lspci` | grep -i ether"
echo ""
echo ""
# probe for disk info #
echo -e "${green}Storage ${NC}"
clush -a -B "echo 'Storage Controller: '; `which lspci` | grep -i -e \ raid -e storage -e lsi"
echo ""
clush -a -B "dmesg | grep -i raid | grep -i scsi"
echo ""
clush -a -B "lsblk -id | awk '{print \$1,\$4}'|sort | nl"
echo ""
echo ""
echo -e "${green} ================ Software ======================= ${NC}"
echo ""
echo ""
echo -e "${green}Linux Release ${NC}"
clush -a -B "cat /etc/*release | uniq"
echo ""
echo ""
echo -e "${green}Linux Version ${NC}"
clush -a -B "uname -srvm | fmt"
echo ""
echo ""
echo -e "${green}Date ${NC}"
clush -a -B date
echo ""
echo ""
echo -e "${green}NTP Status ${NC}"
clush -a -B "ntpstat 2>&1 | head -1"
echo ""
echo ""
echo -e "${green}SELINUX ${NC}"
clush -a -B "echo -n 'SElinux status: '; grep ^SELINUX= \ /etc/selinux/config 2>&1"
echo ""
echo ""
clush -a -B "echo -n 'CPUspeed Service: '; `which service` cpuspeed \ status 2>&1"
clush -a -B "echo -n 'CPUspeed Service: '; `which chkconfig` --list \ cpuspeed 2>&1"
echo ""
echo ""
echo -e "${green}Java Version${NC}"
clush -a -B 'java -version 2>&1; echo JAVA_HOME is ${JAVA_HOME:-Not \ Defined!}'
echo ""
echo ""
echo -e "${green}Hostname LoOKup${NC}"
clush -a -B " ip addr show"
echo ""
echo ""
echo -e "${green}Open File Limit${NC}"
clush -a -B 'echo -n "Open file limit(should be >32K): "; ulimit -n'
2. Change Permissions to executable.
chmod 755 cluster_verification.sh
3. Run the Cluster Verification tool from the admin node. This can be run before starting Hadoop to identify any discrepancies in Post OS Configuration between the servers or during troubleshooting of any cluster / Hadoop issues.
#./cluster_verification.sh
HDP is an enterprise grade, hardened Hadoop distribution. HDP combines Apache Hadoop and its related projects into a single tested and certified package. HPD 2.4 components are depicted in the figure below. The following section goes in to detail on how to install HDP 2.4 on the cluster.
This section details the pre-requisites for HDP Installation such as setting up of HDP Repositories.
1. From a host connected to the Internet, download the Hortonworks repositories as shown below and transfer it to the admin node.
mkdir -p /tmp/Hortonworks
cd /tmp/Hortonworks/
2. Download the Hortonworks HDP Repo:
wget http://public-repo-1.hortonworks.com/HDP/centos7/2.x/updates/2.4.2.0/HDP-2.4.2.0-centos7-rpm.tar.gz
3. Download Hortonworks HDP-Utils Repo:
wget http://public-repo-1.hortonworks.com/HDP-UTILS-1.1.0.20/repos/centos7/HDP-UTILS-1.1.0.20-centos7.tar.gz
4. Download Ambari Repo:
wget http://public-repo-1.hortonworks.com/ambari/centos7/2.x/updates/2.2.2.0/ambari-2.2.2.0-centos7.tar.gz
5. Copy the repository directory to the admin node:
scp -r /tmp/Hortonworks/ rhel1:/var/www/html
6. Extract the files:
login to rhel1
cd /var/www/html/Hortonworks
tar -zxvf HDP-2.4.2.0-centos7-rpm.tar.gz
tar -zxvf HDP-UTILS-1.1.0.20-centos7.tar.gz
tar -zxvf ambari-2.2.2.0-centos7.tar.gz
7. Create the hdp.repo file with following contents:
vi /etc/yum.repos.d/hdp.repo
[HDP-2.4.2.0]
name= Hortonworks Data Platform Version - HDP-2.4.2.0
baseurl= http://rhel1/Hortonworks/HDP/centos7/2.x/updates/2.4.2.0
gpgcheck=0
enabled=1
priority=1
[HDP-UTILS-1.1.0.20]
name=Hortonworks Data Platform Utils Version - HDP-UTILS-1.1.0.20
baseurl= http://rhel1/Hortonworks/HDP-UTILS-1.1.0.20/repos/centos7
gpgcheck=0
enabled=1
priority=1
8. Create the Ambari repo file with following contents:
vi /etc/yum.repos.d/ambari.repo
[Updates-ambari-2.2.2.0]
name=ambari-2.2.2.0 - Updates
baseurl=http://rhel1/Hortonworks/AMBARI-2.2.2.0/centos7/2.2.2.0-460
gpgcheck=0
enabled=1
priority=1
9. From the admin node copy the repo files to /etc/yum.repos.d/ of all the nodes of the cluster.
clush -a -b -c /etc/yum.repos.d/hdp.repo --dest=/etc/yum.repos.d/
clush -a -b -c /etc/yum.repos.d/ambari.repo --dest=/etc/yum.repos.d/
Downgrade snappy on all data nodes by running this command from admin node.
clush -a -b yum -y downgrade snappy
To install HDP, complete the following the steps:
yum -y install ambari-server
The Postgresql database will be used by Ambari, Hive and Oozie services.
In this installation rhel2 will host the Hive and Oozie services and Ambari will be on rhel1.
1. Login to rhel2 and perform the following steps.
yum -y install postgresql-*
postgresql-setup initdb
/bin/systemctl start postgresql.service Or Service postgresql start
systemctl enable postgresql
chkconfig postgresql on
service postgresql status
2. Update these files on rhel2 in the location chosen to install the databases for Hive, Oozie and Ambari, using the host ip addresses.
vi /var/lib/pgsql/data/pg_hba.conf
vi /var/lib/pgsql/data/postgresql.conf
search for listen_address and replace with (*)
sudo -u postgres psql
http://docs.hortonworks.com/HDPDocuments/Ambari-2.2.2.0/bk_ambari_reference_guide/content/_using_ambari_with_postgresql.html
To set up PostgreSQL to be used with Ambari, complete the following steps:
1. Create a user for Ambari and grant it permissions.
2. Using the PostgreSQL database admin utility:
3. Create the Ambari database.
4. Create an Ambari user with the password "Cisco_123".
5. Grant all privileges on the Ambari database to the Ambari user
\connect ambari
6. Create the Ambari schema authorization to Ambari user.
7. Change the Ambari schema owner to Ambari user.
8. Alter the Ambari role, set search_path to 'ambari', 'public'.
9. Log out using \q and log back in using this command:
[root@rhel2 ~]# service postgresql restart
[root@rhel2 ~]# psql -U ambari -d ambari
10. Please enter the password.
11. Load the Ambari Server database schema.
Pre-load the Ambari database schema into your PostgreSQL database using the schema script.
12. Find the Ambari-DDL-Postgres-CREATE.sql file in the /var/lib/ambari-server/resources/ directory of the Ambari Server host after you have installed Ambari Server.
13. Copy /var/lib/ambari-server/resources/ from rhel1 to rhel2:/tmp/.
Cd /tmp
# psql -U <AMBARIUSER> -d <AMBARIDATABASE>
\connect <AMBARIDATABASE>;
\i Ambari-DDL-Postgres-CREATE.sql;
14. Check the table is created by running \dt command.
15. Log out with \q.
16. Find the Ambari-DDL-Postgres-CREATE.sql file in the /var/lib/ambari-server/resources/ directory of the Ambari Server host after installing the Ambari Server.
[root@rhel2 ~]# service postgresql restart
For Hive
sudo -u postgres psql
1. Create the hive database.
2. Create a user hive with password "Cisco_123".
3. Grant all privileges on the hive database to the hive user.
Oozie
sudo -u postgres psql
1. Create the database oozie.
2. Create the user oozie with password "Cisco_123".
3. Grant all privileges on database oozie to oozie.
4. Connect to the admin node (rhel1) and run the commands described below.
yum -y install postgresql-jdbc*
ambari-server setup --jdbc-db=postgres --jdbc-driver=/usr/share/java/postgresql-jdbc.jar
Ambary-server setup -j $JAVA_HOME
Note: Enter the advanced database configuration option – Yes and choose option 4 for the existing PostgreSQL Database selection.
ambari-server start
ps –aef | grep ambari-server
Once the Ambari service has been started, access the Ambari Install Wizard through the browser.
1. Point the browser to http://<ip address for rhel1>:8080
The Ambari Login screen will open (Figure 83).
2. Log in to the Ambari Server using the default username/password: admin/admin. This can be changed at a later period of time.
Once logged in, the “Welcome to Apache Ambari” window appears (Figure 84).
To create a cluster, complete the following steps:
1. Click the Create a Cluster button to launch the install wizard as shown in Figure 84 above.
2. On the Get started page (Figure 85) type “Cisco_HDP” as the name for the cluster.
3. Click Next.
1. In the next screen (Figure 86), select the HDP 2.4 stack.
2. Expand “Advanced Repository Options”.
3. Under the advanced repository option:
4. Select the RedHat 7 checkbox.
5. Uncheck the rest of the checkboxes.
6. Update the Redhat 7 HDP-2.4 URL to http://rhel1/Hortonworks/HDP/centos7/2.x/updates/2.4.2.0.
7. Update the Redhat 7 HDP-UTILS-1.1.0.20 URL to http://rhel1/Hortonworks/HDP-UTILS-1.1.0.20/repos/centos7.
Note: Make sure there are no trailing spaces after the URLs.
To build up the cluster, the install wizard needs to know general information about how the cluster is to be set up. This requires providing the Fully Qualified Domain Name (FQDN) of each of the hosts. The wizard also needs to access the private key file that was created in Set Up Password-less SSH. It uses these to locate all the hosts in the system and to access and interact with them securely.
Figure 87 below shows the install wizard window.
1. Use the Target Hosts text box to enter the list of host names, one per line. Ranges inside brackets can also be used to indicate larger sets of hosts.
2. Select the option Provide your SSH Private Key in the Ambari cluster install wizard.
3. Copy the contents of the file /root/.ssh/id_rsa on rhel1 and paste it in the text area provided by the Ambari cluster install wizard.
Note: Make sure there is no extra white space after the text-----END RSA PRIVATE KEY-----
4. Click the Register and Confirm button to continue.
5. Click OK on the Host Name Pattern Expressions popup.
Figure 88 Host Name Pattern Expressions
Figure 89 shows the Confirm Hosts screen This helps ensure that Ambari has located the correct hosts for the cluster and checks those hosts to make sure they have the correct directories, packages, and processes to continue the install.
1. If any host was selected in error, it can be removed by selecting the appropriate checkboxes and clicking the grey Remove Selected button.
2. To remove a single host, click the small white Remove button in the Action column.
3. When the lists of hosts are confirmed, click Next.
HDP is made up of a number of components. See Understand the Basics for more information. The services are listed in Figure 90 below.
1. Select all to preselect all items.
2. When you have made your selections, click Next.
The Ambari install wizard attempts to assign the master nodes for various services that have been selected to appropriate hosts in the cluster, as shown in Figure 91. The right column shows the current service assignments by host, with the hostname and its number of CPU cores and amount of RAM indicated.
1. Reconfigure the service assignments to match Table 7 shown below.
Table 7 Reconfigure the service assignments
Service Name |
Host |
NameNode |
rhel1 |
SNameNode |
rhel2 |
History Server |
rhel2 |
App Timeline Server |
rhel2 |
Resource Manager |
rhel2 |
Hive Metastore |
rhel2 |
WebHCat Server |
rhel2 |
HiveServer2 |
rhel2 |
HBase Master |
rhel2 |
Oozie Server |
rhel1 |
Zookeeper |
rhel1, rhel2, rhel3 |
Falcon Server |
rhel2 |
DRPC Server |
rhel2 |
Nimbus |
rhel2 |
Storm UI Server |
rhel2 |
|
|
Spark History Server |
rhel2 |
Accumulo Master |
rhel2 |
Accumulo Monitor |
rhel1 |
Accumulo Tracer |
rhel1 |
SmartSense HST Server |
rhel1 |
Grafana |
rhel1 |
Kafka Broker |
rhel1 |
Accumulo GC |
rhel1 |
Atlas Metadata Server |
rhel2 |
Knox Gateway |
rhel1 |
Metrics Collector |
rhel1 |
Note: On a small cluster (<16 nodes), consolidate all master services to run on a single node. For large clusters (> 64 nodes), deploy master services across 3 nodes.
2. Click Next.
The Ambari install wizard attempts to assign the slave components (DataNodes, NFSGateway, NodeManager, RegionServers, Phoenix Query Server, Supervisor, Flume, Accumulo TServer, Spark Thrift Server and Client) to appropriate hosts in the cluster as shown in Figure 92.
3. Reconfigure the service assignment to match the values shown in Table 8 below:
4. Assign DataNode, NodeManager, RegionServer, Supervisor and Flume on nodes rhel3- rhel64.
5. Assign Client to all nodes.
6. Click the Next button.
Table 8 Services and Hostnames
Client Service Name |
Host |
DataNode |
rhel3-rhel64 |
NFSGateway |
rhel1 |
NodeManager |
rhel3-rhel64 |
RegionServer |
rhel3-rhel64 |
Phoenix Query Server |
rhel1 |
Supervisor |
rhel3-rhel64 |
Flume |
rhel3-rhel64 |
Accumulo TServer |
rhel3-rhel64 |
Spark Thrift Server |
rhel1 |
Client |
All nodes, rhel1-rhel64 |
Figure 92 Assign Slaves and Clients
This section as shown in Figure 93 shows the tabs that manage configuration settings for Hadoop components. The wizard attempts to set reasonable defaults for each of the options here, but this can be modified to meet specific requirements. The following sections provide configuration guidance that should be refined to meet specific use case requirements.
The following changes are to be made:
· Memory and service level settings for each component and service level tuning.
· Customize the log locations of all the components to ensure growing logs do not cause the SSDs to run out of space.
1. In Ambari, choose the HDFS Service tab and use the “Search” box on top to filter for the properties mentioned in Table 9 and update their values.
1. Update the following HDFS configurations in Ambari.
Table 9 HDFS Configurations in Ambari
Property Name |
Value |
NameNode Java Heap Size |
4096 |
Hadoop maximum Java heap size |
4096 |
DataNode maximum Java heap size |
4096 |
Datanode Volumes Failure Toleration |
3 |
1. Change the default log location by finding the Log Dir property and modifying the /var prefix to /data/disk1 (Figure 93).
Figure 94 shows the MapReduce2 Tab.
1. In Ambari, choose the MapReduce Service tab and use the “Search” box on top to filter for the properties mentioned in Table 10 and update their values.
2. Update the following MapReduce configurations.
Table 10 MapReduce Configurations
Property Name |
Value |
Default virtual memory for a job's map-task |
4096 |
Default virtual memory for a job's reduce-task |
8192 |
Map-side sort buffer memory |
1638 |
yarn.app.mapreduce.am.resource.mb |
4096 |
mapreduce.map.java.opts |
-Xmx3276m |
mapreduce.reduce.java.opts |
-Xmx6552m |
yarn.app.mapreduce.am.command-opts |
-Xmx6552m |
3. Under the MapReduce2 tab, change the default log location by finding the Log Dir property and modifying the /var prefix to /data/disk1.
Figure 95 MapReduce2 Tab
1. In Ambari, choose the YARN Service from the tab as shown in Figure 96, and use the “Search” box on top to filter for the properties mentioned in Table 11 below to update their values.
2. Update the following YARN configurations.
Table 11 YARN Configuration Values
Property Name |
Value |
ResourceManager Java heap size |
4096 |
NodeManager Java heap size |
2048 |
yarn.nodemanager.resource.memory-mb |
184320 |
YARN Java heap size |
4096 |
yarn.scheduler.minimum-allocation-mb |
4096 |
yarn.scheduler.maximum-allocation-mb |
184320 |
3. Under YARN tab, change the default log location by finding the Log Dir property and modifying the /var prefix to /data/disk1.
No changes are required.
Figure 97 Tez Tab
Choose Hive Service from the tab, as shown in Figure 98. Select the advanced tab and make the changes below:
1. Select Existing PostgreSQL Database.
2. Enter Database Host to 10.4.1.32.
3. Database Name hive.
4. Database Username hive.
5. Enter the Hive database password as per organizational policy.
6. Database password Cisco_123.
7. Please test connection.
8. Change the default log location by finding the Log Dir property and modifying the /var prefix to /data/disk1.
9. Change the WebHCat log directory by finding the Log Dir property and modifying the /var prefix to /data/disk1.
In Ambari, choose HBASE Service from the tab (Figure 99) and use the “Search” box on top to filter for the properties mentioned in Table 12 to update their values.
1. Update the following HBASE configurations:
Property Name |
Value |
HBase Master Maximum Java Heap Size |
4096 |
HBase RegionServers Maximum Java Heap Size |
16384 |
Note: If you are not running HBase, keep the default value of 1024 for Java Heap size for HBase RegionServers and HBase Master
2. Under the HBase tab, change the default log location by finding the Log Dir property and modifying the /var prefix to /data/disk1.
No changes are required.
Figure 100 Pig Services
No changes are required.
Figure 101 Scoop Services
Similarly, under the Oozie tab, (Figure 102), change the default log location by finding the Log Dir property and modifying the /var prefix to /data/disk1.
1. Select Existing PostgreSQL Database.
2. Enter Database Host to 10.4.1.32.
3. Database Name oozie.
4. Database Username oozie. Enter the oozie database password as per organizational policy.
5. Database password is Cisco_123.
6. Please test the connection.
Figure 102 Oozie Tab
Under the Zookeeper tab, (Figure 103), change the default log location by finding the Log Dir property and modifying the /var prefix to /data/disk1.
Figure 103 ZooKeeper Tab
Under the Falcon tab, (Figure 104), change the default log location by finding the Log Dir property and modifying the /var prefix to /data/disk1.
Figure 104 Falcon Tab
1. Under the Storm tab, (Figure 105), change the default log location by finding the Log Dir property and modifying the /var prefix to /data/disk1.
Figure 105 Storm Tab
1. Choose the Ambari Metrics Service, (Figure 106), from the tab and expand the general tab and make the changes below:
2. Enter the Grafana Admin password as per organizational policy.
3. Change the default log location for Metrics Collector, Metrics Monitor and Metrics Grafana by finding the Log Dir property and modifying the /var prefix to /data/disk1.
4. Change the default data dir location for Metrics Grafana by finding the data Dir property and modifying the /var prefix to /data/disk1.
Figure 106 Ambari Metrics
Under the Flume tab, change the default log location by finding the Log Dir property and modifying the /var prefix to /data/disk1.
Figure 107 Flume Tab
Choose Accumulo Service (Figure 108), from the tab and expand the general tab and make the changes below:
1. Enter the Accumulo root password as per organizational policy.
2. Enter the Accumulo instance Secret password as per organizational policy.
3. change the default log location by finding the Log Dir property and modifying the /var prefix to /data/disk1.
Figure 108 Accumulo Service
Under the Atlas tab, (Figure 109), change the default log location by finding the Log Dir property and modifying the /var prefix to /data/disk1.
Figure 109 Atlas Tab
1. Under the Kafka tab, change the default log location by finding the Log Dir property and modifying the /var prefix to /data/disk1.
Figure 110 Kafka Tab
1. Choose Knox Service, (Figure 111), from the tab and expand the Knox gateway tab and make the changes below:
2. Enter the Knox Master Secret password as per organizational policy.
3. For Knox, change the gateway port to 8444 to ensure no conflicts with local HTTP server.
Figure 111 Knox Service
No changes are required.
Figure 112 Mahout Tab
No changes are required.
Figure 113 Slider
Figure 114 shows the SmartSense tab. This requires the Hortonworks support subscription. Subscribers can populate the properties below.
Figure 114 SmartSense
1. Select the Spark tab, (Figure 115), change the default log location by finding the Log Dir property and modifying the /var prefix to /data/disk1.
Figure 115 Spark Tab
No changes are required.
Figure 116 Misc Tab
The assignments that have been made are displayed, (Figure 117). Check to ensure everything is correct before clicking on the Deploy button. If any changes are to be made, use the left navigation bar to return to the appropriate screen.
1. Once the review is complete, click the Deploy button.
Figure 117 Review the Configuration
The progress of the install is shown on the screen as shown in Figure 118. Each component is installed and started and a simple test is run on the component. The next screen displays the overall status of the install in the progress bar at the top of the screen and a host-by-host status in the main section.
2. To see specific information on what tasks have been completed per host, click the link in the Message column for the appropriate host.
3. In the Tasks pop-up, click the individual task to see the related log files.
4. Select filter conditions by using the Show dropdown list.
5. To see a larger version of the log contents, click the Open icon or to copy the contents to the clipboard, use the Copy icon.
Depending on which components are installing, the entire process may take 10 or more minutes.
6. When successfully installed and started the service appears, click Next.
Figure 118 displays the install progress screen.
Figure 118 Install Progress Screen
Figure 119 shows a summary of the accomplished tasks.
7. Click Complete.
Figure 119 Summary Screen
Cisco UCS Integrated Infrastructure for Big Data and Analytics offers several configurations to meet a variety of computing and storage requirements for HDF. Ideally these include the Cisco UCS C220 M4 servers with the following configuration shown in Figure 120 and Table 13 below.
Figure 120 Reference Architecture for Spark Streaming with HDF & HDP
Table 13 Cisco UCS Reference Architecture for HDF
Starter |
High Performance |
8 Cisco UCS C220 M4 Rack Servers, each with: 2 Intel Xeon processor E5- 2620 v4 CPUs (8 cores each) 128 GB of memory 8 x 1.2-TB 10,000-RPM SFF SAS drives Total of 10 TB of storage capacity 1.4 GBps of I/O bandwidth Cisco UCS VIC 1227 Cisco 12-Gbps SAS Modular RAID Controller with 2-GB FBW |
8 Cisco UCS C220 M4 Rack Servers, each with: 2 Intel Xeon processor E5- 2680 v4 CPUs (14 cores each) 256 GB of memory 8 x 960-GB SFF SSDs Total of 7.5 TB of flash storage 4 GBps of I/O bandwidth Cisco UCS VIC 1227 Cisco 12-Gbps SAS Modular RAID Controller with 2-GB FBWC |
This section describes in detail the RAID configuration of disk drives for OS on HDF nodes. The first two disk drives are configured as RAID1, read ahead cache is enabled and write cache is enabled while battery is present and as OS partition.
The remaining disk drives, 6 (in case of Cisco UCS C220 M4) are used for data and is configured as 3 two disk RAID1 volumes with read ahead cache is enabled and write cache is enabled while battery is present is described in the next section.
To configure Disk Drives for Operating System on Cisco UCS C220 M4 servers, complete the following steps.
1. Log in to the Cisco UCS 6296 Fabric Interconnect and launch the Cisco UCS Manager application.
2. Select the Equipment tab as in Figure 121.
3. In the navigation pane expand Rack-Mounts and then Servers.
4. Right click on the server and select KVM Console.
Figure 121 KVM Console
5. Restart the server by using KVM Console, Macros > Static Macros > Ctrl-Alt-Del as shown in Figure 122.
Figure 122 Restart The Server
6. Press <Ctrl>R to enter the Cisco SAS Modular Raid Controller BIOS Configuration Utility. This is shown in Figure 123 below.
7. Select the controller and press F2.
Clear the Configuration if previous configurations are present.
8. Select Create Virtual Drive.
Figure 123 Cisco SAS Modular Raid Controller BIOS Configuration Utility
9. Select RAID level RAID-1, select the first two drives and choose Advanced.
Figure 124 Cisco SAS Modular Raid Controller BIOS Configuration Utility
10. Select the following:
Strip Size is 64KB.
Read Policy is Normal.
Write Policy is Write Through.
I/O Policy is Direct.
Disk Cache Policy is unchanged.
Emulation is Default.
11. Select Initialize.
Figure 125 Create Virtual Drive
Initialization will destroy data on the virtual drives.
12. Select OK to continue.
Figure 126 Cisco SAS Modular Raid Controller BIOS Configuration Utility
Figure 127 Cisco SAS Modular Raid Controller BIOS Configuration Utility
13. Select OK to continue.
14. Press the ESC key, then select OK to exit the utility.
15. Press Ctrl-N to move to the Ctrl Mgmt tab.
16. Go to the Boot Device option and select the VD 0 930.39 GB for C220 M4.
17. Apply the settings and exit.
The remaining 6 disk drives are used for data and are configured as 3 two disk RAID1 volumes with read ahead cache is enabled and write back cache is enabled while battery is present is described in the next section.
1. Repeat the steps above 3 times to create 3 RAID1 volumes of two drives each.
2. Select the following:
Strip Size is 64KB.
Read Policy is Normal.
Write Policy is Write Back.
I/O Policy is Direct.
Disk Cache Policy is unchanged.
Emulation is Default.
The three separate volumes are needed for
· Flowfile
· Content Repository
· Provenance Repository
Follow the same steps to install RHEL 7.2 on the boot drives as shown in the C240 M4 server section.
The following script will format and mount the available volumes on each node. OS boot partition is going to be skipped. All drives are going to be mounted based on their UUID as flowfile_repo, cont_repo, and prov_repo.
1. On the Admin node, create a file containing the following script.
2. To create partition tables and file systems on the local disks supplied to each of the nodes, run the following script as the root user on each node.
Note: The script assumes there are no partitions already existing on the data volumes. If there are partitions, delete them before running the script. This process is documented in the "Note" section at the end of the section.
#vi /root/driveconf_nifi.sh
#!/bin/bash
#disks_count=`lsblk -id | grep sd | wc -l`
#if [ $disks_count -eq 24 ]; then
# echo "Found 24 disks"
#else
# echo "Found $disks_count disks. Expecting 24. Exiting.."
# exit 1
#fi
[[ "-x" == "${1}" ]] && set -x && set -v && shift 1
count=1
for X in /sys/class/scsi_host/host?/scan
do
echo '- - -' > ${X}
done
for X in /dev/sd?
do
echo "========"
echo $X
echo "========"
if [[ -b ${X} && `/sbin/parted -s ${X} print quit|/bin/grep -c boot` -ne 0
]]
then
echo "$X bootable - skipping."
continue
else
Y=${X##*/}1
echo "Formatting and Mounting Drive => ${X}"
/sbin/mkfs.xfs -f ${X}
(( $? )) && continue
#Identify UUID
UUID=`blkid ${X} | cut -d " " -f2 | cut -d "=" -f2 | sed 's/"//g'`
if [[ $count -eq 1 ]]; then
folder="flowfile_repo";
elif [[ $count -eq 2 ]]; then
folder="cont_repo";
else
folder="prov_repo";
fi
echo ${folder}
/bin/mkdir -p /${folder}
(( $? )) && continue
echo "UUID of ${X} = ${UUID}, mounting ${X} using UUID on /${folder}"
/bin/mount -t xfs -o inode64,noatime -U ${UUID} /${folder}
(( $? )) && continue
echo "UUID=${UUID} /${folder} xfs inode64,noatime 0 0" >> /etc/fstab
((count++))
fi
done
3. Run the following command to copy driveconf_nifi.sh to all the nodes:
#chmod 755 /root/driveconf_nifi.sh
#clush –a -B –c /root/driveconf_nifi.sh
4. Run the following command from the admin node to run the script across all data nodes
#clush –a –B /root/driveconf_nifi.sh
5. Run the following from the admin node to list the partitions and mount points
#clush –a -B df –h
#clush –a -B mount
#clush –a -B cat /etc/fstab
Please follow the similar steps as HDP nodes post OS configuration mentioned above with additional steps as described below.
Note: For more information please check https://nifi.apache.org/docs/nifi-docs/html/administration-guide.html#configuration-best-practices
NiFi will at any one time potentially have a very large number of file handles open. Increase the limits by editing /etc/security/limits.conf to add something like:
* hard nofile 50000
* soft nofile 50000
NiFi may be configured to generate a significant number of threads.
1. To increase the allowable number, edit /etc/security/limits.conf.
* hard nproc 10000
* soft nproc 10000
2. The distribution may require an edit to /etc/security/limits.d/90-nproc.conf by adding:
* soft nproc 10000
3. Increase the number of TCP socket ports available. This is particularly important if the flow will be setting up and tearing down a large number of sockets in a small period of time.
sudo sysctl -w net.ipv4.ip_local_port_range="10000 65000"
4. Set how long sockets stay in a TIMED_WAIT state when closed. Don’t let the sockets sit and linger too long. The flow will be setting up and tearing down a large number of sockets in a small period of time. Read more about it, but to adjust, try the following:
sudo sysctl -w net.ipv4.netfilter.ip_conntrack_tcp_timeout_time_wait="1"
5. Set Linux never to NiFi swap.
Swapping is fantastic for some applications. It is not good for NiFi that always wants to be running.
6. To turn swapping off, edit /etc/sysctl.conf and add the following line:
vm.swappiness = 0
1. On an Internet connected node, create directory called HDF.
mkdir –p /tmp/hdf
cd /tmp/hdf
We are using site to site connectivity for this exercise.
Node 1- distribution node (Internet connectivity)
Node 2-7 – HDF cluster
2. Based on Post OS settings, have the clustershell installed on a Distribution node (edge node) to run the clush command.
To setup HDF, complete the following steps:
1. Running as root, download HDF-1.2.0.1-1.zip using wget below command.
wget http://public-repo-1.hortonworks.com/HDF/centos6/1.x/updates/1.2.0.1/HDF-1.2.0.1-1.zip
2. Copy HDF zip file to /root/distribution on Distribution node and unzip.
mkdir distribution
3. Create user nifi on all nodes (distribution and HDF nodes).
clush -a -b adduser nifi
4. Open the following ports on the firewall using these commands.
firewall-cmd --zone=public --add-port=80/tcp --permanent
firewall-cmd --zone=public --add-port=8080/tcp --permanent
firewall-cmd --zone=public --add-port=9090/tcp --permanent
firewall-cmd --zone=public --add-port=9080/tcp --permanent
firewall-cmd --zone=public --add-port=9088/tcp --permanent
firewall-cmd --zone=public --add-port=7088/tcp --permanent
firewall-cmd --zone=public --add-port=6088/tcp --permanent
firewall-cmd --reload
5. From the distribution node run, create the following directories needed for HDF.
[root@hdf ~]# clush -a -b mkdir -p /opt/configuration-resources
[root@hdf ~]# clush -a -b mkdir -p /opt/database_repo
[root@hdf ~]# clush -a -b mkdir -p /nifi-log
[root@hdf ~]# clush -a -b chown -R nifi:nifi /opt/database_repo
[root@hdf ~]# clush -a -b chown -R nifi:nifi /opt/configuration-resources
[root@hdf ~]# clush -a -b chown -R nifi:nifi /nifi-log
6. Copy the files from the distribution node to all other nodes as follows:
cd /home/nifi/distribution/HDF-1.2.0.1-1/nifi/conf
[root@hdf conf]# clush -a -b -c authority-providers.xml --dest=/opt/configuration-resources
[root@hdf conf]# clush -a -b -c authorized-users.xml --dest=/opt/configuration-resources
[root@hdf conf]# clush -a -b -c bootstrap-notification-services.xml --dest=/opt/configuration-resources
[root@hdf conf]# clush -a -b -c login-identity-providers.xml --dest=/opt/configuration-resources
[root@hdf conf]# clush -a -b -c state-management.xml --dest=/opt/configuration-resources
[root@hdf conf]# clush -a -b -c zookeeper.properties --dest=/opt/configuration-resources
[root@hdf conf]# clush -a -b chown -R nifi:nifi /opt/configuration-resources/*
[root@hdf conf]# clush -a -b ls -l /opt/configuration-resources/
7. Create the following directories needed for HDF as follows:
[root@hdf conf]# clush -a -b mkdir /opt/configuration-resources/templates
[root@hdf conf]# clush -a -b mkdir /opt/configuration-resources/archive
[root@hdf conf]# clush -a -b chown -R nifi:nifi /opt/configuration-resources/*
8. On the distribution-node, delete the following files in the nifi conf directory:
cd /home/nifi/distribution/HDF-1.2.0.1-1/nifi/conf
[root@hdf conf]# rm -f authority-providers.xml authorized-users.xml bootstrap-notification-services.xml login-identity-providers.xml state-management.xml
zookeeper.properties
[root@hdf conf]# ls
bootstrap.conf , logback.xml, nifi.properties
[nifi@hdf conf]$ pwd
9. Modify Nifi.properties.
10. Update file nifi.properties with settings based on the directories created above.
cd /home/nifi/distribution/HDF-1.2.0.1-1/nifi/conf
vi nifi.properties
11. Modify file/directory location for the following as shown in the screenshots.
/conf in nifi.properties are updated to /opt/configuration-resources
12. Update the flowfile repo as /flowfile_repo.
13. Update the content repo directory as /cont_repo.
14. Set the http host to Change_host, this will be updated later as a script. Update the port number.
15. Set the http host to Change_host, this will be updated later as a script. Update the port number.
Modify Bootstrap.conf
16. Update bootstrap.conf as shown in the screenshot.
Modify logback.xml
1. Modify logback.xml to point to the right log file as shown below:
<file>logs/nifi-app.log</file> to
<file>/nifi-log/nifi-app.log</file>
2. Search for <fileNamePattern> and change.log to .log.gz.
<fileNamePattern>./logs/nifi-app_%d{yyyy-MM-dd_HH}.%i.log</fileNamePattern>
3. Now copy the entire distribution directory to all the nifi nodes under the same location.
[root@hdf distribution]# clush -a -b -c HDF-1.2.0.1-1 --dest=/opt/
[root@hdf distribution]# clush -a -b ls /opt/
4. Change owner on all files in /opt/HDF-<x> to nifi:nifi as shown below
[root@hdf distribution]# clush -a -b chown -R nifi:nifi /opt/HDF-1.2.0.1-1
5. Change owner on all the following mount points as well
6. Run df –h command on all nodes to check mount points
[root@hdf ~]# clush -a -b chown -R nifi:nifi /flowfile_repo/
[root@hdf ~]# clush -a -b chown -R nifi:nifi /prov_repo/
[root@hdf ~]# clush -a -b chown -R nifi:nifi /cont_repo/
7. Write this script to make changes in nifi.properties on all nodes:
[root@hdf ~]# vi nifi-changes.sh
host=`hostname -f`
echo $host
sed -i.bak "s/change_host/$host/g" /opt/HDF-1.2.0.1-1/nifi/conf/nifi.properties
[root@hdf ~]# clush -a -b -c ./nifi-changes.sh
[root@hdf ~]# clush -a -b ./nifi-changes.sh
8. Run this command to identify changes are applied
[root@hdf ~]# clush -a -b "diff /opt/HDF-1.2.0.1-1/nifi/conf/nifi.*"
There are 3 types of Nifi nodes:
1. Standalone - distribution node ( internet connectivity) Edge node
2. Master
3. Slave
To bring up each of these nodes, complete the following steps:
The Standalone system is ready, no changes are required.
1. Start the service on the standalone system:
[root@hdf ~]$ /opt/HDF-1.2.0.1-1/nifi/bin/nifi.sh start
1. Connect to the master node and perform the following steps:
2. In nifi properties, update manager to true:
vi nifi.properites
nifi.cluster.is.manager = true
3. Start the service on the master system:
[root@hdf ~]$ /opt/HDF-1.2.0.1-1/nifi/bin/nifi.sh start
1. In nifi properties, update node to true to make this a slave node
nifi.cluster.is.node = true
2. Run the following command on all slave nodes:
sed -i.bak "s/nifi.cluster.is.node=false/nifi.cluster.is.node=true/g" /opt/HDF-1.2.0.1-1/nifi/conf/nifi.properties
3. Start service on slave nodes:
[root@hdf ~]$ /opt/HDF-1.2.0.1-1/nifi/bin/nifi.sh start
At this point the NiFi cluster should be up and running.
This section provides the BOM for the 64 nodes Performance Optimized Cluster. See Table 14 for BOM for the master rack, Table 15 for BOM for expansion racks (racks 2 to 4), Table 16 and Table 17 for software components. Table 18 lists Hortonworks Data Platform (HDP) SKUs available from Cisco. Table 19 and Table 20 list Hortonworks Dataflow Starter Configuration SKUs.
If UCSD-SL-CPA4-P2 is added to the BOM all the required components for 16 servers only are automatically added. If not customers can pick each of the individual components that are specified after this and build the BOM manually.
Table 14 Bill of Materials for C240M4SX Base Rack
If using the FI 6332 please refer to Table 1 for the SKU information.
Table 15 Bill of Materials for Expansion Racks
Table 16 Red Hat Enterprise Linux License
Red Hat Enterprise Linux |
||
RHEL-2S2V-3A |
Red Hat Enterprise Linux |
64 |
CON-ISV1-EL2S2V3A |
3 year Support for Red Hat Enterprise Linux |
64 |
Table 17 SKUS for Hortonworks Subscription
Cisco PID (TOP level) |
Cisco Subscription PID |
Description |
UCS-BD-HDP-JSS= |
UCS-BD-HDP-JSS-6M |
HDP Data Platform Jumpstart Subscription - Up to 16 Nodes – 1 Business Day Response - 6 Months |
UCS-BD-HDP-ENT-ND= |
UCS-BD-ENT-ND-1Y |
HDP Enterprise Subscription - 4 Nodes - 24x7 Sev 1 Response - 1 Year |
UCS-BD-HDP-ENT-ND= |
UCS-BD-ENT-ND-2Y |
HDP Enterprise Subscription - 4 Nodes - 24x7 Sev 1 Response - 2 Year |
UCS-BD-HDP-ENT-ND= |
UCS-BD-ENT-ND-3Y |
HDP Enterprise Subscription - 4 Nodes - 24x7 Sev 1 Response - 3 Year |
UCS-BD-HDP-EPL-ND= |
UCS-BD-EPL-ND-1Y |
HDP Enterprise Plus Subscription - 4 Nodes - 24x7 Sev 1 Response - 1 Year |
UCS-BD-HDP-EPL-ND= |
UCS-BD-EPL-ND-2Y |
HDP Enterprise Plus Subscription - 4 Nodes - 24x7 Sev 1 Response - 2 Year |
UCS-BD-HDP-EPL-ND= |
UCS-BD-EPL-ND-3Y |
HDP Enterprise Plus Subscription - 4 Nodes - 24x7 Sev 1 Response - 3 Year |
Hortonworks SKU |
Description |
HDF-JSS-6M |
Hortonworks DataFlow Jumpstart Subscription - 32 Cores and 10 Edge Devices – Business Day Response - 6 Months |
HDF-ENT-1Y |
Hortonworks DataFlow Enterprise Subscription - 16 Cores – 24x7 Sev 1 Response - 1 Year |
HDF-EDG-1Y |
Hortonworks DataFlow Edge Pack Subscription - 100 Edge Devices – 24x7 Sev 1 Response - 1 Year |
Note: For the Hortonworks Dataflow, the following SKU’s are available, each with 8 x Cisco UCS C220M4 in the configuration.
1. Starter Configuration for test and development, and smaller deployments.
Table 19 Hortonworks Dataflow Starter Configuration SKUs
Base Solution SKU |
UCS-SL-CPA4-S |
Server SKU (C220 M4) |
UCS-SPBD-C220M4-S1 |
2. High Performance Configuration with SSDs.
Table 20 Hortonworks Dataflow High Performance Configuration
Base Solution SKU |
UCS-SL-CPA4-H |
Server SKU (C220 M4) |
UCS-SPBD-C220M4-H1 |
Manan Trivedi is a Big Data Solutions Architect in the Data Center Solutions Group, Cisco Systems Inc. Manan is part of the Big Data solution engineering team focusing on big data infrastructure and performance.
Ali Bajwa, Principal Partner Solutions Engineer, Technology Alliances Team, Hortonworks, Inc.
Ali is a Senior Partner Solutions Engineer at Hortonworks and works as part of the Technology Alliances team. His focus is to evangelize and assist partners integrate with Hortonworks Data Platform.
· Karthik Kulkarni, Big Data Solutions Architect, Data Center Solutions Group, Cisco Systems Inc.
· Barbara Dixon, Technical Writer, Data Center Solutions Group, Cisco Systems, Inc.
Network bonding can be setup on the vNICs on the hosts for redundancy as well as for increased throughput. Below is example of how bonding can be configured on a 16-node cluster. See Figure 128 below.
Figure 128 vNIC Bonding Set Up diagram
Enabling bonding will need the Fabric Interconnects to connect to any L2/L3 switch in this example the N9K-9372PX switch is used. All the ports that will be used will need to be bundled in a port channel. In this example we bundled 8 ports into the port channel. In this configuration 3 vNICs are used on each FI, while one vNIC on each is sufficient the reason for this is multiple vNIC bonding works much better with bonding mode 6, because both sending and receiving frames are load balanced with different MAC addresses.
By the very definition of bond mode 6 (balance-alb),“when a link is reconnected or a new slave joins the bond the received traffic is redistributed among all active slaves in the bond by initiating ARP Replies with the selected MAC address to each of the clients.” This explains the performance improvement that is observed with addition of multiple vNICS.
VLANs are configured as shown in Table 21 below:
VLAN |
Fabric |
NIC Port |
Function |
Vlan19 |
|
bond0 |
Data |
|
A |
eth0 |
Data |
|
B |
eth1 |
Data |
|
A |
eth2 |
Data |
|
B |
eth3 |
Data |
|
A |
eth4 |
Data |
|
B |
eth5 |
Data |
All NICs are carrying data traffic in Vlan19.
16 upstream ports bundled in Port channel created between Fabric Interconnect and upstream switch (Nexus N9K-C9372PX)
1. Configure eth0 as shown below in Figure 129, and click Apply to enable the changes.
Figure 129 Configure eth0
All vNICs (eth0-eth5) are the same as the above configuration. The only change will be the Fabric ID as noted below:
· eth0, eth2, and eth4 are on the link going to Fabric A
· eth1, eth3, and eth5 are on the link going to Fabric B
· All 6 vNICs are bonded together in Mode 6 to form bond0 interface.
To configure bonding on the OS Host machines, complete the following steps:
1. Configure /etc/modules.conf as follows:
alias bond0 bonding
Sample ifcfg-eth0 file
DEVICE="eth0"
BOOTPROTO="none"
MTU="9000"
ONBOOT="yes"
TYPE="Ethernet"
MASTER=bond0
SLAVE=yes
2. All the vNICs that are being bonded will need to be configured as SLAVES as shown above on all the hosts, with bond0 as the master.
Sample ifcfg-bond0 file
DEVICE=bond0
TYPE=Bond
BONDING_MASTER=yes
DEFROUTE=yes
IPV4_FAILURE_FATAL=no
NETMASK=255.255.255.0
GATEWAY=$GATEWAYIP
BONDING_OPTS="mode=6 miimon=100 xmit_hash_policy=0"
BOOTPROTO="none"
ONBOOT="yes"
MTU="9000"
IPADDR=”$HOSTIP”
This section explains how to create a sample application which continuously feeds Twitter live -streams through Hortonworks DataFlow, and sends this to Apache Solr for indexing.
To create the sample application, complete the following steps:
· Create a Twitter Application (Twitter Credentials)
· Install Apache Solr and Banana dashboard
· Configure HDF to pull data from Twitter and store in Solr
To skip having to register a personal Twitter application, and use previous data, go to the next section to download the sample dataset.
To pull live data from Twitter in this tutorial, register a personal Twitter application.
1. Go to the Twitter Apps Website and sign in using a personal Twitter account
2. Click Create a New App.
3. Fill in details about the application.
4. Click Create Your Twitter Application at the bottom of the screen after reading the developer agreement.
Note: Add your mobile phone to a personal Twitter account before creating the application
Shows the dashboard for the Twitter application.
5. Click the permissions tab and select the Read Only Option and Update the application.
6. Generate the OAuth key.
7. Click Test OAuth on the top of the permissions page, or go to Keys and Access Tokens and find the option to generate the OAuth tokens.
The keys and access tokens should look similar to the following:
Make note of the Consumer Key, Consumer Secret, Access Token, and Access Token Secret. These are used to create the data flow in NiFi.
1. To enable search, install the hdpsearch package as show below on the existing Hadoop cluster nodes.
clush –a –b yum install -y lucidworks-hdpsearch
2. From the internet connected node, download the default.json files and copy them to the admin node of the Hadoop cluster.
mkdir –p /tmp/solr
cd /tmp/solr
wget https://raw.githubusercontent.com/abajwa-hw/ambari-nifi-service/master/demofiles/default.json
3. Connect to rhel1 (Hadoop admin node) and perform the following steps to copy the default.json file from root to the banana dashboard.
[root@rhel1 ~]# cp default.json /opt/lucidworks-hdpsearch/solr/server/solr-webapp/webapp/banana/app/dashboards/
cp: overwrite ‘/opt/lucidworks-hdpsearch/solr/server/solr-webapp/webapp/banana/app/dashboards/default.json’? y
4. Edit the Solr configuration file to add the date format compatible with twitter data.
vi /opt/lucidworks-hdpsearch/solr/server/solr/configsets/data_driven_schema_configs/conf/solrconfig.xml
5. Search for ParseDateFieldUpdateProcessorFactory and the add the following line at the beginning:
<str>EEE MMM d HH:mm:ss Z yyyy</str>
6. Backup the solrconfig.xml file to root on all the nodes.
[root@rhel1 ~]# clush -a -b cp /opt/lucidworks-hdpsearch/solr/server/solr/configsets/data_driven_schema_configs/conf/solrconfig.xml /root/solrconfig.xml-bkp
7. Copy the solrconfig.xml file from rhel1 to the all nodes.
[root@rhel1 ~]# clush -a -b -c /opt/lucidworks-hdpsearch/solr/server/solr/configsets/data_driven_schema_configs/conf/solrconfig.xml
8. Add zookeeper node details to the solr configuration with the following commands
cd /opt/lucidworks-hdpsearch/solr/server/scripts/cloud-scripts
9. Edit this line with lists of the nodes which zookeeper is running on.
/zkcli.sh -zkhost rhel1,rhel2,rhel3 -cmd makepath /solr
10. Start Solr on all the nodes.
clush -a -b /opt/lucidworks-hdpsearch/solr/bin/solr start -c -z rhel1,rhel2,rhel3:2181/solr
11. Create Solr collection with 2 shards and replication factor 2.
/opt/lucidworks-hdpsearch/solr/bin/solr create -c tweets -d data_driven_schema_configs -s 2 -rf 2
1. From your Hadoop cluster, copy core-site.xml and put the HDF cluster on all the nodes under /opt/configuration-resources.
clush -a -b -c /root/core-site.xml –dest=/opt/ configuration-resources/
2. Go to http://10.4.1.240:8080/nifi/ (Distribution Node)
The top left icon corresponds to a “processor” which is the building block used to create NiFi flows on the workspace.
3. Drag/drop the processor icon (top left of page) onto the workspace to display a list of available processors. To find the GetTwitter processor, type “twitter” in the search box (edit screenshot and show some pointer)
Note: To create a twitter application or credentials, please see information in Appendix B.
4. Right click on the GetTwitter Property, and update the twitter endpoint to the filter endpoint and provide the twitter application credentials. Allow terms to filter to the section with the tweet selected.
5. Drag the Process group icon from the left tab (the 5th from the left) to the workspace.
6. Connect to the cluster node NIFI :
7. Go to the http://10.4.1.241:8080/nifi/ ( HDF cluster IP)
1. Click the link below to download the NiFi data flow template for the Twitter Dashboard:
2. Make note of the download location of this file. It is used in the next step.
3. Open up the NiFi user interface found at http://10.4.1.241:8080/nifi/ .
4. Import the downloaded template into NiFi.
5. Import the template by clicking the Templates icon on the top right corner of the screen (Third from the right).
6. Click Browse and navigate to the CVD-Cluster-Twitter-Demo-Template.xml file that was previously downloaded.
7. Click Import.
The template appears as shown below.
Now that the template is imported into NiFi, to instantiate it:
8. Drag the template icon (the 7th from the left) onto the workspace.
9. A dialog box should appear. Make sure that CVD-Cluster-Twitter-Demo-Template is selected and click Add.
10. A screen similar to the following should appear:
The NiFi flow has been set up. The boxes are what NiFi calls processors. Each of the processors can be connected to one another to help make the data flow. Each processor can perform specific tasks and are at the very heart of NiFi’s functionality.
Note: To make the data flows look clean, double click a connection arrow to create a vertex.
Make the connections between the processors 90 degree angles with respect to one another.
Right-click on a few of the processors and look at their configuration. This shows how the Twitter flow works.
Note: Change the Solr location to the Hadoop zookeeper nodes IP addresses.