Hadoop Architecture & Administration Training for Big Data Solutions

Nivå: Intermediate
Snittbetyg: 4,7/5 4,72/5 Based on 36 Reviews

In this Hadoop Architecture and Administration big data training course, you gain the skills to install, configure, and manage the Apache Hadoop platform and its associated ecosystem, and build a Hadoop big data solution that satisfies your business and data science requirements. You will learn to install and build a Hadoop cluster capable of processing very large data sets, then configure and tune the Hadoop environment to ensure high throughput and availability.

Additionally, this course will teach attendees how to allocate, distribute and manage resources; monitor the Hadoop file system, job progress and overall cluster performance; as well as exchange information with relational databases.

Nyckelfunktioner:

  • After-course instructor coaching benefit
  • Learning Tree end-of-course exam included
  • After-course computing sandbox included

Du kommer lära dig att:

  • Architect a Hadoop solution to satisfy your business requirements
  • Install and build a Hadoop cluster capable of processing large data
  • Configure and tune the Hadoop environment to ensure high throughput and availability
  • Allocate, distribute, and manage resources
  • Monitor the file system, job progress, and overall cluster performance

Välj den utbildningsform som passar dig bäst

LIVE, LÄRARLEDD

I klass & Live, Online-utbildning

  • 4-day instructor-led training course
  • After-course instructor coaching benefit
  • Learning Tree end-of-course exam included

FÖRETAGSINTERN UTBILDNING

Teamträning

  • Använd denna eller någon annan utbildning i ditt företag
  • Fullskalig programutveckling
  • Levereras när, var och hur du vill
  • Blandade utbildningsmodellerSkräddarsytt innehåll
  • Coaching av ett expertteam

Anpassa kurs och innehåll efter teamets behov

Kontakta oss

Utveckla dig och ditt team med anpassade eller öppna kurser alternativt e-learning

Learning Tree erbjuder kundanpassad utbildning hos er, öppna kurser i Stockholm, London eller Washington, möjlighet att delta via våra Anywhere centers (Malmö, Göteborg, Linköping, Stockholm eller Borlänge) eller olika former av e-learning med lärarstöd. Läs mer på www.learningtree.se/priser .

I klass & Live, Online-utbildning

Note: This course runs for 4 dagar *

*Events with the Partial Day Event clock icon run longer than normal but provide the convenience of half-day sessions.

  • 13 - 16 okt 9:00 - 4:30 EDT Online (AnyWare) Online (AnyWare) Boka Din Kursplats

  • 12 - 15 jan 9:00 - 4:30 EST New York / Online (AnyWare) New York / Online (AnyWare) Boka Din Kursplats

  • 29 mar - 1 apr 9:00 - 4:30 EDT Ottawa / Online (AnyWare) Ottawa / Online (AnyWare) Boka Din Kursplats

  • 20 - 23 jul 9:00 - 4:30 EDT New York / Online (AnyWare) New York / Online (AnyWare) Boka Din Kursplats

Kurs med startgaranti

När du ser symbolen för “Guaranteed to Run” vid ett kurstillfälle vet du att kursen blir av. Garanterat.

Partial Day Event

Learning Tree offers a flexible schedule program. If you cannot attend full day sessions, this option consists of four-hour sessions per day instead of the full-day session.

Hadoop Administration Course Information

  • Recommended Experience

    • Knowledge of Linux at the level of:
    • Knowledge of Java at the level of:

Hadoop Administration Course Outline

  • Introduction to Data Storage and Processing

    Installing the Hadoop Distributed File System (HDFS)

    • Defining key design assumptions and architecture
    • Configuring and setting up the file system
    • Issuing commands from the console
    • Reading and writing files

    Setting the stage for MapReduce

    • Reviewing the MapReduce approach
    • Introducing the computing daemons
    • Dissecting a MapReduce job
  • Defining Hadoop Cluster Requirements

    Planning the architecture

    • Selecting appropriate hardware
    • Designing a scalable cluster

    Building the cluster

    • Installing Hadoop daemons
    • Optimising the network architecture
  • Configuring a Cluster

    Preparing HDFS

    • Setting basic configuration parameters
    • Configuring block allocation, redundancy and replication

    Deploying MapReduce

    • Installing and setting up the MapReduce environment
    • Delivering redundant load balancing via Rack Awareness
  • Maximising HDFS Robustness

    Creating a fault–tolerant file system

    • Isolating single points of failure
    • Maintaining High Availability
    • Triggering manual failover
    • Automating failover with Zookeeper

    Leveraging NameNode Federation

    • Extending HDFS resources
    • Managing the namespace volumes

    Introducing YARN

    • Critiquing the YARN architecture
    • Identifying the new daemons
  • Managing Resources and Cluster Health

    Allocating resources

    • Setting quotas to constrain HDFS utilization
    • Prioritising access to MapReduce using schedulers

    Maintaining HDFS

    • Starting and stopping Hadoop daemons
    • Monitoring HDFS status
    • Adding and removing data nodes

    Administering MapReduce

    • Managing MapReduce jobs
    • Tracking progress with monitoring tools
    • Commissioning and decommissioning compute nodes
  • Maintaining a Cluster

    Employing the standard built–in tools

    • Managing and debugging processes using JVM metrics
    • Performing Hadoop status checks

    Tuning with supplementary tools

    • Assessing performance with Ganglia
    • Benchmarking to ensure continued performance
  • Extending Hadoop

    Simplifying information access

    • Enabling SQL–like querying with Hive
    • Installing Pig to create MapReduce jobs

    Integrating additional elements of the ecosystem

    • Imposing a tabular view on HDFS with HBase
    • Leveraging memory with Spark
  • Implementing Data Ingress and Egress

    Facilitating generic input/output

    • Moving bulk data into and out of Hadoop
    • Transmitting HDFS data over HTTP with WebHDFS

    Acquiring application–specific data

    • Collecting multi–sourced log files with Flume
    • Importing and exporting relational information with Sqoop
  • Planning for Backup, Recovery and Security

    • Coping with inevitable hardware failures
    • Securing your Hadoop cluster

Teamträning

Hadoop Administration Training FAQs

  • Can I learn Hadoop Architecture and Administration online?

    Yes! We know your busy work schedule may prevent you from getting to one of our classrooms which is why we offer convenient online training to meet your needs wherever you want, including online training.

  • Where does MongoDB fit in my data science training?

    A data science algorithm will ingest data from an appropriate storage technology like a relational database, MongoDB, Hadoop distributed file system into R or Python for data wrangling and model building. If the amount of data is large execution is performed in parallel using Spark. The results will often be visualised by the end user on dashboards.

Questions about which training is right for you?

call 08-506 668 00




100% Satisfaction Guaranteed

Your Training Comes with a 100% Satisfaction Guarantee!*

*Partner-delivered courses may have different terms that apply. Ask for details.

Online (AnyWare)
New York / Online (AnyWare)
Ottawa / Online (AnyWare)
New York / Online (AnyWare)
Hur föredrar du att bli kontaktad:

Please Choose a Language

Canada - English

Canada - Français