Hadoop Java Programming Training for Big Data Solutions

Nivå: Intermediate
Snittbetyg: 4,5/5 4,54/5 Based on 54 Reviews

In this Hadoop Java Programming course, you will implement a strategy for developing Hadoop jobs and extracting business value from large and varied data sets. This Apache Hadoop development training is essential for programmers who want to augment their programming skills to use Hadoop for a variety of big data solutions. You will learn to write, customise and deploy MapReduce jobs to summarise data, load and retrieve unstructured data from HDFS and HBase. In addition, you will develop Hive and Pig queries to simplify data analysis, as well as test and debug jobs using MRUnit.

Nyckelfunktioner:

  • After-course instructor coaching benefit
  • Learning Tree end-of-course exam included
  • After-course computing sandbox included

Du kommer lära dig att:

  • Write, customise, and deploy Java MapReduce jobs to summarise data
  • Develop Hive and Pig queries to simplify data analysis
  • Test and debug jobs using MRUnit
  • Monitor task execution and cluster health

Välj den utbildningsform som passar dig bäst

LIVE, LÄRARLEDD

Klassrum och självstudier

  • 4-day instructor-led training course
  • After-course instructor coaching benefit
  • Learning Tree end-of-course exam included

FÖRETAGSINTERN UTBILDNING

Teamträning

  • Använd denna eller någon annan utbildning i ditt företag
  • Fullskalig programutveckling
  • Levereras när, var och hur du vill
  • Blandade utbildningsmodellerSkräddarsytt innehåll
  • Coaching av ett expertteam

Anpassa kurs och innehåll efter teamets behov

Kontakta oss

Utveckla dig och ditt team med anpassade eller öppna kurser alternativt e-learning

Learning Tree erbjuder kundanpassad utbildning hos er, öppna kurser i Stockholm, London eller Washington, möjlighet att delta via våra Anywhere centers (Malmö, Göteborg, Linköping, Stockholm eller Borlänge) eller olika former av e-learning med lärarstöd. Läs mer på www.learningtree.se/priser .

Klassrum och självstudier

Note: This course runs for 4 dagar

  • 21 - 24 jan 9:00 - 4:30 EST New York / Online (AnyWare) New York / Online (AnyWare) Boka Din Kursplats

  • 18 - 21 feb 9:00 - 4:30 EST Herndon, VA / Online (AnyWare) Herndon, VA / Online (AnyWare) Boka Din Kursplats

  • 23 - 26 jun 9:00 - 4:30 EDT New York / Online (AnyWare) New York / Online (AnyWare) Boka Din Kursplats

  • 4 - 7 aug 9:00 - 4:30 EDT Herndon, VA / Online (AnyWare) Herndon, VA / Online (AnyWare) Boka Din Kursplats

Kurs med startgaranti

När du ser symbolen för “Guaranteed to Run” vid ett kurstillfälle vet du att kursen blir av. Garanterat.

Hadoop Java Programming Course Information

  • Requirements

    • Java experience at the level of:
      • Course 471, Java Programming Introduction, or at least six months of Java programming experience

Hadoop Java Programming Course Outline

  • Introduction to Hadoop

    • Identifying the business benefits of Hadoop
    • Surveying the Hadoop ecosystem
    • Selecting a suitable distribution
  • Parallelizing Program Execution

    Meeting the challenges of parallel programming

    • Investigating parallelisable challenges: algorithms, data and information exchange
    • Estimating the storage and complexity of Big Data

    Parallel programming with MapReduce

    • Dividing and conquering large-scale problems
    • Uncovering jobs suitable for MapReduce
    • Solving typical business problems
  • Implementing Real-World MapReduce Jobs

    Applying the Hadoop MapReduce paradigm

    • Configuring the development environment
    • Exploring the Hadoop distribution
    • Creating the components of MapReduce jobs
    • Introducing the Hadoop daemons
    • Analysing the stages of MapReduce processing: splitting, mapping, shuffling and reducing

    Building complex MapReduce jobs

    • Selecting and employing multiple mappers and reducers
    • Leveraging built-in mappers, reducers and partitioners
    • Analysing time series data with secondary sort
    • Streaming tasks through various programming languages
  • Customising MapReduce

    Solving common data manipulation problems

    • Executing algorithms: parallel sorts, joins and searches
    • Analysing log files, social media data and e-mails

    Implementing partitioners and comparators

    • Identifying network-bound, CPU-bound and disk I/O-bound parallel algorithms
    • Dividing the workload efficiently using partitioners
    • Controlling grouping and sort order with comparators
    • Collecting metrics with counters
  • Persisting Big Data with Distributed Data Stores

    Making the case for distributed data

    • Achieving high performance data throughput
    • Recovering from media failure through redundancy

    Interfacing with Hadoop Distributed File System (HDFS)

    • Breaking down the structure and organisation of HDFS
    • Loading raw data and retrieving results
    • Reading and writing data programmatically
    • Manipulating Hadoop SequenceFile types
    • Sharing reference data with DistributedCache

    Structuring data with HBase

    • Migrating from structured to unstructured storage
    • Applying NoSQL concepts with schema on read
    • Connecting to HBase from MapReduce jobs
    • Comparing HBase to other types of NoSQL data stores
  • Simplifying Data Analysis with Query Languages

    Unleashing the power of SQL with Hive

    • Structuring databases, tables, views and partitions
    • Integrating MapReduce jobs with Hive queries
    • Querying with HiveQL
    • Accessing Hive servers through JDBC
    • Extending HiveQL with User-Defined Functions (UDF)

    Executing workflows with Pig

    • Developing Pig Latin scripts to consolidate workflows
    • Integrating Pig queries with Java
    • Interacting with data through the grunt console
    • Extending Pig with User-Defined Functions (UDF)
  • Managing and Deploying Big Data Solutions

    Testing and debugging Hadoop code

    • Logging significant events for auditing and debugging
    • Debugging in local mode
    • Validating requirements with MRUnit

    Deploying, monitoring and tuning performance

    • Deploying to a production cluster
    • Optimising performance with administrative tools
    • Monitoring job execution through web user interfaces

Teamträning

Hadoop Java Programming Training FAQs

  • Is Java required to learn Hadoop?

    Exam preparation through fact-based questions and case-study questions.

  • Can I learn Hadoop Java Programming online?

    Yes! We know your busy work schedule may prevent you from getting to one of our classrooms which is why we offer convenient online training to meet your needs wherever you want, including online training.

Questions about which training is right for you?

call 08-506 668 00




100% Satisfaction Guaranteed

Your Training Comes with a 100% Satisfaction Guarantee!*

  • If you are not 100 % satisfied, you pay no tuition fee!
  • No advance payment required for most products.
  • Tuition fee can be paid later by invoice - OR - at the time of checkout by credit card.

*Partner-delivered courses may have different terms that apply. Ask for details.

New York / Online (AnyWare)
Herndon, VA / Online (AnyWare)
New York / Online (AnyWare)
Herndon, VA / Online (AnyWare)
Hur föredrar du att bli kontaktad:

Please Choose a Language

Canada - English

Canada - Français