Introduction to Big Data Training

Course 1250

  • Duration: 3 days
  • Labs: Yes
  • Language: English
  • Level: Foundation

This hands-on Introduction to Big Data training provides a unique approach to help you act on data for real business gain. The focus is not on what a tool can do, but on what you can do with the output from the tool. By integrating Big Data training with your data science training you gain the skills you need to store, manage, process, and analyze massive amounts of structured and unstructured data to extract meaningful insights.

Attend this Introduction to Big Data in one of three formats - live, instructor-led, on-demand or a blended on-demand/instructor-led version.

Introduction to Big Data Training Delivery Methods

  • In-Person

  • Online

  • On-Demand

Introduction to Big Data Training Course Benefits

Store, manage, and analyze unstructured dataSelect the correct big data stores for disparate data setsProcess large data sets using Hadoop and Spark to extract valueQuery large data sets in near real time with Pig and HivePlan and implement a big data strategy for your organization

Leverage continued support with after-course one-on-one instructor coaching and computing sandbox

Introduction to Big Data Course Outline

Defining Big Data

  • The four dimensions of Big Data: volume, velocity, variety, veracity
  • Introducing the Storage, MapReduce and Query Stack

Delivering business benefit from Big Data

  • Establishing the business importance of Big Data
  • Addressing the challenge of extracting useful data
  • Integrating Big Data with traditional data

Analyzing your data characteristics

  • Selecting data sources for analysis
  • Eliminating redundant data
  • Establishing the role of NoSQL

Overview of Big Data stores

  • Data models: key value, graph, document, column–family
  • Hadoop Distributed File System
  • HBase
  • Hive
  • Cassandra
  • Amazon S3
  • BigTable
  • DynamoDB
  • MongoDB
  • Redis
  • Riak
  • Neo4J

Selecting Big Data stores

  • Choosing the correct data stores based on your data characteristics
  • Moving code to data
  • Messaging with Kafka
  • Implementing polyglot data store solutions
  • Aligning business goals to the appropriate data store

Integrating disparate data stores

  • Mapping data to the programming framework
  • Connecting and extracting data from storage
  • Transforming data for processing
  • Subdividing data in preparation for Hadoop MapReduce

Employing Hadoop MapReduce

  • Creating the components of Hadoop
  • MapReduce jobs
  • Executing Hadoop
  • MapReduce jobs
  • Monitoring the progress of job flows

The building blocks of Hadoop MapReduce

  • Distinguishing Hadoop daemons
  • Investigating the Hadoop Distributed File System
  • Selecting appropriate execution modes: local, pseudo–distributed and fully distributed
  • Accelerating process with Spark

Handling streaming data

  • Comparing real–time processing modelsLeveraging Storm to extract live events
  • Leveraging Spark Streaming to extract live events
  • Combining streaming and batch processing in a Lambda architecture

Abstracting Hadoop MapReduce jobs with Pig

  • Communicating with Hadoop in Pig Latin
  • Executing commands using the Grunt Shell
  • Streamlining high–level processing

Performing ad hoc Big Data querying with Hive

  • Persisting metadata in the Hive Metastore
  • Performing queries with HiveQL
  • Investigating Hive file formats

Creating business value from extracted data

  • Visualizing processed results with reporting tools
  • Querying in real time with Impala

Defining a Big Data strategy for your organization

  • Establishing your Big Data needs
  • Meeting business goals with timely data
  • Evaluating commercial Big Data tools
  • Managing organizational expectations

Enabling analytic innovation

  • Focusing on business importance
  • Framing the problem
  • Selecting the correct tools
  • Achieving timely results
  • Selecting suitable vendors and hosting options
  • Balancing costs against business value
  • Keeping ahead of the curve

Introduction to Big Data Training Bundle

This product offers access to:

  • 2 on-demand courses and 5 eBooks that have been mapped directly to the objectives of the 3-day introduction course.
  • At any time during your annual access to this offering, you may attend one of our 1-day course events focused specifically on Big Data Technologies, Trends & Insights Training.

On-Demand Courses

  • Mastering Big Data Analytics with PySpark
  • Master Big Data Ingestion and Analytics with Flume, Sqoop, Hive and Spark


  • Artificial Intelligence for Big Data
  • Big Data Architect's Handbook
  • Modern Big Data Processing with Hadoop
  • Big Data Processing with Apache Spark
  • Practical Big Data Analytics

Unlimited Access Introduction to Big Data Premium Blended Training

The Premium Blended Training offers access to:

On-Demand Courses

  • Mastering Big Data Analytics with PySpark
  • Master Big Data Ingestion and Analytics with Flume, Sqoop, Hive and Spark


  • Artificial Intelligence for Big Data
  • Big Data Architect's Handbook
  • Modern Big Data Processing with Hadoop
  • Big Data Processing with Apache Spark
  • Practical Big Data Analytics

Need Help Finding The Right Training Solution?

Our training advisors are here for you.

Introduction to Big Data Training FAQs

Big data is a term used to define data sets that have the potential to rapidly grow so large that they become unmanageable. The Big Data movement includes new tools and ways of storing information that allow efficient processing and analysis for informed business decision-making.

Understanding how to work with big data can help you glean useful insights from large amounts of data, which can help you and your organization make better business decisions.

Big data refers to the data set that has huge, and growing, volume that can quickly become unwieldy. Machine learning is a subsection of Artificial Intelligence (AI) that can help you extract value from big data to solve problems.

Working knowledge of the Microsoft Windows platform and basic database concepts.

Typical job roles include: Project and IT Managers, Database Administrators & Data Architects, Developers & SQL Developers, Data Scientists & Business Intelligence.

Data science is about obtaining insights from data using methods from several disciplines like statistics, linear algebra, differential equations, machine learning, etc. Many of the activities in data science involve large amounts of data, especially for training models. Some will use data that cannot be efficiently stored in a relational fashion; unstructured data. Often data needs to be processed and conclusions reached extremely quickly. These requirements for large volume, variety and velocity create an overlap between big data, how to store, secure, manage and access the data, and data science, how best to utilize the data in order to draw conclusions and gain insights from it.

Schedules are busy, but big data training online makes it easy to level-up your career. If you need Big Data online training, we’ve got you covered. Our AnyWare course delivery option gives you the advantages of a live classroom right from the comfort of your computer screen – no matter where you are.

No. While the content selected does map to the objectives of the instructor-led course, it does not include a recorded version of the instructor-led class. The objectives have been re-imagined to be presented in digital, self-guided formats.

An outline of the content you will receive can be seen above. You will also get access to any new on-demand content that becomes available during your annual enrollment period.

Yes! Each book and video begins with a step by step guide for you to set up a coding environment on your personal computer. The course content is full of examples and practical advice, followed up by the chance to embed your learning through real world tasks. All example code is available to download, copy and use - giving you the chance to work and practice as you read and watch.

Once payment is received, you will receive an email from Learning Tree with all the links and information you need to get started.

Once you are enrolled in the program, we will email you specific dates and details. You can also easily sign up through your My Learning Tree dashboard!

Once payment is received, you will receive details for your Unlimited Access Training Bundle via email. At that time, you may call or email our customer service team for assistance in enrolling in the event date of your choice. You can also easily sign up through your My Learning Tree dashboard!