Course Outline

Pre-Requisites

No specific technical experience or prerequisites are needed.

Lessons

This course is a survey of big data – the landscape, the technology behind it, business drivers and strategic possibilities. “Big data” is a hot buzzword, but most organizations are struggling to put it to practical use. Without assuming any prior knowledge of Apache Hadoop or big data management, this course teaches a wide range of professional roles how to tap and manage the potential benefits of big data, including:

Discovering customer insights buried in your existing data
Uncovering product opportunities from data insights
Pinpointing decision points and criteria
Scaling your existing workflows and operations
Learning to ask questions that drive tangible business value from Big Data tools

Navigate the technology stacks and tools used to work with big data
Establish a common vocabulary on your teams for applying big data practices
Get an overview of how big data technologies work: Apache Hadoop, Spark, Pig, Hive, Sqoop, OOZIE, and FLUME
Design both functional and non-functional requirements for working with big data
Understand common business cases for big data
Differentiate between hype and what’s truly possible
Look at examples of real-world big data use cases
Select initiatives and projects that have high potential to benefit from big data applications
Understand what type of staffing, technical skills, and training is required for projects that incorporate or focus on big data

Course Outline

Part 1: Introduction to Big Data

Academic
Early web
Web-scale
- 1994 – 2012
- 2016
- 2020

Part 2: Sources (Examples)

Internet
Transport systems
Medical, healthcare
Insurance
Military and others

Part 3: Hadoop – the free platform for working with big data

History
Yahoo
Platform fragmentation
What usage looks like in the enterprise

Part 4: The concepts

Load data how you find it
Process it when you can
Project it into various schemas on the fly
Push it back to where you need it

Part 5: The basics

What it’s good for
What can’t it do / disadvantages
Most common use cases for big data

Part 6: Introduction to HDFS

Robustness
Data Replication
Gotchas

Part 7: MapReduce – the core big data function

Map explained
Sort and shuffle explained
Reduce explained

Demonstration: Hadoop, HDFS, and MapReduce - Let’s try it!

Part 8: YARN

How it fits
How it works
Resource Manager
Application Master

Part 9: PIG

What it is
How it works
Compatibilities
Advantages
Disadvantages

Demonstration: YARN and PIG - Let’s try it!

Part 10: Processing Data

The Piggy Bank
Loading and Illustrating the data
Writing a Query
Storing the Result

Part 11: HIVE

Data warehousing
What it is, what it’s not
Language compatibilities
Advantages

Demonstration: HIVE - Let’s try it!

Example demo walkthrough: Contextual advertising

Part 12: OOZIE

What it is
Complex workflow environments
Reducing time-to-market
Frequency execution
How it works with other big data tools

Example demo walkthrough: How to run a job

Part 13: FLUME – stream, collect, store and analyze high-volume log data

How it works: Event, source, sink, channel, agent and client
How it works illustrated
How it works demonstrated

Part 14: SPARK

Move over 2012 Big Data tools: Apache SPARK is the new power tool
The new open source cluster framework
When SPARK performs 100 times faster
Performance comparison of Spark and Hadoop
What else can it do?

Part 15: HBASE

What it is
Common use cases

Part 16: Using External Tools

Who should attend

This class is for anyone involved in project, product, or IT work who is actively consuming or considering big data services. No specific technical experience or prerequisites are needed.
•   Software Engineers and Team Leads
•   Project Managers
•   Business Analysts
•   DBAs and Data Engineering teams
•   Business Customers
•   System Analysts

Cancellation Policy

If a change needs to be made to your public course registration (cancel, transfer, or substitution) ASPE must receive written notice via email at customerservice@aspeinc.com or fax at 919-816-1710. If a cancel or transfer request is made less than 15 business days prior to the class start date, payment will still be due, no refunds will be issued and you will be charged a $200 change fee. Your paid tuition will be available for one year to be used as a credit towards another course of equal value; only one reenrollment opportunity is allowed. Failure to attend the course without written notification will result in forfeiture of the full course price. Student substitutions may be made at any time prior to the start of class free of charge. If ASPE is forced to cancel a course for any reason, liability is limited to the registration fee only.

Training Location

Virtual
Your Address

Your City, Your Province
Your Country

About ASPE Training Inc.

Training Provider Rating

This vendor has an overall average rating of 4.53 out of 5 based on 5 reviews.

Lea W .

No comment

Victor V .

No comment

Nathan L .

No comment

Samad S .

No comment

Malinga D .

Course was professionally conducted and the opportunity for practice the concepts with hands on exercises was very useful.

Big Data Fundamentals

Course Outline

Pre-Requisites

Lessons

Cancellation Policy

Training Location

About ASPE Training Inc.

Training Provider Rating

Course Reviews

More Courses from ASPE Training Inc.

Introduction to Using Puppet

Jira for Administrators

Cloud Strategy Boot Camp

More Courses in 'Big Data and Hadoop Training Classes' Category

Introduction to Data Science, Machine Learning & AI using Python Training