International Site

Data Science and Big Data Analytics

This course provides practical foundation level training that enables immediate and effective participation in big data and other analytics projects. It includes an introduction to big data and the Data Analytics Lifecycle to address business challenges that leverage big data. The course provides grounding in basic and advanced analytic methods and an introduction to big data analytics technology and tools, including MapReduce and Hadoop. Labs offer opportunities for students to understand how these methods and tools may be applied to real world business challenges by a practicing data scientist. The course takes an “Open”, or technology-neutral approach, and includes a final lab which addresses a big data analytics challenge by applying the concepts taught in the course in the context of the Data Analytics Lifecycle. The course prepares the student for the Proven™ Professional Data Scientist Associate (EMCDSA) certification exam

Course Contents

• Introduction and Course Agenda
• Introduction to Big Data Analytics
• Data Analytics Lifecycle
• Review of Basic Data Analytic Methods Using R
• Advanced Analytics – Theory And Methods
• Advanced Analytics - Technologies and Tools
• The Endgame, or Putting it All Together
In this course, each student will receive the original EMC course documentation.

 Detailed table of contents

 Request your tailor-made course.

Target Group

• Managers of teams of business intelligence, analytics, and big data professionals
• Current Business and Data Analysts looking to add big data analytics to their skills.
• Data and database professionals looking to exploit their analytic skills in a big data environment
• Recent college graduates and graduate students with academic experience in a related discipline looking to move into the world of Data Science and big data
• Individuals seeking to take advantage of the EMC Proven. Professional Data Scientist Associate (EMCDSA) certification

Knowledge Prerequisites

• A strong quantitative background with a solid understanding of basic statistics.
• Experience with a scripting language, such as Java, Perl, or Python (or R). Many of the lab examples taught in the course use R (actually RStudio), which is an open source statistical tool and programming language
• Experience with SQL