-
This course provides practice-oriented knowledge for working on projects in which large volumes of data (big data) are processed and in which special requirements are placed on analysis functions. In addition to an introduction to big data and the data analytics lifecycle, it provides basic and advanced analysis methods as well as an introduction to the technologies and tools used to analyze big data, such as MapReduce and Hadoop. Extensive lab exercises provide practical relevance. The course follows a technology-neutral approach and includes a final lab exercise where participants apply the concepts learned in the course to the data analytics lifecycle environment. It prepares participants for certification as a Proven™ Professional Data Scientist Associate (EMCDSA) and teaches the basics of data science, which can be deepened through further training and practical experience.
-
Course Contents
-
- Introduction to Big Data Analytics
- Data Analytics Lifecycle
- Basic Data Analytic Methods Using R
- Advanced Analytics - Theory and Methods
- Advanced Analytics - Technologies and Tools
- The Endgame, or Putting it All Together
-
Target Group
-
- Team leaders whose responsibilities include business information services and data analytics, as well as anyone professionally involved with big data.</li
- Analysts who deal with business processes and data and want to expand their analysis skills for big data
- Data and database experts who want to expand their analysis skills for big data
- Career changers who want to familiarize themselves with data science and the field of big data
- People who want to obtain certification as an EMC Proven Professional Data Scientist Associate (EMCDSA).
-
Knowledge Prerequisites
-
- Comprehensive foundational knowledge of quantitative analysis and statistics as taught in a Statistics 101 level course.
- Experience with a scripting language, e.g. Java, Perl, or Python (or R).
- Experience with SQL (some examples used in the course use PSQL)
Introduction to Big Data Analytics |
. Big Data Overview |
. State of the Practice in Analytics |
. The Data Scientist |
. Big Data Analytics in Industry Verticals |
Data Analytics Lifecycle |
. Discovery |
. Data Preparation |
. Model Planning |
. Model Building |
. Communicating Results |
. Operationalizing |
Review of Basic Data Analytic Methods Using R |
. Using R to Look at Data ¡V Introduction to R |
. Analyzing and Exploring the Data |
. Statistics for Model Building and Evaluation |
Advanced Analytics ¡V Theory And Methods |
. K Means Clustering |
. Association Rules |
. Linear Regression |
. Logistic Regression |
. Naive Bayesian Classifier |
. Decision Trees |
. Time Series Analysis |
. Text Analysis |
Advanced Analytics - Technologies and Tools |
. Analytics for Unstructured Data - MapReduce and Hadoop |
. The Hadoop Ecosystem |
o In-database Analytics ¡V SQL Essentials |
o Advanced SQL and MADlib for In-database Analytics |
The Endgame, or Putting it All Together |
. Operationalizing an Analytics Project |
. Creating the Final Deliverables |
. Data Visualization Techniques |
. Final Lab Exercise on Big Data Analytics |
-
Classroom training
- Do you prefer the classic training method? A course in one of our Training Centers, with a competent trainer and the direct exchange between all course participants? Then you should book one of our classroom training dates!
-
Online training
- You wish to attend a course in online mode? We offer you online course dates for this course topic. To attend these seminars, you need to have a PC with Internet access (minimum data rate 1Mbps), a headset when working via VoIP and optionally a camera. For further information and technical recommendations, please refer to.
-
Tailor-made courses
-
You need a special course for your team? In addition to our standard offer, we will also support you in creating your customized courses, which precisely meet your individual demands. We will be glad to consult you and create an individual offer for you.

-
This course provides practice-oriented knowledge for working on projects in which large volumes of data (big data) are processed and in which special requirements are placed on analysis functions. In addition to an introduction to big data and the data analytics lifecycle, it provides basic and advanced analysis methods as well as an introduction to the technologies and tools used to analyze big data, such as MapReduce and Hadoop. Extensive lab exercises provide practical relevance. The course follows a technology-neutral approach and includes a final lab exercise where participants apply the concepts learned in the course to the data analytics lifecycle environment. It prepares participants for certification as a Proven™ Professional Data Scientist Associate (EMCDSA) and teaches the basics of data science, which can be deepened through further training and practical experience.
-
Course Contents
-
- Introduction to Big Data Analytics
- Data Analytics Lifecycle
- Basic Data Analytic Methods Using R
- Advanced Analytics - Theory and Methods
- Advanced Analytics - Technologies and Tools
- The Endgame, or Putting it All Together
-
Target Group
-
- Team leaders whose responsibilities include business information services and data analytics, as well as anyone professionally involved with big data.</li
- Analysts who deal with business processes and data and want to expand their analysis skills for big data
- Data and database experts who want to expand their analysis skills for big data
- Career changers who want to familiarize themselves with data science and the field of big data
- People who want to obtain certification as an EMC Proven Professional Data Scientist Associate (EMCDSA).
-
Knowledge Prerequisites
-
- Comprehensive foundational knowledge of quantitative analysis and statistics as taught in a Statistics 101 level course.
- Experience with a scripting language, e.g. Java, Perl, or Python (or R).
- Experience with SQL (some examples used in the course use PSQL)
Introduction to Big Data Analytics |
. Big Data Overview |
. State of the Practice in Analytics |
. The Data Scientist |
. Big Data Analytics in Industry Verticals |
Data Analytics Lifecycle |
. Discovery |
. Data Preparation |
. Model Planning |
. Model Building |
. Communicating Results |
. Operationalizing |
Review of Basic Data Analytic Methods Using R |
. Using R to Look at Data ¡V Introduction to R |
. Analyzing and Exploring the Data |
. Statistics for Model Building and Evaluation |
Advanced Analytics ¡V Theory And Methods |
. K Means Clustering |
. Association Rules |
. Linear Regression |
. Logistic Regression |
. Naive Bayesian Classifier |
. Decision Trees |
. Time Series Analysis |
. Text Analysis |
Advanced Analytics - Technologies and Tools |
. Analytics for Unstructured Data - MapReduce and Hadoop |
. The Hadoop Ecosystem |
o In-database Analytics ¡V SQL Essentials |
o Advanced SQL and MADlib for In-database Analytics |
The Endgame, or Putting it All Together |
. Operationalizing an Analytics Project |
. Creating the Final Deliverables |
. Data Visualization Techniques |
. Final Lab Exercise on Big Data Analytics |
-
Classroom training
- Do you prefer the classic training method? A course in one of our Training Centers, with a competent trainer and the direct exchange between all course participants? Then you should book one of our classroom training dates!
-
Online training
- You wish to attend a course in online mode? We offer you online course dates for this course topic. To attend these seminars, you need to have a PC with Internet access (minimum data rate 1Mbps), a headset when working via VoIP and optionally a camera. For further information and technical recommendations, please refer to.
-
Tailor-made courses
-
You need a special course for your team? In addition to our standard offer, we will also support you in creating your customized courses, which precisely meet your individual demands. We will be glad to consult you and create an individual offer for you.
