Statistics for Data Science
Understand the first steps toward machine learning.
Enroll Now Customize for Organizations
At a Glance
- Enrollment
- Open Enrollment
- Duration
- Eight weeks
- Format
- Online
- Total CEUs
- 6.8 CEUs
- Investment
-
$2,500
Upcoming Dates
February Start
Gain an advantage in today’s competitive job market by learning to code and to understand data science.
The University of Chicago’s eight-week Statistics for Data Science course will prepare you to solve complex challenges with data and drive important decision-making processes. You will learn to code at an introductory level and take the first steps to becoming a data scientist.
Designed For
Designed for aspiring data scientists who would like to learn to code, those with computer science backgrounds, and those looking to begin a career in data science.
Statistics for Data Science Curriculum
Statistics for Data Science is a highly practical course that will provide you with the foundational tools to solve data science problems and prepare you to take the next steps in the world of machine learning.
After completing the course, you will be able to:
- Build a classification model and interpret results
- Understand the concept of hypothesis testing
- Learn the intricacies of logistic regression, evaluate its outputs, and comprehend how a link function works
- Handle a data set to produce a specified set of results
- Perform a Principal Components Analysis (PCA) to provide meaningful insights on the original data set
- Perform multiple pairwise comparisons and analyze models with multiple categorical predictors
- Present a start-to-finish analysis with meaningful insights on a data set using exploratory analysis, dimension reduction, linear models, and classification models
Learning Objectives
- Solve data science problems and prepare to take the next steps in the world of machine learning
- Understand RStudio and its application
- Gain confidence handling and manipulating data
- Interpret data and be able to communicate it effectively
Course format
- Eight weeks in length
- Weekly, self-paced interactive learning modules and assignments are time-sensitive and should be completed by the set deadlines
- Synchronous sessions and live question and answer sessions
- Mentors will provide continuous support and encourage a dynamic and positive learning environment
Weekly course schedule
Familiarize yourself with the basics of modeling and understand what it means, learn how to define the concept of an objective function for evaluating model performance, and understand how to carry out unsupervised/supervised analysis and the bias/variance tradeoff with an introduction to different model types.
Understand the concepts of random variables and hypothesis testing. Gain exposure to different data distributions and methods for hypothesis testing.
Explore distance and density-based clustering methods for exploratory analysis—k-means, hierarchical clustering, and DBSCAN—and exploratory analysis by selecting the appropriate clustering method to expand your knowledge of data sets.
Discover dimension reduction and learn how to apply principal components analysis (PCA) as a method, including the fundamentals of PCA to comprehend how its results have significance in terms of the original data and the creation of meaningful features from exploratory analysis that will help you perform supervised modeling.
Examine Method of Moments and learn how to use it to determine the linear model parameters, understand the assumptions and restrictions of a linear model, and evaluate the estimates and suitability in a linear model.
Learn to perform variable transformations and include interaction terms to improve model quality, discover and address issues with multicollinearity, and incorporate features from exploratory analysis when building a linear model.
Delve into the concept of a classification model while learning about the intricacies of logistic regression, outputs, and link functions. Understand the extension of binary logistic regression to multinomial logistic regression.
Examine the process of modeling categorical independent variables; evaluate the outputs of ANOVA, both from a traditional linear model and for determining if group means values are significantly different; and perform multiple pairwise comparisons and analyze models with multiple categorical predictors.
Meet Your Instructor
Career Outlook
Data is a commodity, and statisticians who know how to code and who understand data science are in high demand across industries. Statistics, the art of finding structure in and gleaning deeper insights from data, is among the most vital means of analyzing and quantifying uncertainty, and statistical methods are crucial to data science. The overall employment market for mathematicians and statisticians is expected to grow by 33% over the next decade.
The average annual pay for a statistician in the US
The ranking of statistician in US News and World Report’s 2021 Best Business Jobs
The expected CAGR of the global data science platform size from 2020 to 2027
Potential job titles for Data Science-Applied Statistics
- Analytics Consultant
- Data Insight Analyst
- Data Scientist
- Machine Learning Specialist
- Statistician