The best things in life are free. And meant to be shared. This is remarkably true when it comes to data science learning resources on the web. Really, you don’t need to invest a single dime to learn to be a data scientist. You just need to spend countless hours scouring the internet for these free data science resources.

Image source: www.datasciencecentral.com

Mercifully, we have combed the web for you and put together the motherload of free data science resources. They are categorized into the following:

- Algorithms
- Artificial Intelligence
- Bayesian Networks
- Big Data Analytics
- Business
- Computer Vision
- Data Journalism
- Data Mining
- Data Scientists
- Data Structures
- Data Visualization
- General Data Science
- Hadoop, MapReduce
- Information Retrieval
- Linear Regression
- Linguistic
- Machine Learning, Predictive Analytics
- Math
- Metadata
- Natural Language Processing
- Network Science
- Open Data
- Other Sites with Free Data Science Resources
- Probability
- Publishers
- Python
- R
- Statistics
- Text Mining

We hope you find the list helpful. Please let us know of any free data science sites, books and videos we failed to spot. Please also report any dead links so we may keep this list as fresh and updated as possible. Algorithms

- Data Structures and Algorithms with Object-Oriented Design Patterns in Python, Fundamentals of data structures and algorithms

- The Intelligent Web: Search, Smart Algorithms and Big Data, An attempt at explaining a few important and exciting advances in computer science and artificial intelligence (AI) in a manner accessible to all

- Advances in Scalable Bayesian Computation, 18 workshop videos during the 5-day workshop
- Gaussian Processes for Machine Learning, One of the most important Bayesian machine learning approaches
- Think Bayes: Bayesian Statistics Made Simple, Bayesian statistics with Python and discrete approximations
- Probabilistic Programming & Bayesian Methods for Hackers, Intro to Bayesian methods and probabilistic programming
- Bayesian Reasoning and Machine Learning, Unified treatment via graphical models, a marriage between graph and probability theory, facilitating the transference of Machine Learning concepts

- Data Analysis by SlideRule, Intensive introduction to data analysis. Consists of free online lectures, homework assignments, quizzes and projects, and will take around 350-400 hours
- Run Cloudera in the Public Cloud: A Live Demo,
- Analytics University, Videos on Analytical techniques
- Implementing Big Data Analysis, Learn how to use Windows Azure HDInsight to process big data and to generate results for analysis and reporting with Microsoft data tools
- Working with big data on Azure, Explores the impact of big data on businesses, and shows how to deploy Hadoop clusters and run MapReduce on Azure to turn this data into insights
- Frontiers in Massive Data Analysis, The promise and perils of massive data
- The Free Hive Book, A free electronic book about Apache Hive. The book is geared towards SQL-knowledgeable business users with some advanced tips for devops
- Big Data Sourcebook: Your Guide to the Data Revolution,
- No More Secrets with Big Data Analytics,
- Harnessing The Power Of Big Data Through Education And Data-Driven Decision Making,
- Planning for Big Data, A CIO’s Handbook to the Changing Data Landscape
- The Fourth Paradigm: Data-Intensive Scientific Discovery, This book is about a new, fourth paradigm for science based on data-intensive computing
- Big Data Now: Current Perspective from O’Reilly Media, Top big data posts from late fall 2012 through late fall 2013
- Big Data Analytics with Twitter, Taught in close collaboration with Twitter, focuses on the tools and algorithms for data analysis as applied to Twitter’s data
- Modeling with Data: Tools and Techniques for Scientific Computing, Executing computationally intensive analyses on very large data sets
- Business Analytics in Retail for Dummies, Using business analytics to discover insights that, when acted on, drive revenue growth and improve customer relations
- Big Data Analytics Infrastructure for Dummies, The emphasis is on hardware infrastructure ? processing, storage, systems software, and internal networks
- Real-Time Big Data Analytics: Emerging Architecture, Using real-time big data analytics (RTBDA) to improve sales, lower costs a ticket to improved sales, higher profits and lower marketing costs
- Getting Started with Microsoft Big Data, An overview of Microsoft Big Data tools as part of the Windows Azure HDInsight and Storage services
- Advanced Data Analysis from an Elementary Point of View, Modern methods of data analysis and the considerations which go into choosing the right method for the job at hand
- Time Series Databases: New Ways to Store and Access Data, A new world of time series data calls for new approaches and new tools to store and access it
- School of Data, Teaching people how to gain powerful insights and create compelling stories using data
- Field Guide to Hadoop, An introduction to Hadoop, its ecosystem and aligned technologies
- The Promise and Peril of Big Data, New challenges and questions facing big data: from ethics to scientific methodologies to evolution of knowledge
- Harvard’s Statistical Computing and Visualization, A graduate class on analyzing data without losing scientific rigor, and communicating your work
- Big Data Analytics for Dummies, How data analysts can use powerful analytic tools to take advantage of Big Data and create powerful analytic applications
- Disruptive Possibilities: How Big Data Changes Everything, Evolution of commodity supercomputing and the simplicity of big data technology, to the ways conventional clouds differ from Hadoop analytics clouds
- Big Data University by IBM, An online educational site run by Hadoop, Big Data and DB2 users who want to learn, contribute with course materials, or look for job opportunities
- Fundamental Numerical Methods and Data Analysis, A wide range of courses that discuss numerical methods used in science
- SQL School by Mode Analytics, Made up of 2 modules: a) SQL skill building b) Analytics training – learning to think like an analyst

- Business Models for the Data Economy, Using data to advance your business

- Computer Vision: Models, Learning, and Inference, A principled model-based approach to computer vision that unifies disparate algorithms, approaches, and topics under the guiding principles of probabilistic models, learning, and efficient inference algorithms

- Data Journalism, For anyone who thinks that they might be interested in becoming a data journalist, or dabbling in data journalism

- Data Mining with STATISTICA Video Series, The Data Mining with STATISTICA series is 35 videos covering concepts, process, and hands-on data mining
- A Programmer’s Guide to Data Mining, Tool for learning basic data mining techniques
- Data Discovery for Dummies, Data discovery, what it does and how it makes dealing with big data simpler
- Data Mining and Analysis: Fundamental Concepts and Algorithms, Fundamental algorithms in data mining and analysis that form the basis for the emerging field of data science
- Google Tech Talks: Data Mining, Data Mining by David Mease that is split into 13 YouTube lectures
- An Introduction to Data Mining, A creative way of learning about data mining
- Data Mining and Knowledge Discovery in Real Life Applications, Four different ways of theoretical and practical advances and applications of data mining in different promising areas like Industrialist, Biological, and Social
- Knowledge-Oriented Applications in Data Mining, In-depth description of novel mining algorithms and many useful applications
- Data Mining with Weka, A collection of machine learning algorithms for data mining tasks that can either be applied directly to a dataset or called from your own Java code
- New Fundamental Technologies in Data Mining, In-depth description of novel mining algorithms and many useful applications
- R and Data Mining: Examples and Case Studies, Various data mining functionalities in R and three case studies of real world applications
- The Elements of Statistical Learning (Data Mining, Inference and Prediction), Bringing together many new ideas in learning and explaining them in a statistical framework
- Mining of Massive Datasets, Data mining of very large amounts of data, that is, data so large it does not fit in main memory

- How to Build and Lead a Winning Data Team, Understanding the difference between a traditional analytics team and one that’s set up to exploit big data
- Analyzing the Analyzers: An introspective survey of data scientists and their work, Survey of several hundred data science practitioners in mid-2012
- Building Data Science Teams, What data scientists add to an organization, how they fit in, and how to hire and build effective data science teams

- Data Structures and Algorithms with Object-Oriented Design Patterns in Python, Fundamentals of data structures and algorithms

- How to Build Dashboards that Persuade, Inform, and Engage,
- Data Visualization with JavaScript,
- Storytelling with Data, Using visualization to share the human impact of numbers
- Data+Design: A simple introduction to preparing and visualizing information, The actual set of steps that need to be accomplished before data can be visualized, from the design of the survey to the collection of the data to ultimately its visualization
- Harvard’s Statistical Computing and Visualization, A graduate class on analyzing data without losing scientific rigor, and communicating your work
- Interactive Data Visualization for the Web, Visualizing data is the fastest way to communicate it to others
- Interactive Data Visualization for the Web, Dynamic, interactive visualizations to empower people to explore the data for themselves
- Visual Analytics Best Practices, Easy-to-read booklet that shows you simple yet extraordinary techniques for making every data visualization useful and beautiful
- 5 Great Ways to Tell Great Stories with Data, The ability to tell terrific stories with data makes you a God-like analyst
- Data Visualization for All, This book helps everyone create interactive charts, maps, and simple web apps to tell stories about your data
- School of Data, Teaching people how to gain powerful insights and create compelling stories using data

- Joseph Sirosh Keynote: “A New Data Science Economy”, In the emerging new Data Science Economy, data scientists are able to monetize their skills – at scale, in the cloud – just like app developers have been able to do for several years now
- Foundations of Data Science,
- Why Most Big Data Projects Fail, Learning from Common Mistakes to Transform Big Data into Insights
- The Data Science Handbook (Free Pre-Release), Advice and Insights from 25 Amazing Data Scientists
- Data Driven: Creating a Data Culture, The steps needed for your company to be truly data-driven, including the questions you should ask and the methods you should adopt
- Applied Data Science, Taking people with strong mathematical / statistical knowledge and teaching them software development fundamentals
- Data Science with MIT Open Courseware, A choke-full of data science courses, from data mining to data visualization to social networks
- Introduction to Data Science, A series of data problems of increasing complexity to illustrate the skills and capabilities needed by data scientists
- The Field Guide to Data Science, Appreciating the insights data can provide us today
- Data Science on Coursera, Coursera has a plethora of free online data science classes. Searching with keywords like data science, data mining, text mining, data visualization, machine learning, statistics
- What is Data Science?, A book on data science and how it enables the creation of data products
- Data Science on edX, edX does not have as many free online data science courses as Coursera, but it’s still an applaudable collection
- Learn Data Science, A collection of Data Science materials written in the form of IPython Notebooks
- Harvard Data Science Course, A fantastic series of lectures pertaining to Data Science
- The Evolution of Data Products, Discusses what happens when data becomes a product, specifically, a consumer product, and where these data products headed
- Data Science with CMU’s Open Learning Initiative, As of today, the only data science related MOOCs offered by CMU’s Open Learning Initiative are those related to statistics
- Data Science: An Introduction by Wikibooks, Basic introduction to data science designed for the advanc ed high school student or average college freshman
- Data Jujitsu: The Art of Turning Data into Product, Using multiple data elements in clever ways to solve iterative problems that, when combined, solve a data problem that might otherwise be intractable

- The Free Hive Book, A free electronic book about Apache Hive. The book is geared towards SQL-knowledgeable business users with some advanced tips for devops
- Data-Intensive Text Processing with MapReduce, Scalable approaches to processing large amounts of text with MapReduce
- Field Guide to Hadoop, An introduction to Hadoop, its ecosystem and aligned technologies
- Hadoop Illuminated, Gentle Introduction of Hadoop and Big Data
- Hadoop Interview Questions HDFS, a list of Hadoop Interview Questions which covers HDFS
- 15-Minute Guide to Install a Hadoop Cluster, Apache Hadoop Cluster 2.0 with single data node configuration on Ubuntu

- An Introduction to Information Retrieval, Scientific underpinnings of information retrieval

- A Practitioner’s Guide to Generalized Linear Models, Written for the practising actuary who would like to understand generalized linear models (GLMs) and use them to analyze insurance data

- Analyzing Linguistic Data: A practical introduction to statistics, Statistical analysis of language, designed for linguists with a non-mathematical background

Machine Learning, Predictive Analytics

- Statistical foundations of machine learning, Statistical foundations of machine learning intended as the discipline which deals with the automatic design of models from data
- A Deep Learning Tutorial: From Perceptrons to Deep Networks,
- Deep Learning for Natural Language Processing, A discussion of NLP-oriented issues in modeling, interpretation, representational power, and optimization
- Unsupervised Feature Learning and Deep Learning #2,
- Unsupervised Feature Learning and Deep Learning #1,
- Deep Learning Video Series, A series of 18 deep learning videos
- Recent Developments in Deep Neural Networks,
- Machine Learning and AI via Brain Simulations, Deep Learning, Self-Taught Learning and Unsupervised Feature Learning
- How To Create A Mind: Ray Kurzweil at TEDxSiliconAlley,
- Deep Learning Course, Online video and slides on deep learning taught at New York University
- Neural networks class – Université de Sherbrooke,
- Deep Learning: Methods and Applications, An overview of general deep learning methodology and its applications to a variety of signal and information processing tasks
- Deep Learning,
- Machine Learning by Andrew Ng, In this course, you’ll learn about some of the most widely used and successful machine learning techniques
- Neural Networks and Deep Learning, Neural networks and deep learning currently provide the best solutions to many problems in image recognition, speech recognition, and natural language processing
- Unsupervised Feature Learning and Deep Learning,
- Deep Learning Tutorial,
- Advances in Scalable Bayesian Computation, 18 workshop videos during the 5-day workshop
- An Introduction to Unsupervised Learning via Scikit Learn,
- Deep Learning: Intelligence from Big Data,
- Predictive Modeling Education, A series of educational sessions to help take the mystery out of predictive modeling. These are all short 5-10 minute videos that discuss a specific predictive modeling concept
- Machine Learning, Probability and Graphical Models,
- Embracing Uncertainty: The New Machine Intelligence,
- Regression Analysis by Analytics University, A set of 16 videos on regression analysis
- Machine Learning CMU 2013 10-701,
- Machine Learning CMU Fall 2013 10-701x,
- Machine Learning Summer School 2014,
- Mu Li Parameter Server Tutorial, Machine Learning Summer School 2014,
- Alex Smola Scalable Machine Learning, Machine Learning Summer School 2014,
- Probabilistic Programming & Bayesian Methods for Hackers, Intro to Bayesian methods and probabilistic programming
- A First Encounter with Machine Learning, A first read to wet the appetite so to speak, a prelude to the more advanced machine learning topics
- Mega Collection of Machine Learning Video Lectures & Tutorials, A collection of over 1000 free machine learning video lectures and tutorials that you can watch until your eyes pop
- Bayesian Reasoning and Machine Learning, Unified treatment via graphical models, a marriage between graph and probability theory, facilitating the transference of Machine Learning concepts
- The Elements of Statistical Learning (Data Mining, Inference and Prediction), Bringing together many new ideas in learning and explaining them in a statistical framework
- A Course in Machine Learning, Covers most major aspects of modern machine learning
- Machine Learning Video Library, Lots of free machine learning videos from Professor Yaser Abu-Mostafa of Caltech: Bayesian Learning, Data Snooping, Linear Classification and Logistic Regression
- Gaussian Processes for Machine Learning, One of the most important Bayesian machine learning approaches
- Forecasting: principles and practice, Comprehensive introduction to forecasting methods
- Regression Analysis using R, Using regression analysis to solve prediction problems
- Computer Vision: Models, Learning, and Inference, A principled model-based approach to computer vision that unifies disparate algorithms, approaches, and topics under the guiding principles of probabilistic models, learning, and efficient inference algorithms
- Practical Machine Learning: Innovations in Recommendations, Innovations that make machine learning practical for business production settings
- Deep Learning by Samy Bengio, Tom Dean and Andrew Ng, Learn about widely used and successful machine learning techniques, with the opportunity to implement these algorithms
- Think Bayes: Bayesian Statistics Made Simple, Bayesian statistics with Python and discrete approximations
- Predictive Analytics for Dummies, Providing decision makers and analysts with the capability to make accurate predictions about future events based on complex statistical algorithms
- In-depth introduction to machine learning in 15 hours of expert videos, An excellent course in statistical learning (also known as “machine learning”), by Stanford University professors Trevor Hastie and Rob Tibshirani
- Introduction to Machine Learning, Very comprehensive overview of machine learning
- Introduction to Machine Learning, Iain Murray from the University of Edinburgh walks you through face detection, speech recognition, oranges & lemons, stars & galaxies, and linear regression

- Mathematics for Computer Science, Mathematical models and methods to analyze problems that arise in computer science
- Foundation of Data Science, The book focuses on the mathematical foundations rather than dwell on particular applications that are only briefly described

- Introduction to Metadata, Overview of metadata and its current trends, especially that of metadata created by users

- Deep Learning for Natural Language Processing, A discussion of NLP-oriented issues in modeling, interpretation, representational power, and optimization
- Natural Language Processing with Python, A book about Natural Language Processing

- Network Science, Concepts of network science and the tools that can be used to study real networks and interpret the obtained results
- Networks, Crowds & Markets: Reasoning about a highly connected world, Links that connect us and the ways in which each of our decisions can have subtle consequences for the outcomes of everyone else
- The Wealth of Networks: How Social Production Transforms Markets and Freedom, How information, knowledge and culture are produced and exchanged in our society
- Introduction to Social Network Methods, Building an understanding of a social network with complete and rigorous descriptions of social relationship patterns

- Open Data Handbook, Guides, case studies and resources for government & civil society on the “what, why & how” of open data

Other Sites with Free Data Science Resources

- Free Data Science Books from Intech, Lots of Data Science books you can read online. Search for them using keywords like data science, data mining, machine learning, etc

- Probabilistic Programming & Bayesian Methods for Hackers, Intro to Bayesian methods and probabilistic programming
- Introduction to Probability, Introductory probability course
- Computer Vision: Models, Learning, and Inference, A principled model-based approach to computer vision that unifies disparate algorithms, approaches, and topics under the guiding principles of probabilistic models, learning, and efficient inference algorithms
- Think Stats: Probability and Statistics for Beginners, Emphasizes the use of statistics and a computational approach to explore large datasets

Publishers

- O’Reilly: Free data science and big data books

- Think Bayes: Bayesian Statistics Made Simple, Bayesian statistics with Python and discrete approximations
- Data Structures and Algorithms with Object-Oriented Design Patterns in Python, Fundamentals of data structures and algorithms
- Natural Language Processing with Python, A book about Natural Language Processing
- Codecademy: Python, An interactive “Python for Beginners” tutorial that allows you to code while learning the fundamentals

- simpleR – Using R for Introductory Statistics, How to use R while learning introductory statistics
- Regression Analysis using R, Using regression analysis to solve prediction problems
- Statistics with R, Notes author took while discovering and using the statistical environment R
- An Introduction to Statistical Learning (with applications in R), For those wishing to use statistical learning tools to analyze their data
- R and Data Mining: Examples and Case Studies, Various data mining functionalities in R and three case studies of real world applications
- An Introduction to R, R is an integrated suite of software facilities for data manipulation, calculation and graphical display
- RStatistics.net, An educational resource for all things related to R language and its applications in advanced statistical computing and machine learning.

- Statistical foundations of machine learning, Statistical foundations of machine learning intended as the discipline which deals with the automatic design of models from data
- Statistics for Big data , Multivariate Data Analysis using R
- STATISTICS Methods and Applications, A nearly encyclopedic comprehensive presentationof statistical methods and analytic approaches used in science, industry, business, and data mining written from the perspective of the real-life practitioner
- Introduction to Statistical Thought, A focus on ideas that statisticians care about as opposed to technical details of how to put those ideas into practice
- Statistics by Wikibooks.org, Modern statistics and some practical applications of statistics
- The Elements of Statistical Learning (Data Mining, Inference and Prediction), Bringing together many new ideas in learning and explaining them in a statistical framework
- A Practitioner’s Guide to Generalized Linear Models, Written for the practising actuary who would like to understand generalized linear models (GLMs) and use them to analyze insurance data
- Concepts and Applications of Inferential Statistics, A full-length and occasionally interactive statistics textbook.
- OpenIntro Statistics, Foundation of statistical thinking and methods
- An Introduction to Statistical Learning (with applications in R), For those wishing to use statistical learning tools to analyze their data
- Collaborative Statistics, An introductory statistics course for students majoring in fields other than engineering or math. The only prerequisite is intermediate algebra
- simpleR – Using R for Introductory Statistics, How to use R while learning introductory statistics
- Statistics with R, Notes author took while discovering and using the statistical environment R
- Analyzing Linguistic Data: A practical introduction to statistics, Statistical analysis of language, designed for linguists with a non-mathematical background
- Think Stats: Probability and Statistics for Beginners, Emphasizes the use of statistics and a computational approach to explore large datasets
- Harvard’s Statistical Computing and Visualization, A graduate class on analyzing data without losing scientific rigor, and communicating your work

- Text Mining with STATISTICA Video Series, The intent of this series is to familiarize you with text mining, so that you are empowered to start your own text mining projects and can carry out the projects successfully
- Data-Intensive Text Processing with MapReduce, Scalable approaches to processing large amounts of text with MapReduce
- Theory and Applications for Advanced Text Mining, Advanced text mining techniques