All resources
Foundations
-
Introduction to Scientific ML
Online textbook · IntermediateOnline lecture-book on scientific machine learning with interactive coding tutorials for data analytics and uncertainty quantification.
Prior knowledge: Python, basic calculus, linear algebra
Estimated time: TODOscientific machine learningtextbooknotebooks
-
Machine Learning Introduction (Coursera)
Online course · BeginnerCoursera specialization introducing basic machine learning concepts with a video component and practical assignments.
Prior knowledge: Basic coding (for loops, functions, if/else), high-school math (algebra)
Estimated time: 2 months part-time (~10 hours/week)machine learningcoursevideo
-
scikit-learn
Software · BeginnerGeneral-purpose machine learning library in Python with classical models, preprocessing utilities, and metrics.
Prior knowledge: Basic Python programming
Estimated time: -machine learningpythonlibrary
-
PyTorch
Software · BeginnerDeep learning framework with dynamic computation graphs and extensive support for neural network research and applications.
Prior knowledge: Basic Python programming
Estimated time: -deep learningpythonframework
-
TensorFlow
Software · IntermediateEnd-to-end open-source platform for machine learning and deep learning, with support for large-scale training and deployment.
Prior knowledge: TODO
Estimated time: TODOdeep learningpythonframework
-
Hugging Face
Dataset · BeginnerEcosystem of pretrained models, datasets, and Python libraries for transformers, diffusion models, and other modern ML architectures.
Prior knowledge: -
Estimated time: -transformersLLMsmodel hub
-
Bayesian Modeling and Computation in Python
Online textbook · IntermediateApplied textbook that walks through Bayesian computation using PyMC, ArviZ and TensorFlow Probability. Includes practical code-examples and notebooks. Tutorials included.
Prior knowledge: Probability theory, linear algebra, basic Python or R
Estimated time: 20–40 hoursbayesian statisticsprobabilistic programmingpython notebooks
-
Dive into Deep Learning
Online textbook · BeginnerInteractive textbook with Jupyter notebooks and runnable code (PyTorch, MXNet) for deep learning. Great hands-on start.
Prior knowledge: Basic Python programming, linear algebra
Estimated time: 30–60 hoursdeep learningnotebookspython
-
Geometric Deep Learning: Grids, Groups, Graphs, Geodesics, and Gauges
Online textbook · AdvancedAdvanced textbook covering geometric deep learning topics (manifolds, graphs, gauge theory) with some code examples. No heavy notebooks listed.
Prior knowledge: Deep learning, differential geometry, graph theory
Estimated time: 20–40 hoursgeometric deep learninggraph neural networksmanifolds
-
Interpretable Machine Learning
Online textbook · IntermediateBook for practitioners wanting to understand model interpretability. Includes Python notebook examples for many methods.
Prior knowledge: Machine learning basics, Python
Estimated time: 20–30 hoursinterpretable MLmachine learningpython
-
Understanding Deep Learning
Online textbook · IntermediateTextbook providing up-to-date deep learning topics (transformers, diffusion models) with exercises and interactive slides; some notebooks included.
Prior knowledge: Basic deep learning concepts, Python
Estimated time: 15–30 hoursdeep learningtransformersvisual explanations
-
3Blue1Brown Neural Networks Video Series
Tutorial · BeginnerVisual and intuitive video series explaining neural networks and deep learning concepts with minimal math.
Prior knowledge: Basic calculus, linear algebra
Estimated time: 5 hoursneural networksvideovisual explanations
-
Agents4Science: Agentic Scientific Discovery Platforms
Online course · AdvancedCourse on AI agents in scientific discovery platforms with slides & reading materials; covers sense-plan-act-learn loops, scientific workflows. Not heavy on notebooks but substantial reading/slides.
Prior knowledge: Machine learning, AI agents, scientific workflows
Estimated time: ≈10–15 hours (lecture slides + assignments)AI agentsscientific discoveryworkflow automation
-
Deep Neural Networks Video Course
Online course · BeginnerVideo-course playlist covering deep neural network fundamentals. Mostly video lectures, limited notebook assignments.
Prior knowledge: Basic calculus, linear algebra, Python
Estimated time: 8–10 hours (video playlist)neural networksvideo lecturesdeep learning
-
Mathematics for Machine Learning Specialization
Online course · IntermediateCoursera specialization covering linear algebra, calculus and probability for machine learning. Includes interactive Jupyter assignments.
Prior knowledge: High-school algebra, Python basics
Estimated time: 4 weeks part-time (~10 hours/week)mathematical foundationslinear algebraprobability & statistics
-
Machine Learning Refined – Course Materials
Online course · Beginner To IntermediateAccompanying course materials for the textbook Machine Learning Refined. Includes Jupyter/Colab notebooks, chapter notes, exercises, and slides, emphasizing geometric intuition and building classic ML methods from scratch in Python.
Prior knowledge: Basic Python, matrix algebra, introductory calculus
Estimated time: 40–60 hours (online notes, exercises, and slides)machine learningpython notebooksoptimization
-
Machine Learning for Beginners (Microsoft)
Online course · BeginnerProject-based introductory curriculum on classic machine learning using scikit-learn. Provides lesson notebooks, quizzes, assignments, and solutions, organized as a 12-week, 26-lesson sequence designed for classroom use or self-study.
Prior knowledge: Introductory Python, high-school algebra
Estimated time: 40–60 hours (12-week, 26-lesson curriculum)machine learningbeginner curriculumscikit-learn
-
Practical Deep Learning for Coders (fast.ai)
Online course · IntermediateHands-on deep learning course focusing on practical applications (vision, NLP, tabular, recommender systems, and diffusion models). Uses the fastai and PyTorch libraries with free compute options, combining video lectures with Jupyter notebooks and exercises.
Prior knowledge: Comfortable with Python coding, basic math (algebra and simple calculus)
Estimated time: 30+ hours (video lessons plus notebooks)deep learningpractical coursepytorch
-
LLM Visualization
Tutorial · IntermediateInteractive 3D visualisation of a GPT-style large language model, showing every layer and operation during inference. You can explore the full LLM computation pipeline step by step in 3D, making the inner workings of LLMs much more tangible.
Prior knowledge: Basic understanding of neural networks and transformers is helpful, plus some linear algebra intuition
Estimated time: 1–2 hourslarge language models3D visualizationinteractive tutorial
-
A Gentle Introduction to Graph Neural Networks
Tutorial · Beginner To IntermediateInteractive Distill article that introduces graph neural networks from first principles, with animations, visual explanations, and code snippets. Walks through graph data, message passing, and the components of a modern GNN in an intuitive, experimentable way.
Prior knowledge: Familiarity with graphs and adjacency matrices
Estimated time: 2–4 hoursgraph neural networksinteractive articlevisual explanation
-
Machine Learning for Everyone (In Simple Words)
Tutorial · BeginnerLong-form blog tutorial explaining machine learning in absolute basic, simple words, with real-world analogies and zero formal math. Focuses on intuitions, everyday examples, and plain language rather than equations, making it accessible to non-technical readers.
Prior knowledge: None; general curiosity about machine learning
Estimated time: 1–2 hoursmachine learning basicsnon-technicalreal-world examples
-
An Introduction to Statistical Learning
Online textbook · IntermediateAccessible textbook on statistical learning, covering regression, classification, resampling methods, regularization, tree-based methods, SVMs, clustering, survival analysis, and more. Free PDF versions and video lectures are available, with R and Python labs at the end of each chapter for hands-on practice.
Prior knowledge: Basic probability and statistics, linear algebra, and some R or Python experience
Estimated time: 30–60 hours (full book with labs)statistical learningregression & classificationR/Python labs
-
The Python Tutorial (Official Documentation)
Tutorial · BeginnerThe official Python tutorial, maintained as part of the core Python documentation. Covers all essential Python concepts, from basic syntax to modules, classes, I/O, and error handling. A major advantage is that it is always updated for the very latest Python version.
Prior knowledge: None; suitable for first-time programmers
Estimated time: 10–20 hours (full read-through with exercises)python basicsofficial docsprogramming fundamentals
-
Software Carpentry Lessons
Tutorial · BeginnerCollection of core Software Carpentry lessons teaching essential research computing skills: the Unix shell, version control with Git, and programming with Python or R. Designed as hands-on workshop material with exercises and instructor notes.
Prior knowledge: Basic familiarity with files/folders; no prior coding experience required
Estimated time: 10–20 hours (Unix shell, Git, and Python/R lessons)research computingshell/git/pythonhands-on lessons
-
BoTorch
Software · IntermediateBayesian optimization library built on PyTorch, supporting Gaussian process models and acquisition functions for global optimization.
Prior knowledge: Basic Python programming, PyTorch
Estimated time: -bayesian optimizationgaussian processespytorch
-
Homemade Machine Learning
Tutorial · Beginner To IntermediateCollection of popular machine learning algorithms implemented from scratch in Python, with the underlying mathematics explained. Each algorithm is accompanied by interactive Jupyter Notebook demos so you can tweak data and hyperparameters and immediately see predictions and visualisations.
Prior knowledge: Python (NumPy), basic calculus, linear algebra, and introductory ML concepts
Estimated time: 15–30 hours (working through demos and notebooks)from-scratch implementationsjupyter notebooksclassic ML algorithms
-
Python for Physicists
Course · BeginnerSoftware Carpentry style course aimed at teaching Python to physicists.
Prior knowledge: Basic Python programming
Estimated time: 12 hourspythonscientific programmingphysicists
Chemistry
-
EPFL AI for Chemistry course
Online course · IntermediateLecture notes, slides, and notebooks for AI in chemistry, focusing on reaction prediction and synthesis planning.
Prior knowledge: Undergrad chemistry, basic ML (supervised learning)
Estimated time: 10–20 hoursreaction predictioncheminformaticscourse
-
ML4Chem
Software · IntermediateOpen-source machine learning library for atomistic models in chemistry and materials science with a PyTorch backend.
Prior knowledge: Python, PyTorch, basic atomistic simulations
Estimated time: 2–4 hours (tutorials)atomistic MLmaterialspythonlibrary
-
Scientific Computing for Chemists with Python
Online textbook · Beginner To IntermediateOnline textbook on programming and scientific computing for chemists, featuring Python-based coding tutorials.
Prior knowledge: None, starts from basics of Python programming
Estimated time: 10 - 15 hours (basics) + 15-30 hours (advanced topics)pythonscientific computingchemistrytextbook
-
Reinforcement Learning for ChemEng
Tutorial · Intermediate To AdvancedEducational reinforcement learning implementation with tutorial notebooks aimed at chemical engineering applications.
Prior knowledge: Chemical engineering, reinforcement learning basics, Python
Estimated time: 5–10 hours (notebook tutorials)reinforcement learningchemical engineeringnotebooks
-
RDKit
Software · -Open-source toolkit for cheminformatics, enabling construction, manipulation, and analysis of molecular structures and fingerprints.
Prior knowledge: Basic Python programming
Estimated time: -cheminformaticsmoleculesdescriptorsfingerprints
-
STK
Software · BeginnerPython library for the construction and manipulation of complex molecules, supramolecular assemblies, and molecular databases.
Prior knowledge: Basic Python programming
Estimated time: -supramolecularmoleculespythonlibrary
-
STKO
Software · BeginnerCollection of molecular optimisers and property calculators designed for use with stk and supramolecular systems.
Prior knowledge: Basic Python programming, familiarity with stk
Estimated time: -supramolecularoptimisationpropertiespython
-
MORDRED
Software · BeginnerMolecular descriptor calculator capable of generating a wide range of descriptors for cheminformatics applications.
Prior knowledge: Basic Python programming
Estimated time: -molecular descriptorscheminformaticspython
-
GAUCHE
Software · IntermediateLibrary for Gaussian processes in chemistry, enabling probabilistic modeling and surrogate models for chemical problems.
Prior knowledge: Basic Python programming, Gaussian processes, chemistry
Estimated time: -gaussian processeschemistrybayesian modelling
-
BayBE
Software · BeginnerBayesian optimization package focused on chemistry applications, with tools for experiment planning and optimization.
Prior knowledge: Basic Python programming, chemistry
Estimated time: -bayesian optimizationchemistryexperiments
-
DScribe
Software · BeginnerLibrary for computing advanced descriptors for molecules and materials, including SOAP, MBTR, and other atomistic representations. Includes tutorials and examples.
Prior knowledge: Basic Python programming, atomistic simulations
Estimated time: -descriptorsmaterialsmoleculespython
-
Deep Learning for Molecules and Materials
Online textbook · Beginner To IntermediateTextbook focused on deep learning approaches for molecules and materials. Contains Jupyter-book style chapters and notebook examples for hands-on learning.
Prior knowledge: Chemistry fundamentals, Python, basic ML
Estimated time: 15–30 hoursmoleculesdeep learningmaterials informatics
-
Data-Driven Chemistry (University of Edinburgh)
Online course · BeginnerIntroductory Python/data-analysis course for chemistry students. Contains Jupyter notebooks for each unit.
Prior knowledge: Undergraduate chemistry, basic Python
Estimated time: 12–20 hours (10 workshop units)chemistry programmingdata-analysisPython notebooks
-
Intro to Machine Learning in Chemistry (ML4chemArg)
Online course · Beginner To IntermediateCourse designed for chemistry students without prior programming experience: uses Python notebooks and real chemical data.
Prior knowledge: Basic Python or none
Estimated time: 10–15 hours (notebook-based course)machine learning chemistryPython notebooksintroductory course
-
Data Analytics in Chemistry (CHEM70012 — Imperial College)
Online course · Beginner To IntermediateWorkshop-based course introducing statistical learning, data visualisation and model building for chemical datasets. Contains Jupyter notebooks for each workshop session. Designed for masters-level chemistry undergraduates.
Prior knowledge: Familiarity with Python; high-school level maths/statistics
Estimated time: ≈8–12 hours (workshop notebooks + data-analysis modules)chemistry data analysisstatistical learningworkshop notebooks
-
Is Life Worth Living? — Cheminformatics Blog by @iwatobipen
Tutorial · Beginner To IntermediateBlog covering cheminformatics topics such as RDKit, molecular similarity, data-pipelines, and workflows. Includes code snippets, practical examples, and explains tools in clear terms.
Prior knowledge: Basic chemistry and Python; interest in cheminformatics
Estimated time: Varies (many short posts, individual topics)cheminformaticsRDKitpython workflows
-
Introduction to Python for Chemists (Imperial College)
Tutorial · BeginnerIntroductory Python course starting at very basics tailored to chemists: basic syntax, data handling, and chemical-data examples. Contains Jupyter notebooks and worked examples to help chemists get started coding.
Prior knowledge: High-school chemistry, basic mathematics
Estimated time: ≈5–10 hours (notebooks + exercises)chemistrypython basicsnotebooks
Materials
-
MatChem Dataset Repository
Dataset · BeginnerCurated list of datasets for machine learning with materials, including links to data resources and related projects.
Prior knowledge: -
Estimated time: -datasetsmaterialscurated list
-
Materials Informatics Video Tutorials (Taylor Sparks)
Tutorials · BeginnerVideo tutorials on materials informatics by Taylor Sparks, covering various topics in the field.
Prior knowledge: Basic materials science, Python
Estimated time: -materials informaticstutorialsvideo
-
Automated Experiment (UTK Spring 2023)
Online course · IntermediateCourse material repository for automated experiment design incorporating Gaussian processes and physics discovery. Contains Jupyter notebooks and assignments. oai_citation:1‡LinkedIn
Prior knowledge: Statistics, machine learning, Python
Estimated time: 8–12 hours (lecture slides + notebooks)automated experimentationGaussian processesBayesian optimisation
-
Machine Learning for Materials: From PCA to ChatGPT (UTK MSE Fall 2023)
Online course · Intermediate To AdvancedSemester-length course on machine learning for materials, from PCA and classical methods to modern deep learning and large language models. Includes Jupyter notebooks, module-based materials, and project-style content focused on real materials-science problems.
Prior knowledge: Undergraduate materials science, basic Python, linear algebra
Estimated time: 30–40 hours (selected modules, readings, and notebooks)materials sciencemachine learningcourse notebooks
-
Materials Informatics (MSE5540/6640, University of Utah)
Online course · IntermediateFull course on materials informatics covering data repositories, featurization, best practices, and ML workflows for materials discovery. Repository includes lecture slides, Jupyter notebooks, homework assignments, and reading lists, plus a linked YouTube lecture playlist.
Prior knowledge: Undergraduate materials science, basic Python, basic statistics
Estimated time: 30–50 hours (lectures, homeworks, and worked examples)materials informaticsjupyter notebookscourse
-
Machine Learning for Materials (MATE70026 — Imperial College)
Online course · IntermediateCourse module introducing representation of composition–structure–property data for materials, building, training and evaluating ML models, plus recent AI for science topics. Includes Jupyter notebooks for module work. From Imperial College London.
Prior knowledge: Basic Python programming, undergraduate materials science
Estimated time: ≈12–16 hours (lectures + notebook modules + assignments)materials sciencemachine learningJupyter notebooks
-
Digital Materials Foundry – Experimental Materials Data Library (Henry Royce Institute)
Dataset · -Library of experimental materials data repositories curated by the Henry Royce Institute. Includes device-performance, stress–strain, thermoelectric, optical property databases, etc. Useful resource for ML in materials. License varies per dataset. oai_citation:0‡Henry Royce Institute
Prior knowledge: Materials science fundamentals, interest in data workflows
Estimated time: -experimental materials datamaterials discoverydata-library
-
Materials Project
Dataset · -Open-access computational materials database providing predicted and known properties of inorganic materials (e.g., formation energy, band-gap, structure) built via DFT and high-throughput workflows. Widely used in ML for materials. oai_citation:1‡Wikipedia
Prior knowledge: Materials science (crystallography, DFT basics) or willingness to explore APIs
Estimated time: -inorganic materials datasetDFT computed propertiesmaterials ML
-
MatBench – Benchmark Datasets for Materials Property Prediction
Dataset · -Benchmark dataset suite curated by the Materials Project for ML-based materials property prediction. Tasks range across electronic, thermal, mechanical properties; includes APIs & leaderboard. License: MIT. oai_citation:3‡GitHub
Prior knowledge: Machine learning basics, materials science background helpful
Estimated time: -benchmark datasetsmaterials MLproperty prediction
-
Porous Material AI Gym: Open Datasets for Machine Learning on Porous Materials
Dataset · -Collection of open datasets for machine learning pertaining to porous materials (MOFs, COFs, zeolites). Includes thousands of labelled examples (adsorption, band-gaps, charges) for supervised learning. Provides ready-to-use data for ML workflows.
Prior knowledge: Materials science (crystallography/porous materials), Python, ML basics
Estimated time: -porous materialsmachine learning datasetMOFs/COFs