<h1 align="center">
<a href="https://prompts.chat">
This page collects common issues and setup tips encountered while working through the book.
Loading actions...
<a href="https://prompts.chat">
TypeScript and ESLint rules that MUST be followed when creating, modifying, or reviewing any file under apps/frontend/, including .ts, .tsx, .js, and .jsx files. Also apply when discussing frontend linting, type safety, or ESLint configuration.
risks
This page collects common issues and setup tips encountered while working through the book.
The chapter notebooks use Markdown image links hosted at https://sebastianraschka.com/images/LLMs-from-scratch-images/.... This keeps the repository download size manageable, but it also means the images depend on the image host and your network connection.
If images in the .ipynb notebooks do not render:
If you want to modify notebooks while also receiving repository updates, fork the repository first, then clone your fork. The main book notebooks are kept in sync with the printed book and are generally not changed, except for critical fixes. Most repository updates add bonus material instead.
Notebook files are JSON files, so Git diffs and merge conflicts can be hard to read. To avoid unnecessary conflicts, I recommend keeping your experiments separate from the tracked book notebooks:
ch02.ipynb to ch02_experiments.ipynb.upstream remote, then merge or rebase only when you need those updates.To create a fork and clone it:
YOUR-USERNAME with your GitHub username:git clone https://github.com/YOUR-USERNAME/LLMs-from-scratch.git
cd LLMs-from-scratch
Then add the original repository as upstream so you can fetch future updates:
git remote add upstream https://github.com/rasbt/LLMs-from-scratch.git
git fetch upstream
git merge upstream/main
If you do need to merge edited notebooks, consider installing nbdime to get notebook-aware diffs and merge tools:
pip install nbdime
nbdime config-git --enable
For more context, see #1015.
Some notebooks and scripts use cuda when available and otherwise fall back to cpu, without selecting Apple's mps backend. This omission of mps support is intentional in many places because earlier PyTorch/MPS versions produced unstable or different results in several examples, especially during training and finetuning.
If you are using an Apple Silicon Mac and see diverging losses, sharp loss spikes, poor generated text, or results that do not match the book, rerun the example on cpu first. For faster training with book-matching behavior, I recommend using cuda on a local NVIDIA GPU or a cloud GPU.
Newer PyTorch versions may improve MPS behavior, and you can experiment with mps locally if you validate the results carefully. However, if you add mps support to a script yourself, keep in mind that CUDA-specific options such as pin_memory=True, torch.compile, and DDP/multi-GPU code may need separate guards.
For more context, see #977, #625, #644, #442, and #846.
For other issues, please feel free to open a new GitHub Issue.