Clearing up confusion around IPython, ipykernel, and Jupyter notebooks
⋅ 4 minute read
Contents
One of my big recurring time sinks while doing data science work used to be trying to get my colleagues’ jupyter notebooks to run on my machine. The main contributing factors:
- My team generally uses poetry environments to improve reproducability, but sometimes dependencies aren’t specified
- I use VS code and the jupyter extension to edit notebooks in VS code which requires more configuration then running the Jupyter web UI.
- I lacked a clear understanding of the differences of
IPython
,ipykernel
,jupyter
and which python environments are being used when running a notebook.
So this is my attempt at a friendcatcher . I hope this saves you a few minutes the next time you run into similar issues.
The different components
Let’s distinguish the components that play a role in running a jupyter notebook:
- Jupyter Notebook platform: A web-based interactive computing platform that supports different languages via different kernels, e.g. for python (ipykernel), Julia (IJulia), R (IRKernel)
- Jupyter notebook (extension
.ipynb
): is a document in json format that holds metadata and cell code. - IPython command shell: The shell has two components:
- An interactive Python shell. You can start it with
ipython
. It’s like the default python REPL but with enhanced features, e.g. object introspection, tab completion, input history, magic commands, etc. - A jupyter kernel
ipykernel
. This is the backend process where user python code runs and which can be connected to different frontends. One frontend is indeed the IPython shell, another one a Jupyter notebook. You can installipykernel
as a standalone package into your python environment.
- An interactive Python shell. You can start it with
jupyter
python package. This is a metapackage which installs the notebook, qtconsole, and ipykernel.
graph TD; colab(Google Colab UI) <--> ipykernel(ipykernel) vs(VS Code UI) <--> ipykernel ui(jupyter notebook UI) <--> server server(jupyter server) <--> ipykernel ipykernel <--> ipython[IPython]
Jupyter kernels vs. shell environment
One source of confusion is that jupyter kernel can point to a different python executable than your shell environment.
To get an overview of available jupyter executables you can use:
- List all available jupyter executables in your system:
1$ type -a jupyter
- List all available jupyter kernels:
1$ jupyter kernelspec list
Every jupyter kernel folder includes a kernel.json
file that links to the python executable that is being used. Note this can be different to the python executable referenced by your current shell. Moreover, the shell environment of a Jupyter notebook uses the python executable used to launch the notebook.
- Print path of currently used python executable:
or in a notebook cell:
1$ type python
1!type python
- Print path of python executable of current kernel:
In a notebook cell:
1import sys 2sys.executable
You can create new kernels using the ipykernel
package:
1$ python -m ipykernel install --user --name envname --display-name "Python (envname)"
Dependency management
Since I use VS Code as my frontend I just need to add the ipykernel
package into the virtual envrionment that I use to manage all other dependencies used to run the notebook. This ensures that the same python executable is used for the kernel and the shell environment.This is well explained
here
.
These are the steps to create a new environment for a jupyter notebook:
- Create project folder:
1$ mkdir notebook_project
- Create new virtual environment in the folder, then activate it
1$ cd notebook_project 2$ python3 -m venv .venv 3$ source .venv/bin/activate
- Install
ipykernel
(and other dependencies) using pip (make sure the venv is activated):1$ python3 -m pip install ipykernel 2$ python3 -m pip install pandas
- Create a new notebook
1$touch mynotebook.ipynb
- Open notebook in VSCode and in the top right corner select
Select Kernel
–>Python Environment
–>.venv (.venv/bin/python)
- You should now be able to run the notebook and use the pandas package inside the notebook.
- To add new dependencies:
- Use the terminal:
$ python3 -m pip install <package_name>
- Install from within a notebook cell:
1import sys 2!{sys.executable} -m pip install <package_name>
- Specify your dependencies in a
requirements.txt
- Use the terminal:
I use poetry for virtual environments and dependency management. So in step 2 I would instead use:
1$ cd notebook_project
2$ poetry init
and install packages via:
1$ poetry add ipykernel
2$ poetry add pandas
If I want to use the default Jupyter UI, I can install the jupyter
metapackage into my environment and then start the UI with:
1$ poetry add jupyter
2$ poetry run jupyter notebook
Links
If you have any thoughts, questions, or feedback about this post, I would love to hear it. Please reach out to me via email.