sqrtminusone.github.io/org/2021-04-07-org-python.org

6.3 KiB

Replacing Jupyter Notebook with Org Mode

Why?

Basic setup

There are multiple ways of doing literate programming with Python & Org Mode, ein being one of the notable alternatives.

I go with the emacs-jupyter package. Installing it is pretty straightforward, I use use-package with straight.el:

(use-package jupyter
  :straight t)

Then, we have to enable languages for org-babel. The following isn't the best practice for startup performance time, but the least problematic in my experience.

(org-babel-do-load-languages
 'org-babel-load-languages
 '((emacs-lisp . t) ;; Other languages
   (shell . t)
   ;; Python & Jupyter
   (python . t)
   (jupyter . t)))

That adds Org source blocks with names like jupyter-LANG, e.g. jupyter-python. To use just LANG src blocks, call the following function after org-babel-do-load-languages:

(org-babel-jupyter-override-src-block "python")

That overrides built-in python block with jupyter-python.

If you use ob-async, you have to set jupyter-LANG blocks as ignored by this package, because emacs-jupyter has async executiong of its own.

(setq ob-async-no-async-languages-alist '("python" "jupyter-python"))

Environments

So, we've set up a basic emacs-jupyter configuration.

The catch here is that Jupyter should be available on Emacs startup (at the time of evaluation of the emacs-jupyter package, to be precise). That means, if you are launching Emacs with something like application launcher, global Python & Jupyter will be used.

import sys
sys.executable
/usr/bin/python3

Which is probably not what we want. To resolve that, we have to make the right Python available at the required time.

Anaconda

If you were using Jupyter Lab or Notebook before, there is a good change you used it via Anaconda. If not, in a nutshell, it is a package & environment manager, which specializes on Python & R, but also supports a whole lot of stuff like Node.js. In my opinion, it is the easiest way to manage multiple Python installations if you don't use some advanced package manager like Guix.

As one may expect, there is an Emacs package called conda.el to help working with conda environments in Emacs. We have to put it somewhere before emacs-jupyter package and call conda-env-activate:

(use-package conda
  :straight t
  :config
  (setq conda-anaconda-home (expand-file-name "~/Programs/miniconda3/"))
  (setq conda-env-home-directory (expand-file-name "~/Programs/miniconda3/"))
  (setq conda-env-subdirectory "envs"))

(unless (getenv "CONDA_DEFAULT_ENV")
  (conda-env-activate "base"))

If you have Anaconda installed on a custom path, as I do, you'd have to add these 3 setq in the :config section. Also, there is no point in activating environment if Emacs is somehow already lauched in an environment.

That'll give us Jupyter from a base conda environment.

virtualenv

TODO

Switching an environment

However, as you may have noticed, emacs-jupyter will always use the Python kernel found on startup. So if you switch a new environment, the code will still be ran on an old one, which is not too convinient.

Fortunately, to fix that we have only to refresh the jupyter kernelspecs:

(defun my/jupyter-refresh-kernelspecs ()
  "Refresh Jupyter kernelspecs"
  (interactive)
  (jupyter-available-kernelspecs t))

Calling M-x my/jupyter-refresh-kernelspecs after a switch will give you a new kernel. Just keep in mind that the kernelspec seems to be attached to a session, so you'd also have to change the session name to get a new kernel.

import sys
sys.executable
/home/pavel/Programs/miniconda3/bin/python
(conda-env-activate "ann")
import sys
sys.executable
/home/pavel/Programs/miniconda3/bin/python
(my/jupyter-refresh-kernelspecs)
import sys
sys.executable
/home/pavel/Programs/miniconda3/envs/ann/bin/python

Programming

To test if everything is working correctly, run M-x jupyter-run-repl, which should give you a REPL with a chosen kernel. If so, we can finally start using Python in org mode.

#+begin_src python :session hello :async yes
print('Hello, world!')
#+end_src

#+RESULTS:
: Hello, world!

To avoid repeating similar arguments for the src block, we can set the header-args property at the start of the file:

#+PROPERTY: header-args:python :session hello
#+PROPERTY: header-args:python+ :async yes

When a kernel is initialized, an associated REPL buffer is also created with a name like *jupyter-repl[python 3.9.2]-hello*. That may also come in handy, although I prefer running a standalone REPL, doing which will be discussed further.

One advantage of emacs-jupyter is that kernel requests for input are queried through the minibuffer. So, you can run a code like this:

#+begin_src python
name = input('Name: ')
print(f'Hello, {name}!')
#+end_src

#+RESULTS:
: Hello, Pavel!

without any additional setup.

Code output

Images

If you want to display inline images right after the code execution, add the following hook:

(add-hook 'org-babel-after-execute-hook 'org-redisplay-inline-images)

Otherwise, you'd have to call org-redisplay-inline-images every time you want to see the output image.

Tables

HTML

Widgets

Remote kernels

Export

HTML

LaTeX

ipynb