9.9 KiB
Replacing Jupyter Notebook with Org Mode
Why?
Basic setup
There are multiple ways of doing literate programming with Python & Org Mode, ein being one of the notable alternatives.
I go with the emacs-jupyter package. Installing it is pretty straightforward, I use use-package with straight.el:
(use-package jupyter
:straight t)
Then, we have to enable languages for org-babel. The following isn't the best practice for startup performance time, but the least problematic in my experience.
(org-babel-do-load-languages
'org-babel-load-languages
'((emacs-lisp . t) ;; Other languages
(shell . t)
;; Python & Jupyter
(python . t)
(jupyter . t)))
That adds Org source blocks with names like jupyter-LANG, e.g. jupyter-python. To use just LANG src blocks, call the following function after org-babel-do-load-languages:
(org-babel-jupyter-override-src-block "python")
That overrides built-in python block with jupyter-python.
If you use ob-async, you have to set jupyter-LANG blocks as ignored by this package, because emacs-jupyter has async execution of its own.
(setq ob-async-no-async-languages-alist '("python" "jupyter-python"))
Environments
So, we've set up a basic emacs-jupyter configuration.
The catch here is that Jupyter should be available on Emacs startup (at the time of evaluation of the emacs-jupyter package, to be precise). That means, if you are launching Emacs with something like application launcher, global Python & Jupyter will be used.
import sys
sys.executable
/usr/bin/python3
Which is probably not what we want. To resolve that, we have to make the right Python available at the required time.
Anaconda
If you were using Jupyter Lab or Notebook before, there is a good change you used it via Anaconda. If not, in a nutshell, it is a package & environment manager, which specializes on Python & R, but also supports a whole lot of stuff like Node.js. In my opinion, it is the easiest way to manage multiple Python installations if you don't use some advanced package manager like Guix.
As one may expect, there is an Emacs package called conda.el to help working with conda environments in Emacs. We have to put it somewhere before emacs-jupyter package and call conda-env-activate:
(use-package conda
:straight t
:config
(setq conda-anaconda-home (expand-file-name "~/Programs/miniconda3/"))
(setq conda-env-home-directory (expand-file-name "~/Programs/miniconda3/"))
(setq conda-env-subdirectory "envs"))
(unless (getenv "CONDA_DEFAULT_ENV")
(conda-env-activate "base"))
If you have Anaconda installed on a custom path, as I do, you'd have to add these 3 setq in the :config section. Also, there is no point in activating environment if Emacs is somehow already lauched in an environment.
That'll give us Jupyter from a base conda environment.
virtualenv
TODO
Switching an environment
However, as you may have noticed, emacs-jupyter will always use the Python kernel found on startup. So if you switch to a new environment, the code will still be ran in the old one, which is not too convinient.
Fortunately, to fix that we have only to refresh the jupyter kernelspecs:
(defun my/jupyter-refresh-kernelspecs ()
"Refresh Jupyter kernelspecs"
(interactive)
(jupyter-available-kernelspecs t))
Calling M-x my/jupyter-refresh-kernelspecs after a switch will give you a new kernel. Just keep in mind that the kernelspec seems to be attached to a session, so you'd also have to change the session name to get a new kernel.
import sys
sys.executable
/home/pavel/Programs/miniconda3/bin/python
(conda-env-activate "ann")
import sys
sys.executable
/home/pavel/Programs/miniconda3/bin/python
(my/jupyter-refresh-kernelspecs)
import sys
sys.executable
/home/pavel/Programs/miniconda3/envs/ann/bin/python
Programming
To test if everything is working correctly, run M-x jupyter-run-repl, which should give you a REPL with a chosen kernel. If so, we can finally start using Python in org mode.
#+begin_src python :session hello :async yes
print('Hello, world!')
#+end_src
#+RESULTS:
: Hello, world!
To avoid repeating similar arguments for the src block, we can set the header-args property at the start of the file:
#+PROPERTY: header-args:python :session hello #+PROPERTY: header-args:python+ :async yes
When a kernel is initialized, an associated REPL buffer is also created with a name like *jupyter-repl[python 3.9.2]-hello*. That may also come in handy, although you may prefer running a standalone REPL, doing which will be discussed further.
Also, one advantage of emacs-jupyter is that kernel requests for input are queried through the minibuffer. So, you can run a code like this:
#+begin_src python
name = input('Name: ')
print(f'Hello, {name}!')
#+end_src
#+RESULTS:
: Hello, Pavel!
without any additional setup.
Code output
Images
Image output show work out of box. Run M-x org-toggle-inline-images (C-c C-x C-v) after the execution to see the image inline.
#+begin_src python import matplotlib.pyplot as plt fig, ax = plt.subplots() ax.plot([1, 2, 3, 4], [1, 4, 2, 3]) pass #+end_src #+RESULTS: [[file:./.ob-jupyter/86b3c5e1bbaee95d62610e1fb9c7e755bf165190.png]]
However, there is some room for improvement. First, you can add the following hook if you don't want press this awkward keybinding every time:
(add-hook 'org-babel-after-execute-hook 'org-redisplay-inline-images)
Second, we may override the image save path like this:
#+begin_src python :file img/hello.png import matplotlib.pyplot as plt fig, ax = plt.subplots() ax.plot([1, 2, 3, 4], [1, 4, 2, 3]) pass #+end_src #+RESULTS: [[file:img/hello.png]]
That can save you a savefig call if the image has to be used somewhere further.
Finally, by default the image has tranparent background and ridiculously small size. That can be fixed with some matplotlib settings:
import matplotlib as mpl
mpl.rcParams['figure.dpi'] = 200
mpl.rcParams['figure.facecolor'] = '1'
Then, we can set image width to prevent images from becoming too large. I prefer to do it inside a emacs-lisp code block in the same org file:
(setq-local org-image-actual-width '(1024))
Tables
If you are evaluating something like pandas DataFrame, it will be outputted in the HTML format, wrapped in the begin_export block. To view the data in text format, you can set :display plain:
#+begin_src python :display plain
import pandas as pd
pd.DataFrame({"a": [1, 2], "b": [3, 4]})
#+end_src
#+RESULTS:
: a b
: 0 1 3
: 1 2 4
Another solution is to use the tabulate package:
#+begin_src python
import pandas as pd
import tabulate
df = pd.DataFrame({"a": [1, 2], "b": [3, 4]})
print(tabulate.tabulate(df, headers=df.columns, tablefmt="orgtbl"))
#+end_src
#+RESULTS:
: | | a | b |
: |----+-----+-----|
: | 0 | 1 | 3 |
: | 1 | 2 | 4 |
HTML & other rich output
Yet another solution is to use emacs-jupyter's option :pandoc t, which invokes pandoc to convert HTML, LaTeX and Markdown to Org. Predictably, this is slower than the options above.
#+begin_src python :pandoc t
import pandas as pd
df = pd.DataFrame({"a": [1, 2], "b": [3, 4]})
df
#+end_src
#+RESULTS:
:RESULTS:
| | a | b |
|---+---+---|
| 0 | 1 | 3 |
| 1 | 2 | 4 |
:END:
Finally, every once in a while I have to view an actual HTML in a browser, e.g. when using folium. To do that, I've written a small function, which performs xdg-open on the HTML export block under the cursor:
(setq my/org-view-html-tmp-dir "/tmp/org-html-preview/")
(use-package f
:straight t)
(defun my/org-view-html ()
(interactive)
(let ((elem (org-element-at-point))
(temp-file-path (concat my/org-view-html-tmp-dir (number-to-string (random (expt 2 32))) ".html")))
(cond
((not (eq 'export-block (car elem)))
(message "Not in an export block!"))
((not (string-equal (plist-get (car (cdr elem)) :type) "HTML"))
(message "Export block is not HTML!"))
(t (progn
(f-mkdir my/org-view-html-tmp-dir)
(f-write (plist-get (car (cdr elem)) :value) 'utf-8 temp-file-path)
(start-process "org-html-preview" nil "xdg-open" temp-file-path))))))
f.el is used by a lot of packages, including the above mentioned conda.el, so you probably already have it installed.
Put a cursor on an export block and run M-x my/org-view-html.
There also seems to be widgets support in emacs-jupyter, but I wan't able to make it work.