#+HUGO_SECTION: posts #+HUGO_BASE_DIR: ../ #+TITLE: Running Gource with Emacs #+DATE: 2023-01-02 #+HUGO_TAGS: emacs #+HUGO_DRAFT: false [[./static/images/gource/gource.png]] [[https://gource.io/][Gource]] is a program that draws an animated graph of users changing the repository over time. Although it can work without extra effort (just run =gource= in a [[https://git-scm.com/][git]] repo), there are some tweaks that can be done: - Gource supports using custom pictures for users. [[https://en.gravatar.com/][Gravatar]] is an obvious place to get these. - Occasionally, the same people have different names and/or emails in history.\\ It may happen when people use forges like [[https://gitlab.com/][GitLab]] or just have different settings on different machines. It would be nice to merge these names. - Visualizing the history of multiple repositories (e.g. frontend and backend) requires combining multiple gource logs. So, why not try doing that with Emacs? * Gravatars Much to my surprise, Emacs turned out to have a built-in package called [[https://github.com/emacs-mirror/emacs/blob/master/lisp/image/gravatar.el][gravatar.el]]. So, let's make a function to retrieve a gravatar and save it: #+begin_src emacs-lisp (defun my/gravatar-retrieve-sync (email file-name) "Get gravatar for EMAIL and save it to FILE-NAME." (let ((gravatar-default-image "identicon") (gravatar-size nil) (coding-system-for-write 'binary) (write-region-annotate-functions nil) (write-region-post-annotation-function nil)) (write-region (image-property (gravatar-retrieve-synchronously email) :data) nil file-name nil :silent))) #+end_src To use these images, we need to save them to some folder and use usernames as file names. The folder: #+begin_src emacs-lisp (setq my/gravatar-folder "/home/pavel/.cache/gravatars/") #+end_src And the function that downloads a gravatar if necessary: #+begin_src emacs-lisp (defun my/gravatar-save (email author) "Download gravatar for EMAIL. AUTHOR is the username." (let ((file-name (concat my/gravatar-folder author ".png"))) (mkdir my/gravatar-folder t) (unless (file-exists-p file-name) (message "Fetching gravatar for %s (%s)" author email) (my/gravatar-retrieve-sync email file-name)))) #+end_src * Merging authors Now to merging authors. Gource itself uses only usernames (without emails), but we can use =git log= to get both. The required information can be extracted like that: #+begin_src bash git log --pretty=format:"%ae|%an" | sort | uniq -c | sed "s/^[ \t]*//;s/ /|/" #+end_src The output is a list of pipe-separated strings, where the values are: - Number of occurrences for this combination of username and email - Email - Username Of course, that part would have to be changed appropriately for other version control systems if you happen to use one. So, below is one hell of a function that wraps this command and tries to merge emails and usernames belonging to one author: #+begin_src emacs-lisp (defun my/git-get-authors (repo &optional authors-init) "Extract and merge all combinations of authors & emails from REPO. REPO is the path to a git repository. AUTHORS-INIT is the previous output of `my/git-get-authors'. It can be used to extract that information from multiple repositories. The output is a list of alists with following keys: - emails: list of ( . ) - authors: list of ( . ) - email: the most popular email - author: the most popular username I.e. one alist is all emails and usernames of one author." (let* ((default-directory repo) (data (shell-command-to-string "git log --pretty=format:\"%ae|%an\" | sort | uniq -c | sed \"s/^[ \t]*//;s/ /|/\"")) (authors (cl-loop for string in (split-string data "\n") if (= (length (split-string string "|")) 3) collect (let ((datum (split-string string "|"))) `((count . ,(string-to-number (nth 0 datum))) (email . ,(downcase (nth 1 datum))) (author . ,(nth 2 datum))))))) (mapcar (lambda (datum) (setf (alist-get 'author datum) (car (cl-reduce (lambda (acc author) (if (> (cdr author) (cdr acc)) author acc)) (alist-get 'authors datum) :initial-value '(nil . -1)))) (setf (alist-get 'email datum) (car (cl-reduce (lambda (acc email) (if (> (cdr email) (cdr acc)) email acc)) (alist-get 'emails datum) :initial-value '(nil . -1)))) datum) (cl-reduce (lambda (acc val) (let* ((author (alist-get 'author val)) (email (alist-get 'email val)) (count (alist-get 'count val)) (saved-value (seq-find (lambda (cand) (or (alist-get email (alist-get 'emails cand) nil nil #'string-equal) (alist-get author (alist-get 'authors cand) nil nil #'string-equal) (alist-get email (alist-get 'authors cand) nil nil #'string-equal) (alist-get author (alist-get 'emails cand) nil nil #'string-equal))) acc))) (if saved-value (progn (if (alist-get email (alist-get 'emails saved-value) nil nil #'string-equal) (cl-incf (alist-get email (alist-get 'emails saved-value) nil nil #'string-equal) count) (push (cons email count) (alist-get 'emails saved-value))) (if (alist-get author (alist-get 'authors saved-value) nil nil #'string-equal) (cl-incf (alist-get author (alist-get 'authors saved-value) nil nil #'string-equal) count) (push (cons author count) (alist-get 'authors saved-value)))) (setq saved-value (push `((emails . ((,email . ,count))) (authors . ((,author . ,count)))) acc))) acc)) authors :initial-value authors-init)))) #+end_src Despite the probable we-enjoy-typing-ness of the implementation, it's actually pretty simple: - The output of =git log= is parsed into a list of alists with =count=, =email= and =author= as keys. - This list is reduced by =cl-reduce= into a list of alists with =emails= and =authors= as keys and the respective counts as values, e.g. =(( . 1) ( . 3))=.\\ I've seen a couple of cases where people would swap their username and email (lol), so =seq-find= also looks for an email in the list of authors and vice versa. - The =mapcar= call determines the most popular email and username for each authors. The output is another list of alists, now with the following keys: - =emails= - list of elements like =( . )= - =authors= - list of elements like =( . )= - =email= - the most popular email - =author= - the most popular username. * Running for multiple repos This section was mostly informed by [[https://github.com/acaudwell/Gource/wiki/Visualizing-Multiple-Repositories][this page]] in the [[https://github.com/acaudwell/Gource/wiki][gource wiki]]. As I said above, by default =gource= just creates a visualization for the current repo. To change something in it, we need to invoke the program like that: =gource --output-custom-log PATH=, where =PATH= is either the path to the log file or =-= for stdout. The log consists of lines of pipe-separated strings, e.g.: #+begin_example 1600769568|dsofronov|A|/studentor/.dockerignore 1600769568|dsofronov|A|/studentor/.editorconfig 1600769568|dsofronov|A|/studentor/.flake8 1600769568|dsofronov|A|/studentor/.gitignore #+end_example where the values of one line are: - UNIX timestamp - Author name - =A= for add, =M= for modify, and =D= for delete - Path to file The file has to be sorted by the timestamp in ascending order. So, the function that prepares the log for one repository: #+begin_src emacs-lisp (defun my/gource-prepare-log (repo authors) "Create gource log string for REPO. AUTHORS is the output of `my/git-get-authors'." (let ((log (shell-command-to-string (concat "gource --output-custom-log - " repo))) (authors-mapping (make-hash-table :test #'equal)) (prefix (file-name-base repo))) (cl-loop for author-datum in authors for author = (alist-get 'author author-datum) do (my/gravatar-save (alist-get 'email author-datum) author) do (cl-loop for other-author in (alist-get 'authors author-datum) unless (string-equal (car other-author) author) do (puthash (car other-author) author authors-mapping))) (cl-loop for line in (split-string log "\n") concat (let ((fragments (split-string line "|"))) (when (> (length fragments) 3) (when-let (mapped-author (gethash (nth 1 fragments) authors-mapping)) (setf (nth 1 fragments) mapped-author)) (setf (nth 3 fragments) (concat "/" prefix (nth 3 fragments)))) (string-join fragments "|")) concat "\n"))) #+end_src This function: - Downloads a gravatar for each author - Replaces all usernames of one author with the most frequent one - Prepends the file path with the repository name. The output is a string in the gource log format as described above. Finally, as we need to invoke all of this for multiple repositories, why not do that with [[https://www.gnu.org/software/emacs/manual/html_node/emacs/Dired.html][dired]]: #+begin_src emacs-lisp (defun my/gource-dired-create-logs (repos log-name) "Create combined gource log for REPOS. REPOS is a list of strings, where a string is a path to a git repo. LOG-NAME is the path to the resulting log file. This function is meant to be invoked from `dired', where the required repositories are marked." (interactive (list (or (dired-get-marked-files nil nil #'file-directory-p) (user-error "Select at least one directory")) (read-file-name "Log file name: " nil "combined.log"))) (let ((authors (cl-reduce (lambda (acc repo) (my/git-get-authors repo acc)) repos :initial-value nil))) (with-temp-file log-name (insert (string-join (seq-filter (lambda (line) (not (string-empty-p line))) (seq-sort-by (lambda (line) (if-let (time (car (split-string line "|"))) (string-to-number time) 0)) #'< (split-string (mapconcat (lambda (repo) (my/gource-prepare-log repo authors)) repos "\n") "\n"))) "\n"))))) #+end_src This function extracts authors from each repository and merges the logs as required by gource, that is sorting the result by time in ascending order. * Using the function To use the function above, mark the required repos in a dired buffer and run =M-x my/gource-dired-create-logs=. This also works nicely with [[https://github.com/Fuco1/dired-hacks][dired-subtree]], in case your repos are located in different folders. The function will create a combined log file (by default =combined.log=). To visualize the log, run: #+begin_src bash gource --user-image-dir #+end_src Check the [[https://github.com/acaudwell/Gource][README]] for possible parameters, such as the speed of visualization, different elements, etc. That's it! I thought about making something like a [[https://github.com/magit/transient][transient.el]] wrapper around the =gource= command but figured it wasn't worth the effort for something that I run just a handful of times in a year.