mirror of
https://github.com/SqrtMinusOne/sqrtminusone.github.io.git
synced 2025-12-10 15:53:03 +03:00
deploy: 9b858d019a
This commit is contained in:
parent
6c53072b13
commit
2609c4ea86
6 changed files with 318 additions and 239 deletions
|
|
@ -1374,7 +1374,11 @@ Emacs is also particularly great at writing Lisp code, e.g. Clojure, Common Lisp
|
|||
</span></span></code></pre></div><h4 id="consult">consult</h4>
|
||||
<p><a href="https://github.com/minad/consult">consult</a> provides various commands based on the <code>completing-read</code> API.</p>
|
||||
<div class="highlight"><pre tabindex="0" style=";-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-emacs-lisp" data-lang="emacs-lisp"><span style="display:flex;"><span>(<span style="color:#008000">use-package</span> <span style="color:#19177c">consult</span>
|
||||
</span></span><span style="display:flex;"><span> <span style="color:#008000">:straight</span> <span style="color:#800">t</span>)
|
||||
</span></span><span style="display:flex;"><span> <span style="color:#008000">:straight</span> <span style="color:#800">t</span>
|
||||
</span></span><span style="display:flex;"><span> <span style="color:#008000">:config</span>
|
||||
</span></span><span style="display:flex;"><span> (<span style="color:#008000">setq</span> <span style="color:#19177c">consult-preview-excluded-files</span>
|
||||
</span></span><span style="display:flex;"><span> <span style="color:#666">`</span>(<span style="color:#ba2121">"\\`/[^/|:]+:"</span>
|
||||
</span></span><span style="display:flex;"><span> <span style="color:#666">,</span>(<span style="color:#008000">rx</span> <span style="color:#ba2121">"html"</span> <span style="color:#19177c">eos</span>))))
|
||||
</span></span></code></pre></div><h4 id="marginalia">marginalia</h4>
|
||||
<p><a href="https://github.com/minad/marginalia">marginalia</a> provides annotations in the completion interface.</p>
|
||||
<div class="highlight"><pre tabindex="0" style=";-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-emacs-lisp" data-lang="emacs-lisp"><span style="display:flex;"><span>(<span style="color:#008000">use-package</span> <span style="color:#19177c">marginalia</span>
|
||||
|
|
@ -5660,6 +5664,10 @@ Emacs is also particularly great at writing Lisp code, e.g. Clojure, Common Lisp
|
|||
</span></span></code></pre></div><h4 id="bibliography">Bibliography</h4>
|
||||
<p>I use <a href="https://www.zotero.org/">Zotero</a> to manage my bibliograhy.</p>
|
||||
<p>There is a Zotero extension called <a href="https://retorque.re/zotero-better-bibtex/">better bibtex</a>, which allows for having one bibtex file that is always syncronized with the library. That comes quite handy for Emacs integration.</p>
|
||||
<p>Resources:</p>
|
||||
<ul>
|
||||
<li><a href="https://blog.tecosaur.com/tmio/2021-07-31-citations.html">Introducing citations!</a></li>
|
||||
</ul>
|
||||
<h5 id="citar">citar</h5>
|
||||
<p><a href="https://github.com/emacs-citar/citar">citar</a> is a package that works with citations.</p>
|
||||
<div class="highlight"><pre tabindex="0" style=";-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-emacs-lisp" data-lang="emacs-lisp"><span style="display:flex;"><span>(<span style="color:#008000">use-package</span> <span style="color:#19177c">citar</span>
|
||||
|
|
@ -5674,6 +5682,10 @@ Emacs is also particularly great at writing Lisp code, e.g. Clojure, Common Lisp
|
|||
</span></span><span style="display:flex;"><span> <span style="color:#19177c">org-cite-follow-processor</span> <span style="color:#19177c">'citar</span>
|
||||
</span></span><span style="display:flex;"><span> <span style="color:#19177c">org-cite-activate-processor</span> <span style="color:#19177c">'citar</span>
|
||||
</span></span><span style="display:flex;"><span> <span style="color:#19177c">citar-bibliography</span> <span style="color:#19177c">org-cite-global-bibliography</span>)
|
||||
</span></span><span style="display:flex;"><span> (<span style="color:#008000">setq</span> <span style="color:#19177c">org-cite-export-processors</span>
|
||||
</span></span><span style="display:flex;"><span> <span style="color:#666">'</span>((<span style="color:#19177c">latex</span> <span style="color:#19177c">bibtex</span> <span style="color:#ba2121">"numeric"</span>)))
|
||||
</span></span><span style="display:flex;"><span> (<span style="color:#008000">setq</span> <span style="color:#19177c">citar-library-paths</span>
|
||||
</span></span><span style="display:flex;"><span> <span style="color:#666">'</span>(<span style="color:#ba2121">"~/30-39 Life/33 Library/33.01 Documents/"</span>))
|
||||
</span></span><span style="display:flex;"><span> (<span style="color:#19177c">add-hook</span> <span style="color:#19177c">'latex-mode</span> <span style="color:#00f">#'</span><span style="color:#19177c">citar-capf-setup</span>)
|
||||
</span></span><span style="display:flex;"><span> (<span style="color:#19177c">add-hook</span> <span style="color:#19177c">'org-mode</span> <span style="color:#00f">#'</span><span style="color:#19177c">citar-capf-setup</span>))
|
||||
</span></span><span style="display:flex;"><span>
|
||||
|
|
@ -5689,6 +5701,8 @@ Emacs is also particularly great at writing Lisp code, e.g. Clojure, Common Lisp
|
|||
<div class="highlight"><pre tabindex="0" style=";-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-emacs-lisp" data-lang="emacs-lisp"><span style="display:flex;"><span>(<span style="color:#008000">use-package</span> <span style="color:#19177c">org-ref</span>
|
||||
</span></span><span style="display:flex;"><span> <span style="color:#008000">:straight</span> (<span style="color:#008000">:files</span> (<span style="color:#008000">:defaults</span> <span style="color:#ba2121">"citeproc"</span> (<span style="color:#008000">:exclude</span> <span style="color:#ba2121">"*helm*"</span>)))
|
||||
</span></span><span style="display:flex;"><span> <span style="color:#008000">:if</span> (<span style="color:#19177c">not</span> <span style="color:#19177c">my/remote-server</span>)
|
||||
</span></span><span style="display:flex;"><span> <span style="color:#008000">:commands</span> (<span style="color:#19177c">org-ref-insert-link-hydra/body</span>
|
||||
</span></span><span style="display:flex;"><span> <span style="color:#19177c">org-ref-bibtex-hydra/body</span>)
|
||||
</span></span><span style="display:flex;"><span> <span style="color:#008000">:init</span>
|
||||
</span></span><span style="display:flex;"><span> (<span style="color:#008000">setq</span> <span style="color:#19177c">bibtex-dialect</span> <span style="color:#19177c">'biblatex</span>)
|
||||
</span></span><span style="display:flex;"><span> (<span style="color:#19177c">add-hook</span> <span style="color:#19177c">'bibtex-mode</span> <span style="color:#19177c">'smartparens-mode</span>)
|
||||
|
|
@ -8000,220 +8014,6 @@ Didn’t work out as I expected, so I’ve made <code>org-journal-tags</
|
|||
</span></span><span style="display:flex;"><span> (<span style="color:#008000">setq-local</span> <span style="color:#19177c">subed-mpv-video-file</span> (<span style="color:#19177c">elfeed-entry-link</span> <span style="color:#19177c">entry</span>))
|
||||
</span></span><span style="display:flex;"><span> (<span style="color:#19177c">subed-mpv--play</span> <span style="color:#19177c">subed-mpv-video-file</span>))
|
||||
</span></span></code></pre></div><p>Keep in mind that this function has to be launched inside the buffer opened by the <code>my/elfeed-youtube-subtitles</code> function.</p>
|
||||
<h4 id="podcast-transcripts">Podcast transcripts</h4>
|
||||
<p>In my experience, finding something in a podcast can be particularly troublesome. For instance, at times, I want to refer to a specific line in the podcast to make an <a href="https://github.com/org-roam/org-roam">org-roam</a> node, and I need to check if I got that part right. And I have no reasonable way to get there because audio files, in themselves, don’t allow for <a href="https://en.wikipedia.org/wiki/Random_access">random access</a>, i.e. there are no “landmarks” that point to a particular portion of the file. At least if nothing like a transcript is available.</p>
|
||||
<p>For obvious reasons, podcasts rarely ship with transcripts. So in this <del>post</del> section I’ll be using a speech recognition engine to make up for that. The general idea is to obtain the podcast information from <a href="https://github.com/skeeto/elfeed">elfeed</a>, process it with <a href="https://github.com/openai/whisper">OpenAI Whisper</a> and feed it to <a href="https://github.com/sachac/subed">subed</a> to control the playback in <a href="https://mpv.io/">MPV</a>.</p>
|
||||
<p>Edit <span class="timestamp-wrapper"><span class="timestamp"><2022-10-08 Sat></span></span>: Changed <a href="https://github.com/alphacep/vosk-api">vosk-api</a> to OpenAI Whisper.</p>
|
||||
<h5 id="whisper">Whisper</h5>
|
||||
<p><a href="https://github.com/openai/whisper">OpenAI Whisper</a> is an amazing speech recognition toolkit.</p>
|
||||
<p>The implementation by OpenAI is rather slow on my PC (speed around 0.75 on tiny.en), but <a href="https://github.com/ggerganov/whisper.cpp">whisper.cpp</a> by Georgi Gerganov works much faster (5.9x). I’ve packaged the latter for Guix.</p>
|
||||
<table>
|
||||
<thead>
|
||||
<tr>
|
||||
<th>Guix dependency</th>
|
||||
</tr>
|
||||
</thead>
|
||||
<tbody>
|
||||
<tr>
|
||||
<td>whisper-cpp</td>
|
||||
</tr>
|
||||
</tbody>
|
||||
</table>
|
||||
<h5 id="running-it-from-emacs">Running it from Emacs</h5>
|
||||
<p>Running the program from Emacs is rather straightforward with <a href="https://www.gnu.org/software/emacs/manual/html_node/elisp/Asynchronous-Processes.html">asyncronous processes</a>.</p>
|
||||
<p>I’m using an English-language-only model because that’s the only language I need at the moment.</p>
|
||||
<div class="highlight"><pre tabindex="0" style=";-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-emacs-lisp" data-lang="emacs-lisp"><span style="display:flex;"><span>(<span style="color:#008000">defun</span> <span style="color:#19177c">my/invoke-whisper--direct</span> (<span style="color:#19177c">input</span> <span style="color:#19177c">output-dir</span> <span style="color:#008000">&optional</span> <span style="color:#19177c">remove-wav</span>)
|
||||
</span></span><span style="display:flex;"><span> <span style="color:#ba2121">"Extract subtitles from a WAV audio file.
|
||||
</span></span></span><span style="display:flex;"><span><span style="color:#ba2121">
|
||||
</span></span></span><span style="display:flex;"><span><span style="color:#ba2121">INPUT is the absolute path to audio file, OUTPUT-DIR is the path to
|
||||
</span></span></span><span style="display:flex;"><span><span style="color:#ba2121">the directory with resulting files."</span>
|
||||
</span></span><span style="display:flex;"><span> (<span style="color:#008000">let*</span> ((<span style="color:#19177c">default-directory</span> <span style="color:#19177c">output-dir</span>)
|
||||
</span></span><span style="display:flex;"><span> (<span style="color:#19177c">buffer</span> (<span style="color:#19177c">generate-new-buffer</span> <span style="color:#ba2121">"whisper"</span>))
|
||||
</span></span><span style="display:flex;"><span> (<span style="color:#19177c">proc</span> (<span style="color:#00f">start-process</span>
|
||||
</span></span><span style="display:flex;"><span> <span style="color:#ba2121">"whisper"</span> <span style="color:#19177c">buffer</span>
|
||||
</span></span><span style="display:flex;"><span> <span style="color:#ba2121">"whisper-cpp"</span> <span style="color:#ba2121">"--model"</span> <span style="color:#ba2121">"/home/pavel/.whisper/ggml-medium.bin"</span>
|
||||
</span></span><span style="display:flex;"><span> <span style="color:#ba2121">"-otxt"</span> <span style="color:#ba2121">"-ovtt"</span> <span style="color:#ba2121">"-osrt"</span> <span style="color:#ba2121">"-l"</span> <span style="color:#ba2121">"auto"</span> <span style="color:#19177c">input</span>)))
|
||||
</span></span><span style="display:flex;"><span> (<span style="color:#00f">set-process-sentinel</span>
|
||||
</span></span><span style="display:flex;"><span> <span style="color:#19177c">proc</span>
|
||||
</span></span><span style="display:flex;"><span> (<span style="color:#008000">lambda</span> (<span style="color:#19177c">process</span> <span style="color:#19177c">_msg</span>)
|
||||
</span></span><span style="display:flex;"><span> (<span style="color:#008000">let</span> ((<span style="color:#19177c">status</span> (<span style="color:#00f">process-status</span> <span style="color:#19177c">process</span>))
|
||||
</span></span><span style="display:flex;"><span> (<span style="color:#19177c">code</span> (<span style="color:#00f">process-exit-status</span> <span style="color:#19177c">process</span>)))
|
||||
</span></span><span style="display:flex;"><span> (<span style="color:#008000">cond</span> ((<span style="color:#008000">and</span> (<span style="color:#00f">eq</span> <span style="color:#19177c">status</span> <span style="color:#19177c">'exit</span>) (<span style="color:#00f">=</span> <span style="color:#19177c">code</span> <span style="color:#666">0</span>))
|
||||
</span></span><span style="display:flex;"><span> (<span style="color:#19177c">notifications-notify</span> <span style="color:#008000">:body</span> <span style="color:#ba2121">"Audio conversion completed"</span>
|
||||
</span></span><span style="display:flex;"><span> <span style="color:#008000">:title</span> <span style="color:#ba2121">"Whisper"</span>)
|
||||
</span></span><span style="display:flex;"><span> (<span style="color:#008000">when</span> <span style="color:#19177c">remove-wav</span>
|
||||
</span></span><span style="display:flex;"><span> (<span style="color:#00f">delete-file</span> <span style="color:#19177c">input</span>))
|
||||
</span></span><span style="display:flex;"><span> (<span style="color:#008000">dolist</span> (<span style="color:#19177c">extension</span> <span style="color:#666">'</span>(<span style="color:#ba2121">".txt"</span> <span style="color:#ba2121">".vtt"</span> <span style="color:#ba2121">".srt"</span>))
|
||||
</span></span><span style="display:flex;"><span> (<span style="color:#00f">rename-file</span> (<span style="color:#00f">concat</span> <span style="color:#19177c">input</span> <span style="color:#19177c">extension</span>)
|
||||
</span></span><span style="display:flex;"><span> (<span style="color:#00f">concat</span> (<span style="color:#19177c">file-name-sans-extension</span> <span style="color:#19177c">input</span>) <span style="color:#19177c">extension</span>)))
|
||||
</span></span><span style="display:flex;"><span> (<span style="color:#00f">kill-buffer</span> (<span style="color:#00f">process-buffer</span> <span style="color:#19177c">process</span>)))
|
||||
</span></span><span style="display:flex;"><span> ((<span style="color:#008000">or</span> (<span style="color:#008000">and</span> (<span style="color:#00f">eq</span> <span style="color:#19177c">status</span> <span style="color:#19177c">'exit</span>) (<span style="color:#00f">></span> <span style="color:#19177c">code</span> <span style="color:#666">0</span>))
|
||||
</span></span><span style="display:flex;"><span> (<span style="color:#00f">eq</span> <span style="color:#19177c">status</span> <span style="color:#19177c">'signal</span>))
|
||||
</span></span><span style="display:flex;"><span> (<span style="color:#008000">let</span> ((<span style="color:#19177c">err</span> (<span style="color:#008000">with-current-buffer</span> (<span style="color:#00f">process-buffer</span> <span style="color:#19177c">process</span>)
|
||||
</span></span><span style="display:flex;"><span> (<span style="color:#00f">buffer-string</span>))))
|
||||
</span></span><span style="display:flex;"><span> (<span style="color:#d2413a;font-weight:bold">user-error</span> <span style="color:#ba2121">"Error in Whisper: %s"</span> <span style="color:#19177c">err</span>)))))))))
|
||||
</span></span><span style="display:flex;"><span>
|
||||
</span></span><span style="display:flex;"><span>(<span style="color:#008000">defun</span> <span style="color:#19177c">my/invoke-whisper</span> (<span style="color:#19177c">input</span> <span style="color:#19177c">output-dir</span>)
|
||||
</span></span><span style="display:flex;"><span> <span style="color:#ba2121">"Extract subtitles from the audio file.
|
||||
</span></span></span><span style="display:flex;"><span><span style="color:#ba2121">
|
||||
</span></span></span><span style="display:flex;"><span><span style="color:#ba2121">INPUT is the absolute path to the audio file, OUTPUT-DIR is the path
|
||||
</span></span></span><span style="display:flex;"><span><span style="color:#ba2121">to the directory with resulting files.
|
||||
</span></span></span><span style="display:flex;"><span><span style="color:#ba2121">
|
||||
</span></span></span><span style="display:flex;"><span><span style="color:#ba2121">Run ffmpeg if the file is not WAV."</span>
|
||||
</span></span><span style="display:flex;"><span> (<span style="color:#008000">interactive</span>
|
||||
</span></span><span style="display:flex;"><span> (<span style="color:#00f">list</span>
|
||||
</span></span><span style="display:flex;"><span> (<span style="color:#19177c">read-file-name</span> <span style="color:#ba2121">"Input file: "</span> <span style="color:#800">nil</span> <span style="color:#800">nil</span> <span style="color:#800">t</span>)
|
||||
</span></span><span style="display:flex;"><span> (<span style="color:#19177c">read-directory-name</span> <span style="color:#ba2121">"Output directory: "</span>)))
|
||||
</span></span><span style="display:flex;"><span> (<span style="color:#008000">if</span> (<span style="color:#19177c">string-match-p</span> (<span style="color:#008000">rx</span> <span style="color:#ba2121">".wav"</span> <span style="color:#19177c">eos</span>) <span style="color:#19177c">input</span>)
|
||||
</span></span><span style="display:flex;"><span> (<span style="color:#19177c">my/invoke-whisper--direct</span> <span style="color:#19177c">input</span> <span style="color:#19177c">output-dir</span>)
|
||||
</span></span><span style="display:flex;"><span> (<span style="color:#008000">let*</span> ((<span style="color:#19177c">ffmpeg-proc</span>
|
||||
</span></span><span style="display:flex;"><span> (<span style="color:#00f">start-process</span>
|
||||
</span></span><span style="display:flex;"><span> <span style="color:#ba2121">"ffmpef"</span> <span style="color:#800">nil</span> <span style="color:#ba2121">"ffmpeg"</span> <span style="color:#ba2121">"-i"</span> <span style="color:#19177c">input</span> <span style="color:#ba2121">"-ar"</span> <span style="color:#ba2121">"16000"</span> <span style="color:#ba2121">"-ac"</span> <span style="color:#ba2121">"1"</span> <span style="color:#ba2121">"-c:a"</span>
|
||||
</span></span><span style="display:flex;"><span> <span style="color:#ba2121">"pcm_s16le"</span> (<span style="color:#00f">concat</span> (<span style="color:#19177c">file-name-sans-extension</span> <span style="color:#19177c">input</span>) <span style="color:#ba2121">".wav"</span>))))
|
||||
</span></span><span style="display:flex;"><span> (<span style="color:#00f">set-process-sentinel</span>
|
||||
</span></span><span style="display:flex;"><span> <span style="color:#19177c">ffmpeg-proc</span>
|
||||
</span></span><span style="display:flex;"><span> (<span style="color:#008000">lambda</span> (<span style="color:#19177c">process</span> <span style="color:#19177c">_msg</span>)
|
||||
</span></span><span style="display:flex;"><span> (<span style="color:#008000">let</span> ((<span style="color:#19177c">status</span> (<span style="color:#00f">process-status</span> <span style="color:#19177c">process</span>))
|
||||
</span></span><span style="display:flex;"><span> (<span style="color:#19177c">code</span> (<span style="color:#00f">process-exit-status</span> <span style="color:#19177c">process</span>)))
|
||||
</span></span><span style="display:flex;"><span> (<span style="color:#008000">cond</span> ((<span style="color:#008000">and</span> (<span style="color:#00f">eq</span> <span style="color:#19177c">status</span> <span style="color:#19177c">'exit</span>) (<span style="color:#00f">=</span> <span style="color:#19177c">code</span> <span style="color:#666">0</span>))
|
||||
</span></span><span style="display:flex;"><span> (<span style="color:#19177c">my/invoke-whisper--direct</span>
|
||||
</span></span><span style="display:flex;"><span> (<span style="color:#00f">concat</span> (<span style="color:#19177c">file-name-sans-extension</span> <span style="color:#19177c">input</span>) <span style="color:#ba2121">".wav"</span>) <span style="color:#19177c">output-dir</span> <span style="color:#800">t</span>))
|
||||
</span></span><span style="display:flex;"><span> ((<span style="color:#008000">or</span> (<span style="color:#008000">and</span> (<span style="color:#00f">eq</span> <span style="color:#19177c">status</span> <span style="color:#19177c">'exit</span>) (<span style="color:#00f">></span> <span style="color:#19177c">code</span> <span style="color:#666">0</span>))
|
||||
</span></span><span style="display:flex;"><span> (<span style="color:#00f">eq</span> <span style="color:#19177c">status</span> <span style="color:#19177c">'signal</span>))
|
||||
</span></span><span style="display:flex;"><span> (<span style="color:#008000">let</span> ((<span style="color:#19177c">err</span> (<span style="color:#008000">with-current-buffer</span> (<span style="color:#00f">process-buffer</span> <span style="color:#19177c">process</span>)
|
||||
</span></span><span style="display:flex;"><span> (<span style="color:#00f">buffer-string</span>))))
|
||||
</span></span><span style="display:flex;"><span> (<span style="color:#d2413a;font-weight:bold">user-error</span> <span style="color:#ba2121">"Error in running ffmpeg: %s"</span> <span style="color:#19177c">err</span>))))))))))
|
||||
</span></span></code></pre></div><p>If run interactively, the defined function prompts for paths to both files.</p>
|
||||
<p>The process sentinel sends a <a href="https://www.gnu.org/software/emacs/manual/html_node/elisp/Desktop-Notifications.html">desktop notification</a> because it’s a bit more noticeable than <code>message</code>, and the process is expected to take some time.</p>
|
||||
<h5 id="integrating-with-elfeed">Integrating with elfeed</h5>
|
||||
<p>To actually run the function from the section above, we need to download the file in question.</p>
|
||||
<p>The <code>whisper</code> executable, given the file <code><file>.<extension></code>, creates files named <code><file>.vtt</code>, <code><file>.srt</code>, <code><file>.txt</code>. So first we need to save the file under the correct name.</p>
|
||||
<p>I use a library called <a href="https://github.com/tkf/emacs-request">request.el</a> to download files elsewhere, so I’ll re-use it here. You can just as well invoke <code>curl</code> or <code>wget</code> via a asynchronous process.</p>
|
||||
<p>This function downloads the file to a non-temporary folder, which is <code>~/.elfeed/podcast-files/</code> if you didn’t move the elfeed database. That is so because a permanently downloaded file works better for the next section.</p>
|
||||
<div class="highlight"><pre tabindex="0" style=";-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-emacs-lisp" data-lang="emacs-lisp"><span style="display:flex;"><span>(<span style="color:#008000">with-eval-after-load</span> <span style="color:#19177c">'elfeed</span>
|
||||
</span></span><span style="display:flex;"><span> (<span style="color:#008000">defvar</span> <span style="color:#19177c">my/elfeed-whisper-podcast-files-directory</span>
|
||||
</span></span><span style="display:flex;"><span> (<span style="color:#00f">concat</span> <span style="color:#19177c">elfeed-db-directory</span> <span style="color:#ba2121">"/podcast-files/"</span>)))
|
||||
</span></span><span style="display:flex;"><span>
|
||||
</span></span><span style="display:flex;"><span>(<span style="color:#008000">defun</span> <span style="color:#19177c">my/elfeed-whisper-get-transcript-new</span> (<span style="color:#19177c">entry</span>)
|
||||
</span></span><span style="display:flex;"><span> (<span style="color:#008000">interactive</span> (<span style="color:#00f">list</span> <span style="color:#19177c">elfeed-show-entry</span>))
|
||||
</span></span><span style="display:flex;"><span> (<span style="color:#008000">let*</span> ((<span style="color:#19177c">url</span> (<span style="color:#19177c">caar</span> (<span style="color:#19177c">elfeed-entry-enclosures</span> <span style="color:#19177c">entry</span>)))
|
||||
</span></span><span style="display:flex;"><span> (<span style="color:#19177c">file-name</span> (<span style="color:#00f">concat</span>
|
||||
</span></span><span style="display:flex;"><span> (<span style="color:#19177c">elfeed-ref-id</span> (<span style="color:#19177c">elfeed-entry-content</span> <span style="color:#19177c">entry</span>))
|
||||
</span></span><span style="display:flex;"><span> <span style="color:#ba2121">"."</span>
|
||||
</span></span><span style="display:flex;"><span> (<span style="color:#19177c">file-name-extension</span> <span style="color:#19177c">url</span>)))
|
||||
</span></span><span style="display:flex;"><span> (<span style="color:#19177c">file-path</span> (<span style="color:#00f">expand-file-name</span>
|
||||
</span></span><span style="display:flex;"><span> (<span style="color:#00f">concat</span>
|
||||
</span></span><span style="display:flex;"><span> <span style="color:#19177c">my/elfeed-whisper-podcast-files-directory</span>
|
||||
</span></span><span style="display:flex;"><span> <span style="color:#19177c">file-name</span>))))
|
||||
</span></span><span style="display:flex;"><span> (<span style="color:#00f">message</span> <span style="color:#ba2121">"Download started"</span>)
|
||||
</span></span><span style="display:flex;"><span> (<span style="color:#008000">unless</span> (<span style="color:#00f">file-exists-p</span> <span style="color:#19177c">my/elfeed-whisper-podcast-files-directory</span>)
|
||||
</span></span><span style="display:flex;"><span> (<span style="color:#19177c">mkdir</span> <span style="color:#19177c">my/elfeed-whisper-podcast-files-directory</span>))
|
||||
</span></span><span style="display:flex;"><span> (<span style="color:#19177c">request</span> <span style="color:#19177c">url</span>
|
||||
</span></span><span style="display:flex;"><span> <span style="color:#008000">:type</span> <span style="color:#ba2121">"GET"</span>
|
||||
</span></span><span style="display:flex;"><span> <span style="color:#008000">:encoding</span> <span style="color:#19177c">'binary</span>
|
||||
</span></span><span style="display:flex;"><span> <span style="color:#008000">:complete</span>
|
||||
</span></span><span style="display:flex;"><span> (<span style="color:#008000">cl-function</span>
|
||||
</span></span><span style="display:flex;"><span> (<span style="color:#008000">lambda</span> (<span style="color:#008000">&key</span> <span style="color:#19177c">data</span> <span style="color:#008000">&allow-other-keys</span>)
|
||||
</span></span><span style="display:flex;"><span> (<span style="color:#008000">let</span> ((<span style="color:#19177c">coding-system-for-write</span> <span style="color:#19177c">'binary</span>)
|
||||
</span></span><span style="display:flex;"><span> (<span style="color:#19177c">write-region-annotate-functions</span> <span style="color:#800">nil</span>)
|
||||
</span></span><span style="display:flex;"><span> (<span style="color:#19177c">write-region-post-annotation-function</span> <span style="color:#800">nil</span>))
|
||||
</span></span><span style="display:flex;"><span> (<span style="color:#00f">write-region</span> <span style="color:#19177c">data</span> <span style="color:#800">nil</span> <span style="color:#19177c">file-path</span> <span style="color:#800">nil</span> <span style="color:#008000">:silent</span>))
|
||||
</span></span><span style="display:flex;"><span> (<span style="color:#00f">message</span> <span style="color:#ba2121">"Conversion started"</span>)
|
||||
</span></span><span style="display:flex;"><span> (<span style="color:#19177c">my/invoke-whisper</span> <span style="color:#19177c">file-path</span> <span style="color:#19177c">my/elfeed-srt-dir</span>)))
|
||||
</span></span><span style="display:flex;"><span> <span style="color:#008000">:error</span>
|
||||
</span></span><span style="display:flex;"><span> (<span style="color:#008000">cl-function</span>
|
||||
</span></span><span style="display:flex;"><span> (<span style="color:#008000">lambda</span> (<span style="color:#008000">&key</span> <span style="color:#19177c">error-thrown</span> <span style="color:#008000">&allow-other-keys</span>)
|
||||
</span></span><span style="display:flex;"><span> (<span style="color:#00f">message</span> <span style="color:#ba2121">"Error!: %S"</span> <span style="color:#19177c">error-thrown</span>))))))
|
||||
</span></span></code></pre></div><p>I also experimented with a bunch of options to write binary data in Emacs, of which the way with <code>write-region</code> (as implemented in <a href="https://github.com/rejeep/f.el">f.el</a>) seems to be the fastest. <a href="https://emacs.stackexchange.com/questions/59449/how-do-i-save-raw-bytes-into-a-file">This thread on StackExchange</a> suggests that it may screw some bytes towards the end, but whether or not this is the case, mp3 files survive the procedure. The proposed solution with <code>seq-doseq</code> takes at least a few seconds.</p>
|
||||
<p>As <code>my/invoke-whisper</code> creates multiple files, here’s a function to select related files:</p>
|
||||
<div class="highlight"><pre tabindex="0" style=";-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-emacs-lisp" data-lang="emacs-lisp"><span style="display:flex;"><span>(<span style="color:#008000">defun</span> <span style="color:#19177c">my/elfeed-show-related-files</span> (<span style="color:#19177c">entry</span>)
|
||||
</span></span><span style="display:flex;"><span> (<span style="color:#008000">interactive</span> (<span style="color:#00f">list</span> <span style="color:#19177c">elfeed-show-entry</span>))
|
||||
</span></span><span style="display:flex;"><span> (<span style="color:#008000">let*</span> ((<span style="color:#19177c">files</span>
|
||||
</span></span><span style="display:flex;"><span> (<span style="color:#00f">mapcar</span>
|
||||
</span></span><span style="display:flex;"><span> (<span style="color:#008000">lambda</span> (<span style="color:#19177c">file</span>) (<span style="color:#00f">cons</span> (<span style="color:#19177c">file-name-extension</span> <span style="color:#19177c">file</span>) <span style="color:#19177c">file</span>))
|
||||
</span></span><span style="display:flex;"><span> (<span style="color:#19177c">seq-filter</span>
|
||||
</span></span><span style="display:flex;"><span> (<span style="color:#008000">lambda</span> (<span style="color:#19177c">file</span>)
|
||||
</span></span><span style="display:flex;"><span> (<span style="color:#19177c">string-match-p</span>
|
||||
</span></span><span style="display:flex;"><span> (<span style="color:#008000">rx</span> <span style="color:#19177c">bos</span> (<span style="color:#19177c">literal</span> (<span style="color:#19177c">elfeed-ref-id</span> (<span style="color:#19177c">elfeed-entry-content</span> <span style="color:#19177c">entry</span>))) <span style="color:#ba2121">"."</span>)
|
||||
</span></span><span style="display:flex;"><span> <span style="color:#19177c">file</span>))
|
||||
</span></span><span style="display:flex;"><span> (<span style="color:#00f">directory-files</span> <span style="color:#19177c">my/elfeed-srt-dir</span>))))
|
||||
</span></span><span style="display:flex;"><span> (<span style="color:#19177c">buffer</span>
|
||||
</span></span><span style="display:flex;"><span> (<span style="color:#19177c">find-file-other-window</span>
|
||||
</span></span><span style="display:flex;"><span> (<span style="color:#00f">concat</span>
|
||||
</span></span><span style="display:flex;"><span> <span style="color:#19177c">my/elfeed-srt-dir</span>
|
||||
</span></span><span style="display:flex;"><span> (<span style="color:#19177c">alist-get</span>
|
||||
</span></span><span style="display:flex;"><span> (<span style="color:#00f">completing-read</span> <span style="color:#ba2121">"File: "</span> <span style="color:#19177c">files</span>)
|
||||
</span></span><span style="display:flex;"><span> <span style="color:#19177c">files</span> <span style="color:#800">nil</span> <span style="color:#800">nil</span> <span style="color:#00f">#'equal</span>)))))
|
||||
</span></span><span style="display:flex;"><span> (<span style="color:#008000">with-current-buffer</span> <span style="color:#19177c">buffer</span>
|
||||
</span></span><span style="display:flex;"><span> (<span style="color:#008000">setq-local</span> <span style="color:#19177c">elfeed-show-entry</span> <span style="color:#19177c">entry</span>))))
|
||||
</span></span></code></pre></div><p>Finally, we need a function to show the transcript if it exists or invoke <code>my/elfeed-whisper-get-transcript-new</code> if it doesn’t. And this is the function that we’ll call from an <code>elfeed-entry</code> buffer.</p>
|
||||
<div class="highlight"><pre tabindex="0" style=";-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-emacs-lisp" data-lang="emacs-lisp"><span style="display:flex;"><span>(<span style="color:#008000">defun</span> <span style="color:#19177c">my/elfeed-whisper-get-transcript</span> (<span style="color:#19177c">entry</span>)
|
||||
</span></span><span style="display:flex;"><span> <span style="color:#ba2121">"Retrieve transcript for the enclosure of the current elfeed ENTRY."</span>
|
||||
</span></span><span style="display:flex;"><span> (<span style="color:#008000">interactive</span> (<span style="color:#00f">list</span> <span style="color:#19177c">elfeed-show-entry</span>))
|
||||
</span></span><span style="display:flex;"><span> (<span style="color:#008000">let</span> ((<span style="color:#19177c">enclosure</span> (<span style="color:#19177c">caar</span> (<span style="color:#19177c">elfeed-entry-enclosures</span> <span style="color:#19177c">entry</span>))))
|
||||
</span></span><span style="display:flex;"><span> (<span style="color:#008000">unless</span> <span style="color:#19177c">enclosure</span>
|
||||
</span></span><span style="display:flex;"><span> (<span style="color:#d2413a;font-weight:bold">user-error</span> <span style="color:#ba2121">"No enclosure found!"</span>))
|
||||
</span></span><span style="display:flex;"><span> (<span style="color:#008000">let</span> ((<span style="color:#19177c">srt-path</span> (<span style="color:#00f">concat</span> <span style="color:#19177c">my/elfeed-srt-dir</span>
|
||||
</span></span><span style="display:flex;"><span> (<span style="color:#19177c">elfeed-ref-id</span> (<span style="color:#19177c">elfeed-entry-content</span> <span style="color:#19177c">entry</span>))
|
||||
</span></span><span style="display:flex;"><span> <span style="color:#ba2121">".srt"</span>)))
|
||||
</span></span><span style="display:flex;"><span> (<span style="color:#008000">if</span> (<span style="color:#00f">file-exists-p</span> <span style="color:#19177c">srt-path</span>)
|
||||
</span></span><span style="display:flex;"><span> (<span style="color:#008000">let</span> ((<span style="color:#19177c">buffer</span> (<span style="color:#19177c">find-file-other-window</span> <span style="color:#19177c">srt-path</span>)))
|
||||
</span></span><span style="display:flex;"><span> (<span style="color:#008000">with-current-buffer</span> <span style="color:#19177c">buffer</span>
|
||||
</span></span><span style="display:flex;"><span> (<span style="color:#008000">setq-local</span> <span style="color:#19177c">elfeed-show-entry</span> <span style="color:#19177c">entry</span>)))
|
||||
</span></span><span style="display:flex;"><span> (<span style="color:#19177c">my/elfeed-whisper-get-transcript-new</span> <span style="color:#19177c">entry</span>)))))
|
||||
</span></span></code></pre></div><h5 id="integrating-with-subed">Integrating with subed</h5>
|
||||
<p>Now that we’ve produced a <code>.srt</code> file, we can use a package called <a href="https://github.com/sachac/subed">subed</a> to control the playback, as I had done in the previous post.</p>
|
||||
<p>By the way, this wasn’t the most straightforward thing to figure out, because the MPV window doesn’t show up for an audio file, and the player itself starts in the paused state. So I thought nothing was happening until I enabled the debug log.</p>
|
||||
<p>With that in mind, here’s a function to launch MPV from the buffer generated by <code>my/elfeed-whisper-get-transcript</code>:</p>
|
||||
<div class="highlight"><pre tabindex="0" style=";-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-emacs-lisp" data-lang="emacs-lisp"><span style="display:flex;"><span>(<span style="color:#008000">defun</span> <span style="color:#19177c">my/elfeed-whisper-subed</span> (<span style="color:#19177c">entry</span>)
|
||||
</span></span><span style="display:flex;"><span> <span style="color:#ba2121">"Run MPV for the current Whisper-generated subtitles file.
|
||||
</span></span></span><span style="display:flex;"><span><span style="color:#ba2121">
|
||||
</span></span></span><span style="display:flex;"><span><span style="color:#ba2121">ENTRY is an instance of </span><span style="color:#19177c">`elfeed-entry'</span><span style="color:#ba2121">."</span>
|
||||
</span></span><span style="display:flex;"><span> (<span style="color:#008000">interactive</span> (<span style="color:#00f">list</span> <span style="color:#19177c">elfeed-show-entry</span>))
|
||||
</span></span><span style="display:flex;"><span> (<span style="color:#008000">unless</span> <span style="color:#19177c">entry</span>
|
||||
</span></span><span style="display:flex;"><span> (<span style="color:#d2413a;font-weight:bold">user-error</span> <span style="color:#ba2121">"No entry!"</span>))
|
||||
</span></span><span style="display:flex;"><span> (<span style="color:#008000">unless</span> (<span style="color:#19177c">derived-mode-p</span> <span style="color:#19177c">'subed-mode</span>)
|
||||
</span></span><span style="display:flex;"><span> (<span style="color:#d2413a;font-weight:bold">user-error</span> <span style="color:#ba2121">"Not subed mode!"</span>))
|
||||
</span></span><span style="display:flex;"><span> (<span style="color:#008000">setq-local</span> <span style="color:#19177c">subed-mpv-video-file</span>
|
||||
</span></span><span style="display:flex;"><span> (<span style="color:#00f">expand-file-name</span>
|
||||
</span></span><span style="display:flex;"><span> (<span style="color:#00f">concat</span> <span style="color:#19177c">my/elfeed-whisper-podcast-files-directory</span>
|
||||
</span></span><span style="display:flex;"><span> (<span style="color:#19177c">my/get-file-name-from-url</span>
|
||||
</span></span><span style="display:flex;"><span> (<span style="color:#19177c">caar</span> (<span style="color:#19177c">elfeed-entry-enclosures</span> <span style="color:#19177c">entry</span>))))))
|
||||
</span></span><span style="display:flex;"><span> (<span style="color:#19177c">subed-mpv--play</span> <span style="color:#19177c">subed-mpv-video-file</span>))
|
||||
</span></span></code></pre></div><p>After running <code>M-x my/elfeed-whisper-subed</code>, run <code>M-x subed-toggle-loop-over-current-subtitle</code> (<code>C-c C-l</code>), because somehow it’s turned on by default, and <code>M-x subed-toggle-pause-while-typing</code> (<code>C-c C-p</code>), because sometimes this made my instance of MPV lag.</p>
|
||||
<p>After that, <code>M-x subed-mpv-toggle-pause</code> should start the playback, which you can control by moving the cursor in the buffer.</p>
|
||||
<p>You can also run <code>M-x subed-toggle-sync-point-to-player</code> (<code>C-c .</code>) to toggle syncing the point in the buffer to the currently played subtitle (this automatically gets disabled when you switch buffers).</p>
|
||||
<p>Running <code>M-x subed-toggle-sync-player-to-point</code> (<code>C-c ,</code>) does the opposite, i.e. sets the player position to the subtitle under point. These two functions are useful since the MPV window controls aren’t available.</p>
|
||||
<h5 id="running-it-for-random-files">Running it for random files</h5>
|
||||
<p>Apparently I also need to run whisper for random files from the Internet.</p>
|
||||
<div class="highlight"><pre tabindex="0" style=";-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-emacs-lisp" data-lang="emacs-lisp"><span style="display:flex;"><span>(<span style="color:#008000">defun</span> <span style="color:#19177c">my/whisper-url</span> (<span style="color:#19177c">url</span> <span style="color:#19177c">file-name</span> <span style="color:#19177c">output-dir</span>)
|
||||
</span></span><span style="display:flex;"><span> (<span style="color:#008000">interactive</span>
|
||||
</span></span><span style="display:flex;"><span> (<span style="color:#00f">list</span> (<span style="color:#00f">read-from-minibuffer</span> <span style="color:#ba2121">"URL: "</span>)
|
||||
</span></span><span style="display:flex;"><span> (<span style="color:#00f">read-from-minibuffer</span> <span style="color:#ba2121">"File name: "</span>)
|
||||
</span></span><span style="display:flex;"><span> (<span style="color:#19177c">read-directory-name</span> <span style="color:#ba2121">"Output directory: "</span>)))
|
||||
</span></span><span style="display:flex;"><span> (<span style="color:#008000">let</span> ((<span style="color:#19177c">file-path</span>
|
||||
</span></span><span style="display:flex;"><span> (<span style="color:#00f">concat</span> <span style="color:#19177c">output-dir</span> <span style="color:#19177c">file-name</span> <span style="color:#ba2121">"."</span> (<span style="color:#19177c">file-name-extension</span> <span style="color:#19177c">url</span>))))
|
||||
</span></span><span style="display:flex;"><span> (<span style="color:#00f">message</span> <span style="color:#ba2121">"Download started"</span>)
|
||||
</span></span><span style="display:flex;"><span> (<span style="color:#19177c">request</span> <span style="color:#19177c">url</span>
|
||||
</span></span><span style="display:flex;"><span> <span style="color:#008000">:type</span> <span style="color:#ba2121">"GET"</span>
|
||||
</span></span><span style="display:flex;"><span> <span style="color:#008000">:encoding</span> <span style="color:#19177c">'binary</span>
|
||||
</span></span><span style="display:flex;"><span> <span style="color:#008000">:complete</span>
|
||||
</span></span><span style="display:flex;"><span> (<span style="color:#008000">cl-function</span>
|
||||
</span></span><span style="display:flex;"><span> (<span style="color:#008000">lambda</span> (<span style="color:#008000">&key</span> <span style="color:#19177c">data</span> <span style="color:#008000">&allow-other-keys</span>)
|
||||
</span></span><span style="display:flex;"><span> (<span style="color:#008000">let</span> ((<span style="color:#19177c">coding-system-for-write</span> <span style="color:#19177c">'binary</span>)
|
||||
</span></span><span style="display:flex;"><span> (<span style="color:#19177c">write-region-annotate-functions</span> <span style="color:#800">nil</span>)
|
||||
</span></span><span style="display:flex;"><span> (<span style="color:#19177c">write-region-post-annotation-function</span> <span style="color:#800">nil</span>))
|
||||
</span></span><span style="display:flex;"><span> (<span style="color:#00f">write-region</span> <span style="color:#19177c">data</span> <span style="color:#800">nil</span> <span style="color:#19177c">file-path</span> <span style="color:#800">nil</span> <span style="color:#008000">:silent</span>))
|
||||
</span></span><span style="display:flex;"><span> (<span style="color:#00f">message</span> <span style="color:#ba2121">"Conversion started"</span>)
|
||||
</span></span><span style="display:flex;"><span> (<span style="color:#19177c">my/invoke-whisper</span> <span style="color:#19177c">file-path</span> <span style="color:#19177c">output-dir</span>)))
|
||||
</span></span><span style="display:flex;"><span> <span style="color:#008000">:error</span>
|
||||
</span></span><span style="display:flex;"><span> (<span style="color:#008000">cl-function</span>
|
||||
</span></span><span style="display:flex;"><span> (<span style="color:#008000">lambda</span> (<span style="color:#008000">&key</span> <span style="color:#19177c">error-thrown</span> <span style="color:#008000">&allow-other-keys</span>)
|
||||
</span></span><span style="display:flex;"><span> (<span style="color:#00f">message</span> <span style="color:#ba2121">"Error!: %S"</span> <span style="color:#19177c">error-thrown</span>))))))
|
||||
</span></span></code></pre></div><h5 id="some-observations">Some observations</h5>
|
||||
<p>So, the functions above work for my purposes.</p>
|
||||
<p>Vosk API works much faster than Whisper. The smallest Vosk model requires ~10 times less than the playback time, and even the <code>tiny.en</code> Whisper model on my PC requires maybe 1.2x playback time.</p>
|
||||
<p>However, the quality of the output for Whisper is just so much better so I consider it to be worth the wait. Even with the <code>tiny</code> model, the transcript is almost perfect, provided that the audio is of reasonable quality.</p>
|
||||
<h3 id="internet-and-multimedia">Internet & Multimedia</h3>
|
||||
<h4 id="notmuch">Notmuch</h4>
|
||||
<p>My notmuch config now resides in <a href="/configs/mail/">Mail.org</a>.</p>
|
||||
|
|
@ -9572,10 +9372,12 @@ Didn’t work out as I expected, so I’ve made <code>org-journal-tags</
|
|||
</span></span><span style="display:flex;"><span> (<span style="color:#19177c">sx-question-mode-content</span> <span style="color:#008000">:background</span> <span style="color:#800">nil</span>))
|
||||
</span></span><span style="display:flex;"><span> (<span style="color:#19177c">add-hook</span> <span style="color:#19177c">'sx-question-mode-hook</span> <span style="color:#00f">#'</span><span style="color:#19177c">doom-modeline-mode</span>)
|
||||
</span></span><span style="display:flex;"><span> (<span style="color:#19177c">add-hook</span> <span style="color:#19177c">'sx-question-list-mode-hook</span> <span style="color:#00f">#'</span><span style="color:#19177c">doom-modeline-mode</span>))
|
||||
</span></span></code></pre></div><h3 id="llm">LLM</h3>
|
||||
<p>Trying out LLM integrations.</p>
|
||||
<p>I don’t have access to any proprietary APIs, but LLaMA 3 8b with <a href="https://ollama.com/">ollama</a> works for some purposes.</p>
|
||||
<h4 id="gptel">gptel</h4>
|
||||
</span></span></code></pre></div><h3 id="not-an-ai">Not-an-AI</h3>
|
||||
<p>Workflows, which are sometimes referred as “AI”, go in here.</p>
|
||||
<p>I’m technically writing a PhD on a related topic, so I’m a bit more receptive towards the whole thing than most of the community. But I’m still not calling it AI.</p>
|
||||
<h4 id="llms">LLMs</h4>
|
||||
<p>I don’t have access to any proprietary APIs, but LLaMA 3.1 8b with <a href="https://ollama.com/">ollama</a> works for some purposes.</p>
|
||||
<h5 id="gptel">gptel</h5>
|
||||
<p><a href="https://github.com/karthink/gptel">gtpel</a> is a package that provides an interface to chat with LLMs.</p>
|
||||
<div class="highlight"><pre tabindex="0" style=";-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-emacs-lisp" data-lang="emacs-lisp"><span style="display:flex;"><span>(<span style="color:#008000">use-package</span> <span style="color:#19177c">gptel</span>
|
||||
</span></span><span style="display:flex;"><span> <span style="color:#008000">:straight</span> <span style="color:#800">t</span>
|
||||
|
|
@ -9591,10 +9393,9 @@ Didn’t work out as I expected, so I’ve made <code>org-journal-tags</
|
|||
</span></span><span style="display:flex;"><span> (<span style="color:#008000">setq</span> <span style="color:#19177c">gptel-backend</span> (<span style="color:#19177c">gptel-make-ollama</span> <span style="color:#ba2121">"Ollama"</span>
|
||||
</span></span><span style="display:flex;"><span> <span style="color:#008000">:host</span> <span style="color:#ba2121">"localhost:11434"</span>
|
||||
</span></span><span style="display:flex;"><span> <span style="color:#008000">:stream</span> <span style="color:#800">t</span>
|
||||
</span></span><span style="display:flex;"><span> <span style="color:#008000">:models</span> <span style="color:#666">'</span>(<span style="color:#ba2121">"llama3:latest"</span> <span style="color:#ba2121">"llama3-gradient"</span>
|
||||
</span></span><span style="display:flex;"><span> <span style="color:#ba2121">"llama3:instruct"</span>)))
|
||||
</span></span><span style="display:flex;"><span> <span style="color:#008000">:models</span> <span style="color:#666">'</span>(<span style="color:#ba2121">"llama3.1:latest"</span> <span style="color:#ba2121">"llama3.1:instruct"</span>)))
|
||||
</span></span><span style="display:flex;"><span>
|
||||
</span></span><span style="display:flex;"><span> (<span style="color:#19177c">my/gptel-switch-backend</span> <span style="color:#ba2121">"llama3:latest"</span>)
|
||||
</span></span><span style="display:flex;"><span> <span style="color:#408080;font-style:italic">;; (my/gptel-switch-backend "llama3.1:latest")</span>
|
||||
</span></span><span style="display:flex;"><span> (<span style="color:#19177c">general-define-key</span>
|
||||
</span></span><span style="display:flex;"><span> <span style="color:#008000">:keymaps</span> <span style="color:#666">'</span>(<span style="color:#19177c">gptel-mode-map</span>)
|
||||
</span></span><span style="display:flex;"><span> <span style="color:#008000">:states</span> <span style="color:#666">'</span>(<span style="color:#00f">insert</span> <span style="color:#19177c">normal</span>)
|
||||
|
|
@ -9606,7 +9407,7 @@ Didn’t work out as I expected, so I’ve made <code>org-journal-tags</
|
|||
</span></span><span style="display:flex;"><span> (<span style="color:#19177c">gptel-make-gemini</span> <span style="color:#ba2121">"Gemini"</span>
|
||||
</span></span><span style="display:flex;"><span> <span style="color:#008000">:key</span> (<span style="color:#19177c">my/password-store-get-field</span> <span style="color:#ba2121">"My_Online/Accounts/google-gemini"</span> <span style="color:#ba2121">"api"</span>)
|
||||
</span></span><span style="display:flex;"><span> <span style="color:#008000">:stream</span> <span style="color:#800">t</span>))
|
||||
</span></span></code></pre></div><h4 id="ellama">ellama</h4>
|
||||
</span></span></code></pre></div><h5 id="ellama">ellama</h5>
|
||||
<p><a href="https://github.com/s-kostyaev/ellama">ellama</a> provides commands that feed things from Emacs buffers into LLMs with various prompts.</p>
|
||||
<div class="highlight"><pre tabindex="0" style=";-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-emacs-lisp" data-lang="emacs-lisp"><span style="display:flex;"><span>(<span style="color:#008000">use-package</span> <span style="color:#19177c">ellama</span>
|
||||
</span></span><span style="display:flex;"><span> <span style="color:#008000">:straight</span> <span style="color:#800">t</span>
|
||||
|
|
@ -9621,15 +9422,15 @@ Didn’t work out as I expected, so I’ve made <code>org-journal-tags</
|
|||
</span></span><span style="display:flex;"><span> <span style="color:#ba2121">"aie"</span> <span style="color:#666">'</span>(<span style="color:#008000">:wk</span> <span style="color:#ba2121">"ellama"</span> <span style="color:#008000">:keymap</span> <span style="color:#19177c">ellama-command-map</span>))
|
||||
</span></span><span style="display:flex;"><span>
|
||||
</span></span><span style="display:flex;"><span> (<span style="color:#008000">setq</span> <span style="color:#19177c">ellama-provider</span> (<span style="color:#19177c">make-llm-ollama</span>
|
||||
</span></span><span style="display:flex;"><span> <span style="color:#008000">:chat-model</span> <span style="color:#ba2121">"llama3:instruct"</span>
|
||||
</span></span><span style="display:flex;"><span> <span style="color:#008000">:embedding-model</span> <span style="color:#ba2121">"llama3:instruct"</span>))
|
||||
</span></span><span style="display:flex;"><span> <span style="color:#008000">:chat-model</span> <span style="color:#ba2121">"llama3.1:instruct"</span>
|
||||
</span></span><span style="display:flex;"><span> <span style="color:#008000">:embedding-model</span> <span style="color:#ba2121">"llama3.1:instruct"</span>))
|
||||
</span></span><span style="display:flex;"><span> (<span style="color:#008000">setq</span> <span style="color:#19177c">ellama-providers</span>
|
||||
</span></span><span style="display:flex;"><span> <span style="color:#666">`</span>((<span style="color:#ba2121">"llama3:8b"</span> <span style="color:#666">.</span> <span style="color:#666">,</span>(<span style="color:#19177c">make-llm-ollama</span>
|
||||
</span></span><span style="display:flex;"><span> <span style="color:#008000">:chat-model</span> <span style="color:#ba2121">"llama3:latest"</span>
|
||||
</span></span><span style="display:flex;"><span> <span style="color:#008000">:embedding-model</span> <span style="color:#ba2121">"llama3:latest"</span>))
|
||||
</span></span><span style="display:flex;"><span> (<span style="color:#ba2121">"llama3:instruct"</span> <span style="color:#666">.</span> <span style="color:#666">,</span>(<span style="color:#19177c">make-llm-ollama</span>
|
||||
</span></span><span style="display:flex;"><span> <span style="color:#008000">:chat-model</span> <span style="color:#ba2121">"llama3:instruct"</span>
|
||||
</span></span><span style="display:flex;"><span> <span style="color:#008000">:embedding-model</span> <span style="color:#ba2121">"llama3:instruct"</span>)))))
|
||||
</span></span><span style="display:flex;"><span> <span style="color:#666">`</span>((<span style="color:#ba2121">"llama3.1:8b"</span> <span style="color:#666">.</span> <span style="color:#666">,</span>(<span style="color:#19177c">make-llm-ollama</span>
|
||||
</span></span><span style="display:flex;"><span> <span style="color:#008000">:chat-model</span> <span style="color:#ba2121">"llama3.1:latest"</span>
|
||||
</span></span><span style="display:flex;"><span> <span style="color:#008000">:embedding-model</span> <span style="color:#ba2121">"llama3.1:latest"</span>))
|
||||
</span></span><span style="display:flex;"><span> (<span style="color:#ba2121">"llama3.1:instruct"</span> <span style="color:#666">.</span> <span style="color:#666">,</span>(<span style="color:#19177c">make-llm-ollama</span>
|
||||
</span></span><span style="display:flex;"><span> <span style="color:#008000">:chat-model</span> <span style="color:#ba2121">"llama3.1:instruct"</span>
|
||||
</span></span><span style="display:flex;"><span> <span style="color:#008000">:embedding-model</span> <span style="color:#ba2121">"llama3.1:instruct"</span>)))))
|
||||
</span></span></code></pre></div><p>The keybindings are a bit crazy to use even with <code>which-key</code>, so here goes transient.el.</p>
|
||||
<div class="highlight"><pre tabindex="0" style=";-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-emacs-lisp" data-lang="emacs-lisp"><span style="display:flex;"><span>(<span style="color:#008000">with-eval-after-load</span> <span style="color:#19177c">'ellama</span>
|
||||
</span></span><span style="display:flex;"><span> (<span style="color:#19177c">transient-define-prefix</span> <span style="color:#19177c">my/ellama-transient</span> ()
|
||||
|
|
@ -9745,10 +9546,289 @@ Didn’t work out as I expected, so I’ve made <code>org-journal-tags</
|
|||
</span></span><span style="display:flex;"><span>(<span style="color:#008000">defun</span> <span style="color:#19177c">my/ellama-improve-concise</span> (<span style="color:#19177c">text</span> <span style="color:#19177c">is-org-mode</span>)
|
||||
</span></span><span style="display:flex;"><span> (<span style="color:#008000">interactive</span> (<span style="color:#00f">list</span> (<span style="color:#19177c">my/ellama--text</span>) (<span style="color:#19177c">derived-mode-p</span> <span style="color:#19177c">'org-mode</span>)))
|
||||
</span></span><span style="display:flex;"><span> (<span style="color:#19177c">my/ellama-text-with-diff</span> <span style="color:#19177c">text</span> <span style="color:#19177c">is-org-mode</span> <span style="color:#19177c">my/ellama-improve-concise-prompt</span>))
|
||||
</span></span></code></pre></div><h5 id="other-thoughts">Other thoughts</h5>
|
||||
</span></span></code></pre></div><h4 id="podcast-transcripts">Podcast transcripts</h4>
|
||||
<p>In my experience, finding something in a podcast can be particularly troublesome. For instance, at times, I want to refer to a specific line in the podcast to make an <a href="https://github.com/org-roam/org-roam">org-roam</a> node, and I need to check if I got that part right. And I have no reasonable way to get there because audio files, in themselves, don’t allow for <a href="https://en.wikipedia.org/wiki/Random_access">random access</a>, i.e. there are no “landmarks” that point to a particular portion of the file. At least if nothing like a transcript is available.</p>
|
||||
<p>For obvious reasons, podcasts rarely ship with transcripts. So in this <del>post</del> section I’ll be using a speech recognition engine to make up for that. The general idea is to obtain the podcast information from <a href="https://github.com/skeeto/elfeed">elfeed</a>, process it with <a href="https://github.com/openai/whisper">OpenAI Whisper</a> and feed it to <a href="https://github.com/sachac/subed">subed</a> to control the playback in <a href="https://mpv.io/">MPV</a>.</p>
|
||||
<p>Edit <span class="timestamp-wrapper"><span class="timestamp"><2022-10-08 Sat></span></span>: Changed <a href="https://github.com/alphacep/vosk-api">vosk-api</a> to OpenAI Whisper.</p>
|
||||
<p>Edit <span class="timestamp-wrapper"><span class="timestamp"><2024-11-10 Sun></span></span>: Moved from elfeed to Not-an-AI, reworked to use <a href="https://github.com/Vaibhavs10/insanely-fast-whisper">insanely-fast-whisper</a>.</p>
|
||||
<h5 id="whisper">Whisper</h5>
|
||||
<p><a href="https://github.com/openai/whisper">OpenAI Whisper</a> is an amazing speech recognition toolkit.</p>
|
||||
<p>I previously used <a href="https://github.com/ggerganov/whisper.cpp">whisper.cpp</a> by Georgi Gerganov, but have switched to <a href="https://github.com/Vaibhavs10/insanely-fast-whisper">insanely-fast-whisper</a> since it’s easier to run on GPU, it doesn’t require converting everything to WAV, and it includes speaker diarization capabilities.</p>
|
||||
<p>One disadvantage is that it doesn’t produce human-readable output by default, so I make my own.</p>
|
||||
<table>
|
||||
<thead>
|
||||
<tr>
|
||||
<th>Guix dependency</th>
|
||||
<th>Disabled</th>
|
||||
</tr>
|
||||
</thead>
|
||||
<tbody>
|
||||
<tr>
|
||||
<td>whisper-cpp</td>
|
||||
<td>t</td>
|
||||
</tr>
|
||||
</tbody>
|
||||
</table>
|
||||
<h5 id="running-it-from-emacs">Running it from Emacs</h5>
|
||||
<p>First, some functions to process the output. These take a JSON formed by <code>insanely-fast-whisper</code> and create a set of files:</p>
|
||||
<ul>
|
||||
<li><code>ellama-code-complete</code> is pretty good to write migrations</li>
|
||||
<li>a TXT file with the full text;</li>
|
||||
<li>a VTT file;</li>
|
||||
<li>if speaker info is available:
|
||||
<ul>
|
||||
<li>a TXT file with speaker tags;</li>
|
||||
<li>a VTT file with speaker tags.</li>
|
||||
</ul>
|
||||
</li>
|
||||
</ul>
|
||||
<!--listend-->
|
||||
<div class="highlight"><pre tabindex="0" style=";-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-emacs-lisp" data-lang="emacs-lisp"><span style="display:flex;"><span>(<span style="color:#008000">defun</span> <span style="color:#19177c">my/whisper--format-vtt-seconds</span> (<span style="color:#19177c">seconds</span>)
|
||||
</span></span><span style="display:flex;"><span> (<span style="color:#008000">let*</span> ((<span style="color:#19177c">hours</span> (<span style="color:#00f">/</span> (<span style="color:#00f">floor</span> <span style="color:#19177c">seconds</span>) (<span style="color:#00f">*</span> <span style="color:#666">60</span> <span style="color:#666">60</span>)))
|
||||
</span></span><span style="display:flex;"><span> (<span style="color:#19177c">minutes</span> (<span style="color:#00f">/</span> (<span style="color:#00f">-</span> (<span style="color:#00f">floor</span> <span style="color:#19177c">seconds</span>) (<span style="color:#00f">*</span> <span style="color:#19177c">hours</span> <span style="color:#666">60</span> <span style="color:#666">60</span>)) <span style="color:#666">60</span>))
|
||||
</span></span><span style="display:flex;"><span> (<span style="color:#19177c">sec</span> (<span style="color:#00f">%</span> (<span style="color:#00f">floor</span> <span style="color:#19177c">seconds</span>) <span style="color:#666">60</span>))
|
||||
</span></span><span style="display:flex;"><span> (<span style="color:#19177c">ms</span> (<span style="color:#00f">floor</span> (<span style="color:#00f">*</span> <span style="color:#666">1000</span> (<span style="color:#00f">-</span> <span style="color:#19177c">seconds</span> (<span style="color:#00f">floor</span> <span style="color:#19177c">seconds</span>))))))
|
||||
</span></span><span style="display:flex;"><span> (<span style="color:#00f">format</span> <span style="color:#ba2121">"%.2d:%.2d:%.2d.%.3d"</span> <span style="color:#19177c">hours</span> <span style="color:#19177c">minutes</span> <span style="color:#19177c">sec</span> <span style="color:#19177c">ms</span>)))
|
||||
</span></span><span style="display:flex;"><span>
|
||||
</span></span><span style="display:flex;"><span>(<span style="color:#008000">defun</span> <span style="color:#19177c">my/whisper--save-chucks-vtt</span> (<span style="color:#19177c">path</span> <span style="color:#19177c">data</span>)
|
||||
</span></span><span style="display:flex;"><span> (<span style="color:#008000">with-temp-file</span> <span style="color:#19177c">path</span>
|
||||
</span></span><span style="display:flex;"><span> (<span style="color:#00f">insert</span> <span style="color:#ba2121">"WEBVTT\n\n"</span>)
|
||||
</span></span><span style="display:flex;"><span> (<span style="color:#008000">cl-loop</span> <span style="color:#19177c">for</span> <span style="color:#19177c">chunk</span> <span style="color:#19177c">across</span> (<span style="color:#19177c">alist-get</span> <span style="color:#19177c">'chunks</span> <span style="color:#19177c">data</span>)
|
||||
</span></span><span style="display:flex;"><span> <span style="color:#19177c">for</span> <span style="color:#19177c">start</span> <span style="color:#00f">=</span> (<span style="color:#19177c">my/whisper--format-vtt-seconds</span>
|
||||
</span></span><span style="display:flex;"><span> (<span style="color:#00f">aref</span> (<span style="color:#19177c">alist-get</span> <span style="color:#19177c">'timestamp</span> <span style="color:#19177c">chunk</span>) <span style="color:#666">0</span>))
|
||||
</span></span><span style="display:flex;"><span> <span style="color:#19177c">for</span> <span style="color:#19177c">end</span> <span style="color:#00f">=</span> (<span style="color:#19177c">my/whisper--format-vtt-seconds</span>
|
||||
</span></span><span style="display:flex;"><span> (<span style="color:#00f">aref</span> (<span style="color:#19177c">alist-get</span> <span style="color:#19177c">'timestamp</span> <span style="color:#19177c">chunk</span>) <span style="color:#666">1</span>))
|
||||
</span></span><span style="display:flex;"><span> <span style="color:#008000">do</span> (<span style="color:#00f">insert</span> (<span style="color:#00f">format</span> <span style="color:#ba2121">"%s --> %s"</span> <span style="color:#19177c">start</span> <span style="color:#19177c">end</span>) <span style="color:#ba2121">"\n"</span>)
|
||||
</span></span><span style="display:flex;"><span> <span style="color:#008000">do</span> (<span style="color:#00f">insert</span> (<span style="color:#19177c">string-trim</span> (<span style="color:#19177c">alist-get</span> <span style="color:#19177c">'text</span> <span style="color:#19177c">chunk</span>)) <span style="color:#ba2121">"\n\n"</span>))))
|
||||
</span></span><span style="display:flex;"><span>
|
||||
</span></span><span style="display:flex;"><span>(<span style="color:#008000">defun</span> <span style="color:#19177c">my/whisper--save-speakers-vtt</span> (<span style="color:#19177c">path</span> <span style="color:#19177c">data</span>)
|
||||
</span></span><span style="display:flex;"><span> (<span style="color:#008000">with-temp-file</span> <span style="color:#19177c">path</span>
|
||||
</span></span><span style="display:flex;"><span> (<span style="color:#00f">insert</span> <span style="color:#ba2121">"WEBVTT\n\n"</span>)
|
||||
</span></span><span style="display:flex;"><span> (<span style="color:#008000">cl-loop</span> <span style="color:#19177c">for</span> <span style="color:#19177c">chunk</span> <span style="color:#19177c">across</span> (<span style="color:#19177c">alist-get</span> <span style="color:#19177c">'speakers</span> <span style="color:#19177c">data</span>)
|
||||
</span></span><span style="display:flex;"><span> <span style="color:#19177c">for</span> <span style="color:#19177c">start</span> <span style="color:#00f">=</span> (<span style="color:#19177c">my/whisper--format-vtt-seconds</span>
|
||||
</span></span><span style="display:flex;"><span> (<span style="color:#00f">aref</span> (<span style="color:#19177c">alist-get</span> <span style="color:#19177c">'timestamp</span> <span style="color:#19177c">chunk</span>) <span style="color:#666">0</span>))
|
||||
</span></span><span style="display:flex;"><span> <span style="color:#19177c">for</span> <span style="color:#19177c">end</span> <span style="color:#00f">=</span> (<span style="color:#19177c">my/whisper--format-vtt-seconds</span>
|
||||
</span></span><span style="display:flex;"><span> (<span style="color:#00f">aref</span> (<span style="color:#19177c">alist-get</span> <span style="color:#19177c">'timestamp</span> <span style="color:#19177c">chunk</span>) <span style="color:#666">1</span>))
|
||||
</span></span><span style="display:flex;"><span> <span style="color:#008000">do</span> (<span style="color:#00f">insert</span> (<span style="color:#00f">format</span> <span style="color:#ba2121">"%s --> %s"</span> <span style="color:#19177c">start</span> <span style="color:#19177c">end</span>) <span style="color:#ba2121">"\n"</span>)
|
||||
</span></span><span style="display:flex;"><span> <span style="color:#008000">do</span> (<span style="color:#00f">insert</span>
|
||||
</span></span><span style="display:flex;"><span> (<span style="color:#00f">format</span> <span style="color:#ba2121">"<v %s>"</span> (<span style="color:#19177c">alist-get</span> <span style="color:#19177c">'speaker</span> <span style="color:#19177c">chunk</span>))
|
||||
</span></span><span style="display:flex;"><span> (<span style="color:#19177c">string-trim</span> (<span style="color:#19177c">alist-get</span> <span style="color:#19177c">'text</span> <span style="color:#19177c">chunk</span>)) <span style="color:#ba2121">"\n\n"</span>))))
|
||||
</span></span><span style="display:flex;"><span>
|
||||
</span></span><span style="display:flex;"><span>(<span style="color:#008000">defun</span> <span style="color:#19177c">my/whisper--save-speakers-txt</span> (<span style="color:#19177c">path</span> <span style="color:#19177c">data</span>)
|
||||
</span></span><span style="display:flex;"><span> (<span style="color:#008000">with-temp-file</span> <span style="color:#19177c">path</span>
|
||||
</span></span><span style="display:flex;"><span> (<span style="color:#008000">cl-loop</span> <span style="color:#19177c">with</span> <span style="color:#19177c">prev-speaker</span>
|
||||
</span></span><span style="display:flex;"><span> <span style="color:#19177c">for</span> <span style="color:#19177c">chunk</span> <span style="color:#19177c">across</span> (<span style="color:#19177c">alist-get</span> <span style="color:#19177c">'speakers</span> <span style="color:#19177c">data</span>)
|
||||
</span></span><span style="display:flex;"><span> <span style="color:#19177c">for</span> <span style="color:#19177c">speaker</span> <span style="color:#00f">=</span> (<span style="color:#19177c">alist-get</span> <span style="color:#19177c">'speaker</span> <span style="color:#19177c">chunk</span>)
|
||||
</span></span><span style="display:flex;"><span> <span style="color:#008000">if</span> (<span style="color:#19177c">not</span> (<span style="color:#00f">equal</span> <span style="color:#19177c">speaker</span> <span style="color:#19177c">prev-speaker</span>))
|
||||
</span></span><span style="display:flex;"><span> <span style="color:#008000">do</span> (<span style="color:#008000">progn</span>
|
||||
</span></span><span style="display:flex;"><span> (<span style="color:#008000">when</span> <span style="color:#19177c">prev-speaker</span>
|
||||
</span></span><span style="display:flex;"><span> (<span style="color:#19177c">fill-region</span>
|
||||
</span></span><span style="display:flex;"><span> (<span style="color:#00f">line-beginning-position</span>)
|
||||
</span></span><span style="display:flex;"><span> (<span style="color:#00f">line-end-position</span>))
|
||||
</span></span><span style="display:flex;"><span> (<span style="color:#00f">insert</span> <span style="color:#ba2121">"\n\n"</span>))
|
||||
</span></span><span style="display:flex;"><span> (<span style="color:#00f">insert</span> (<span style="color:#00f">format</span> <span style="color:#ba2121">"[%s]"</span> <span style="color:#19177c">speaker</span>) <span style="color:#ba2121">"\n"</span>)
|
||||
</span></span><span style="display:flex;"><span> (<span style="color:#008000">setq</span> <span style="color:#19177c">prev-speaker</span> <span style="color:#19177c">speaker</span>))
|
||||
</span></span><span style="display:flex;"><span> <span style="color:#008000">do</span> (<span style="color:#00f">insert</span> (<span style="color:#19177c">string-trim</span> (<span style="color:#19177c">alist-get</span> <span style="color:#19177c">'text</span> <span style="color:#19177c">chunk</span>)) <span style="color:#ba2121">" "</span>))
|
||||
</span></span><span style="display:flex;"><span> (<span style="color:#19177c">fill-region</span>
|
||||
</span></span><span style="display:flex;"><span> (<span style="color:#00f">line-beginning-position</span>)
|
||||
</span></span><span style="display:flex;"><span> (<span style="color:#00f">line-end-position</span>))))
|
||||
</span></span><span style="display:flex;"><span>
|
||||
</span></span><span style="display:flex;"><span>(<span style="color:#008000">defun</span> <span style="color:#19177c">my/whisper--process-output</span> (<span style="color:#19177c">transcript-path</span>)
|
||||
</span></span><span style="display:flex;"><span> (<span style="color:#008000">let</span> ((<span style="color:#19177c">data</span> (<span style="color:#19177c">json-read-file</span> <span style="color:#19177c">transcript-path</span>)))
|
||||
</span></span><span style="display:flex;"><span> (<span style="color:#008000">when</span> (<span style="color:#19177c">alist-get</span> <span style="color:#19177c">'text</span> <span style="color:#19177c">data</span>)
|
||||
</span></span><span style="display:flex;"><span> (<span style="color:#008000">with-temp-file</span> (<span style="color:#00f">concat</span>
|
||||
</span></span><span style="display:flex;"><span> (<span style="color:#19177c">file-name-sans-extension</span> <span style="color:#19177c">transcript-path</span>)
|
||||
</span></span><span style="display:flex;"><span> <span style="color:#ba2121">".txt"</span>)
|
||||
</span></span><span style="display:flex;"><span> (<span style="color:#00f">insert</span> (<span style="color:#19177c">string-trim</span> (<span style="color:#19177c">alist-get</span> <span style="color:#19177c">'text</span> <span style="color:#19177c">data</span>)))
|
||||
</span></span><span style="display:flex;"><span> (<span style="color:#19177c">do-auto-fill</span>)))
|
||||
</span></span><span style="display:flex;"><span> (<span style="color:#008000">unless</span> (<span style="color:#19177c">seq-empty-p</span> (<span style="color:#19177c">alist-get</span> <span style="color:#19177c">'speakers</span> <span style="color:#19177c">data</span>))
|
||||
</span></span><span style="display:flex;"><span> (<span style="color:#19177c">my/whisper--save-speakers-vtt</span>
|
||||
</span></span><span style="display:flex;"><span> (<span style="color:#00f">concat</span> (<span style="color:#19177c">file-name-sans-extension</span> <span style="color:#19177c">transcript-path</span>) <span style="color:#ba2121">"-spk.vtt"</span>)
|
||||
</span></span><span style="display:flex;"><span> <span style="color:#19177c">data</span>)
|
||||
</span></span><span style="display:flex;"><span> (<span style="color:#19177c">my/whisper--save-speakers-txt</span>
|
||||
</span></span><span style="display:flex;"><span> (<span style="color:#00f">concat</span> (<span style="color:#19177c">file-name-sans-extension</span> <span style="color:#19177c">transcript-path</span>) <span style="color:#ba2121">"-spk.txt"</span>)
|
||||
</span></span><span style="display:flex;"><span> <span style="color:#19177c">data</span>))
|
||||
</span></span><span style="display:flex;"><span> (<span style="color:#19177c">my/whisper--save-chucks-vtt</span>
|
||||
</span></span><span style="display:flex;"><span> (<span style="color:#00f">concat</span> (<span style="color:#19177c">file-name-sans-extension</span> <span style="color:#19177c">transcript-path</span>) <span style="color:#ba2121">".vtt"</span>)
|
||||
</span></span><span style="display:flex;"><span> <span style="color:#19177c">data</span>)))
|
||||
</span></span></code></pre></div><p>Then run the program itself with <a href="https://www.gnu.org/software/emacs/manual/html_node/elisp/Asynchronous-Processes.html">asyncronous processes</a>.</p>
|
||||
<div class="highlight"><pre tabindex="0" style=";-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-emacs-lisp" data-lang="emacs-lisp"><span style="display:flex;"><span>(<span style="color:#008000">defvar</span> <span style="color:#19177c">my/whisper-path</span>
|
||||
</span></span><span style="display:flex;"><span> <span style="color:#ba2121">"/home/pavel/micromamba/envs/insanely-fast-whisper/bin/insanely-fast-whisper"</span>)
|
||||
</span></span><span style="display:flex;"><span>
|
||||
</span></span><span style="display:flex;"><span>(<span style="color:#008000">defun</span> <span style="color:#19177c">my/invoke-whisper</span> (<span style="color:#19177c">input</span> <span style="color:#19177c">output-dir</span> <span style="color:#008000">&optional</span> <span style="color:#19177c">language</span> <span style="color:#19177c">num-speakers</span>)
|
||||
</span></span><span style="display:flex;"><span> (<span style="color:#008000">interactive</span>
|
||||
</span></span><span style="display:flex;"><span> (<span style="color:#00f">list</span>
|
||||
</span></span><span style="display:flex;"><span> (<span style="color:#19177c">read-file-name</span> <span style="color:#ba2121">"Input file:"</span> <span style="color:#800">nil</span> <span style="color:#800">nil</span> <span style="color:#800">t</span>)
|
||||
</span></span><span style="display:flex;"><span> (<span style="color:#19177c">read-directory-name</span> <span style="color:#ba2121">"Output-directory: "</span>)
|
||||
</span></span><span style="display:flex;"><span> (<span style="color:#008000">let</span> ((<span style="color:#19177c">lang</span> (<span style="color:#00f">read-string</span> <span style="color:#ba2121">"Language (optional): "</span>)))
|
||||
</span></span><span style="display:flex;"><span> (<span style="color:#008000">if</span> (<span style="color:#19177c">string-empty-p</span> <span style="color:#19177c">lang</span>) <span style="color:#800">nil</span> <span style="color:#19177c">lang</span>))
|
||||
</span></span><span style="display:flex;"><span> (<span style="color:#008000">let</span> ((<span style="color:#19177c">num</span> (<span style="color:#19177c">read-number</span> <span style="color:#ba2121">"Number of speakers (optional): "</span> <span style="color:#666">0</span>)))
|
||||
</span></span><span style="display:flex;"><span> (<span style="color:#008000">when</span> (<span style="color:#00f">></span> <span style="color:#19177c">num</span> <span style="color:#666">0</span>)
|
||||
</span></span><span style="display:flex;"><span> (<span style="color:#00f">number-to-string</span> <span style="color:#19177c">num</span>)))))
|
||||
</span></span><span style="display:flex;"><span> (<span style="color:#008000">let*</span> ((<span style="color:#19177c">transcript-path</span> (<span style="color:#00f">concat</span>
|
||||
</span></span><span style="display:flex;"><span> (<span style="color:#00f">expand-file-name</span> (<span style="color:#00f">file-name-as-directory</span> <span style="color:#19177c">output-dir</span>))
|
||||
</span></span><span style="display:flex;"><span> (<span style="color:#19177c">file-name-base</span> <span style="color:#19177c">input</span>)
|
||||
</span></span><span style="display:flex;"><span> <span style="color:#ba2121">".json"</span>))
|
||||
</span></span><span style="display:flex;"><span> (<span style="color:#19177c">args</span>
|
||||
</span></span><span style="display:flex;"><span> <span style="color:#666">`</span>(<span style="color:#ba2121">"--file-name"</span> <span style="color:#666">,</span>(<span style="color:#00f">expand-file-name</span> <span style="color:#19177c">input</span>)
|
||||
</span></span><span style="display:flex;"><span> <span style="color:#ba2121">"--transcript-path"</span> <span style="color:#666">,</span><span style="color:#19177c">transcript-path</span>
|
||||
</span></span><span style="display:flex;"><span> <span style="color:#ba2121">"--hf-token"</span> <span style="color:#666">,</span>(<span style="color:#19177c">my/password-store-get-field</span> <span style="color:#ba2121">"My_Online/Accounts/huggingface.co"</span> <span style="color:#ba2121">"token"</span>)
|
||||
</span></span><span style="display:flex;"><span> <span style="color:#666">,@</span>(<span style="color:#008000">when</span> <span style="color:#19177c">language</span>
|
||||
</span></span><span style="display:flex;"><span> <span style="color:#666">`</span>(<span style="color:#ba2121">"--language"</span> <span style="color:#666">,</span><span style="color:#19177c">language</span>))
|
||||
</span></span><span style="display:flex;"><span> <span style="color:#666">,@</span>(<span style="color:#008000">when</span> <span style="color:#19177c">num-speakers</span>
|
||||
</span></span><span style="display:flex;"><span> <span style="color:#666">`</span>(<span style="color:#ba2121">"--num-speakers"</span> <span style="color:#666">,</span><span style="color:#19177c">num-speakers</span>))))
|
||||
</span></span><span style="display:flex;"><span> (<span style="color:#19177c">buffer</span> (<span style="color:#19177c">generate-new-buffer</span> <span style="color:#ba2121">"*whisper*"</span>))
|
||||
</span></span><span style="display:flex;"><span> (<span style="color:#19177c">proc</span> (<span style="color:#00f">apply</span> <span style="color:#00f">#'start-process</span> <span style="color:#ba2121">"whisper"</span> <span style="color:#19177c">buffer</span> <span style="color:#19177c">my/whisper-path</span> <span style="color:#19177c">args</span>)))
|
||||
</span></span><span style="display:flex;"><span> (<span style="color:#00f">set-process-sentinel</span>
|
||||
</span></span><span style="display:flex;"><span> <span style="color:#19177c">proc</span>
|
||||
</span></span><span style="display:flex;"><span> (<span style="color:#008000">lambda</span> (<span style="color:#19177c">process</span> <span style="color:#19177c">_msg</span>)
|
||||
</span></span><span style="display:flex;"><span> (<span style="color:#008000">let</span> ((<span style="color:#19177c">status</span> (<span style="color:#00f">process-status</span> <span style="color:#19177c">process</span>))
|
||||
</span></span><span style="display:flex;"><span> (<span style="color:#19177c">code</span> (<span style="color:#00f">process-exit-status</span> <span style="color:#19177c">process</span>)))
|
||||
</span></span><span style="display:flex;"><span> (<span style="color:#008000">cond</span> ((<span style="color:#008000">and</span> (<span style="color:#00f">eq</span> <span style="color:#19177c">status</span> <span style="color:#19177c">'exit</span>) (<span style="color:#00f">=</span> <span style="color:#19177c">code</span> <span style="color:#666">0</span>))
|
||||
</span></span><span style="display:flex;"><span> (<span style="color:#19177c">my/whisper--process-output</span> <span style="color:#19177c">transcript-path</span>)
|
||||
</span></span><span style="display:flex;"><span> (<span style="color:#19177c">notifications-notify</span> <span style="color:#008000">:body</span> <span style="color:#ba2121">"Audio conversion completed"</span>
|
||||
</span></span><span style="display:flex;"><span> <span style="color:#008000">:title</span> <span style="color:#ba2121">"Whisper"</span>)
|
||||
</span></span><span style="display:flex;"><span> (<span style="color:#00f">kill-buffer</span> (<span style="color:#00f">process-buffer</span> <span style="color:#19177c">process</span>)))
|
||||
</span></span><span style="display:flex;"><span> ((<span style="color:#008000">or</span> (<span style="color:#008000">and</span> (<span style="color:#00f">eq</span> <span style="color:#19177c">status</span> <span style="color:#19177c">'exit</span>) (<span style="color:#00f">></span> <span style="color:#19177c">code</span> <span style="color:#666">0</span>))
|
||||
</span></span><span style="display:flex;"><span> (<span style="color:#00f">eq</span> <span style="color:#19177c">status</span> <span style="color:#19177c">'signal</span>))
|
||||
</span></span><span style="display:flex;"><span> (<span style="color:#008000">let</span> ((<span style="color:#19177c">err</span> (<span style="color:#008000">with-current-buffer</span> (<span style="color:#00f">process-buffer</span> <span style="color:#19177c">process</span>)
|
||||
</span></span><span style="display:flex;"><span> (<span style="color:#00f">buffer-string</span>))))
|
||||
</span></span><span style="display:flex;"><span> (<span style="color:#d2413a;font-weight:bold">user-error</span> <span style="color:#ba2121">"Error in Whisper: %s"</span> <span style="color:#19177c">err</span>)))))))))
|
||||
</span></span></code></pre></div><p>If run interactively, the defined function prompts for paths to both files.</p>
|
||||
<p>The process sentinel sends a <a href="https://www.gnu.org/software/emacs/manual/html_node/elisp/Desktop-Notifications.html">desktop notification</a> because it’s a bit more noticeable than <code>message</code>, and the process is expected to take some time.</p>
|
||||
<h5 id="integrating-with-elfeed">Integrating with elfeed</h5>
|
||||
<p>To actually run the function from the section above, we need to download the file in question.</p>
|
||||
<p>The <code>whisper</code> executable, given the file <code><file>.<extension></code>, creates files named <code><file>.vtt</code>, <code><file>.srt</code>, <code><file>.txt</code>. So first we need to save the file under the correct name.</p>
|
||||
<p>I use a library called <a href="https://github.com/tkf/emacs-request">request.el</a> to download files elsewhere, so I’ll re-use it here. You can just as well invoke <code>curl</code> or <code>wget</code> via a asynchronous process.</p>
|
||||
<p>This function downloads the file to a non-temporary folder, which is <code>~/.elfeed/podcast-files/</code> if you didn’t move the elfeed database. That is so because a permanently downloaded file works better for the next section.</p>
|
||||
<div class="highlight"><pre tabindex="0" style=";-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-emacs-lisp" data-lang="emacs-lisp"><span style="display:flex;"><span>(<span style="color:#008000">with-eval-after-load</span> <span style="color:#19177c">'elfeed</span>
|
||||
</span></span><span style="display:flex;"><span> (<span style="color:#008000">defvar</span> <span style="color:#19177c">my/elfeed-whisper-podcast-files-directory</span>
|
||||
</span></span><span style="display:flex;"><span> (<span style="color:#00f">concat</span> <span style="color:#19177c">elfeed-db-directory</span> <span style="color:#ba2121">"/podcast-files/"</span>)))
|
||||
</span></span><span style="display:flex;"><span>
|
||||
</span></span><span style="display:flex;"><span>(<span style="color:#008000">defun</span> <span style="color:#19177c">my/elfeed-whisper-get-transcript-new</span> (<span style="color:#19177c">entry</span>)
|
||||
</span></span><span style="display:flex;"><span> (<span style="color:#008000">interactive</span> (<span style="color:#00f">list</span> <span style="color:#19177c">elfeed-show-entry</span>))
|
||||
</span></span><span style="display:flex;"><span> (<span style="color:#008000">let*</span> ((<span style="color:#19177c">url</span> (<span style="color:#19177c">caar</span> (<span style="color:#19177c">elfeed-entry-enclosures</span> <span style="color:#19177c">entry</span>)))
|
||||
</span></span><span style="display:flex;"><span> (<span style="color:#19177c">file-name</span> (<span style="color:#00f">concat</span>
|
||||
</span></span><span style="display:flex;"><span> (<span style="color:#19177c">elfeed-ref-id</span> (<span style="color:#19177c">elfeed-entry-content</span> <span style="color:#19177c">entry</span>))
|
||||
</span></span><span style="display:flex;"><span> <span style="color:#ba2121">"."</span>
|
||||
</span></span><span style="display:flex;"><span> (<span style="color:#19177c">file-name-extension</span> <span style="color:#19177c">url</span>)))
|
||||
</span></span><span style="display:flex;"><span> (<span style="color:#19177c">file-path</span> (<span style="color:#00f">expand-file-name</span>
|
||||
</span></span><span style="display:flex;"><span> (<span style="color:#00f">concat</span>
|
||||
</span></span><span style="display:flex;"><span> <span style="color:#19177c">my/elfeed-whisper-podcast-files-directory</span>
|
||||
</span></span><span style="display:flex;"><span> <span style="color:#19177c">file-name</span>))))
|
||||
</span></span><span style="display:flex;"><span> (<span style="color:#00f">message</span> <span style="color:#ba2121">"Download started"</span>)
|
||||
</span></span><span style="display:flex;"><span> (<span style="color:#008000">unless</span> (<span style="color:#00f">file-exists-p</span> <span style="color:#19177c">my/elfeed-whisper-podcast-files-directory</span>)
|
||||
</span></span><span style="display:flex;"><span> (<span style="color:#19177c">mkdir</span> <span style="color:#19177c">my/elfeed-whisper-podcast-files-directory</span>))
|
||||
</span></span><span style="display:flex;"><span> (<span style="color:#19177c">request</span> <span style="color:#19177c">url</span>
|
||||
</span></span><span style="display:flex;"><span> <span style="color:#008000">:type</span> <span style="color:#ba2121">"GET"</span>
|
||||
</span></span><span style="display:flex;"><span> <span style="color:#008000">:encoding</span> <span style="color:#19177c">'binary</span>
|
||||
</span></span><span style="display:flex;"><span> <span style="color:#008000">:complete</span>
|
||||
</span></span><span style="display:flex;"><span> (<span style="color:#008000">cl-function</span>
|
||||
</span></span><span style="display:flex;"><span> (<span style="color:#008000">lambda</span> (<span style="color:#008000">&key</span> <span style="color:#19177c">data</span> <span style="color:#008000">&allow-other-keys</span>)
|
||||
</span></span><span style="display:flex;"><span> (<span style="color:#008000">let</span> ((<span style="color:#19177c">coding-system-for-write</span> <span style="color:#19177c">'binary</span>)
|
||||
</span></span><span style="display:flex;"><span> (<span style="color:#19177c">write-region-annotate-functions</span> <span style="color:#800">nil</span>)
|
||||
</span></span><span style="display:flex;"><span> (<span style="color:#19177c">write-region-post-annotation-function</span> <span style="color:#800">nil</span>))
|
||||
</span></span><span style="display:flex;"><span> (<span style="color:#00f">write-region</span> <span style="color:#19177c">data</span> <span style="color:#800">nil</span> <span style="color:#19177c">file-path</span> <span style="color:#800">nil</span> <span style="color:#008000">:silent</span>))
|
||||
</span></span><span style="display:flex;"><span> (<span style="color:#00f">message</span> <span style="color:#ba2121">"Conversion started"</span>)
|
||||
</span></span><span style="display:flex;"><span> (<span style="color:#19177c">my/invoke-whisper</span> <span style="color:#19177c">file-path</span> <span style="color:#19177c">my/elfeed-srt-dir</span>)))
|
||||
</span></span><span style="display:flex;"><span> <span style="color:#008000">:error</span>
|
||||
</span></span><span style="display:flex;"><span> (<span style="color:#008000">cl-function</span>
|
||||
</span></span><span style="display:flex;"><span> (<span style="color:#008000">lambda</span> (<span style="color:#008000">&key</span> <span style="color:#19177c">error-thrown</span> <span style="color:#008000">&allow-other-keys</span>)
|
||||
</span></span><span style="display:flex;"><span> (<span style="color:#00f">message</span> <span style="color:#ba2121">"Error!: %S"</span> <span style="color:#19177c">error-thrown</span>))))))
|
||||
</span></span></code></pre></div><p>I also experimented with a bunch of options to write binary data in Emacs, of which the way with <code>write-region</code> (as implemented in <a href="https://github.com/rejeep/f.el">f.el</a>) seems to be the fastest. <a href="https://emacs.stackexchange.com/questions/59449/how-do-i-save-raw-bytes-into-a-file">This thread on StackExchange</a> suggests that it may screw some bytes towards the end, but whether or not this is the case, mp3 files survive the procedure. The proposed solution with <code>seq-doseq</code> takes at least a few seconds.</p>
|
||||
<p>As <code>my/invoke-whisper</code> creates multiple files, here’s a function to select related files:</p>
|
||||
<div class="highlight"><pre tabindex="0" style=";-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-emacs-lisp" data-lang="emacs-lisp"><span style="display:flex;"><span>(<span style="color:#008000">defun</span> <span style="color:#19177c">my/elfeed-show-related-files</span> (<span style="color:#19177c">entry</span>)
|
||||
</span></span><span style="display:flex;"><span> (<span style="color:#008000">interactive</span> (<span style="color:#00f">list</span> <span style="color:#19177c">elfeed-show-entry</span>))
|
||||
</span></span><span style="display:flex;"><span> (<span style="color:#008000">let*</span> ((<span style="color:#19177c">files</span>
|
||||
</span></span><span style="display:flex;"><span> (<span style="color:#00f">mapcar</span>
|
||||
</span></span><span style="display:flex;"><span> (<span style="color:#008000">lambda</span> (<span style="color:#19177c">file</span>) (<span style="color:#00f">cons</span> (<span style="color:#19177c">file-name-extension</span> <span style="color:#19177c">file</span>) <span style="color:#19177c">file</span>))
|
||||
</span></span><span style="display:flex;"><span> (<span style="color:#19177c">seq-filter</span>
|
||||
</span></span><span style="display:flex;"><span> (<span style="color:#008000">lambda</span> (<span style="color:#19177c">file</span>)
|
||||
</span></span><span style="display:flex;"><span> (<span style="color:#19177c">string-match-p</span>
|
||||
</span></span><span style="display:flex;"><span> (<span style="color:#008000">rx</span> <span style="color:#19177c">bos</span> (<span style="color:#19177c">literal</span> (<span style="color:#19177c">elfeed-ref-id</span> (<span style="color:#19177c">elfeed-entry-content</span> <span style="color:#19177c">entry</span>))) <span style="color:#ba2121">"."</span>)
|
||||
</span></span><span style="display:flex;"><span> <span style="color:#19177c">file</span>))
|
||||
</span></span><span style="display:flex;"><span> (<span style="color:#00f">directory-files</span> <span style="color:#19177c">my/elfeed-srt-dir</span>))))
|
||||
</span></span><span style="display:flex;"><span> (<span style="color:#19177c">buffer</span>
|
||||
</span></span><span style="display:flex;"><span> (<span style="color:#19177c">find-file-other-window</span>
|
||||
</span></span><span style="display:flex;"><span> (<span style="color:#00f">concat</span>
|
||||
</span></span><span style="display:flex;"><span> <span style="color:#19177c">my/elfeed-srt-dir</span>
|
||||
</span></span><span style="display:flex;"><span> (<span style="color:#19177c">alist-get</span>
|
||||
</span></span><span style="display:flex;"><span> (<span style="color:#00f">completing-read</span> <span style="color:#ba2121">"File: "</span> <span style="color:#19177c">files</span>)
|
||||
</span></span><span style="display:flex;"><span> <span style="color:#19177c">files</span> <span style="color:#800">nil</span> <span style="color:#800">nil</span> <span style="color:#00f">#'equal</span>)))))
|
||||
</span></span><span style="display:flex;"><span> (<span style="color:#008000">with-current-buffer</span> <span style="color:#19177c">buffer</span>
|
||||
</span></span><span style="display:flex;"><span> (<span style="color:#008000">setq-local</span> <span style="color:#19177c">elfeed-show-entry</span> <span style="color:#19177c">entry</span>))))
|
||||
</span></span></code></pre></div><p>Finally, we need a function to show the transcript if it exists or invoke <code>my/elfeed-whisper-get-transcript-new</code> if it doesn’t. And this is the function that we’ll call from an <code>elfeed-entry</code> buffer.</p>
|
||||
<div class="highlight"><pre tabindex="0" style=";-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-emacs-lisp" data-lang="emacs-lisp"><span style="display:flex;"><span>(<span style="color:#008000">defun</span> <span style="color:#19177c">my/elfeed-whisper-get-transcript</span> (<span style="color:#19177c">entry</span>)
|
||||
</span></span><span style="display:flex;"><span> <span style="color:#ba2121">"Retrieve transcript for the enclosure of the current elfeed ENTRY."</span>
|
||||
</span></span><span style="display:flex;"><span> (<span style="color:#008000">interactive</span> (<span style="color:#00f">list</span> <span style="color:#19177c">elfeed-show-entry</span>))
|
||||
</span></span><span style="display:flex;"><span> (<span style="color:#008000">let</span> ((<span style="color:#19177c">enclosure</span> (<span style="color:#19177c">caar</span> (<span style="color:#19177c">elfeed-entry-enclosures</span> <span style="color:#19177c">entry</span>))))
|
||||
</span></span><span style="display:flex;"><span> (<span style="color:#008000">unless</span> <span style="color:#19177c">enclosure</span>
|
||||
</span></span><span style="display:flex;"><span> (<span style="color:#d2413a;font-weight:bold">user-error</span> <span style="color:#ba2121">"No enclosure found!"</span>))
|
||||
</span></span><span style="display:flex;"><span> (<span style="color:#008000">let</span> ((<span style="color:#19177c">srt-path</span> (<span style="color:#00f">concat</span> <span style="color:#19177c">my/elfeed-srt-dir</span>
|
||||
</span></span><span style="display:flex;"><span> (<span style="color:#19177c">elfeed-ref-id</span> (<span style="color:#19177c">elfeed-entry-content</span> <span style="color:#19177c">entry</span>))
|
||||
</span></span><span style="display:flex;"><span> <span style="color:#ba2121">".srt"</span>)))
|
||||
</span></span><span style="display:flex;"><span> (<span style="color:#008000">if</span> (<span style="color:#00f">file-exists-p</span> <span style="color:#19177c">srt-path</span>)
|
||||
</span></span><span style="display:flex;"><span> (<span style="color:#008000">let</span> ((<span style="color:#19177c">buffer</span> (<span style="color:#19177c">find-file-other-window</span> <span style="color:#19177c">srt-path</span>)))
|
||||
</span></span><span style="display:flex;"><span> (<span style="color:#008000">with-current-buffer</span> <span style="color:#19177c">buffer</span>
|
||||
</span></span><span style="display:flex;"><span> (<span style="color:#008000">setq-local</span> <span style="color:#19177c">elfeed-show-entry</span> <span style="color:#19177c">entry</span>)))
|
||||
</span></span><span style="display:flex;"><span> (<span style="color:#19177c">my/elfeed-whisper-get-transcript-new</span> <span style="color:#19177c">entry</span>)))))
|
||||
</span></span></code></pre></div><h5 id="integrating-with-subed">Integrating with subed</h5>
|
||||
<p>Now that we’ve produced a <code>.srt</code> file, we can use a package called <a href="https://github.com/sachac/subed">subed</a> to control the playback, as I have done in the YouTube section.</p>
|
||||
<p>By the way, this wasn’t the most straightforward thing to figure out, because the MPV window doesn’t show up for an audio file, and the player itself starts in the paused state. So I thought nothing was happening until I enabled the debug log.</p>
|
||||
<p>With that in mind, here’s a function to launch MPV from the buffer generated by <code>my/elfeed-whisper-get-transcript</code>:</p>
|
||||
<div class="highlight"><pre tabindex="0" style=";-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-emacs-lisp" data-lang="emacs-lisp"><span style="display:flex;"><span>(<span style="color:#008000">defun</span> <span style="color:#19177c">my/elfeed-whisper-subed</span> (<span style="color:#19177c">entry</span>)
|
||||
</span></span><span style="display:flex;"><span> <span style="color:#ba2121">"Run MPV for the current Whisper-generated subtitles file.
|
||||
</span></span></span><span style="display:flex;"><span><span style="color:#ba2121">
|
||||
</span></span></span><span style="display:flex;"><span><span style="color:#ba2121">ENTRY is an instance of </span><span style="color:#19177c">`elfeed-entry'</span><span style="color:#ba2121">."</span>
|
||||
</span></span><span style="display:flex;"><span> (<span style="color:#008000">interactive</span> (<span style="color:#00f">list</span> <span style="color:#19177c">elfeed-show-entry</span>))
|
||||
</span></span><span style="display:flex;"><span> (<span style="color:#008000">unless</span> <span style="color:#19177c">entry</span>
|
||||
</span></span><span style="display:flex;"><span> (<span style="color:#d2413a;font-weight:bold">user-error</span> <span style="color:#ba2121">"No entry!"</span>))
|
||||
</span></span><span style="display:flex;"><span> (<span style="color:#008000">unless</span> (<span style="color:#19177c">derived-mode-p</span> <span style="color:#19177c">'subed-mode</span>)
|
||||
</span></span><span style="display:flex;"><span> (<span style="color:#d2413a;font-weight:bold">user-error</span> <span style="color:#ba2121">"Not subed mode!"</span>))
|
||||
</span></span><span style="display:flex;"><span> (<span style="color:#008000">setq-local</span> <span style="color:#19177c">subed-mpv-video-file</span>
|
||||
</span></span><span style="display:flex;"><span> (<span style="color:#00f">expand-file-name</span>
|
||||
</span></span><span style="display:flex;"><span> (<span style="color:#00f">concat</span> <span style="color:#19177c">my/elfeed-whisper-podcast-files-directory</span>
|
||||
</span></span><span style="display:flex;"><span> (<span style="color:#19177c">my/get-file-name-from-url</span>
|
||||
</span></span><span style="display:flex;"><span> (<span style="color:#19177c">caar</span> (<span style="color:#19177c">elfeed-entry-enclosures</span> <span style="color:#19177c">entry</span>))))))
|
||||
</span></span><span style="display:flex;"><span> (<span style="color:#19177c">subed-mpv--play</span> <span style="color:#19177c">subed-mpv-video-file</span>))
|
||||
</span></span></code></pre></div><p>After running <code>M-x my/elfeed-whisper-subed</code>, run <code>M-x subed-toggle-loop-over-current-subtitle</code> (<code>C-c C-l</code>), because somehow it’s turned on by default, and <code>M-x subed-toggle-pause-while-typing</code> (<code>C-c C-p</code>), because sometimes this made my instance of MPV lag.</p>
|
||||
<p>After that, <code>M-x subed-mpv-toggle-pause</code> should start the playback, which you can control by moving the cursor in the buffer.</p>
|
||||
<p>You can also run <code>M-x subed-toggle-sync-point-to-player</code> (<code>C-c .</code>) to toggle syncing the point in the buffer to the currently played subtitle (this automatically gets disabled when you switch buffers).</p>
|
||||
<p>Running <code>M-x subed-toggle-sync-player-to-point</code> (<code>C-c ,</code>) does the opposite, i.e. sets the player position to the subtitle under point. These two functions are useful since the MPV window controls aren’t available.</p>
|
||||
<h5 id="running-it-for-internet-files">Running it for Internet Files</h5>
|
||||
<p>And since lately I don’t listen to podcasts via elfeed that much, I also want a function that runs whisper on random Internet files.</p>
|
||||
<div class="highlight"><pre tabindex="0" style=";-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-emacs-lisp" data-lang="emacs-lisp"><span style="display:flex;"><span>(<span style="color:#008000">defun</span> <span style="color:#19177c">my/whisper-url</span> (<span style="color:#19177c">url</span> <span style="color:#19177c">file-name</span> <span style="color:#19177c">output-dir</span> <span style="color:#008000">&optional</span> <span style="color:#19177c">language</span> <span style="color:#19177c">num-speakers</span>)
|
||||
</span></span><span style="display:flex;"><span> (<span style="color:#008000">interactive</span>
|
||||
</span></span><span style="display:flex;"><span> (<span style="color:#00f">list</span> (<span style="color:#00f">read-from-minibuffer</span> <span style="color:#ba2121">"URL: "</span>)
|
||||
</span></span><span style="display:flex;"><span> (<span style="color:#00f">read-from-minibuffer</span> <span style="color:#ba2121">"File name: "</span>)
|
||||
</span></span><span style="display:flex;"><span> (<span style="color:#19177c">read-directory-name</span> <span style="color:#ba2121">"Output directory: "</span>)
|
||||
</span></span><span style="display:flex;"><span> (<span style="color:#008000">let</span> ((<span style="color:#19177c">lang</span> (<span style="color:#00f">read-string</span> <span style="color:#ba2121">"Language (optional): "</span>)))
|
||||
</span></span><span style="display:flex;"><span> (<span style="color:#008000">if</span> (<span style="color:#19177c">string-empty-p</span> <span style="color:#19177c">lang</span>) <span style="color:#800">nil</span> <span style="color:#19177c">lang</span>))
|
||||
</span></span><span style="display:flex;"><span> (<span style="color:#008000">let</span> ((<span style="color:#19177c">num</span> (<span style="color:#19177c">read-number</span> <span style="color:#ba2121">"Number of speakers (optional): "</span> <span style="color:#666">0</span>)))
|
||||
</span></span><span style="display:flex;"><span> (<span style="color:#008000">when</span> (<span style="color:#00f">></span> <span style="color:#19177c">num</span> <span style="color:#666">0</span>)
|
||||
</span></span><span style="display:flex;"><span> (<span style="color:#00f">number-to-string</span> <span style="color:#19177c">num</span>)))))
|
||||
</span></span><span style="display:flex;"><span> (<span style="color:#008000">let</span> ((<span style="color:#19177c">file-path</span>
|
||||
</span></span><span style="display:flex;"><span> (<span style="color:#00f">concat</span> <span style="color:#19177c">output-dir</span> <span style="color:#19177c">file-name</span> <span style="color:#ba2121">"."</span> (<span style="color:#19177c">file-name-extension</span> <span style="color:#19177c">url</span>))))
|
||||
</span></span><span style="display:flex;"><span> (<span style="color:#00f">message</span> <span style="color:#ba2121">"Download started"</span>)
|
||||
</span></span><span style="display:flex;"><span> (<span style="color:#19177c">request</span> <span style="color:#19177c">url</span>
|
||||
</span></span><span style="display:flex;"><span> <span style="color:#008000">:type</span> <span style="color:#ba2121">"GET"</span>
|
||||
</span></span><span style="display:flex;"><span> <span style="color:#008000">:encoding</span> <span style="color:#19177c">'binary</span>
|
||||
</span></span><span style="display:flex;"><span> <span style="color:#008000">:complete</span>
|
||||
</span></span><span style="display:flex;"><span> (<span style="color:#008000">cl-function</span>
|
||||
</span></span><span style="display:flex;"><span> (<span style="color:#008000">lambda</span> (<span style="color:#008000">&key</span> <span style="color:#19177c">data</span> <span style="color:#008000">&allow-other-keys</span>)
|
||||
</span></span><span style="display:flex;"><span> (<span style="color:#008000">let</span> ((<span style="color:#19177c">coding-system-for-write</span> <span style="color:#19177c">'binary</span>)
|
||||
</span></span><span style="display:flex;"><span> (<span style="color:#19177c">write-region-annotate-functions</span> <span style="color:#800">nil</span>)
|
||||
</span></span><span style="display:flex;"><span> (<span style="color:#19177c">write-region-post-annotation-function</span> <span style="color:#800">nil</span>))
|
||||
</span></span><span style="display:flex;"><span> (<span style="color:#00f">write-region</span> <span style="color:#19177c">data</span> <span style="color:#800">nil</span> <span style="color:#19177c">file-path</span> <span style="color:#800">nil</span> <span style="color:#008000">:silent</span>))
|
||||
</span></span><span style="display:flex;"><span> (<span style="color:#00f">message</span> <span style="color:#ba2121">"Conversion started"</span>)
|
||||
</span></span><span style="display:flex;"><span> (<span style="color:#19177c">my/invoke-whisper</span> <span style="color:#19177c">file-path</span> <span style="color:#19177c">output-dir</span> <span style="color:#19177c">language</span> <span style="color:#19177c">num-speakers</span>)))
|
||||
</span></span><span style="display:flex;"><span> <span style="color:#008000">:error</span>
|
||||
</span></span><span style="display:flex;"><span> (<span style="color:#008000">cl-function</span>
|
||||
</span></span><span style="display:flex;"><span> (<span style="color:#008000">lambda</span> (<span style="color:#008000">&key</span> <span style="color:#19177c">error-thrown</span> <span style="color:#008000">&allow-other-keys</span>)
|
||||
</span></span><span style="display:flex;"><span> (<span style="color:#00f">message</span> <span style="color:#ba2121">"Error!: %S"</span> <span style="color:#19177c">error-thrown</span>))))))
|
||||
</span></span></code></pre></div><h5 id="some-observations">Some observations</h5>
|
||||
<p>So, the functions above work for my purposes.</p>
|
||||
<p>Vosk API works much faster than Whisper. The smallest Vosk model requires ~10 times less than the playback time, and even the <code>tiny.en</code> Whisper model on my PC requires maybe 1.2x playback time.</p>
|
||||
<p>However, the quality of the output for Whisper is just so much better so I consider it to be worth the wait. Even with the <code>tiny</code> model, the transcript is almost perfect, provided that the audio is of reasonable quality.</p>
|
||||
<h3 id="declarative-filesystem-management">Declarative filesystem management</h3>
|
||||
<p>My filesystem is, shall we say, not the most orderly place.</p>
|
||||
<center>
|
||||
|
|
@ -11569,7 +11649,6 @@ I’ve seen a couple of cases where people would swap their username and ema
|
|||
<li><a href="#rdrview">rdrview</a></li>
|
||||
<li><a href="#latex-and-pandoc">LaTeX and pandoc</a></li>
|
||||
<li><a href="#youtube-transcripts">YouTube transcripts</a></li>
|
||||
<li><a href="#podcast-transcripts">Podcast transcripts</a></li>
|
||||
</ul>
|
||||
</li>
|
||||
<li><a href="#internet-and-multimedia">Internet & Multimedia</a>
|
||||
|
|
@ -11596,10 +11675,10 @@ I’ve seen a couple of cases where people would swap their username and ema
|
|||
<li><a href="#stackexchange">StackExchange</a></li>
|
||||
</ul>
|
||||
</li>
|
||||
<li><a href="#llm">LLM</a>
|
||||
<li><a href="#not-an-ai">Not-an-AI</a>
|
||||
<ul>
|
||||
<li><a href="#gptel">gptel</a></li>
|
||||
<li><a href="#ellama">ellama</a></li>
|
||||
<li><a href="#llms">LLMs</a></li>
|
||||
<li><a href="#podcast-transcripts">Podcast transcripts</a></li>
|
||||
</ul>
|
||||
</li>
|
||||
<li><a href="#declarative-filesystem-management">Declarative filesystem management</a>
|
||||
|
|
|
|||
|
|
@ -636,7 +636,7 @@ Remove <code>TAG</code> from emails which are outside the matching <code>PATH</c
|
|||
<p>Edit <span class="timestamp-wrapper"><span class="timestamp"><2022-10-27 Thu></span></span>: for consistency’s sake, I’ll make the signature on the top for all cases.</p>
|
||||
<p>Edit <span class="timestamp-wrapper"><span class="timestamp"><2024-08-19 Mon></span></span>: see above</p>
|
||||
<div class="highlight"><pre tabindex="0" style=";-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-emacs-lisp" data-lang="emacs-lisp"><span style="display:flex;"><span>(<span style="color:#008000">defun</span> <span style="color:#19177c">my/message-insert-signature-need-on-top</span> ()
|
||||
</span></span><span style="display:flex;"><span> <span style="color:#800">nil</span>)
|
||||
</span></span><span style="display:flex;"><span> <span style="color:#800">t</span>)
|
||||
</span></span></code></pre></div><p>Then advice the <code>notmuch-mua-reply</code> function:</p>
|
||||
<div class="highlight"><pre tabindex="0" style=";-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-emacs-lisp" data-lang="emacs-lisp"><span style="display:flex;"><span>(<span style="color:#008000">defun</span> <span style="color:#19177c">my/message-maybe-fix-signature</span> (<span style="color:#008000">&rest</span> <span style="color:#19177c">_</span>)
|
||||
</span></span><span style="display:flex;"><span> (<span style="color:#008000">when</span> (<span style="color:#19177c">my/message-insert-signature-need-on-top</span>)
|
||||
|
|
|
|||
|
|
@ -1,6 +1,6 @@
|
|||
<!DOCTYPE html>
|
||||
<html lang=""><head>
|
||||
<meta name="generator" content="Hugo 0.136.4">
|
||||
<meta name="generator" content="Hugo 0.138.0">
|
||||
<meta charset="utf-8">
|
||||
<meta name="viewport" content="width=device-width, initial-scale=1, shrink-to-fit=no">
|
||||
|
||||
|
|
|
|||
BIN
stats/all.png
BIN
stats/all.png
Binary file not shown.
|
Before Width: | Height: | Size: 121 KiB After Width: | Height: | Size: 121 KiB |
Binary file not shown.
|
Before Width: | Height: | Size: 62 KiB After Width: | Height: | Size: 61 KiB |
Binary file not shown.
|
Before Width: | Height: | Size: 68 KiB After Width: | Height: | Size: 67 KiB |
Loading…
Add table
Reference in a new issue