This commit is contained in:
SqrtMinusOne 2022-10-13 20:10:05 +00:00
parent 61a78bea72
commit 43e418193f
6 changed files with 10 additions and 7 deletions

View file

@ -14,7 +14,8 @@
<guid>https://sqrtminusone.xyz/posts/2022-09-16-vosk/</guid>
<content type="html">
&lt;p&gt;In my experience, finding something in a podcast is particularly troublesome. For example, occasionally I want to refer to some line in the podcast to make an &lt;a href=&#34;https://github.com/org-roam/org-roam&#34;&gt;org-roam&lt;/a&gt; node, e.g. I want to check that I got that part right.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Edit &lt;span class=&#34;timestamp-wrapper&#34;&gt;&lt;span class=&#34;timestamp&#34;&gt;&amp;lt;2022-10-13 Thu&amp;gt;&lt;/span&gt;&lt;/span&gt;:&lt;/strong&gt; Just a couple of days after this post, OpenAI released a speech recognition model called &lt;a href=&#34;https://openai.com/blog/whisper/&#34;&gt;Whisper&lt;/a&gt;, which is so much better than anything I&amp;rsquo;ve ever seen before. I&amp;rsquo;ve decided to leave this post as it is, but check the &lt;a href=&#34;https://sqrtminusone.xyz/configs/emacs/#podcast-transcripts&#34;&gt;Emacs config&lt;/a&gt; for the updated version.&lt;/p&gt;
&lt;p&gt;In my experience, finding something in a podcast is particularly troublesome. For example, occasionally I want to refer to some line in the podcast to make an &lt;a href=&#34;https://github.com/org-roam/org-roam&#34;&gt;org-roam&lt;/a&gt; node, e.g. I want to check that I got that part right.&lt;/p&gt;
&lt;p&gt;And I have no reasonable way to get there because audio files in themselves don&amp;rsquo;t allow for &lt;a href=&#34;https://en.wikipedia.org/wiki/Random_access&#34;&gt;random access&lt;/a&gt;, i.e. there are no &amp;ldquo;landmarks&amp;rdquo; that point to this or that portion of the file. At least if nothing like a transcript is available.&lt;/p&gt;
&lt;p&gt;For obvious reasons, podcasts rarely ship with transcripts. So in this post, I&amp;rsquo;ll be using a speech recognition engine to make up for that. A generated transcript is not quite as good as a manually written one, but for the purpose of finding a fragment of a known podcast, it works well enough.&lt;/p&gt;
&lt;figure&gt;&lt;img src=&#34;https://sqrtminusone.xyz/images/vosk/img.png&#34;/&gt;

View file

@ -63,7 +63,8 @@
<h1 id="title-small-screen">Podcast transcripts with elfeed &amp; speech recognition engine</h1>
<div class="container" id="actual-content">
<h1 id="title-large-screen">Podcast transcripts with elfeed &amp; speech recognition engine</h1>
<p>In my experience, finding something in a podcast is particularly troublesome. For example, occasionally I want to refer to some line in the podcast to make an <a href="https://github.com/org-roam/org-roam">org-roam</a> node, e.g. I want to check that I got that part right.</p>
<p><strong>Edit <span class="timestamp-wrapper"><span class="timestamp">&lt;2022-10-13 Thu&gt;</span></span>:</strong> Just a couple of days after this post, OpenAI released a speech recognition model called <a href="https://openai.com/blog/whisper/">Whisper</a>, which is so much better than anything I&rsquo;ve ever seen before. I&rsquo;ve decided to leave this post as it is, but check the <a href="https://sqrtminusone.xyz/configs/emacs/#podcast-transcripts">Emacs config</a> for the updated version.</p>
<p>In my experience, finding something in a podcast is particularly troublesome. For example, occasionally I want to refer to some line in the podcast to make an <a href="https://github.com/org-roam/org-roam">org-roam</a> node, e.g. I want to check that I got that part right.</p>
<p>And I have no reasonable way to get there because audio files in themselves don&rsquo;t allow for <a href="https://en.wikipedia.org/wiki/Random_access">random access</a>, i.e. there are no &ldquo;landmarks&rdquo; that point to this or that portion of the file. At least if nothing like a transcript is available.</p>
<p>For obvious reasons, podcasts rarely ship with transcripts. So in this post, I&rsquo;ll be using a speech recognition engine to make up for that. A generated transcript is not quite as good as a manually written one, but for the purpose of finding a fragment of a known podcast, it works well enough.</p>
<figure><img src="/images/vosk/img.png"/>

View file

@ -14,7 +14,8 @@
<guid>https://sqrtminusone.xyz/posts/2022-09-16-vosk/</guid>
<content type="html">
&lt;p&gt;In my experience, finding something in a podcast is particularly troublesome. For example, occasionally I want to refer to some line in the podcast to make an &lt;a href=&#34;https://github.com/org-roam/org-roam&#34;&gt;org-roam&lt;/a&gt; node, e.g. I want to check that I got that part right.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Edit &lt;span class=&#34;timestamp-wrapper&#34;&gt;&lt;span class=&#34;timestamp&#34;&gt;&amp;lt;2022-10-13 Thu&amp;gt;&lt;/span&gt;&lt;/span&gt;:&lt;/strong&gt; Just a couple of days after this post, OpenAI released a speech recognition model called &lt;a href=&#34;https://openai.com/blog/whisper/&#34;&gt;Whisper&lt;/a&gt;, which is so much better than anything I&amp;rsquo;ve ever seen before. I&amp;rsquo;ve decided to leave this post as it is, but check the &lt;a href=&#34;https://sqrtminusone.xyz/configs/emacs/#podcast-transcripts&#34;&gt;Emacs config&lt;/a&gt; for the updated version.&lt;/p&gt;
&lt;p&gt;In my experience, finding something in a podcast is particularly troublesome. For example, occasionally I want to refer to some line in the podcast to make an &lt;a href=&#34;https://github.com/org-roam/org-roam&#34;&gt;org-roam&lt;/a&gt; node, e.g. I want to check that I got that part right.&lt;/p&gt;
&lt;p&gt;And I have no reasonable way to get there because audio files in themselves don&amp;rsquo;t allow for &lt;a href=&#34;https://en.wikipedia.org/wiki/Random_access&#34;&gt;random access&lt;/a&gt;, i.e. there are no &amp;ldquo;landmarks&amp;rdquo; that point to this or that portion of the file. At least if nothing like a transcript is available.&lt;/p&gt;
&lt;p&gt;For obvious reasons, podcasts rarely ship with transcripts. So in this post, I&amp;rsquo;ll be using a speech recognition engine to make up for that. A generated transcript is not quite as good as a manually written one, but for the purpose of finding a fragment of a known podcast, it works well enough.&lt;/p&gt;
&lt;figure&gt;&lt;img src=&#34;https://sqrtminusone.xyz/images/vosk/img.png&#34;/&gt;

Binary file not shown.

Before

Width:  |  Height:  |  Size: 116 KiB

After

Width:  |  Height:  |  Size: 116 KiB

View file

@ -13,8 +13,8 @@
<pubDate>Fri, 16 Sep 2022 00:00:00 +0000</pubDate>
<guid>https://sqrtminusone.xyz/posts/2022-09-16-vosk/</guid>
<description>In my experience, finding something in a podcast is particularly troublesome. For example, occasionally I want to refer to some line in the podcast to make an org-roam node, e.g. I want to check that I got that part right.
And I have no reasonable way to get there because audio files in themselves don&amp;rsquo;t allow for random access, i.e. there are no &amp;ldquo;landmarks&amp;rdquo; that point to this or that portion of the file.</description>
<description>Edit &amp;lt;2022-10-13 Thu&amp;gt;: Just a couple of days after this post, OpenAI released a speech recognition model called Whisper, which is so much better than anything I&amp;rsquo;ve ever seen before. I&amp;rsquo;ve decided to leave this post as it is, but check the Emacs config for the updated version.
In my experience, finding something in a podcast is particularly troublesome. For example, occasionally I want to refer to some line in the podcast to make an org-roam node, e.</description>
</item>
<item>

View file

@ -13,8 +13,8 @@
<pubDate>Fri, 16 Sep 2022 00:00:00 +0000</pubDate>
<guid>https://sqrtminusone.xyz/posts/2022-09-16-vosk/</guid>
<description>In my experience, finding something in a podcast is particularly troublesome. For example, occasionally I want to refer to some line in the podcast to make an org-roam node, e.g. I want to check that I got that part right.
And I have no reasonable way to get there because audio files in themselves don&amp;rsquo;t allow for random access, i.e. there are no &amp;ldquo;landmarks&amp;rdquo; that point to this or that portion of the file.</description>
<description>Edit &amp;lt;2022-10-13 Thu&amp;gt;: Just a couple of days after this post, OpenAI released a speech recognition model called Whisper, which is so much better than anything I&amp;rsquo;ve ever seen before. I&amp;rsquo;ve decided to leave this post as it is, but check the Emacs config for the updated version.
In my experience, finding something in a podcast is particularly troublesome. For example, occasionally I want to refer to some line in the podcast to make an org-roam node, e.</description>
</item>
<item>