2023-12-30-02
Speech recognition
I'm interested in this because.
=D • I speak at ~200 WPM;
Ź
().
Ork
co
quick way to get words down
• It short circuits my editing brain
• I can capture commentary
for literate programming
I can use non-computer time
·
can make video's moresearchable & skimmableI can refer back to whatI was talking aboutLive captions might be nice• Animations might be cool too●
回rough transcripts andthe cloud for better onesbecause my laptop can'thandle Whisper.Įcan explore lots ofthings even with thoseConstraints.Workflows• NarrationBraindumps●voice coding islow-priority right nowbecause I havelimited computer time.but thinking out loudfor literate programmingis good. Voice control ofEmacs m.Sometimes I think about paying for a GPU node for other ideas.These are expensive to leave running, so batching might make senseBut maybe I can use Colab/notebooks/ endpoints...Streaming WhisperSpeaker diarizationVoice activity detectionWord timestampssubtitle segmentation- lachesis?So this is mostly.Rough transcriptsBatch processing,higher quality• textcaptionsBatch voice interfaces?Streaming• Speaker identificationWord timestamps