2023-12-30-02

Speech recognition I'm interested in this because. =D • I speak at ~200 WPM; Ź (). Ork co quick way to get words down • It short circuits my editing brain • I can capture commentary for literate programming I can use non-computer time ·

can make video's moresearchable & skimmableI can refer back to whatI was talking aboutLive captions might be nice• Animations might be cool too●

I use my phone for

回rough transcripts andthe cloud for better onesbecause my laptop can'thandle Whisper.Įcan explore lots ofthings even with thoseConstraints.Workflows• NarrationBraindumps●voice coding islow-priority right nowbecause I havelimited computer time.but thinking out loudfor literate programmingis good. Voice control ofEmacs m.Sometimes I think about paying for a GPU node for other ideas.These are expensive to leave running, so batching might make senseBut maybe I can use Colab/notebooks/ endpoints...Streaming WhisperSpeaker diarizationVoice activity detectionWord timestampssubtitle segmentation- lachesis?So this is mostly.Rough transcriptsBatch processing,higher quality• textcaptionsBatch voice interfaces?Streaming• Speaker identificationWord timestamps

Sketches are (c) 2007-2025 Sacha Chua - Creative Commons Attribution License 4.0 unless otherwise specified. This means you can freely share and adapt the sketches (even commercially) if you include attribution and indicate the license and any changes, like this: (c) 2025 Sacha Chua - Creative Commons attribution license