OBS LocalVocal and OCR


AI
Last updated on

OBS LocalVocal

Today I discovered an OBS plugin called LocalVocal that can record, transcribe, and translate audio in real-time. I tried it locally on my Mac, and while it worked, the performance was not perfect. However, based on a YouTube video, it appears to work flawlessly. This tool can be particularly useful in situations where video or audio lacks subtitles, as it allows for real-time transcription. Since it uses whisper.cpp, it supports various Whisper models and multiple languages, and it also offers translation capabilities.

For educational audio and video content, this method can be very effective. For reading, there are multiple immersive translation tools with LLM, so learning in foreign languages is no longer a pain point. AI has already transformed the way we access learning materials in different languages.

OCR

I plan to set up local OCR services to scan all my previous diaries into electronic format. I have written some diaries every year, but it is not easy to keep all records on paper notebooks, so I want to organize them for future reference.

I have collected some OCR projects, but I haven’t tried them yet, partly because I am still waiting for Ollama to support visual models. However, based on recent updates, it seems this feature will not be supported in the near future, although they are actively developing it.

This week, I will first scan all my diaries into PDF or image format, then use OCR to convert them into Markdown files stored in this repository. These will be the best materials for a biography if I decide to write my own story.

Next, I will continue to finish the article that records my grandfather’s story. I have written some pages, but it is still not complete. He was an ordinary farmer like thousands of others in the village, and no one will remember him except me. Even in such cases, I have the privilege to ensure he is recorded in text. I hope someone else will read these words someday.

© 2025 Jennings Liu. All rights reserved.