ADR and Post workflows. Transcription by Assistant / ChatGPT

2023-06-06 04:36:46.356Z

Hi there,

Some of us were talking about the new "virtual assistant" using voice input with AI tools such as ChatGPT or others.

In particular, we thought of some possibilities for assisting ADR and post-production workflows, such as creating and storing on-the-fly transcription, such as for "speech search" or "script validation" tasks.

Do others have ideas of where AI voice modelling and speech-to-text or large language models might be useful for SoundFlow users?

I hope you are well!

Reply1 Like

9 replies

Christian Scheuer @chrscheuer
SoundFlow Team
2023-06-06 09:10:43.570Z
Thanks for sharing, Brenden. Could you elaborate a bit on your ideas in terms of ADR etc.? Would love to hear them and how you'd like this to work.
Reply 1 Like
In reply tonednednerb⬆:
Brenden @nednednerb
2023-06-06 18:03:27.151Z
Hi Christian,

Well, for myself, I do not do much dialogue replacement for video. However, I do work with voiceover, audiobook, and podcast material.

Quite often, I will receive a spreadsheet or a PDF of a book that has all the text that should have been recorded.
The course of my work involves editing each voiceover to match each row or cell in the spreadsheet.

There is a certain point in my workflow where I actually rely on an API call to Microsoft Machine Learning servers that a friend wrote for me. It uses his server and his account because there was no easy way to do a speech-to-text in Pro Tools or Premiere to get each file or clip transcribed into its own row or cell of a CSV.

I do this workflow because it's just a convenient way to be able to scan text visually rather than literally listen to each second of audio (more than a couple times by the end of work). If there are problematic noises or under-spoken pronunciation, often that appears in a weird transcription, or I can see other issues.

I noted that the new "virtual assistant" was going to receive and process voice input. My thoughts quickly hopped over to the idea of the assistant receiving voice input from an audio clip in Pro Tools and then being able to give back that text in useful formats (such as CSV, as I describe above).

Also, sometimes I just want to find, "Where in this file is this specific keyword?" I find it annoying to scroll back and forth, playing, stopping, and navigating. I want to scan a timeline for STT, then search for a text keyword and find an audio timeline point.

I believe that various users could use this voice, speech, and text assistance in a variety of applications, but what I mentioned above would be useful for my regular work.
Reply
1. B Brandon Jiaconia @Brandon_Jiaconia
  2024-06-06 21:45:14.793Z
  Hey Brenden - Pretty wild that you asked this question exactly a year ago today! I was looking into this today and was able to get STT working using Open AI's Whisper. https://openai.com/index/whisper/
  It's pretty great, it transcribes the audio and can output several text formats .txt etc
  
  I'm able to get STT working in terminal, but for some reason soundflow is not passing the shellscript when I use something like
  
  sf.ui.finder.selectedPaths.map(path => { log(`Now converting ${path}...`); sf.system.exec({ commandLine: `whisper "${path}"--model small --language English --output_format txt` }); });```
  
  Or even running the simple command line that I can run in terminal with SF just doesnt run. May be a permissions issue or SIP issue (I'm on a work computer). But I have other command line SF scripts that all work perfectly.
  Reply 1 Like
  Christian Scheuer @chrscheuer
  SoundFlow Team
  2024-06-06 22:01:53.531Z
  It's most likely because you'll need to specify the full path to the whisper binary.
  
  Type which whisper in Terminal to get the full path.
  
  Reply
B
In reply tonednednerb⬆:
Brandon Jiaconia @Brandon_Jiaconia
2024-06-06 22:24:09.451Z
Thanks Christian . I was doing just that in an earlier version. For example:

sf.system.exec({ commandLine: `/opt/homebrew/bin/whisper "/Users/brandon.jiaconia/Desktop/jia_02.aif" --model small --language English --output_format txt` });

The Soundflow Icon is blue for about 5 seconds but nothing happens. The process usually takes awhile in terminal. I was trying to also use an apple script triggered from SF but it also didnt work. I'm going to keep testing, If I can get something working I'll post it!
Reply
1. Christian Scheuer @chrscheuer
  SoundFlow Team
  2024-06-07 09:42:08.717Z
  You could tell it to tee its output somewhere, to a log file, so you could better see what's going on.
  Reply
  B Brandon Jiaconia @Brandon_Jiaconia
  2024-06-07 16:43:10.046Z
  The log is telling me that it's failing to find ffmpeg which is a requirement for whisper. But I have ffmpeg installed at the standard /opt/homebrew/bin/ffmpeg. I use it all the time for converting videos etc. Here is the info from the log:
  
  "/opt/homebrew/Cellar/python@3.12/3.12.3/Frameworks/Python.framework/Versions/3.12/lib/python3.12/subprocess.py", line 1955, in _execute_child
  raise child_exception_type(errno_num, err_msg, err_filename)
  FileNotFoundError: [Errno 2] No such file or directory: 'ffmpeg'
  
  Still working directly in terminal so I'm confused. I'm going to keep working on it but if you have any ideas I'm all ears !
  
  Reply
  Christian Scheuer @chrscheuer
  SoundFlow Team
  2024-06-07 23:48:56.286Z
  It's probably because your path (env variable PATH) isn't correctly specified, so you'd have to set it yourself to include the directory of your ffmpeg process.
  
  Reply
  B Brandon Jiaconia @Brandon_Jiaconia
  2024-06-27 20:32:17.517Z
  Just getting back to this - your suggestion was correct, thank you! I'm now able to run Open AI Whisper with Soundflow to get a transcription from an audio or video file. It runs the selected file in finder. Here is the script:
  
  // Get the path of the selected file in Finder var fullPath = decodeURIComponent(sf.appleScript.finder.selection.getItem(0).asItem.path); // Get the directory of the selected file var directoryPath = fullPath.substring(0, fullPath.lastIndexOf('/')); // Command line sf.system.exec({ commandLine: `PATH=/opt/homebrew/bin:$PATH /opt/homebrew/bin/whisper "${fullPath}" --model small --language English --output_format txt --output_dir "${directoryPath}"` });
  
  Reply

ReplyAdd progress note