Earlier this week I got an e-mail from a friend. Here’s the gist of the e-mail (names have been excluded to protect the innocent LOL)
I have a command line tool installed through homebrew on my laptop running high sierra. The command is just ccextractor <filepath> and it runs fine in a standard bash terminal. I was hoping to use automator to be able to run it on batches of files, but i'm struggling with the syntax for the Run Shell Script command. It just keeps saying ccextractor command not found. Also the command line tool can only do one file at a time so I guess I need some way to loop the request so it does the first file, then runs the command again on the second file etc?
My friend is a fellow movie geek who wants to run CCExtractor on a batch of movie files.
CCExtractor https://www.ccextractor.org/ is an application used to extract closed captions from video files.
The problem was my friend could not figure out how to use Automator (a Mac tool) to run this command on a directory of files. An attempt was made to use bash as well with no luck. Hence the e-mail.
I replied back that I could probably whip something up in Python if that would work. “Are you sure that’s not too much work?” my friend replied. “Nah it should be pretty simple to whip up.”, I replied.
Here’s the gist what I did.
- Traveled to the https://www.ccextractor.org/ site and downloaded the binaries and some 3.x GB sample files to my drive.
- Then I opened my trusty text editor (https://www.sublimetext.com/) is my editor of choice and started a new .py (python) prorgam.
- After a bit of google-fu I came up with this set of code:
import os
import subprocess
directory_to_import = 'D:/Data/clients/RodPaddock/CCExtractor/'
extractor_exe_path = 'D:/Data/clients/RodPaddock/CCExtractor/ccextractorwin'
for file in os.listdir(directory_to_import):
if file.endswith(".mpg"):
print(os.path.join(directory_to_import, file))
subprocess.run([extractor_exe_path, os.path.join(directory_to_import, file)])
This code was built, debugged and run on my Windows development box. The goal was to get it working as fast as possible on my main development box before moving it onto a Mac.
Here’s a link a Gist of the code: https://gist.github.com/rjpaddock/d53956767dd4a1fe267dee08c995c956.js
Getting the code to run on the Mac was simple. Here’s that version:
import os
import subprocess
directory_to_import = '/Users/rodpaddock/ccextractor'
extractor_exe_path = 'ccextractor'
for file in os.listdir(directory_to_import):
if file.endswith(".mpg"):
print(os.path.join(directory_to_import, file))
subprocess.run([extractor_exe_path, os.path.join(directory_to_import, file)])
As you can see the changes were minimal at best. I changed the path to my user directory on the Mac and git rid of the specific path to the executable. I used brew to install the CCExtractor on my mac so it was in the PATH already. After installing Python version 3.x on my old Mac I was able to run the application as-is. No operating specific issues.
After getting it to work I sent it off to my fiend who simply changed the path to the files to decode and BOOM it just worked.
After marveling at how much could be accomplished with so few lines of code, I became curious to see how complex it would be to build the same application in C#. I’m using .NET Core to do this, as I want to run it cross platform as well.
Here’s the same functionality in C#
using System;
using System.Diagnostics;
using System.IO;
namespace ExtractorRunner
{
class Program
{
static void Main(string[] args)
{
var directory_to_import = "D:/Data/clients/RodPaddock/CCExtractor/";
var extractor_exe_path = "D:/Data/clients/RodPaddock/CCExtractor/ccextractorwin";
foreach (var fileName in Directory.GetFiles(directory_to_import,"*.mpg"))
{
Console.WriteLine(fileName);
var process = new Process()
{
StartInfo = new ProcessStartInfo
{
FileName = $"{extractor_exe_path}",
Arguments = $"{fileName}",
UseShellExecute = true,
}
};
process.Start();
}
}
}
}
Not too bad .NET core. It was pretty simple to build this application and get it running in a console application.
Here’s a Gist to the C# code: https://gist.github.com/rjpaddock/be601db3995082949071121d8aa992d7
Now that I have this code, I think it would be fun to explore making it a bit more useful. I’m doing this as an exercise to learn a few more things about building more robust Python and C# console applications. Here’s a set of features I plan on adding:
- Accept an extension parameter (I started with .mpg files) my friend had to change the extension to .mp4 files.
- Accept the path to decode as a parameter.
- Accept the path to the executable as a parameter
- Parameters should be named vs positional if possible.
- Run this code on Windows, Mac and Linux.