Rushing Labs

Automating Metagoofil with Python

I recently needed to automate metagoofil searches using Python, and thought I'd share the "proof of concept" script that got it working. I'll apologize ahead of time for any errors--I'm still only just beginning to learn Python.

Why automate metagoofil? Well, once you have your hands the type of files and metadata metagoofil brings back, there's a myriad of things you could dive into.

I already have some pretty specific ideas for how I can use this short automation script, but I'll save those for late. There's quite a bit to share.

Ok, here's the code.

The Script

from subprocess import Popen, PIPE
import pprint

printer = pprint.PrettyPrinter(indent = 4)

res = Popen([ "python", "~/", "-d", "", "-t", "doc,pdf", "-l", "200", "-n", "100", "-o", "/your/files/here", "-f", "results.html"], stdout = PIPE)

while res.poll() is None:

printer.pprint("completed metagoofil run")

Some Notes

I'm specifically using Popen so that I can easily use .communicate() to easily grab the data being sent to stdout. Also, with parameters being an array of strings, that proves to be an easy mechanism to manipulate when calling this script from others languages/tools (i.e. JavaScript).

res = Popen(["params", "array", "of", "strings"], stdout = PIPE)

This is a naive (yet, effective) mechanism for blocking until the process is completed. It checks every 0.5 seconds to see if the process is finished. BEWARE! If you are using this...and your process never finishes...this won't finish either! Hence, naive. :)

while res.poll() is None: