Video Image Detection and #surveillancecapitalism

Open source tools and computational power have progressed, but there’s a world wide race out there to weave the latest theories and algorithms with cheap small batch hardware. May the most accurate win. Advances in video processing and purpose-built computing centers mean that you don’t need to run super computers to do this. You need simply provide your stream to an ingestion point in the cloud.

Okutama-Action with action detection annotation. Ground truth bounding boxes are displayed, along with the actions performed. Colours represent IDs. Standing hand shaking with red-class object? That’s a no-no according to the ministry of health. You are docked 13 civic credits!

The Competition, Public vs. Private Software

Annually, the Multi Object Tracking Benchmark releases the ranks of multiple object tracking software upon a set of standardized videos. These samples are run against various platforms, algorithms, and software to express how effective they were at reporting the “Ground Truth”

A common collection of “training” datasets that software evaluates against to compete for fastest accuracy

Typically, the unified evaluation framework provided meaningful results of the effectiveness of software which boiled to the following metrics:

  • Accuracy and Rank
  • Number of Mostly Tracked/Lost objects
  • False positives/negatives
  • Tracker speed in frames per second

Open Source Projects Abound

Certainly there’s a tonne of open source project for computer vision. See Github for snippets and full on executables ready to take a stream.

Here’s an open source Python-based single object tracker.   In this one, the camera moves.

SORT is a barebones implementation of a visual multiple object tracking framework

Crowd analysis?

How about computational cell tracking using BayesianTracker