All of us find our hard drives and backup drives filled with multiple images from the phone and the digital cameras etc., and there are inevitably some duplicates. This script is meant to help us orgnaize them.

The script works in the following way:

  1. Calculate the MD5 hash of each of an image file and store it in a SQLITE db.
  2. if the Hash has been seen before, ignore the file.
  3. If the file has not been seen before, copy into Destination directory, and change the file name.

Options available within the script are self explanatory.

	dedupe.py Version 0.74 (http://www.siricon.co.uk/deduper)
	Usage: dedupe.py [options]
	ERROR
		 option --help not recognized
	OPTIONS
		 -h Help
		 -v Verbose debugging
		 -s <Source Directory>
		 -d <Destination Directory>
		 -l <number> Limit the Run to <number> of images
		 -r Rebuild the Database
	LONG OPTIONS
		 --srcdir=<Source Directory>
		 --destdir=<Destination Directory>
		 --limit=<number> Limit the Run to <number> of images
		 --rebuild Rebuild the Database