Merge lp:~manishsinha/dmedia/fix-bug-680886 into lp:dmedia

Status: Rejected
Rejected by: Jason Gerard DeRose
Proposed branch: lp:~manishsinha/dmedia/fix-bug-680886
Merge into: lp:dmedia
Diff against target: 62 lines (+25/-20)
1 file modified
dmedia/extractor.py (+25/-20)
To merge this branch: bzr merge lp:~manishsinha/dmedia/fix-bug-680886
Reviewer Review Type Date Requested Status
Jason Gerard DeRose Needs Fixing
Review via email: mp+58173@code.launchpad.net

Commit message

Used pyexiv2 to extract the exif metadata.

It earlier used exiftool. This solution eliminates the useless complex rain dance like solution.

Description of the change

* Removed the rain-dance solution for extracting EXIF metadata.
* Uses pyexiv2 instead of opening a process and pumping output round the globe :)
* Please make sure that the packaging is updated accordingly.

* Right now there are only a few keys for exif present now extractor.py
Run it on many files and try to get more data and update the constant field EXIF_ALLOW_REMAP

To post a comment you must log in.
189. By Manish Sinha (मनीष सिन्हा)

Fixed the mistakes in the comment

Revision history for this message
Jason Gerard DeRose (jderose) wrote :

Thanks, Manish! Looks like this is well on it's way to getting rid of my check_call() nastiness.

For me the unit tests aren't working, so this needs fixing. You can run just the unit tests for the extractor like this:

  ./setup.py test --names=extractor

I'm getting 4 failures:

  http://paste.ubuntu.com/595774/

These tests use the `MVI_5751.MOV` and `MVI_5751.THM` files in `dmedia/tests/data` (test video from a 5D Mark II).

What version of of python-exiv2 are you using? The version in Natty (python-exiv2 0.3.0-3ubuntu1) seems to have some API differences from what you're using. For example, there is no pyexiv2.Image:

jderose@jgd-ws:~/bzr/dmedia/fix-bug-680886$ python
Python 2.7.1+ (r271:86832, Apr 11 2011, 18:13:53)
[GCC 4.5.2] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> import pyexiv2
>>> pyexiv2.Image
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
AttributeError: 'module' object has no attribute 'Image

review: Needs Fixing
190. By Manish Sinha (मनीष सिन्हा)

Updated dmedia EXIF extraction to use latest version available in natty.

Pre-requisite: pytho-pyexiv2 0.3.0
HINT: This pyexiv2 is a lot lot better than the one present in maverick.
This one has a better documentation and cleaner API.

Revision history for this message
Manish Sinha (मनीष सिन्हा) (manishsinha) wrote :

Fixed as per the latest pyexiv2 (0.3.0-3ubuntu1)

Just some info:
This exif extraction is for images. There are 2 more bugs logged for video and audio.
So this test needs to be run with png/jpg/gif images

Revision history for this message
Jason Gerard DeRose (jderose) wrote :

As mention in #novacut, the `MVI_5751.THM` test file is actually a standard JPEG file, a little trick we use to get the camera settings (ISO, Aperture, Shutter, etc) from a Canon HDSLR.

Manish, you mentioned that you think pyexiv2 uses the filename for type determination, that this is why the unit tests are still failing?

Regardless, I'm not willing to accept this if it means loosing functionality, despite your solution obviously being the longterm correct approach.

Revision history for this message
Manish Sinha (मनीष सिन्हा) (manishsinha) wrote :

> the `MVI_5751.THM` test file is actually a standard JPEG file
I don't understand why those files can't use the simple .jpg/.jpeg extension in this case?

> you mentioned that you think pyexiv2 uses the filename for type determination
I need to look at the code of pyexiv/exiv for that. AFAIK pyexiv2 is a python wrapper over exiv2 which is written in C++ and Boost. I would prefer killing myself with glade than looking at Boost code

> I'm not willing to accept this if it means loosing functionality
Makes sense. No emotions should be entertained. W

Revision history for this message
Jason Gerard DeRose (jderose) wrote :

> I don't understand why those files can't use the simple .jpg/.jpeg extension in this case?

Ah good question, I should have made this clear. Those files are named exactly as they'll be when importing from a memory card, you know, this sort of thing:

  http://vimeo.com/18287329

The filename is how we know to associate a specific video file (MVI_5751.MOV) with it's video thumbnail file (MVI_5751.THM). The EXIF metadata for stuff like ISO, Aperture, Shutter, etc, is only stored in the THM file, is not in the MOV file... and as such, during the import is the only time we can make the association with decent reliability.

So regardless how we do it, we need to make this work with these file names. Also note that renaming the file on the memory card isn't okay because in the near future we want to mount them ready-only as we do the import.

Revision history for this message
Manish Sinha (मनीष सिन्हा) (manishsinha) wrote :

I think this merge request should be rejected and this should be done via gstreamer itself. There is no use adding all these exiv2 as dependency when we would be surely having gstreamer as dependency for all the heavy lifting.

Unmerged revisions

190. By Manish Sinha (मनीष सिन्हा)

Updated dmedia EXIF extraction to use latest version available in natty.

Pre-requisite: pytho-pyexiv2 0.3.0
HINT: This pyexiv2 is a lot lot better than the one present in maverick.
This one has a better documentation and cleaner API.

189. By Manish Sinha (मनीष सिन्हा)

Fixed the mistakes in the comment

188. By Manish Sinha (मनीष सिन्हा)

Used pyexiv2 to extract the exif metadata.

It earlier used exiftool. This solution eliminates the useless complex rain dance like solution.

Preview Diff

[H/L] Next/Prev Comment, [J/K] Next/Prev File, [N/P] Next/Prev Hunk
1=== modified file 'dmedia/extractor.py'
2--- dmedia/extractor.py 2011-04-06 01:27:49 +0000
3+++ dmedia/extractor.py 2011-04-19 11:45:50 +0000
4@@ -32,20 +32,20 @@
5 from base64 import b64encode
6 import time
7 import calendar
8+import pyexiv2
9
10-# exiftool adds some metadata that doesn't make sense to include:
11-EXIFTOOL_IGNORE = (
12- 'SourceFile', # 'dmedia/tests/data/MVI_5751.THM'
13- 'ExifToolVersion', # 8.15
14- 'FileName', # 'MVI_5751.THM'
15- 'Directory', # 'dmedia/tests/data',
16- 'FileSize', # '27 kB'
17- 'FileModifyDate', # '2010:10:19 20:43:18-06:00'
18- 'FilePermissions', # 'rw-r--r--'
19- 'FileType', # 'JPEG'
20- 'MIMEType', # 'image/jpeg'
21- 'ExifByteOrder', # 'Little-endian (Intel, II)'
22-)
23+# exiftool adds some metadata that makes sense to include:
24+# Examples folow
25+EXIF_ALLOW_REMAP = {
26+ 'Exif.Image.Make': 'Manufacturer', # HTC
27+ 'Exif.Image.Model': 'Model', # Desire HD
28+ 'Exif.Photo.ISOSpeedRatings' : 'ISOSpeedRatings', # 965
29+ 'Exif.Photo.DateTimeOriginal': 'OriginalDateTime', # 2011-03-05 20:30:09
30+ 'Exif.Photo.DateTimeDigitized': 'DigitizedDateTime', # 2011-03-05 20:30:09
31+ 'Exif.Photo.PixelXDimension': 'ImageWidth', # 3264
32+ 'Exif.Photo.PixelYDimension':'ImageHeight', # 1952
33+ 'Exif.Photo.FlashpixVersion':'FlashpixVersion' # 0100
34+}
35
36
37 # We try to pull an authoritative mtime from these EXIF keys:
38@@ -95,13 +95,18 @@
39 Attempt to extract EXIF metadata from file at *filename*.
40 """
41 try:
42- args = ['exiftool', '-j', filename]
43- (stdout, stderr) = Popen(args, stdout=PIPE).communicate()
44- exif = json.loads(stdout)[0]
45- assert isinstance(exif, dict)
46- for key in EXIFTOOL_IGNORE:
47- exif.pop(key, None)
48- return exif
49+ exif = {}
50+ # Create an image instance
51+ image = pyexiv2.ImageMetadata(filename)
52+ # Read all the exif metadata.
53+ # Without this setp, next all command will fail
54+ image.read()
55+ exifkeys = image.exif_keys()
56+ for key in exifkeys:
57+ # If the key was found in the dictionary
58+ if key in EXIF_ALLOW_REMAP:
59+ # If found then used the remaped key to store the value
60+ exif[EXIF_ALLOW_REMAP[key]] = image[key].raw_value
61 except Exception as e:
62 return {u'Error': u'%s: %s' % (e.__class__.__name__, e)}
63

Subscribers

People subscribed via source and target branches