Skip to content
Solely for the Purpose of Catching $PAMRZ

Is JPEG Good Enough for Archival Masters?

On the ImageLib mailing list, Rob Lancefield (Manager of Museum Information Services for Wesleyan University) posted a link to the Universal Photographic Digital Imaging Guidelines (UPDIG) for image creators. The introduction says: “These 12 guidelines — provided as a Quick Guide plus an in-depth Complete Guide — aim to clarify the issues affecting accurate reproduction and management of digital image files. Although they largely reflect a photographer’s perspective, anyone working with digital images should find them useful…. This document, prepared by the UPDIG working group, represents the industry consensus as of September 2007.” The listed members of UPDIG leads one to believe that this is a professional photography group. One thing in the introduction to the guidelines caught my eye, though:

The chapter on archiving now has a discussion of JPEG as an archival format.

Note that the authors do indeed mean JPEG (circa 1994), not JPEG2000. The chapter on archiving lists the pros and cons of a number of formats, to include JPEG. The following bullet points are excerpted from the text.

  • Conversion to TIFF files: By converting images to TIFF format [from camera RAW], the photographer is storing the images in the most accessible file format… There is a downside, however. TIFF files are much larger than RAW files… Another downside to conversion to TIFF is that it precludes the use of better RAW converters that are surely coming in the future.
  • Archiving JPEG files: Conventional wisdom holds that the TIFF format holds a quality advantage over the JPEG format. This holds true only if the JPEG file is saved at less than 10 quality using the Photoshop standard. When using JPEG quality 10 or 12, the artifacts are either non-existent or insignificant. Higher bit-depth is really the only advantage of using TIFF over JPEG 10 or 12 (in terms of image quality)… Update 2008-02-11: Please see below.
  • Archiving RAW files: If a photographer chooses to archive the RAW file, then he will be preserving the largest number of options for future conversion of the files… This, too, has its downside. RAW files will likely have to be converted to a more universal file format at some time in the future.
  • Archiving DNG files: RAW files can be converted to DNG, a documented TIFF-based format created by Adobe that can store the RAW image data, metadata, and a color-corrected JPEG preview of the image. The DNG file format provides a common platform for information about the file and adjustments to the image… DNG is likely to be readable long after the original RAW format becomes obsolete, simply because there will be so many more of them than any particular RAW file format… There’s a downside to DNG, of course. Conversion to DNG requires an extra step at the time of RAW file processing; it does not take terribly long, but it is an extra process.

Update 2008-02-11: Ken Fleisher noted in the comments that the excerpt above was truncated before his reasoning was described. In the interest of clarity, the full text of this bullet point on the UPDIG site is:

Archiving JPEG files: Conventional wisdom holds that the TIFF format holds a quality advantage over the JPEG format. This holds true only if the JPEG file is saved at less than 10 quality using the Photoshop standard. When using JPEG quality 10 or 12, the artifacts are either non-existent or insignificant. Higher bit-depth is really the only advantage of using TIFF over JPEG 10 or 12 (in terms of image quality). Some have argued that that JPEG, because of the way it encodes data, compromises color. This is a misconception. When using the highest quality settings, there is no loss of color fidelity. Therefore, if JPEG files are saved at 10-12 quality, and if they do not require much pixel editing before use, archiving JPEG files is not a bad concept, and it can save a lot of space. For many picture archives, the economics of storing large numbers of files dominates all other considerations, and JPEG offers a feasible solution to the problem.

The notes at the end of the chapter say: “The archiving JPEG section is based on research and analysis by Ken Fleisher.”

So I wonder what is going on here. Does the cultural heritage community have a different definition of the word archive from the professional photography community? Are there sufficient differences in our goals that warrant the differences in practices?

This topic is of interest because the program of the JPEG2000 in Archives and Libraries Interest Group of the Library and Information Technology Association (LITA) will be holding a panel at the ALA Annual Conference in Anaheim this summer on using the JPEG2000 file format for archival purposes. Part of the discussion will center around the notion of visually lossless versus data lossless compression. This mention of lossy-yet-high-quality JPEG compression seems to fit into the same topic.

(This post was updated on 11-Feb-2008.)

8 Comments

  1. Rob Lancefield | February 6, 2008 at 10:00 am | Permalink

    Hi, Peter. Excellent points. Other folks more centrally involved than I in crafting UPDIG documents could offer more deeply grounded comments, but I’ll give it a shot. As you observe, UPDIG consists mostly of people from professional photography/image organizations, along with a smaller number who do digital imaging and image management for institutional collections; so it aims for a diverse set of potential readers, users, and local applications. We institutional folk may find the guidelines useful to look through and assess against our own communities’ best practices, and then perhaps to harmonize certain of our local practices with UPDIG’s higher-end recommendations, as it were, if appropriate.

    Your first, third, and fourth bulleted points respond to varying degrees, I believe, to the underlying fact that one central purpose of the guidelines is to foster movement towards more sustainable image management by photographers who don’t work in institutional settings (or at least in libraries, archives, or museums), and the consequent need to offer some flexible paths of entry to best practices–a case of trying not to make the closest-we-can-get-to-perfect into the enemy of the reasonably good….

    As a fellow cultural heritage repository person, though, I agree regarding the second point you cite, about JPEG (as you note, not any of the JPEG 2000 family of formats, in which I continue to vest serious hope for many uses). That still leaves me feeling queasy at best, even though I believe it’s meant to offer a lowest-threshold path towards incremental improvement of practices in severely resource-limited, individual-practitioner contexts (in those settings, better less than more compression even if decompression doesn’t yield a bit-for-bit result?) or for “archives” in the looser sense of aggregators and/or providers of images for specific, delimited kinds of end use (not “archives” in our sense of an institutional collection of digital surrogates preserved in maximally use-neutral, lossless ways). Or so I hope. Clarifying the intended context for this does seem important for the next iteration, and I’ll pass that suggestion back to the authors–who do an amazing job of coming out with successively improved versions.

    Perhaps Ken Fleisher, more of an UPDIG insider than I am, could shed more light on this; I’ll point him towards this useful thread. Enough from me, anyway! –Rob

  2. the Jester | February 6, 2008 at 10:38 am | Permalink

    A great clarification on the intent of the UPDIG document. I didn’t read through the whole thing, so it is possible I took the “archive” chapter out of context of the rest of the document and its intended audience. I would agree that a JPEG2000 practice would take some effort given that it is not the native format of digital cameras. Still, it would be useful (I think) if the document’s archive chapter attempted to stratify the options from least-effort to gold-standard as a way to organize the options available to practitioners.

    Thanks for the comment, rob, and for pointing the authors of the UPDIG to this discussion.

  3. Rob Lancefield | February 6, 2008 at 1:24 pm | Permalink

    And three quick followups:

    - another key framing factor is the likelihood that in the coming decades, a growing number of born-digital images initially (and long-) managed by photographers will subsequently enter into public collections, and that encouragement towards better practices (when truly best practices may be unattainable) during those earlier phases of personal image stewardship potentially has wider, long-term benefits.

    - one promising avenue for building connections between the library community and UPDIG may be the Preserving Digital Images initiative of the American Society of Media Photographers, undertaken with Library of Congress support via the National Digital Information Infrastructure and Preservation Program (NDIIPP); http://www.asmp.org/pdi/index.php has more on the “ASMP-PDI,” as it’s known.

    - repository-focused work linked to UPDIG in some other key ways is happening under the auspices of ImageMuse (http://imagemuse.org).

    And with that, I’ll stop for real. Promise.

  4. Ken Fleisher | February 7, 2008 at 2:27 pm | Permalink

    Thank you for pointing me to this discussion. It appears as if my comments about JPEG were attached to the wrong section and I had not caught this. (I will see about having this fixed on the UPDIG site, or at least clarified.) My original comments were in response to the JPEG vs. TIFF issue with respect to image quality in the context of a “Delivery Spec”, not as an archival master. I do feel that a JPEG saved at Photoshop’s level 10 or 12 is a perfectly reasonable delivery format and not inferior to TIFF for a number of reasons. Particularly, I feel that JPEG should not be disregarded as a delivery format because it’s perceived as “lossy”.

    That said, I do also feel that in some situations, though certainly not all, JPEG “could” be appropriate as an archival format as well. Take this example:

    I took an image capture of a painting (color corrected, sharpened, etc., and then downsampled to 8-bit/channel). I saved one version as TIFF and one as JPEG quality 12, baseline optimized. I get the following results when I compare the difference between the two files:

    17,020,752 total pixels in the image

    (RED channel histogram values listed, but similar numbers for GREEN and BLUE channels)
    8-BIT DIFFERENCE LEVEL – # Pixels
    0 – 7,438,696
    1 – 8,128,655
    2 – 1,375,457
    3 – 75,734
    4 – 2,173
    5 – 37
    6 – 0
    7 – 0
    etc.

    Typical camera noise for digital cameras is on the order of 2.5% and follows a gaussian distribution. (This is a gross over-simplification of noise, but is sufficient for this discussion.) That means we can expect 8-bit pixel values to vary by a maximum of plus or minus 6 pixels, with a standard deviation of about 1.5 – 2 pixels. In other words, based on noise characteristics, we can expect our image captures to have errors of plus or minus “2″ for 2/3 of the pixels, with errors no larger than 6. Although the example does not exactly fit this model, it should be clear that the errors introduced by saving as JPEG are very close to being on the order of typical image capture noise. Errors such as this will be visually lossless in most cases. An exception might be if using a colorspace such as ProPhoto RGB, then there may be “very” slight visual differences. Therefore, I believe that in some cases where it is not necessary to save the raw camera capture, JPEG can be used instead of TIFF for archival purposes. Stated another way, a JPEG file saved as above does not add any more distortion than typical camera noise. As long as the effect (which is admittedly compounded with the camera noise) is not visible, then why is this any worse of an archive file? I’ll stress again, this depends highly on the application and purpose of the archive you are creating, but I do see it as a plausible format for some cases.

  5. Ken Fleisher | February 8, 2008 at 11:40 am | Permalink

    Upon reviewing my documents, I realize that my comments on JPEG as an archiving format were not taken out of context. I made the recommendation for both a delivery format and (conditionally) as an archiving format. However, the quoted text above was truncated before the qualifying statements, which essentially restate my comments from my last post. The full text on the UPDIG site reads:

    “Archiving JPEG files: Conventional wisdom holds that the TIFF format holds a quality advantage over the JPEG format. This holds true only if the JPEG file is saved at less than 10 quality using the Photoshop standard. When using JPEG quality 10 or 12, the artifacts are either non-existent or insignificant. Higher bit-depth is really the only advantage of using TIFF over JPEG 10 or 12 (in terms of image quality). Some have argued that that JPEG, because of the way it encodes data, compromises color. This is a misconception. When using the highest quality settings, there is no loss of color fidelity. Therefore, if JPEG files are saved at 10-12 quality, and if they do not require much pixel editing before use, archiving JPEG files is not a bad concept, and it can save a lot of space. For many picture archives, the economics of storing large numbers of files dominates all other considerations, and JPEG offers a feasible solution to the problem.”

    This relates the ideas that 1) the image is already edited or does not require significant editing, 2) that only the highest JPEG quality levels are used, and 3) that the solution is not for everybody but when appropriate, it can have certain advantages–namely significant space savings (translates to “cost savings” for most of us). Given these qualifying statements, I don’t feel the text requires any changes.

    Also after re-reading the original message, I believe that for most professional uses, JPEG (with stated constraints) should be a viable solution. For scientific uses where introduction of any compounding error can distort experimental results, JPEG is clearly not a solution. But since these errors are so small (most are less than plus or minus 2), they simply will not be visible in most cases (large gamut color spaces noted as an exception, but even then the errors are only “slightly” visible and only if you look for them). I’m not saying that errors aren’t introduced, but they truly are insignificant.

  6. the Jester | February 11, 2008 at 1:13 pm | Permalink

    Ken –

    For me, the key part of your reply was:

    I feel that JPEG should not be disregarded as a delivery format because it�s perceived as “lossy”.

    I, for one, and just coming to terms with the concept of “visually lossless” and its associated consequences as a practice for terminal (e.g. archival) copies. It is, of course, much easier to say “JPEG is a lossy compression method, therefore it should not be used for archival copies.” It is much harder to give the nuanced argument, as you did with your example comparing a TIFF with its corresponding JPEG quality 12 instance. (I almost used the word “derivative” instead of “instance” — demonstrating that I’m still struggling internally with the nuanced argument.)

    I do not want to give a misleading impression to anyone who comes upon this post, so I’ve updated the text of the main posting with the full paragraph from the UPDIG site. Thanks for offering your perspective and setting the record straight.

  7. Howard Brainen | February 14, 2008 at 1:46 pm | Permalink

    I have no problem with JPEG as a proper format as a “delivery spec.” But I can’t recommend archiving any image file that started life with more than 8 bits per channel as a JPEG. Those extra bits often contain significant amounts of useful information. One needs to save as much information as possible because some of it will be lost when files are later delivered for various outputs.

  8. the Jester | February 14, 2008 at 7:19 pm | Permalink

    Howard,

    The full version of Ken Fleisher’s recommendation, recently added to the text of this posting above, does acknowledge the benefit of TIFF over JPEG for greater-than-8-bit-depth images:

    Higher bit-depth is really the only advantage of using TIFF over JPEG 10 or 12 (in terms of image quality).

    The recommendation also says that JPEG is appropriate for terminal copies (my phrase, the text says “and if they do not require much pixel editing before use”). I think, in other words, no further transformations are expected.

3 Trackbacks

  1. Kramer auto Pingback[...] Blog post: Is JPEG Good Enough for Archival Masters? [...]

  2. Kramer auto Pingback[...] As to the issue of using JPEG for “definitive” storage of image data, here’s a link to a discussion on that point which is currently going on among archivists: http://dltj.org/article/jpeg-as-master/ [...]

  3. Mining Metadata » JPEG and Archives | February 29, 2008 at 7:28 am | Permalink

    [...] http://dltj.org/article/jpeg-as-master/ [...]

Post a Comment

Your email is never published nor shared. Required fields are marked *
Human Detection Scheme
(What's this?)
Comment Preview

Additional comments powered by BackType

Subscribe without commenting

From the Disruptive Library Technology Jester (http://dltj.org/), printed on Tuesday the 9th of February 2010 at 8:03:31 AM EST (-0500). The URL to this page is http://dltj.org/article/jpeg-as-master/

[Creative Commons Logo] This work is licensed under the Creative Commons Attribution-NonCommercial-ShareAlike 3.0 United States License. To view a copy of this license, visit http://creativecommons.org/licenses/by-nc-sa/3.0/us/ or send a letter to Creative Commons, 543 Howard Street, 5th Floor, San Francisco, California, 94105, USA.