Table of supported image formats in EMAN2

Type

Extension

Read

Write

3D

Image Stacks

Region I/O

Comments

Primary EMAN2 Formats

BDB

N/A

Y

Y

Y

Y

Y

This entry is for EMAN2's embedded database system, used by default for most operations. For portability see HDF5. BDB and HDF5 are the only 2 formats that support the full EMAN2 Metadata. These files are in EMAN2DB directories and should not be manipulated manually. Please see this important note

HDF5

hdf

Y

Y

Y

Y

Y

HDF5 is an international standard for scientific data (http://www.hdfgroup.org/HDF5/). It supports arbitrary metadata (header info) and is very portable. This is the standard interchange format for EMAN2. Chimera can read EMAN2 style HDF files.

Cryo-EM Formats

DM2 (Gatan)

dm2

Y

N

N

N

N

Proprietary Gatan format (older version)

DM3 (Gatan)

dm3

Y

N

N

N

N

Proprietary Gatan format from Digital Micrograph

DM4 (Gatan)

dm4

Y

N

Y

Y

N

Proprietary Gatan format from Digital Micrograph, used with K2 cameras

SER (FEI)

ser

Y

N

N

Y

N

Proprietary FEI format (Falcon camera ?)

EM

em

Y

Y

Y

N

Y

As produced by the EM software package

ICOS

icos

Y

Y

Y

N

Y

Old icosahedral format

Imagic

img/hed

Y

Y

Y

Y

Y

This format stores header and image data in 2 separate files. Region I/O is only available for 2D. The Imagic format in EMAN2 is fully compatible with Imagic4D standard since the 2.0 release.

MRC

mrc

Y

Y

Y

N

Y

Largely compatible with CCP4. Note that some programs will treat 3D MRC files as stacks of 2D imagess (like IMOD). This behavior is partially supported in EMAN, but be aware that it is impossible to store metadata about each image in the stack when doing this, so it is not suitable as an export format for single particle work. EMAN2 support reading of FEI MRC, which is an extended MRC format for tomography. The extra header information will be read into the header. All FEI MRC images will be 2-byte integer.

Spider

spi

Y

Y

Y

Y

Y

To read the overall image header in a stacked spider file, use image_index = -1.

SER

ser

Y

N

N

Y

N

Also known as TIA (Emospec) file format, used by FEI Tecnai and Titan microscope for acquiring and displaying scanned images and spectra

Other Supported Formats

Amira

am

Y

Y

Y

N

N

A native format for the Amira visualization package

DF3

df3

Y

Y

Y

N

N

File format for POV-Ray, support 8,16,32 bit integer per pixel

FITS

fts

Y

N

Y

N

N

Widely used file format in astronomy

JPEG

jpg/jpeg

N

Y

N

N

N

Note that JPEG images use lossy compression and are NOT suitable for quantitative analysis. PNG (lossless compression) is a better alternative unless file size is of critical importance.

LST

lst

Y

Y

Y

Y

N

ASCII file contains a list of image file names and numbers. Used in EMAN1 to avoid large files. Not commonly used in EMAN2

LSTFAST

lsx/lst

Y

Y

Y

Y

N

Optomized version of LST

OMAP

omap

Y

N

Y

N

N

Also called DSN6 map, 1 byte integer per pixel

PGM

pgm

Y

Y

N

N

N

Standard graphics format with 8 bit greyscale images. No compression.

PIF

pif

Y

Y

Y

Y

N

Purdue Image Format. This will read most, but not all PIF images. Recent support added for mode 40 and 46 (boxed particles). Some of the FFT formats cannot be read by EMAN2. PIF writing is normally done in FLOAT mode, which is not used very often in PIF. PIF technically permits only images with odd dimensions, EMAN does not enforce this.

PNG

png

Y

Y

N

N

N

Excellent format for presentations. Lossless data compression, 8 bit or 16 bit per pixel

SAL

hdr/img

Y

N

N

N

N

Scans-A-Lot. Old proprietary scanner format. Separate header and data file

SITUS

situs

Y

Y

Y

N

N

Situs-specific ASCII format on a cubic lattice. Used by Situs programs

TIFF

tiff/tif

Y

Y

N

N

N

Good format for use with programs like photoshop. Some variants are good for quantitative analysis, but JPEG compression should be avoided.

V4L

v4l

Y

N

N

N

N

Used by some video-capture boards in Linux. Acquires images from the V4L2 interface in real-time(video4linux).

VTK

vtk

Y

Y

Y

N

N

Native format from Visualization Toolkit

XPLOR

xplor

Y

Y

Y

N

N

8 bytes integer, 12.5E float ASCII format

Saving EMData from Python

If you want to be prompted to supply a image name using a standard dialog use this approach

   1 # save a single image
   2 a = test_image()
   3 from emsave import save_data
   4 save_data(a)
   5 
   6 # save a list of images as a stack
   7 b = test_image(2)
   8 save_data([a,b])

Region I/O

Region I/O means you can read/write only part of an image from/to a file. This is useful when you process a huge image file on a computer with limited resource. The region specification is the same as in EMData::get_clip() function. For regional reading, you can specify a region inside an image, partially out of image bounds, or even completely out of image bounds. For regional writing, the region must be completely inside image bounds. You need be aware when reading a partially out of bounds region from HDF5 file, it will involve copying the inbound part image to the region size memory. That means there is a performance overhead for partially out bounds region reading.

   1 # Read a origin at (1,1,1), size 8x8x8 subregion from an 64x64x64 image file 3dimage.hdf
   2 img = EMData
   3 region = Region(1,1,1,8,8,8)
   4 img.read_image("3dimage.hdf",0,False,region)

Storage type

Inside EMAN2 all pixel data is 32 bit floating point. Many file formats also support other storage modes. The various formats are defined in a dictionary imported from EMAN2.py: file_mode_map. There is also a file_mode_range dictionary which contains the numeric limits for each type. If you set the header values renfer_min and render_max in each image before writing, this will control how the float data is scaled to the specified mode. ie - if render_min is 0 and render_max is 1.0, then the 0-1 range in the internal image will be mapped to the full available scale of (integer mode) output formats. Note also that not all file formats support all modes.

Here are some examples of how to write in alternative formats:

img = EMData(128,128)
img.write('float-image.mrc')  #by default, image will be write as float
img.write_image('short-image.mrc', 0, EMUtil.ImageType.IMAGE_MRC, False, None, EMUtil.EMDataType.EM_SHORT) #write mrc file in short (16bit)
img.write_image('byte-image.mrc', 0, EMUtil.ImageType.IMAGE_MRC, False, None, EMUtil.EMDataType.EM_UCHAR) #write mrc file in byte (8bit)

In the last write_image() funciton call, 'byte-image.mrc' is the file name you write to. Second argument, 0 is the image index in a stack. It's always 0 since MRC does not support stack. third argument, EMUtil.ImageType.IMAGE_MRC is the file type you are writing to, type 'help(EMUtil.ImageType)' in python will print out all type supported by EMAN2. forth argument, False means NOT header only. fifth argument, None means we are not doing Region I/O. sixth argument, EMUtil.EMDataType.EM_UCHAR specify the data storage type for this image file. type 'help(EMUtil.EMDataType)' will list all datatypes. For MRC, three types are supported.

WRITING TO THE HEADER OF AN IMAGE

For some reason, you have to specify the type of an image via a monstrous flag if you want to write to the header of the image without actually opening/loading it. Say you load ONLY the header of an image (in python) by doing:

a=EMData('myimage.hdf',0,True)   <--- "0" means "load the first image in the file/stack", while "True" means "load the header only".

And then you define a new header parameter:

a['my_new_parameter'] = 'whatever_value'

To write out the new header into the image, you cannot simply say a.write_image('myimage.hdf',0,True), but actually have to do it this way:

img.write_image("test.hdf",0,EMUtil.ImageType.IMAGE_HDF,True)

The monstrous flag EMUtil.ImageType.IMAGE_HDF can either correspond to the image format you are writing to, or be left undefined: EMUtil.ImageType.IMAGE_UNKNWON, BUT you have to write it out nonetheless (it is what it is...).

Note that this is NOT the case if you load the ENTIRE image (opposed to just the header). If you load the ENTIRE image, you can reasonably set or reset a value on the header as follows:

img=EMData('my_file.hdf',0)
img['my_parameter']=whatever_value
img.write_image('test.hdf',0)