Differences between revisions 43 and 44
Revision 43 as of 2010-12-07 10:38:39
Size: 1998
Editor: root
Comment:
Revision 44 as of 2011-01-04 04:52:51
Size: 3420
Editor: root
Comment:
Deletions are marked like this. Additions are marked like this.
Line 7: Line 7:
EMEN2 is an object oriented database and electronic lab notebook. It is designed to store scientific data in a freeform way without limiting the ability to search/mine the results. Unlike a traditional database, where the contents of each record type (table) must be defined by a database administrator and strictly adhered to, each individual record in EMEN2 can have arbitrary additional parameters outside the record definition, and all such parameters remain fully searchable. Structural and computational biologists frequently work with complex data sets assembled from diverse experimental sources, public resources, and analysis methods. Archiving and mining these data sets with their complicated interrelationships remains a persistent challenge, particularly with “open science” initiatives to make entire workflows, including all raw and intermediate data, available with publications.
Line 9: Line 9:
Records in the database may be arbitrarily linked to each other, much like the web. Any record may link to an arbitrary number of other records of arbitrary type (the record's children). Many other records may link to each record (the record's parents). This permits, for example, a publication to be linked into a publications folder as well as being linked to a specific project; or a microscopy session may be a child of both the biological research project as well as the microscope the data was collected on. To address these needs, we have developed EMEN2, an object-oriented scientific database and electronic notebook. EMEN2 uses a flexible schema based on plain text descriptions of experimental protocols. These protocols may be local and describe techniques and data within a single lab group, reference published ontologies (e.g. GO, NCBO BioPortal), or contain links to external resources (PDB, GenBank, etc.). Similarly, an EMEN2 installation can itself act as a resource, providing public access to selected protocols and data. While originally developed to serve the needs of the cryo-EM community, we believe EMEN2’s architecture provides an excellent foundation for many other scientific endeavors.

EMEN2 is developed using all open-source technologies. The core database is written in the Python programming language, with BerkeleyDB providing a robust embedded database back-end. The infrastructure is highly modular, permitting new ontologies to be fully implemented using only it’s “Web 2.0” interface. In addition, there is a remote API available for client applications. The included EMDash program is a standalone GUI tool for equipment integration, currently used to upload data transparently from our electron microscopes as it is being collected, as well as integrate with other lab equipment. The EMEN2 server itself can be extended in a similar way by writing custom Python modules, which can expose additional views to the Web interface, or new methods to the API.

A full ontology for cryo-EM has been established for internal use and has been in active use at the NCMI for ~3 years. It is used to archive all data at the center, and currently provides services for over 750 users, with over 16 terabytes of data in 460,000 records.
As an example of its extensibility and ontology mapping capabilities, we have developed a module for harvesting the database and producing PDB compliant XML files which can be used to seed a structure deposition to EMDatabank.org.

EMEN2

EMEN2

An extesible, object-oriented electronic lab notebook

Structural and computational biologists frequently work with complex data sets assembled from diverse experimental sources, public resources, and analysis methods. Archiving and mining these data sets with their complicated interrelationships remains a persistent challenge, particularly with “open science” initiatives to make entire workflows, including all raw and intermediate data, available with publications.

To address these needs, we have developed EMEN2, an object-oriented scientific database and electronic notebook. EMEN2 uses a flexible schema based on plain text descriptions of experimental protocols. These protocols may be local and describe techniques and data within a single lab group, reference published ontologies (e.g. GO, NCBO BioPortal), or contain links to external resources (PDB, GenBank, etc.). Similarly, an EMEN2 installation can itself act as a resource, providing public access to selected protocols and data. While originally developed to serve the needs of the cryo-EM community, we believe EMEN2’s architecture provides an excellent foundation for many other scientific endeavors.

EMEN2 is developed using all open-source technologies. The core database is written in the Python programming language, with BerkeleyDB providing a robust embedded database back-end. The infrastructure is highly modular, permitting new ontologies to be fully implemented using only it’s “Web 2.0” interface. In addition, there is a remote API available for client applications. The included EMDash program is a standalone GUI tool for equipment integration, currently used to upload data transparently from our electron microscopes as it is being collected, as well as integrate with other lab equipment. The EMEN2 server itself can be extended in a similar way by writing custom Python modules, which can expose additional views to the Web interface, or new methods to the API.

A full ontology for cryo-EM has been established for internal use and has been in active use at the NCMI for ~3 years. It is used to archive all data at the center, and currently provides services for over 750 users, with over 16 terabytes of data in 460,000 records. As an example of its extensibility and ontology mapping capabilities, we have developed a module for harvesting the database and producing PDB compliant XML files which can be used to seed a structure deposition to EMDatabank.org.

EMEN2 Demo

There is a publicly accessible, read-only EMEN2 installation for accessing the NCMI's public datasets:

http://ncmi.bcm.edu/publicdata/db/home/

An overview document has been created to introduce new users to the EMEN2 web interface. It includes a number of screenshots.

Installation and Configuration

* Download

* Dependencies

* Install

* Configuration

* Maintenance and Backups

EMEN2 Client Documentation

* Command line upload/download client: emen2client.py

* EMDash microscope client

* EMAN2 Integration

EMEN2 API

Coming soon

EMEN2 (last edited 2013-04-22 20:02:57 by IanRees)