Sunday, September 16, 2012

The Great Electronic Lab Notebook Challenge, pt. II

DEVONthink Pro Office and MacJournal: A review and comparison

This is the second in a series of posts on searching for an ELN suitable for use in my research group.  This question is related to a series of hardware, workflow and "data management" questions.  These I will address elsewhere.  In this post I discuss my experiences with MacJournal and DEVONthink Pro Office. In The Great ELN Challenge, pt. I, I laid out what I'm looking for in an ELN, and how it fits my ideal group workflow.  Subsequent posts will address my experiences with MacJournal and DEVONthink Pro Office.

Both of the software tools discussed here are for Mac OS X.

MacJournal by Mariner Software
The MacJournal interface

The strength of MacJournal is also it's weakness:  it is journaling software.  This is well highlighted in Macworld's review.  Using an interface similar to  pre-Mountain Lion Mac Mail, MacJournal allows you to write text notes into which you can include images and PDFs.  Notes, or entries, are stored in a folder-based hierarchy.  Sets of notes ("notebooks") are searchable, and there are a range of display options.  These display options all focus on a chronological presentation of entries.  Therefore as a journal or chronological personal notebook, MacJournal is great.  You can even attached scanned input to text entries, e.g. as PDFs or images.  Even so, it is not generally possible to attach, include or work with non-text files (e.g., MS Excel files, binary data files, large ASCII data files).

There does not appear to be any way to lock/encrypt entries to prevent subsequent modification.  This is despite the fact that entries are "lock-able".  The issue is that this functionality is designed only to prevent unintentional modification, and can be turned on and off by the user at will.

There is now Dropbox support, but there is no server capability, and no multi-user database capability.  For personal use as a journal or notebook, MacJournal is great.  As an ELN it lacks critical functionality--chief among these is the ability to collect non-note files.

DEVONthink Pro Office by DEVON Technologies
The DEVONthink Pro Office interface

DTPO is basically an open database, wrapped in a relatively sophisticated, user friendly and extensible interface.  (Macworld review here.)  Files are imported into (external file physically copied into the DTPO database), indexed to (link to original file as extracted text contained in file are added to the DTPO database), or created within DTPO (new file created using either the built-in DTPO previewer or the native application associated with the file type created).

In the interface, you organize your files into a folder structure.  This appears to be similar, if not identical, to simply saving the files to disk. In DTPO, though, the files themselves are stored in a flat database. The database is open and unencrypted, so you can always access your files directly (there is even a button on the DTPO toolbar that opens a Finder window at the location of the file selected in DTPO).

In contrast to your disk-based file structure, the DTPO interface gives you access to a range of capabilities and metadata associated with your files via the underlying database.  For example, like in Gmail, the folders you create in the interface are really just tags, and you can manually tag individual files however you like.  Smart folders can slice and dice tagged files just like in Gmail.  Files can appear in multiple locations in your folder tree (i.e. be tagged by multiple folders), but even more powerfully, can either appear as "duplicates" (a separate copy of a file) or "replicates" (basically a soft link to a file).

Every file included in the database automatically has any contained text extracted and processed via DTPO's "AI", which seeks to improve searching by suggesting potentially related content, and allowing searches to be executed over all text in the database.  This concept is extended by the inclusion of a powerful OCR engine integrated with DTPO itself.

"Indexing" (versus "importing") files allows large files (e.g., files containing research data) to be linked to within the database without having to duplicate the files themselves.  Indexed files can live on remote  or archive disks that are not always mounted to the system running DTPO.  In addition, indexing folders allows "file groups" to be included in DTPO.  This is particularly useful when working with LaTeX documents (where the .tex, .aux, .bib, .dvi, etc., files must appear "together" to the LaTeX engine, not spread across DTPO's internal database), or coding projects (where multiple source files and make files must remain associated on the physical file system).  Note also, that "indexing" allows the metadata (and text) of "archived" files to be available for live searching.

DTPO does not lend itself directly to journaling, although RTF files can, of course, be created at will within DTPO itself.  Basically, while MacJournal provides more chronological organization than may be strictly necessary in an ELN, by default, DTPO provides NO chronological organization beyond date/time stamps on physical files.  While this is a weakness, it serves to highlight an additional (and major) strength of DTPO:  scriptability.

DTPO can be scripted at a number of levels.  You can create Automator scripts that push actions in DTPO, and you can create "smart document" creation scripts that push database actions at the time of file creation.  This last, combined with tagging and searching capabilities, allows the creation of journal capabilities in DTPO.  For example, the date and time can be auto-generated and inserted into a newly created RTF file.  This file can be auto-tagged at creation with, e.g., a "Journal" tag.  The creation of a smart folder in the interface filter for all "Journal" tagged files will then show all journal entries throughout the database.  This would allow, for example, journal entries to live topically-arranged in the file structure in the interface, but also appear chronologically by creation data/time in the Journal smart folder.

The "Pro Office" version--that is, the DT"PO" being discussed here--allows the creation of multiple databases whose search and AI are separate.  It also comes with a built-in web server allowing the database to be accessed via the web remotely.  While this gives definite multi-portal access to the database, it should also, in theory, give rudimentary multi-user capabilities, though I have yet to explore this.  A smattering of scripts are included in the DTPO install version, and an additional smattering are available for download at the DTPO support site.  In a later post, I will share scripts I have created.

In addition to scripting capabilities, there are a number of plug-ins and extensions for DTPO, chief among these are tools to integrate DTPO will all major email clients and web browsers.  This allows direct addition of emails and webpages to DTPO.

On the negative side, there does not appear to be any encryption or locking capability as all files remain modifiable.

Summary

Neither MacJournal nor DTPO are ideal for an ELN, primarily because they lack the ability to lock files and input.  DTPO, though, comes quite close.  As a database-centered tool for collecting and collating data in all forms, DTPO significantly outshines MacJournal as a research ELN.

Stay tuned for late posts on my workflow with DTPO, as well as some scripts and smart document templates I use in my research.

No comments:

Post a Comment