Inventory of Resources for Datasharing in Neuroimaging and Electrophysiology Research
Connectome Mapping Toolkit http://www.connectome.ch
This is a Python-based open source toolkit for magnetic resonance connectome mapping, data managment, sharing, visualization and analysis. The ConnectomeViewer connects Multi-Modal Multi-Scale Neuroimaging and Network Datasets For Analysis and Visualization in Python. The application defines its own data format, the Connectome File Format (cfflib; http://cmtk.org/cfflib) for multi-modal data integration (metadata, networks, surfaces, volumes, tracks, time series, other data).
As the size and complexity of data generated from functional genomics experiments grows, so does the requirement for standard data formats. FuGE (Functional Genomics Experiment) [Object Model / Markup-Language] (FuGE-OM, FuGE-ML) has been created to facilitate the development of data standards and address the many challenges in data archiving, sharing and querying of such experiments.
FUSE (filesystem in a userspace) is a loadable kernel module for Unix-like computer operating systems that allows non-privileged users to create their own file systems without editing the kernel code. Features of the program are: simple library API; simple installation (no need to patch or recompile the kernel); secure implementation; very efficient userspace - kernel interface usable by non-privileged users; and ability to run on Linux kernels 2.4.X and 2.6.X.
The Concierge Project, developed by the Laboratory for Neuroinformatics in RIKEN Brain Science Institute, aims to support research resource sharing in neuroscience communities. This database software manages all types of files based on their metadata by simple operations. A key feature of the software is its pluggable configuration. Several application plug-ins specific to neuroscience research have been developed, e.g. literature manager, electronic laboratory notebook, etc.
NIF Vocabularies / Neurolex http://neuinfo.org/#vocab
NIF (Neuroscience Information Framework) has developed a comprehensive vocabulary for annotating and searching neuro- science resources. The vocabularies are available for download as an OWL file and also through the NCBO BioPortal. A critical component of the NIF project is a consistent, flexible terminology that can be used to describe and retrieve neuroscience-relevant resources. With the advent of shared information systems the need for a common semantic framework for neuroscience has become critically important, all the more so if individual researchers and automated search agents are to access and utilize the most up-to- date information. To address this need, NIF has created NeuroLex, a comprehensive lexicon of common neuroscience terminology woven into an ontologically consistent, unified representation of the biomedical domains typically used to describe neuroscience data.
The NPRC (Neuroscience Peer Review Consortium) is an alliance of neuroscience journals that have agreed to accept manuscript reviews from other members of the Consortium. Its goals are to support efficient and thorough peer review of original research in neuroscience, speed the publication of research reports, and reduce the burden on peer reviewers.
CARMEN (Code Analysis, Modeling and Repository for e-Neuroscience) is a web-based virtual laboratory for analysis and sharing of time-series (neurophysiological) data. The platform contains both a repository of data and analyses as well as the infrastructure to support the deployment of analysis services that can be run remotely. The metadata structure is based on the MINI (Minimum Information about a Neuroscience Investigation) schema.
CRCNS (Collaborative Research in Computational Neuroscience) is a web resource that provides a marketplace and discussion forum for sharing experimental data and other resources, such as stimuli and analysis tools. Information about the aims and scope of this site is given in an article published in Neuroinformatics (pdf available here). CRCNS hosts experimental data sets of high quality that will be valuable for testing computational models of the brain and new analysis methods. The data include physiological recordings from sensory and memory systems, as well as eye movement data.
The German INCF Node has been established to facilitate interaction and collaboration between neuroscientists, with a focus on cellular and systems neurophysiology. Tools and infrastructure for data access, data storage and exchange, and data analysis are being developed. One component is a central data management platform (http://www.g-node.org/data) where neuroscientists can store and organize their data for analysis, sharing, and collaborative work.
A public repository for neurophysiological data from which users can download datasets as well as contribute their own.
Data Formats and Data Models
Proposed standard for the encoding and storing of biomedical time-varying signals such as the electrocardiogram (ECG), electromyogram (EMG), electroencephalogram (EEG), electrocorticogram (ECoG), electrooculogram (EOG), blood pressure and signals associated with respiration, speech, etc. The BiosignalML standard would include metadata annotation that provides information on the context for the recorded signal, using community accepted bio-ontologies where appropriate. The standard will also need to include dense data formats for efficient storage of large data sets.
Data Format Description Language
EDF and EDF+ http://www.edfplus.info
EDF (European Data Format) is a simple and flexible format for exchange and storage of multichannel biological and physical signals which has become the de-facto standard for EEG and PSG recordings in commercial equipment and multicenter research projects EDF+ files additionally support interrupted recordings, annotations, stimuli and events. Thus, EDF+ can store any medical recording such as EMG, evoked potentials, ECG, as well as automatic and manual analysis results, such as sleep stages.
MIEN is a package of interface and library code intended to make a number of scientific modeling, data markup, and data storage tasks easier. MIEN was developed for use in computational Neuroscience, but a good deal of effort has gone into making the core MIEN package a general purpose scientific modeling and data visualization tool with a flexible extension system. MIEN is implemented primarily in Python.
Python object model for neuroscience data. Documentation at http://packages.python.org/neo. This is work in progress with discussion at http://groups.google.com/group/neuralensemble.
Neuroshare is an initiative to develop a standard for accessing neurophysiological data from any vendor's acquisition device or software. An API is defined, and vendors and communities are encouraged to provide implementations of a library of functions that can read data files collected with that vendor's instrument or software.
This framework provides a vendor-independent mechanism for translating between raw and processed neurphysiology data in the form of time and image series. Implemented in CARMEN but may also be useful for third-party applications.
SignalML is an XML-based language, designed for meta-description of formats, used for digital storage of biomedical time series. Using SignalML, information on the structure of binary data files can be simply and efficiently coded.
E-Prime is a commercial suite of applications providing an easy to use environment for computerized psychology experiment design, implementation, and analysis. The applications together provide a flexible means of covering all the steps from initial design through to data processing and management within a single environment.
A modular, web-framework-based data and metadata management system for for developing customized neuroscience databases.
MINI (Minimum Information about a Neuroscience Investigation) http://www.carmen.org.uk/standards
This metadata schema identifies the minimum reporting information required to describe an electrophysiological study. Originally developed for submission to the CARMEN system it has now been adopted and extended for other databases.
odML (open metadata Markup Language) is an initiative to define and establish an open, flexible and easy-to-use format to transport metadata. The project was initiated to connect two software projects, RELACS for acquiring electrophysiological data and the LabLog for project documentation and (meta)data management, and is now being further developed at the German INCF Node (www.g-node.org).
The Yogo Data Management Framework is a set of software tools created to rapidly build scientific data-management applications. These applications will enhance the process of data annotation, analysis, and web publication.
MIAME describes the Minimum Information About a Microarray Experiment that is needed to enable the interpretation of the results of the experiment unambiguously and potentially to reproduce the experiment. MIAME does not specify a particular format or any particular terminology, although it does make recommendations to make the data more easily accessed and understood.
ModelDB provides an accessible location for storing and efficiently retrieving computational neuroscience models. Models can be coded in any language for any environment. Model code can be viewed before downloading and browsers can be set to auto- launch the models.
Datasharing Initiatives and Meta-analysis Resources
BrainMap is an online database of published functional neuroimaging experiments with coordinate-based x,y,z) activation locations in Talairach and/or MNI space; also includes meta-analysis for coordinate results.
Brainscape is a database for resting state functional connectivity studies. Functional connectivity has shown tremendous promise in mapping the intrinsic functional topography of the brain, evaluating neuroanatomical models, and investigating neurological and psychiatric disease. Brainscape includes a repository of public and private data and an analysis engine for exploring the correlation structure of spontaneous fluctuations in the fMRI BOLD signal.
The fMRI Data Center is a public repository of peer-reviewed fMRI studies and their underlying data. The ultimate goal of the fMRIDC is to help speed the progress and the understanding of cognitive processes and the neural substrates that underlie them by providing: a publicly accessible repository of peer-reviewed fMRI studies; all metadata necessary to interpret, analyze, and replicate these fMRI studies; and training for both the academic and professional communities. fMRIDC hosts a web-accessible database that has data mining capabilities and the means to deliver requested data to the user.
fMRI Methods Wiki http://www.fmrimethods.org.
This site is meant for ongoing discussion and editing of a set of minimally acceptable standards for the reporting of research studies using fMRI.
NeuroGenerator is a database project originally funded by the European Commission. The overall purpose is to make databases with data in comparable and compatible formats suited for meta-research and for making models of the cerebral cortex of the human brain. Researchers can submit their own raw PET-data and fMRI data to NeuroGenerator. See Appendix B for metadata structure of NeuroGenerator.
NeuroSynth is a platform for large-scale, automated synthesis of functional magnetic resonance imaging (fMRI) data extracted from published articles. Currently in preview mode.
The Neuroimaging Data Access Group is an international working group dedicated to enhancing access to neuroimaging data in order to advance progress in neuroscience.... NIDAG seeks to establish a universal coordinate database, including both past papers and future studies.
OpenfMRI http://www.openfmri.org OpenfMRI.org (contact Russ Poldrack) is a project dedicated to the free and open sharing of functional magnetic resonance imaging (fMRI) datasets, including raw data.
Pediatric MRI Data Repository http://www.bic.mni.mcgill.ca/nihpd/info/index.html
This site provides information about the NIH MRI Study of Normal Brain Development (Pediatric MRI Study) and resulting Pediatric MRI Data Repository. This website serves as the portal through which data can be obtained by qualified researchers. This multi-site longitudinal study used technologies (anatomical MRI, diffusion tensor imaging (DTI), and MR spectroscopy (MRS)) to map pediatric brain development.
SumsDB is a repository of brain-mapping data (surfaces & volumes; structural & functional data) from many laboratories.
XNAT Central http://central.xnat.org
1000 Functional Connectomes Project http://www.nitrc.org/projects/fcon_1000
The ‘1000 Functional Connectomes’ Project provides unrestricted access to 1200+ ‘resting state’ functional MRI (R-fMRI) datasets independently collected at 33 sites. Age, sex and imaging center information are provided for each of the datasets that are otherwise anonymized, with no protected health information included. This data-sharing effort provides researchers with a means of exploring and refining R-fMRI approaches.
Data Formats, Databases and Workflows
Human Imaging Database (HID) http://www.nitrc.org/projects/hi) HID is an extensible database management system developed to handle the increasingly large and diverse datasets collected as part of the mBIRN and fBIRN collaboratories.
LONI Pipeline http://pipeline.loni.ucla.edu
The LONI Pipeline is a free workflow application primarily aimed at neuroimaging researchers. With the LONI Pipeline users can quickly create workflows that take advantage of all the greatest neuroimaging tools available.
NIfTI (Neuroimaging Informatics Technology Initiative) aims to provide coordinated and targeted service, training, and research to speed the development and enhance the utility of informatics tools related to neuroimaging. The NIfTI-1 data format has been developed to facilitate inter-operation of functional MRI data analysis software packages.
Nipype (Neuroimaging in Python Pipelines and Interfaces) is an open-source, community-developed initiative under the umbrella of NiPy. It provides a uniform interface to existing neuroimaging software and facilitates interaction among packages within a single workflow. The specificity of Nipype over other workflow packages is that offers interfaces to many existing mainstream packages (e.g., SPM, FSL, FreeSurfer, AFNI, Slicer, etc...) and an environment that encourages interactive exploration of algorithms from those, eases the design of workflows within and between packages, and reduces the learning curve necessary to use different packages.
The XCEDE (XML-Based Clinical Experiment Data Exchange) schema provides an extensive metadata hierarchy for describing and documenting research and clinical studies. The schema organizes information into five general hierarchical levels: the complete project; studies within a project; subjects involved in the studies; visits for each of the subjects; and a full description of the subject's participation during each visit. Each sub-schema is composed of information relevant to that aspect of an experiment and can be stored in separate XML files or spliced into one large file allowing for the XML data to be stored in a hierarchical directory structure along with the primary data. Each sub-schema also allows for the storage of data provenance information allowing for a traceable record of processing and/or changes to the underlying data. Additionally, the sub-schemas contain support for derived statistical data in the form of human imaging activation maps and simple statistical value lists. XCEDE was originally designed in the context of neuroimaging studies and complements the Biomedical Informatics Research Network (BIRN) Human Imaging Database (HID), an extensible database and intuitive web-based user interface for the management, discovery, retrieval, and analysis of clinical and brain imaging data.
XNAT (Extensible Neuroimaging Archive Toolkit) is an open source software platform designed to facilitate management and exploration of neuroimaging and related data. XNAT includes a secure database back-end and a rich web-based user interface.
Attribution rules - data article etc
Articles about Data Sharing
- Summary of a (still relevant!) 2007 workshop