PyXNAT Workshop

From Neuroimaging Datasharing
Jump to: navigation, search

12-14 July 2011

Washington University in St. Louis

Attendees: Alexandra Anghelescu, Deech, David Gutman, Adam Harding, David Just, Dan Marcus, Tim Olsen, Yannick Schwartz, Mark Scully

Contents

Introduction

XNAT (The Extensible Neuroimaging Archive Toolkit), is an extensible database for neuroimaging and related data that is growing in popularity as many databases are based on this technology or plan to do so. It features a set of functionalities, including a web interface, that makes it easier to share data in a research institution or across mutliple research centers. As those databases grow larger, it becomes crucial to be able to script them in order to automate some tasks that would otherwise be very difficult to achieve using a graphical user interface. These tasks include but are not limited to: large datasets inclusion, single and multi-database querying, large scale data analysis. The ability to plug in directly programs to an XNAT database gives an opportunity to populate automatically the database with the results that can in turn be shared between users. Scripting interfaces allow easier multi-database querying and may marshal a community effort to standardize data schemas.

PyXNAT is a python library that communicates with XNAT. It gives XNAT access to neuroimaging tools in Python, such as NiPyPE which is a workflow engine that provides interfaces for all major neuroimaging softwares such as SPM, FSL and Freesurfer.

Description

This meeting will bring different PyXNAT users together to discuss the inner workings of the package as well as its future. The main idea is to help people in contributing in the package so that it can be developed and maintained by the community. This is a crucial step to ensure the continued existence and improvement of PyXNAT.

The meeting will comprise a tutorial part, to describe the different components of PyXNAT and the decisions that lead to its present design. Current issues will be discussed and identified. This can be either missing features or design weaknesses.

The meeting will then have a coding phase, aimed to solve the problems outlined in the first part. Missing features can be added by a group while design flaws can be discussed by another, and a base to a new PyXNAT architecture can be started if needed.

Ideas

Presentations

  • Yannick
    • Introduction
    • PyXNAT design
    • Github workflow and coding guidelines
  • Anyone else?

Documentation

There are several sources of existing documentation that should be merged and improved:

Feature requests

pyxnat

  • Currently in pyxnat file.get_copy(newPath) downloads to the cache (if not already there) and then copies the file to the new location. It would be nice if something like file.get_symlink(newPath) could be created that makes use of os.symlink so that the file doesn't have to be physically copied. (pyxnat)
  • It would be nice if pyxnat used a logging framework (such as the 'logging' module) so that we could set verbosity level and be able to review the uris that are being used as well as general debug information. (pyxnat)
  • Queries call reduction. It seems pyxnat queries for things and then queries again for the same thing in a lot of places. Most of the processing time is spent in postgres because of this. (pyxnat)
  • Support for constraints at each type level. Eg. Xnat.projects(‘bob’).experiments([constraints here]) instead of having to do xnat.search([constriants]). Perhaps this implies additional scan-level logic (?). (pyxnat) (cf schema with REST)
  • Password failure notification
  • Turn off warnings option
  • Make PyXNAT available from stock Debian/Ubuntu.

XNAT

I would like to see fault tolerances added into XNAT for this.

  • Setting multiple attributes at once stops when it encounters an attribute that doesn’t exist or has incorrect formatting. (XNAT)
  • XNAT falsely says an attribute was set when it isn’t. Requests always seem to come back as success even when the setting of the attribute failed. (XNAT)
  • Setting up fields with the REST API with this kind of syntax: 'xnat:subjectData/fields/field[name=pt_age]/field' (XNAT)
  • Scan-level search support. As of XNAT 1.5, scan-based searching must be simulated externally, and the performance is poor. pyxnat doesn't directly support this as of version 0.8, so the user must perform the simulation logic explicitly. (XNAT, pyxnat)
  • talk on the next version of the XNAT search engine; cf. scans
  • support for additional REST resources (cf REST API section)
  • REST access to the schema? This might imply the XNAT schema becoming a published part of the XNAT interface: If pyxnat is considered as another interface to XNAT's data like XNAT's REST interface, this may be reasonable but would require tighter coordination between XNAT and pyxnat development to maintain compatibility. If pyxnat is considered a user of XNAT's data via XNAT's published interfaces (REST), XNAT need not consider the schema part of its interface (and thereby reserve the flexibility to change it as needed).
  • REST method to get XNAT version

REST API

A copy paste from an old xnat group post of mine. Maybe some of those are supported now.

Last Modification Date
  • This works for subjects:
    • /REST/subjects?columns=last_modified
    • /REST/projects/{ID}/subjects?columns=last_modified
  • Not for experiments:
      • /REST/experiments?columns=last_modified
      • /REST/subjects/{ID}/experiments?columns=last_modified
      • /REST/projects/{ID}/subjects/{ID}/experiments?columns=last_modified
Advanced filtering
  • This works:
    • /REST/subjects?ID={ID}
    • /REST/experiments?ID={ID}
  • This does not work:
    • /REST/projects/{ID}/subjects?ID={ID}
    • /REST/projects/{ID}/experiments?ID={ID}
    • /REST/projects/{ID}/subjects/{ID}/experiments?ID={ID}
URIs
  • /REST/subjects/{ID}/experiments -> does not work
  • /REST/subjects/{ID}/experiments/{ID}/scans -> does not work
  • /REST/subjects/{ID}/experiments/{ID}/reconstructions -> does not work
  • /REST/subjects/{ID}/experiments/{ID}/assessors -> does not work
  • /REST/experiments/{ID}/scans -> works
  • /REST/experiments/{ID}/reconstructions -> works
  • /REST/experiments/{ID}/assessors -> does not work
  • /REST/projects/{ID}/experiments/{ID}/scans -> does not work
  • /REST/projects/{ID}/experiments/{ID}/assessors -> does not work
  • /REST/projects/{ID}/experiments/{ID}/reconstructions -> does not work
PUT experiment
  • PUT /REST/experiments/{ID}?xsiType=... -> does not work
  • PUT /REST/projects/{ID}/experiments/{ID}?xsiType=... -> does not work

Other ideas

Verify with either of these:

openssl dgst -md5 NeuroDebian_6.0.2+XNAT1.5.0-1_amd64.ova
openssl dgst -sha1 NeuroDebian_6.0.2+XNAT1.5.0-1_amd64.ova

The correct hashes are:

MD5: 5e66b95bb0d062c9377fefe8b29dd4e4
SHA1: de449b16795688b4cc0b04c565a9348e84efd6f1

Many thanks to Satra Ghosh, Christian Haselgrove and Yaroslav Halchenko for setting up and tuning this virtual machine.

  • pyxnat tools
    • spreadsheet uploader
    • file structure uploader


Report

The meeting have covered two aspects: the dicussion and development of new features to pyxnat and discussions on future developments for XNAT based on the pyxnat users experience.

Main changes in pyxnat

  • XNAT custom variables support in pyxnat:
    • custom variables are defined at the project level through a Project Object
    • the variables are usable different studyProtocol levels
    • deletion of custom variables is not supported yet
The idea of supporting custom variables is to allow users to upload csv data without having to define a custom schema.
  • Implementation of the new prearchive functions:
    • browse the prearchive
    • get prearchive status and additional info
    • move prearchive data to the archive
    • delete data from the prearchive
    • TODO: import files into the prearchive
  • New global scan listing:

To overcome a current search engine limitation which cannot query at the scan level, a global scan listing function which uses the advanced filtering options of the REST API has been developed. It features filtering at the project and subject levels, on the experiment and scan types, and with any other attribute the relevant scans may have.

  • Logging system:
    • improved error messages
    • different levels of verbosity, both on log files and stdout
    • circular log files
PyXNAT lacked so far a logging system. The one that has been started is still in early stages of its development and not available yet.

XNAT future developments

  • One major new feature of XNAT concerns the ability to access the schema through the REST services. Several ways to achieve that were discussed. The most obvious way is to have specific REST functions that let the user query the data model instead of only the data currently. Another option would be to have a dynamic REST API, that would adapt to the schema itself and therefore provide the necessary information on the schema. The schema information is useful for the users to discover data in XNAT and for several functions in pyxnat that could be enhanced.
  • The custom variables feature from XNAT need an improved REST support to be fully functional in pyxnat, and could need some refactoring as well on XNAT.
  • The search engine should be refactored and updated at some point.
  • The catalog system should find its way into the database at some point to enable users to query directly at the file level.
Personal tools