reliable, robust, and accessible


big data neuroscience

Ph.D. Preliminary Meeting

2017-10-19

Gregory Kiar

Outline

  • background
  • clowdr
  • provenance graph evaluation
  • replication/extension of analysis
  • timeline

background

  • data collection and federation is booming
  • but a reproducibility crisis plagues science
  • we need for stable methods and evaluation

the "plan"

  • extend platforms for HPC in the cloud
  • create framework for evaluating prov. graphs
  • replicate and improve impactful study

clowdr

  • Streamlining the tool-to-impact cycle
  • Prototype, parallelize, and publish
  • Microservice: no server, no database

clowdr + cbrain = cc

  • a common API for data and compute
  • CBRAIN as a portal for Compute Canada

provenance graphs

  • we can record pipeline's internal calls
  • construct a provenance graph of execution
  • compare graphs across executions?

example

Possible types of differences could be:

  • convergence rate
  • resulting file size
  • execution time
  • resources used
  • warnings
  • errors

node = command
edge = data

neuroinformatics replication

  • replicate existing impactful study
  • perform stability analysis
  • improve the stability of tool(s)
  • replicate study with new tool
  • attempt to generalize old and new findings

for instance?

timeline

  • clowdr: 1 year
  • provenance analysis: 1.5 years
  • replication and extension study: 1 year
  • total: 3.5 years (grad. Spring 2021)

acknowledgements

Thanks to:

  • Tristan Glatard
  • Alan C. Evans
  • J.B. Poline
  • Pierre Bellec
  • Christine Tardif
  • Joshua T. Vogelstein
  • Heather Keightly
  • Palle Kiar
  • Family, Friends & Labmates

... for all of the proof reading, brain storming, editing, listening, and support.

references

All references can be found in my proposal.

Thank you!!!