canSAS-XI/DataFormats

From canSAS

Presentation

Session Notes

Jan : real world examples of nxcanSAS from facilities needed. Tim suggests everyone putting them on Zenodo to get DOIs Pete - post issues on cansas examples github

Pete - request for more examples of NXcanSAS files: https://github.com/canSAS-org/NXcanSAS_examples/issues/3#issuecomment-509619529


Reduced data from multiple detectors ....

- Is this a general problem
- Jeff : looking at how to handle this for VSANS
- Issue for both raw and reduced : raw easier
- How to convert?
- AJJ : question is of multiple 2D q-spaces rather than detector spaces.
- Tim : multiple physical detectors vs moving detector
- AJJ : multiple SASdata entries in a single SASentry
- Jan : how to get software to read complex datasets (e.g. lots of mag fields with many detectors)
- AJJ : discussion of complexity ... some of the complex analyses will require the users to understand their data. Perhaps plugins in e.g. DPDAK?
- Is there a standard location for motors etc. Not really ... each instrument/facility does it differently.
 - can we have a standard set of metadata tags for NXcanSAS to support 
 - how about more detailed application definitions?

- Pete - search methods, specify search path. - Adam : SESANS data - should create equivalent of SASData for SESANS data. - Brian : how to include multi-modal data - probably at the same level as a SASentry - Detectors : how do we link detector metadata to the q-spaces derived from them. e.g. multiple detector configurations might be used to give a single Q space. This is true for 1D as well. - Stitching : discussion of stitching. How is it automated.

Discussion of dataset representation in python

 - xarray - not fully featured.
 - scippy (c++ version of xarray for mantid) - reinventing numpy! not what we are looking for here.
 - DAWN has something
 - nexpy supports network streaming / reading online rather than having to load whole dataset

Multiple detectors - multiple kinds of detector (Tim)

- e.g. optical camera + x-ray detector working at different frequencies. More frames from one than the other
- SAXS + WAXS at DESY - just write two nexus files to avoid the problem. Server issues and I/O
- I22 - timeframe generation (one file) + two separate detector files. Acquisition gives you a header nexus file linked to the others. Currently looking at stacking in the right order in data structures - slowest at top.
- ESS - all timestamped ... 
- Pete - timestamping at synchrotrons will be coming. XPCS can use this.
- Brian : Visualisation
      - Time slider bars - showing last collected image.

Metadata for Machine Learning

- Datasets generated by Diamond should be put online (3Tb issue!)
- Metadata from sample is required.
- XPDF users need the information about the sample. (same for liquids neutron diffraction).
- Need to move on from basics (Pete) - don't know what we need to know to feed the machine learning algorithms.
- What are the users going to give us?
  - Very little ... 
  - Use header nexus file with link to raw detector data. Basically digital logbook.
  - Pete - shouldn't limit metadata for edge cases. Need to have the metadata possible and turn off as needed.
  - Pete - automated addition of data from proposal/safety form
  - Brian - adding analyte composition and density etc.
  - Pete - issue with NXsample : requires chemical formula, not composition.
  - Some are easy for low throughput. Need an efficient way of entering the information. 
  - Should look at what the MX do. 
  - Needs to be end-to-end. Has to be a benefit to user.
  - Machine learning : Pete - need some samples with bad metadata to teach algorithm!

Actions

- Identify the right location for data sets and upload to Zenodo and/or github.

- Linking of detector metadata to sasdata entries.

- Writing of notes with examples for NXcanSAS usage.

- Build a list of suggested metadata with standard names. Including how to build sample descriptions. Brian P to write a proposal for sample relevant metadata and to circulate to Data Formats group