canSAS-XIV
As usual canSAS will hold some events in conjunction with the triennial SAS2024 meeting being held in Taipei Taiwan from 3-8 November 2024. canSAS-XIV will include a half day session on Sunday prior to the meeting and an open informational lunch session on Monday.
Purpose
The acronym canSAS stands for collective action for nomadic Small Angle Scatterers. It is a bottom-up effort originally initiated in Grenoble by Ron Ghosh (ILL), Roland May (ILL), Peter Timmins (ILL), Claudio Ferrero (ESRF) and Wim Bras (DUBBLE CRG) in 1998. Ever since it has been an ongoing activity to provide the small-angle scattering user community with shared tools for data reduction and analysis, and disseminating other pertinent information (https://www.cansas.org). Lately the focus has been on data analysis and modelling software, instrument resolution corrections and other issues related to data interpretation.
A several day independent meeting is held every few years while shorter meetings are planned at the triennial SAS conferences to take advantage of the convening of the larger community as a whole.
Activities
Session notes
Databases session:
Fair data - there are checkers available for example www/f-uji.net/
Public databases: Catalog of fair databases: fairsharing.org brings up two databases, SASPDB and [xx?] simplescattering.com Zenodo also contains 14k datasets, also software.
Local databases: SciCat: possibility to catalog datasets and instruments. Tiled: data access framework linked to BlueSky, future connection with scicat will be done.
- A publicly accessible (scicat) catalog would be useful for labs to catalog public scattering datasets in. - This could be taken up as part of the International Scattering Alliance, certainly falls within its remit. Needs a document to specify what is needed, what should be uploaded to it, and how much it would cost. - Data to include in this would be (corrected, well-documented) data on easily accessible samples. This could include single phases (backgrounds) like water, hexane, toluene, acetone, PEG-DA, scotch tape, etc., and also reference samples such as the silver nanoparticle solutions. The datasets and samples should be well-described and ideally complete with a data processing graph. - There should be a provision to add annotations (text, keywords, flags) to datasets, this could say a lot about the human interpretation of given datasets. - This (and other well-documented databases) could have benefits for the third “user” group: the group of people that need your data to train ML systems. - All databases need to have a good user interface that actually addresses the need of the user. Once you build that, and the user sees the benefit they get from this, they will come. For example, MX people cannot work without their dashboards. FAIR-compliant user interfaces need to become easy enough to use that it becomes prohibitively hard for the user not to follow the recommended pathway. - We might need guidelines on how to describe a sample sufficiently well, so that they can be used in the analysis. This data, and the (corrected, fully described) data should be made available in a public repository at the latest at the time of publication.
Organising Committee
- Paul Butler (NIST)
- U-Ser Jeng (NSRRC)
- Orion Shih (NSRRC)
- Adrian Rennie (Uppsala U)
- Steve King (ISIS)
- Wojciech Potrzebowski (SciLifeLab)