Data Formats Working Group: Difference between revisions

From canSAS
No edit summary
(PRJ: added more organizational details and summarized email discussion to date)
Line 1: Line 1:
[http://smallangles.net/pipermail/cansas-1dwg_smallangles.net/ Mailing List Archive]
[http://smallangles.net/pipermail/cansas-1dwg_smallangles.net/ Mailing List Archive]
===Timeline===
* 2007-12-31 agree on v1.0 format
* 2008-01-01 start implementing v1 at facilities
* 2008-06 representative sampling of data available for inter-facility comparison
* 2008-10 presentation of results at NOBUGS2008 meeting (date TBA)
===Considerations===
* a key point of what we discussed at NIST: 
namely that our goal is to agree a format which that whilst using as much best XML practice as is reasonable, leaves the file instantly human-readable, editable in the simplest of editors, and importable by simple text import filters in programs that don't recognise the XML.
* document what we decide
**1DWG will take care of documenting the format it defines.
*** make that definition with a schema (for absolute validation of any proposed XML file against the standard)
*** instructions on how to use that schema
*** XSL style sheets to present the XML contents in various forms (also serves as examples)
*** a couple of examples
*** maybe also some words.
** move some of this discussion to
*** discussion page
*** other wiki pages
*** /dev/null after its usefulness has been exhausted
* coordinate with other communities
** [http://www.nexusformat.org NeXus (http://www.nexusformat.org)]
** reflectivity
** powder diffraction
*** [http://www.xrdml.com XRDML (http://www.xrdml.com)]
* should we consider a file naming convention?
* should we consider a SAS scan naming convention?
** sequential run number from facility
** convention set by the detector software provider
* XML representation of the I vs. Q data
** tabular format
** vector format
* general XML coding style
** readability by humans
*** with lots of computer skills
*** with rudimentary computer skills
** readability by computers
*** standard XML libraries
*** generic visualization tools
*** common software such as MS Excel  or Open Office  or [ftp://ftp.ill.fr/pub/cs/rxml XMLPLO source code for windows/linux/OSX86]
** availability of style sheets
* scalability of XML format to 2D data?
* What is required?
* What is optional?
* Use the same tags again in similar contexts
** X,Y pairs for example, whether detector position, beam center, sample position
{|  width="100%" border="1" style="border:1px solid navy; border-collapse: collapse"
|-
! width="50%" | inconsistent
! width="50%" | consistent
|-
| <pre>
<beam_size axis="x" units="mm">12.00</beam_size>
<beam_size axis="y" units="mm">12.00</beam_size>
<x0 units="mm">322.64</x0>
<y0 units="mm">327.68</y0>
<pixel_x units="mm">5.00</pixel_x>
<pixel_y units="mm">5.00</pixel_y>
</pre>
| <pre>
<beam_size axis="x" units="mm">12.00</beam_size>
<beam_size axis="y" units="mm">12.00</beam_size>
<beam_center axis="x" units="mm">322.64</beam_center>
<beam_center axis="y" units="mm">327.68</beam_center>
<pixel_size  axis="x" units="mm">5.00</pixel_size>
<pixel_size  axis="y" units="mm">5.00</pixel_size>
</pre>
|}
===Points for Discussion===
*Do we want to advocate/recommend particular names for particular tags; eg, SASdata, SASsample, Idata, etc.?
** which ones?
* provide for (optional) inclusion of sample prep details
* provide for (optional) inclusion of other (non-SAS) data in the XML
* Need to allow for more than a single SAS data set in one .xml file
===Other Points===
* It's not clear how to specify that multiple runs were reduced together
* How does one include the instrument information of the many runs that we used to make up the composite file
* If we have reduction information, then everything needs to be in there, i.e. the run numbers for the can, the standard, the uniform field, etc.
* Information on the averaging, is it radial, sector, rectangular, etc.
===Members===
===Members===
* Andrew Jackson (NIST)
* Andrew Jackson (NIST)

Revision as of 16:43, 12 December 2007

Mailing List Archive

Timeline

  • 2007-12-31 agree on v1.0 format
  • 2008-01-01 start implementing v1 at facilities
  • 2008-06 representative sampling of data available for inter-facility comparison
  • 2008-10 presentation of results at NOBUGS2008 meeting (date TBA)

Considerations

  • a key point of what we discussed at NIST:

namely that our goal is to agree a format which that whilst using as much best XML practice as is reasonable, leaves the file instantly human-readable, editable in the simplest of editors, and importable by simple text import filters in programs that don't recognise the XML.

  • document what we decide
    • 1DWG will take care of documenting the format it defines.
      • make that definition with a schema (for absolute validation of any proposed XML file against the standard)
      • instructions on how to use that schema
      • XSL style sheets to present the XML contents in various forms (also serves as examples)
      • a couple of examples
      • maybe also some words.
    • move some of this discussion to
      • discussion page
      • other wiki pages
      • /dev/null after its usefulness has been exhausted
  • coordinate with other communities
  • should we consider a file naming convention?
  • should we consider a SAS scan naming convention?
    • sequential run number from facility
    • convention set by the detector software provider
  • XML representation of the I vs. Q data
    • tabular format
    • vector format
  • general XML coding style
    • readability by humans
      • with lots of computer skills
      • with rudimentary computer skills
    • readability by computers
    • availability of style sheets
  • scalability of XML format to 2D data?
  • What is required?
  • What is optional?
  • Use the same tags again in similar contexts
    • X,Y pairs for example, whether detector position, beam center, sample position
inconsistent consistent
<beam_size axis="x" units="mm">12.00</beam_size>
<beam_size axis="y" units="mm">12.00</beam_size>
<x0 units="mm">322.64</x0>
<y0 units="mm">327.68</y0>
<pixel_x units="mm">5.00</pixel_x>
<pixel_y units="mm">5.00</pixel_y>
<beam_size axis="x" units="mm">12.00</beam_size>
<beam_size axis="y" units="mm">12.00</beam_size>
<beam_center axis="x" units="mm">322.64</beam_center>
<beam_center axis="y" units="mm">327.68</beam_center>
<pixel_size  axis="x" units="mm">5.00</pixel_size>
<pixel_size  axis="y" units="mm">5.00</pixel_size>

Points for Discussion

  • Do we want to advocate/recommend particular names for particular tags; eg, SASdata, SASsample, Idata, etc.?
    • which ones?
  • provide for (optional) inclusion of sample prep details
  • provide for (optional) inclusion of other (non-SAS) data in the XML
  • Need to allow for more than a single SAS data set in one .xml file

Other Points

  • It's not clear how to specify that multiple runs were reduced together
  • How does one include the instrument information of the many runs that we used to make up the composite file
  • If we have reduction information, then everything needs to be in there, i.e. the run numbers for the can, the standard, the uniform field, etc.
  • Information on the averaging, is it radial, sector, rectangular, etc.


Members

  • Andrew Jackson (NIST)
  • Pete Jemian (APS)
  • Steve King (ISIS)
  • Ken Littrell (ORNL)
  • Andy Nelson (ANSTO)
  • Ron Ghosh (ILL)
  • Jan Ilavsky (APS)

News/Status