Talk:Data Formats Working Group

From canSAS

Which way to declare the units?

--Jemian 18:06, 17 December 2007 (EST) Cast your vote by attaching initials to one of these three methods.

In the Idata tag

votes: PRJ, ARJN, SMK, AJJ

    <SASdata>
      <Idata>
        <Q units="1/A"></Q>
        <I units="1/cm"></I>
        <Qdev units="1/A"></Qdev>
        <Idev units="1/cm"></Idev>
        <Qfwhm units="1/A"><!--  Qfwhm is optional  --></Qfwhm>
        <Qmean units="1/A"><!--  Qmean is optional  --></Qmean>
        <Shadowfactor units="none"><!--  Shadowfactor is optional  --></Shadowfactor>
      </Idata>
    </SASdata>

As attributes of the SASdata tag

votes:

    <SASdata 
         entries="Q:I:Qdev:Idev:Qfwhm:Qmean:ShadowFactor" 
         units="1/A:1/cm:1/A:1/cm:1/A:1/A:1/A">
      <Idata>
        <Q></Q>
        <I></I>
        <Qdev></Qdev>
        <Idev></Idev>
        <Qfwhm><!  Qfwhm is optional  ></Qfwhm>
        <Qmean><!--  Qmean is optional  --></Qmean>
        <ShadowFactor><!--  ShadowFactor is optional  --></ShadowFactor>
      </Idata>
    </SASdata>

As axis declarations within the SASdata tag

votes:

    <SASdata>
      <axis name="Q"     units="1/A"  type="Idata" />
      <axis name="I"     units="1/cm" type="Idata" />
      <axis name="Qdev"  units="1/A"  type="constant" >0.000085</axis>
      <axis name="Idev"  units="1/cm" type="Idata" />
      <axis name="Qfwhm" units="1/A"  type="constant" >0.000191</axis>
      <axis name="Qmean" units="1/A"  type="undefined" />
      <axis name="ShadowFactor" units="1/A"  type="undefined" />
      <Idata>
        <Q></Q>
        <I></I>
        <Qdev></Qdev>
        <Idev></Idev>
        <Qfwhm><!  Qfwhm is optional  ></Qfwhm>
        <Qmean><!--  Qmean is optional  --></Qmean>
        <ShadowFactor><!--  ShadowFactor is optional  --></ShadowFactor>
      </Idata>
    </SASdata>

--Jemian 12:01, 14 December 2007 (EST) items moved from main page

Considerations

  • a key point of what we discussed at NIST:

namely that our goal is to agree a format which that whilst using as much best XML practice as is reasonable, leaves the file instantly human-readable, editable in the simplest of editors, and importable by simple text import filters in programs that don't recognise the XML.

  • document what we decide
    • 1DWG will take care of documenting the format it defines.
      • make that definition with a schema (for absolute validation of any proposed XML file against the standard)
      • instructions on how to use that schema
      • XSL style sheets to present the XML contents in various forms (also serves as examples)
      • a couple of examples
      • maybe also some words.
    • move some of this discussion to
      • discussion page
      • other wiki pages
      • /dev/null after its usefulness has been exhausted
  • coordinate with other communities
  • should we consider a file naming convention?
  • should we consider a SAS scan naming convention?
    • sequential run number from facility
    • convention set by the detector software provider
  • XML representation of the I vs. Q data
    • tabular format
    • vector format
  • general XML coding style
    • readability by humans
      • with lots of computer skills
      • with rudimentary computer skills
    • readability by computers
    • availability of style sheets
  • scalability of XML format to 2D data?
  • What is required?
  • What is optional?
  • Use the same tags again in similar contexts
    • X,Y pairs for example, whether detector position, beam center, sample position
inconsistent consistent
<beam_size axis="x" units="mm">12.00</beam_size>
<beam_size axis="y" units="mm">12.00</beam_size>
<x0 units="mm">322.64</x0>
<y0 units="mm">327.68</y0>
<pixel_x units="mm">5.00</pixel_x>
<pixel_y units="mm">5.00</pixel_y>
<beam_size axis="x" units="mm">12.00</beam_size>
<beam_size axis="y" units="mm">12.00</beam_size>
<beam_center axis="x" units="mm">322.64</beam_center>
<beam_center axis="y" units="mm">327.68</beam_center>
<pixel_size  axis="x" units="mm">5.00</pixel_size>
<pixel_size  axis="y" units="mm">5.00</pixel_size>

Points for Discussion

  • Do we want to advocate/recommend particular names for particular tags; eg, SASdata, SASsample, Idata, etc.?
    • which ones?
  • provide for (optional) inclusion of sample prep details
  • provide for (optional) inclusion of other (non-SAS) data in the XML
  • Need to allow for more than a single SAS data set in one .xml file

Other Points

  • It's not clear how to specify that multiple runs were reduced together
    • (AJJ) Assuming that those multiple runs were first stored as XML then referencing the individual files would give all that back information (a la Ghosh suggestion). At NIST we take absolute I vs Q files and combine them to produce an absolute I vs Q file thus that is reasonable here. What about elsewhere?
  • How does one include the instrument information of the many runs that we used to make up the composite file
  • If we have reduction information, then everything needs to be in there, i.e. the run numbers for the can, the standard, the uniform field, etc.
  • Information on the averaging, is it radial, sector, rectangular, etc.