|
Cross-matchRun2XcosmoDC2S.Plaszczynski 24 oct.19 On this page... (hide) Updates
1. selections1.1 source = Run2.1i (dr1-b)
: ~50M 1.2 target= CosmoDC2
2. healpixel grid(nside, resol (arcsec)) (1, 211076.28514206142), (2, 105538.14257103071), (4, 52769.071285515354), (8, 26384.535642757677), (16, 13192.267821378839), (32, 6596.133910689419), (64, 3298.0669553447096), (128, 1649.0334776723548), (256, 824.5167388361774), (512, 412.2583694180887), (1024, 206.12918470904435), (2048, 103.06459235452218), (4096, 51.53229617726109), (8192, 25.766148088630544), (16384, 12.883074044315272), (32768, 6.441537022157636), (65536, 3.220768511078818), (131072, 1.610384255539409), (262144, 0.8051921277697045), (524288, 0.40259606388485225)] stats within 1 pixel:
+-----+--------+ |count| count| +-----+--------+ | 1|46150111| | 2| 141961| | 3| 103| +-----+--------+
+-----+--------+ |count| freq| +-----+--------+ | 1|76819578| | 2| 1739137| | 3| 33127| | 4| 586| | 5| 10| | 58| 2| +-----+--------+ 3. Full cosmoDC2xRun2 cross-matchbasically a join based on ipix: but problems with pixels boundaries (some point may be close to a pixel border and should be associated actually to the pixel neighbour)
Spark @cori (10 nodes) interactively:
+----+--------+--------------------+ |nass| count| frac| +----+--------+--------------------+ | 1|34891784| 0.7304920953658602| | 2|10656958| 0.2231133718942536| | 3| 1914284| 0.04007732394208735| | 4| 265944| 0.0055677860957175| | 5| 31903|6.679191100821053E-4| | 6| 3462|7.248020434141769E-5| | 7| 386|8.081270616922943E-6| | 8| 38|7.955655011478544E-7| | 9| 5|1.046796712036650...| | 58| 2|4.187186848146602...| +----+--------+--------------------+ change def: now matched means r<1 arcmin
4. DC2 validation4.1 sample purityangles: One can study the astrometric errors by comparing the local x/y distributions (dx=cos(DEC) Delta RA and dy=Delta DEC)between the matched points. Here is the 2D histogram (log scaled):
what's the associated photometry? With a cut on delta(flux):
4.2 completeness:
4.3 Testing the PSFaccording to Lupton the astrometric errors are related to the PSF ones (in the Gaussian case) by Var(x)=2 sigma^2/SN^2 so that we can test the PSF by looking at psf_x= dx*SNR/\sqrt{2}; psf_y= dy*SNR/\sqrt{2} in radial coordinates:
<fwhm> ~30% larger that what appears in run2 another interesting way to look at this is to plot the distribution of r*SNR (r= distance between the 2 associated points). This is NOT the same than above because of the Jacobian in the transform, and should follow 2pi*r*Moffat(r) The mode (max pos) is at ~0.40 For a Moffat distrib it lies at r_{max}=\frac{a}{\sqrt(2\beta-1}=0.44 fwhm (for beta=2): then we find again <fwhm>~0.9 arsec. Then if we histogram psf_r/psf_fwhm_i it should peak at 0.44: It is measured slightly higher (0.52) but given the fact that sqrt(2) is from a gaussain model and that the Moffat(beta=2) is not a very good approximation it looks quite fair. 4.4 PhotometrySNRMagnitudesFluxes( m=-2.5 log_{10}(f_\nu)+31.4 ) |
- Testing the errors
4.5 stars
avec flux PSF:
4.6 colors
5. Getting a probability for the match
- Theorem: if x \sim f then CDF(x)\sim u[0,1] , CDF=Cumulative Distribution Function of f
- from the golden sample we have distributions for distance/flux: construct binned CDFs ("cumsum")
- for each sample p_1=CDF(distance), p2=CDF(flux) : each are u[0,1] for the signal and peaked to 0 for noise
- combine both : P=p_1\cdot p_2 is reasonable. This is not (yet) a probability (ie u[0,1] distributed for signal).
- compute (homework) CDF(P)= p_1 p_2(1- ln[p_1 p2])
This way we compute a probability (u[0,1] for the signal) for each candidate at take the max. You can later cut on that value to increase purity.
6. Conclusions
- cosmoDC2xRun2 match in 3min with Spark (with solution about pixel boundaries) : can provide ObjectId<->galaticId parquet file (does it interest someone?)
- unbiased astrometry/fluxes distributions
- astrometric error FWHM compatible with psfwhm/SNR at the 40% level. if considered as an equivalent gaussian sigma (=FWHM/2.355) wrong by a factor 3.3
- not clear what cModelFluxErr_i really means (or how to cut on that), but nice unskewed distribution
- match probability