NYU Value-Added Galaxy Catalog

Corresponding authors:
Michael R. Blanton and David W. Hogg
Center for Cosmology and Particle Physics
Department of Physics
New York University

Additional authors: David Schlegel (LBL), Douglas Finkbeiner (Princeton University), Nikhil Padmanabhan (Princeton University), Max Tegmark (MIT), Idit Zehavi (University of Arizona), Andreas Berlind (NYU), Ryan Scranton (UPitt), Christy Tremonti (University of Arizona), Jeff Munn (USNO), Gillian Knapp (Princeton University), James Gunn (Princeton University)

  1. General description
  2. DR6 has arrived!
  3. Conditions of use
  4. Downloading the data
  5. Organization of the data
  6. Description of the sample geometry
  7. Software tools
  8. Datasweeps of full SDSS catalog
  9. FAQ

Note any IMPORTANT CHANGES

General Information

The NYU Value-Added Galaxy Catalog (NYU-VAGC) is a cross-matched collection of galaxy catalogs maintained for the study of galaxy formation and evolution. It includes carefully constructed large-scale structure samples useful for calculating power spectra, correlation functions, etc.

This catalog is described in a paper published in the Astronomical Journal. Usage of the NYU-VAGC requires citing this paper (as well as the appropriate citations for component catalogs and calibrations).

The latest sample (DR6) consists of 9938 sq deg of photometric imaging and and 6750 sq deg of spectroscopic coverage.

The most useful components of the catalog are:

  1. catalog of low-redshift galaxies (10 < d < 150 Mpc/h) [NOT FOR DR6 YET]
  2. large-scale structure samples
  3. K-corrections

Currently, no funding exists for the NYU-VAGC and we create it for the warm, fuzzy feeling inside that it gives us when we think that people might be using it. In order to feed that warm, fuzzy feeling we greatly appreciate any feedback!

DR6 has had an important fix

The SDSS DR6 was released November 2007 and we released a new version of the NYU-VAGC with it.

DR6 consists of 9938 sq deg of photometric imaging and and 6750 sq deg of spectroscopic coverage.

There was an important problem in DR6, which was that several runs had fields missing (run numbers 6123, 6122, 6075, 5960, 5934, 5817, 5421, 5403, 5396, 5384, 4832, and 4828). This means that any area for which these runs are primary appear empty in the "vagc-dr6/vagc0" catalogs.

While we won't change the "vagc-dr6/vagc0" directories, we have released a new version of the LSS samples, "dr6fix", which exclude the affected areas (about 20 sq deg). The old "dr6" results are available in the old locations.

Conditions of use

If you download NYU-VAGC data, we would appreciate you:

Downloading the data

Here we describe how to get to the data, and below we describe how the data is organized.

The public data release data is available through a set of directories conforming to the data model described below. The base URLs are:

You can retrieve this set of data automatically using Unix WWW downloaders such as GNU wget or curl. Special procedures exist for SDSS participants.

Organization of the data

All entries in the catalog are sources in the SDSS imaging survey selected in one of the following ways:

The selection is designed to include all galaxies relevant to analyzing the spectroscopic survey.

We match all of the objects to the SDSS spectroscopic survey, to the FIRST radio survey, to the 2MASS Point Source Catalog, to the 2MASS extended source catalog, to the Two-degree Field Galaxy Redshift Survey, to the IRAS Point Source Catalog Redshift Survey, to Reference Catalog 3 (RC3.9b), to the GALEX GR1 release, and to the Spitzer SWIRE catalog.

We present all the data in the form of FITS binary tables, FTCL parameter files, and mangle-style polygon files. idlutils contains tools for reading all of these files into IDL structures.

In the $VAGC_REDUX directory are the files:

  1. object_catalog.fits: the list of coordinates of every object in the catalog, and their positions (zero-indexed) in the constituent catalogs.
  2. object_sdss_imaging.fits: some photometric catalog data for all of the objects in the catalog
    (with full photometric data also available, as well as 2MASS Point Source Catalog and FIRST Catalog data).
  3. object_sdss_spectro.fits: spectroscopic catalog data for all of the objects
    (SDSS_SPECTRO_TAG==-1 for objects with no spectroscopic data).
  4. object_sdss_tiling.fits: tiling catalog data for all of the objects
    (SDSS_TILING_TAG==-1 for objects with no spectroscopic data).
  5. object_twomass.fits: 2MASS Extended Source Catalog data for all the objects
    (TWOMASS_TAG==-1 for objects with no 2MASS data)
  6. object_pscz.fits: PSCz catalog data for all the objects
    (PSCZ_TAG==-1 for objects with no PSCz data)
  7. object_rc3.fits: RC3 catalog data for all the objects
    (RC3_TAG==-1 for objects with no RC3 data)
  8. object_twodf.fits: 2dFGRS catalog data for all the objects
    (TWODF_TAG==-1 for objects with no 2dFGRS data)
  9. object_galex.fits: GALEX GR1 catalog data for all the objects
    (GALEX_TAG==-1 for objects with no GALEX data)
  10. object_swire.fits: Spitzer SWIRE catalog data for all the objects
    (SWIRE_TAG==-1 for objects with no SWIRE data)
Each row refers to the same object in all of the files; that is, the data in row 10 of object_twomass.fits is the 2MASS data available for the object in row 10 of object_sdss_imaging.fits.

Note that the above files only contain the objects in each survey which match entries in object_sdss_imaging. In order to access the other objects from the survey in question, you can look in files of the form:

$VAGC_REDUX/[name]/[name]_catalog.fits
See the documentation for the "object_" files for more details.

Some derived quantities are also available for each object in subdirectories. Currently these consist of:

  1. kcorrect: K-corrections of the ugrizJHK magnitudes (using various collision-corrected redshifts, definitions of magnitude, and bandpasses)
  2. cutouts: some cutout images of selected objects
  3. quality: eyeball quality checks
  4. collisions: a best redshift for each object including correction for collisions, using several methods (these values are useful for some large-scale structure work)
  5. velmod_distance: distances corrected for peculiar velocities
  6. doublestar: attempt to detect poorly deblended double stars
  7. sersic: Sersic radial profile fits
  8. lowz: a specially prepared catalog of low-redshift galaxies [NOT FOR DR6 YET]
  9. matchspec: attempt to match objects to spectra in a more generous way (connecting objects further than 2 arcsec)
  10. parents: properties of parent objects

Description of the geometry of the catalog

For the SDSS imaging survey we have an expression for its geometry in terms of spherical polygons.

However, for most people who are worried about the angular selection function, it is best to use a large-scale structure subsample, where the imaging, target, and tiling masks are combined and the flux limit and completeness are tracked as a function of position. In the $LSS_REDUX/dr6fix ("drtwo14" for the DR2 version) directory are the files:

  1. lss_combmask.dr6fix.fits: geometry of the sample, with areas around bright stars excised
  2. lss_index.dr6fix.fits: positions of all objects in VAGC and which lss_combmask polygon they are in (-1 if outside OR if inside the bright star mask)
The geometry is built out of two separate files, one expressing the window and the other expressing the bright star mask:
  1. lss_geometry.dr6fix.fits: file describing geometry of the full large-scale structure sample
  2. lss_bsmask.dr6fix.fits: file describing the bright star mask

In addition there are a number of subdirectories. Each subdirectory corresponds to a "pre-redshift" selection criterion; this means that the objects and the area of sky have been selected according to flux limit, completeness, and other properties which do not require knowing the object redshift. Further subdirectories provide complete subsamples based on cuts made after the redshift determination: on redshift, luminosity, intrinsic color, etc. Further details on the large-scale structure samples are available.

Software tools

We use a large suite of tools in order to create this catalog. The basics are all in idlutils. Public tools for dealing with the spectra are in idlspec2d. Public tools for dealing with the imaging data are in photoop. Our tools for building the NYU-VAGC and the large-scale structure are in vagc. (This last link is a tarball of our vagc v1_10 code; it is very large because it contains a model of the local velocity field.

These are all IDL products, and you need IDL to use them. But you are an astronomer, and probably need IDL anyway (sigh).

A special piece of code that has been useful to use (and is compiled into idlutils) is mangle, a set of tools for dealing with angular masks developed by Andrew Hamilton and Max Tegmark. This code is distributed with idlutils.

Datasweeps of full SDSS catalog

One of the very useful tools that we use to build the NYU-VAGC are the "datasweeps" of the full SDSS catalog. These are compressed versions of the full catalog that have only the decent detections. They occupy a very small amount of disk space (about 90G) so are usefully small. They do not have all of the photometric information in them, though, which is a drawback.

All of the data for DR6 is at:

http://sdss.physics.nyu.edu/datasweep/dr6/

There are two catalogs, split into stars and galaxies. For stars, the datasweeps have any stars where the PSF extinction-corrected magnitudes are brigher than the following for at least one band: u < 22.5, g < 22.5, r < 22.5, i < 22, z < 21.5. The catalog is broken down into runs and camcols, and is in files with the names:

calibObj-[run]-[camcol]-star.fits.gz

Similarly, for galaxies , they have any galaxies where the model extinction-corrected magnitudes are brigher than the following for at least one band: u < 21.0, g < 22.0, r < 22.0, i < 20.5, z < 20.1. The names of the files for each run and camcol are:

calibObj-[run]-[camcol]-gal.fits.gz

The meanings of the columns of the FITS files conform to the naming conventions at the Princeton reductions site. Important things to remember are:

  1. If you only want one instance of each object, require that (RESOLVE_STATUS&256)>0 be set (as described on the Princeton site).
  2. Fluxes are not Galactic extinction corrected, and are in nanomaggies.
  3. Radii are generally given in PIXELS (each SDSS pixel is 0.396 arcsec)
  4. For stars, the 2MASS PSC magnitudes are included.

NYU Value-Added Galaxy Catalog

Please contact us with comments or questions.