Issue 
A&A
Volume 616, August 2018
Gaia Data Release 2



Article Number  A2  
Number of page(s)  25  
Section  Astronomical instrumentation  
DOI  https://doi.org/10.1051/00046361/201832727  
Published online  10 August 2018 
Gaia Data Release 2
The astrometric solution
^{1}
Lund Observatory, Department of Astronomy and Theoretical Physics, Lund University, Box 43,
22100
Lund,
Sweden
^{2}
ESA, European Space Astronomy Centre, Camino Bajo del Castillo s/n,
28691
Villanueva de la Cañada,
Spain
^{3}
HE Space Operations BV for ESA/ESAC, Camino Bajo del Castillo s/n,
28691
Villanueva de la Cañada,
Spain
^{4}
LohrmannObservatorium, Technische Universität Dresden,
Mommsenstrasse 13,
01062
Dresden,
Germany
^{5}
Astronomisches RechenInstitut, Zentrum für Astronomie der Universität Heidelberg,
Mönchhofstraße 14,
69120
Heidelberg,
Germany
^{6}
Vitrociset Belgium for ESA/ESAC, Camino Bajo del Castillo s/n,
28691
Villanueva de la Cañada,
Spain
^{7}
Telespazio Vega UK Ltd for ESA/ESAC, Camino Bajo del Castillo s/n,
28691
Villanueva de la Cañada,
Spain
^{8}
Institut de Ciències del Cosmos, Universitat de Barcelona (ICCUB),
Martí Franquès 1,
08028
Barcelona,
Spain
^{9}
Institute for Astronomy, School of Physics and Astronomy, University of Edinburgh, Royal Observatory, Blackford Hill,
Edinburgh
EH9 3HJ,
UK
^{10}
Gaia Project Office for DPAC/ESA, Camino Bajo del Castillo s/n,
28691
Villanueva de la Cañada,
Spain
^{11}
Aurora Technology for ESA/ESAC, Camino Bajo del Castillo s/n,
28691
Villanueva de la Cañada,
Spain
^{12}
Institute of Astronomy, University of Cambridge,
Madingley Road,
Cambridge
CB3 0HA,
UK
^{13}
Istituto Nazionale di Astrofisica, Osservatorio Astrofisico di Torino,
Via Osservatorio 20,
Pino Torinese,
Torino
10025,
Italy
^{14}
Elecnor Deimos Space for ESA/ESAC, Camino Bajo del Castillo s/n,
28691
Villanueva de la Cañada,
Spain
^{15}
SYRTE, Observatoire de Paris, Université PSL, CNRS, Sorbonne Université, LNE,
61 avenue de l’Observatoire,
75014
Paris,
France
^{16}
GPAObservatorio National/MCT,
Rua Gal. Jose Cristino 77,
CEP 20921400
Rio de Janeiro,
Brazil
^{17}
Serco for ESA/ESAC, Camino Bajo del Castillo s/n,
28691
Villanueva de la Cañada,
Spain
^{18}
INAF, Osservatorio Astrofisico di Catania,
Catania,
Italy
^{19}
Astronomical Institute, Bern University,
Sidlerstrasse 5,
3012
Bern,
Switzerland
^{20}
EURIX SRL, Corso Vittorio Emanuele II, 61,
10128
Torino,
Italy
^{21}
Laboratoire d’Astrophysique de Bordeaux, Université de Bordeaux, CNRS, B18N, allée Geoffroy SaintHilaire,
33615
Pessac,
France
^{22}
University of Torino, Department of Computer Science,
Torino,
Italy
^{23}
ESA, European Space Research and Technology Centre,
Keplerlaan 1,
2200
AG,
Noordwijk,
The Netherlands
^{24}
University of Padova,
Via Marzolo 8,
Padova
35131,
Italy
^{25}
RHEA for ESA/ESAC, Camino Bajo del Castillo s/n,
28691
Villanueva de la Cañada,
Spain
^{26}
Université Côte d’Azur, Observatoire de la Côte d’Azur, CNRS, Géo Azur,
250 rue Albert Einstein, CS 10269,
06905
Sophia Antipolis Cedex,
France
^{27}
Max Planck Institute for Astronomy,
Königstuhl 17,
69117
Heidelberg,
Germany
^{28}
Observatoire Astronomique de l’Université de Genève, Sauverny,
Chemin des Maillettes 51,
1290
Versoix,
Switzerland
^{29}
Shanghai Astronomical Observatory, Chinese Academy of Sciences,
80 Nandan Rd,
200030
Shanghai,
PR China
^{30}
School of Astronomy and Space Science, University of Chinese Academy of Sciences,
Beijing
100049,
PR China
^{31}
Las Cumbres Observatory,
6740 Cortona Dr. 102,
Goleta,
CA 93117,
USA
^{32}
Astrophysics Research Institute, Liverpool John Moores University,
146 Brownlow Hill,
Liverpool L3 5RF,
UK
^{33}
ALTEC,
Corso Marche 79,
Torino
10146,
Italy
^{34}
Université Côte d’Azur, Observatoire de la Côte d’Azur, CNRS, Laboratoire Lagrange, Bd de l’Observatoire,
CS 34229,
06304 Nice Cedex 4,
France
^{35}
Università di Torino, Dipartimento di Fisica,
via P. Giuria 1,
10125,
Torino,
Italy
^{36}
ATG for ESA/ESAC, Camino Bajo del Castillo s/n,
28691
Villanueva de la Cañada,
Spain
^{37}
Sterrewacht Leiden, Leiden University,
PO Box 9513,
2300 RA,
Leiden,
The Netherlands
Received:
29
January
2018
Accepted:
6
March
2018
Context. Gaia Data Release 2 (Gaia DR2) contains results for 1693 million sources in the magnitude range 3 to 21 based on observations collected by the European Space Agency Gaia satellite during the first 22 months of its operational phase.
Aims. We describe the input data, models, and processing used for the astrometric content of Gaia DR2, and the validation of these resultsperformed within the astrometry task.
Methods. Some 320 billion centroid positions from the preprocessed astrometric CCD observations were used to estimate the five astrometric parameters (positions, parallaxes, and proper motions) for 1332 million sources, and approximate positions at the reference epoch J2015.5 for an additional 361 million mostly faint sources. These data were calculated in two steps. First, the satellite attitude and the astrometric calibration parameters of the CCDs were obtained in an astrometric global iterative solution for 16 million selected sources, using about 1% of the input data. This primary solution was tied to the extragalactic International Celestial Reference System (ICRS) by means of quasars. The resulting attitude and calibration were then used to calculate the astrometric parameters of all the sources. Special validation solutions were used to characterise the random and systematic errors in parallax and proper motion.
Results. For the sources with fiveparameter astrometric solutions, the median uncertainty in parallax and position at the reference epoch J2015.5 is about 0.04 mas for bright (G < 14 mag) sources, 0.1 mas at G = 17 mag, and 0.7 masat G = 20 mag. In the proper motion components the corresponding uncertainties are 0.05, 0.2, and 1.2 mas yr^{−1}, respectively.The optical reference frame defined by Gaia DR2 is aligned with ICRS and is nonrotating with respect to the quasars to within 0.15 mas yr^{−1}. From the quasars and validation solutions we estimate that systematics in the parallaxes depending on position, magnitude, and colour are generally below 0.1 mas, but the parallaxes are on the whole too small by about 0.03 mas. Significant spatial correlations of up to 0.04 mas in parallax and 0.07 mas yr^{−1} in proper motion are seen on small (< 1 deg) and intermediate (20 deg) angular scales. Important statistics and information for the users of the Gaia DR2 astrometry are given in the appendices.
Key words: astrometry / parallaxes / proper motions / methods: data analysis / space vehicles: instruments / reference systems
© ESO 2018
1 Introduction
Gaia DR2 (Gaia Collaboration 2018a), the second release of data from the European Space Agency mission Gaia (Gaia Collaboration 2016b), contains provisional results based on observations collected during the first 22 months since the start of nominal operations in July 2014. The astrometric data in Gaia DR2 include the five astrometric parameters (position, parallax, and proper motion) for 1332 million sources, and the approximate positions at epoch J2015.5 for an additional 361 million mostly faint sources with too few observations for a reliable fiveparameter solution. The limiting magnitude is G ≃ 21.0. The bright limit is G ≃ 3, although stars with G ≲ 6 generally have inferior astrometry due to calibration issues. The data are publicly available in the online Gaia Archive^{1}.
This paper gives an overview of the astrometric processing for Gaia DR2 and describes the main characteristics of the results. Further details are provided in the online documentation of the Gaia Archive and in specialised papers. In contrast to the TychoGaia astrometric solution (TGAS; Lindegren et al. 2016) in Gaia DR1 (Gaia Collaboration 2016a), the present solution does not incorporate any astrometric information from HIPPARCOS and Tycho2, and the results are therefore independent of these catalogues. Similarly to Gaia DR1, all sources are treated as single stars and thus representable by the five astrometric parameters. For unresolved binaries (separation ≲ 100 mas), the results thus refer to the photocentre, while for resolved binaries the results may refer to either component and are sometimes spurious due to confusion of the components. For a very small number of nearby sources, perspective effects due to their radial motions were taken into account.
The input data for the astrometric solutions are summarised in Sect. 2. A central part of the processing carried out by the Gaia Data Processing and Analysis Consortium (DPAC; Gaia Collaboration 2016b) is the astrometric global iterative solution (AGIS) described in Lindegren et al. (2012, hereafter the AGIS paper), and the present results were largely computed using the models and algorithms described in that paper. However, a few major additions have been made since 2012, and they are outlined in Sect. 3. Section 4 describes the main steps of the solutions. The validation of the results carried out by the astrometry team of DPAC primarily aimed at estimating the level of systematic errors; this is described in Sect. 5, with the main conclusions in Sect. 6. Three appendices give statistics and other information of potential interest to users of the Gaia DR2 astrometry.
2 Data used
The main input to the astrometric solutions are one or twodimensional measurements of the locations of pointsource images on Gaia’s CCD detectors, derived by the image parameter determination (Sect. 2.2) in the preprocessing of the raw Gaia data (Fabricius et al. 2016). The CCD measurements must be assigned to specific sources, so that all the measurements of a given source can be considered together in the astrometric solution. This is achieved by a dedicated crossmatching procedure following the same overall threestep scheme as for Gaia DR1. First all sources close to a detection – the candidate matches – are found. This is done for the full set of observations, using updated calibrations and an extended attitude covering also time intervals that may later be excluded. Next, the detections are divided into isolated groups consisting of the smallest possible sets of detections with candidate matches to the same sources, such that a given candidate source only appears in one group. Finally, each group is resolved into clusters of detections and each cluster assigned to one source. What is done differently from Gaia DR1 is the way the clusters are formed. For Gaia DR1, this involved a simple nearestneighbour algorithm, applied to one detection at a time, without a global view of the group. For Gaia DR2, a more elaborate clustering algorithm was used, giving better results in dense areas and performing much better for sources with high proper motions as it includes the detection of linear motion. The overall crossmatch scheme is described in Castañeda et al. (in prep.). For Gaia DR2, about 52 billion detections were processed, but 11 billion were considered spurious and therefore did not take part in the cross matching. The remaining 41 billion transits were matched to 2583 million sources, of which a significant number could still be spurious. Even among the clearly nonspurious sources, many had too few or too poor observations to make it to the release, which therefore has a total of 1693 million sources.
A second important input to the astrometric solution for Gaia DR2 is the colour information, available for most of the sources thanks to the early photometric processing of data from the blue and red photometers (BP and RP; van Leeuwen et al. 2017; Riello et al. 2018; Evans et al. 2018). This processing used astrometric data (source and attitude parameters) taken from a provisional astrometric solution (Sect. 4.1).
Additional input data are obtained from the basic angle monitor (BAM; Sect. 2.4) and the orbit reconstruction and time synchronisation data provided by the Mission Operations Centre (Sect. 5.3 in Gaia Collaboration 2016b).
2.1 Time coverage
Gaia DR2 is based on data collected from the start of the nominal observations on 2014 July 25 (10:30 UTC) until 2016 May 23 (11:35 UTC), or 668 days. However, the astrometric solution for this release did not use the observations during the first month after commissioning, when a special scanning mode (the ecliptic pole scanning law, EPSL) was employed. The data for the astrometry therefore start on 2014 Aug. 22 (21:00 UTC) and cover 640 days or 1.75 yr, with some interruptions mentioned below.
Hereafter we use the onboard mission timeline (OBMT) to label onboard events; it is expressed as the number of nominal revolutions of exactly 21 600 s (6 h) onboard time from an arbitrary origin. The approximate relation between OBMT (in revolutions) and barycentric coordinate time (TCB, in Julian years) at Gaia is (1)
The nominal observations start at OBMT 1078.38 rev. The astrometric solution used data in the interval OBMT 1192.13–3750.56 rev, with major gaps at OBMT 1316.49–1389.11 rev and 2324.90–2401.56 rev due to mirror decontamination events and the subsequent recovery of thermal equilibrium. Planned maintenance operations (stationkeeping manoeuvres, telescope refocusing, etc.), micrometeoroid hits, and other events caused additional gaps that rarely exceeded a few hours.
The reference epoch used for the astrometry in Gaia DR2 is J2015.5 (see Sect. 3.1), approximately halfway through the observation period used in the solution. This reference epoch, chosen to minimise correlations between the positions and proper motions, is 0.5 Julian year later than the reference epoch for Gaia DR1; this difference must be taken into account when comparing positional data from the two releases.
2.2 Image parameters
Image parameters are obtained by fitting a model profile to the photon counts in the observation window centred on the source in the CCD pixel stream. The model profile is a point spread function (PSF) for a twodimensional window and a line spread function (LSF) in the more common case of a onedimensional window (for details on the CCD operations, see Sect. 3.3.2 in Gaia Collaboration 2016b). The main image parameters are the estimated one or twodimensional location of the image centroid (defined by the origin of the fitted PSF or LSF) and the integrated flux of the image. The image parameter determination for Gaia DR2 is essentially the same as for Gaia DR1 (see Sect. 5 in Fabricius et al. 2016). In particular, the fitted PSF and LSF were assumed to be independent of time and of the colour and magnitude of the source, which means that centroid shifts depending on time, colour, and magnitude need to be modelled in the astrometric solution (Sect. 3.3). For Gaia DR2, all image parameters have been redetermined in a uniform way and recovering observations that for various reasons did not enter Gaia DR1. The sky background has been recalibrated, and we now have a far more detailed calibration of the electronic bias of the CCDs (Hambly et al. 2018). Important for sources brighter than G ≃ 12 is a more reliable identification of saturated samples, which are not used in the PSF fitting.
All observations provide an alongscan (AL) measurement, consisting of the precise time at which the image centroid passes a fiducial line on the CCD. The twodimensional windows, mainly used for bright sources (G ≲13), provide in addition a less precise acrossscan (AC) measurement from the pixel column of the image centroid. A singe transit over the field of view thus generates ten AL measurements and one or ten AC measurements, although some of them may be discarded in the subsequent processing. The first observation in a transit is always made with the sky mapper (SM); it is twodimensional, but less precise in both AL and AC than the subsequent observations in the astrometric field (AF) because of the special readout mode of the SM detectors. Only AF observations are used in the astrometric solutions. All measurements come with a formal uncertainty estimated by the image parameter determination. Based on the photonnoise statistics, the median formal AL uncertainty is about 0.06 mas per CCD observation in the AF for G < 12 mag, 0.20 mas at G = 15 mag, and 3.8 mas at G = 20 mag (cf. Fig. 9).
2.3 Colour information
The chromaticity calibration (Sect. 3.3) requires that the effective wavenumber ν_{eff} = ⟨λ^{−1}⟩ is known for all primary sources. For Gaia DR2, this quantity was computed from the mean integrated G_{BP} and G_{RP} magnitudes provided by the photometry pipeline (Riello et al. 2018), using the formula (2)
where C = G_{BP}–G_{RP} (Fig. 1). The arctan transformation constrains ν_{eff} to the interval [1.1, 2.9] μm^{−1} (roughly corresponding to the passband of G, or ≃340–910 nm) as a safeguard against spurious extreme values of C. The polynomial coefficients are based on prelaunch calibrations of the photometric bands and standard stellar flux libraries. In future releases, more accurate values of ν_{eff} may be computed directly from the calibrated BP and RP spectra.
Fig. 1 Effective wavenumber as a function of colour index. The curve is the analytical relation in Eq. (2). We also show the distribution of G_{BP} –G_{RP} for a randomselection of bright (G < 12 mag, bluish histogram with two peaks) and faint (G > 18 mag, reddish histogram) sources. 
2.4 BAM data
The basic angle monitor (BAM) is an interferometric device measuring shortterm (≲1 day) variations of the basic angle at μas precision (Mora et al. 2016). Similarly to what was done for Gaia DR1 (Appendix A.2 in Lindegren et al. 2016), the BAM data are here used to correct the astrometric measurements for the rapid variations (in particular the ~1 mas amplitude 6 h oscillations) not covered by the astrometric calibration model. However, the corrections are considerably more detailed for Gaia DR2, taking advantage of several improvements in the processing and analysis of the BAM data: cosmicray filtering at pixel level of the raw BAM data; use of crosscorrelation to determine very precise relative fringe phases; improved modelling ofdiscontinuities and other variations that cannot be represented by the simple harmonic model used for Gaia DR1 (cf. Figs. A.2 and A.3 in Lindegren et al. 2016). Some 370 basicangle jumps with a median amplitude of 45 μas are corrected in this way. The jumps appear seemingly at random times, but at a much increased rate in the weeks following a decontamination event. The jumps, plus the smoothed BAM data between jumps, provided the basicangle corrector for Gaia DR2 in the form of a spline function of time.
The spinrelated distortion model (Sect. 3.4) provides certain global corrections to the BAM data, derived from the astrometric observations, but cannot replace the BAM data, which contain a host of more detailed information such as the jumps.
3 Models
3.1 Source model
The Gaia data processing is based on a consistent theory of relativistic astronomical reference systems (Soffel et al. 2003). Relevant components of the model are gathered in the Gaia relativity model (GREM; Klioner 2003, 2004). The primary coordinate system is the Barycentric Celestial Reference System (BCRF) with origin at the solar system barycentre and axes aligned with the International Celestial Reference System (ICRS). The timelike coordinate of the BCRS is the barycentric coordinate time (TCB).
The astrometric solutions described in this paper always assume that the observed centre of the source moves with uniform space motion relative to the solar system barycentre. (Nonlinear motions caused by binarity and other perturbations require special solutions that will be included in future Gaia releases.) The relevant source model is described in Sect. 3.2 of the AGIS paper and is not repeated here. It depends on six kinematic parameters per source, that is, the standard five astrometric parameters (α, δ, ϖ, μ_{α*}, and μ_{δ}), and the radial velocity v_{r}. The astrometric parameters in Gaia DR2 refer to the reference epoch J2015.5 = JD 2457 206.375 (TCB) = 2015 July 2, 21:00:00 (TCB). The positions and proper motions refer to the ICRS thanks to the special frame alignment procedure (Sect. 5.1).
The source model allows taking into account perspective acceleration through terms depending on the radial velocity v_{r}. The accumulated effect over a time interval T is Δ = v_{r} μϖT^{2}∕A_{u}, where is the total proper motion and A_{u} is the astronomical unit. This is negligible except for some very nearby highvelocity stars, and for nearly all sources we ignore the effect by setting v_{r} = 0 in the astrometric processing. Only for 53 nearby HIPPARCOS sources was it taken into account by assuming nonzero values of v_{r} taken from the literature (SIMBAD; Wenger et al. 2000). These sources were selected as having a predicted Δ > 0.023 mas for T = 1.75 yr, calculated from HIPPARCOS astrometry (van Leeuwen 2007). The somewhat arbitrary limit 0.023 mas corresponds to an RMS modelling error below 0.002 mas, which is truly insignificant for this release. The top ten cases are listed in Table 1. In future releases, perspective acceleration will be taken into account whenever possible, using radialvelocity data from Gaia’s onboard spectrometer (RVS; Sartoretti et al. 2018). We note that 34 of the 53 sources have radial velocities from the RVS in this release, with a median absolute deviation of 0.6 km s^{−1} from the values used here. The absolute difference exceeds 5 km s^{−1} in only four cases, the most extreme being HIP 47425 = Gaia DR2 5425628298649940608 with v_{r} = +142 ± 21 km s^{−1} from SIMBAD, based on Rodgers & Eggen (1974), and v_{r} = +17.8 ± 0.2 km s^{−1} in Gaia DR2. In none of the cases will the error in v_{r} cause an astrometric effect exceeding 0.02 mas in the present reduction.
The final secondary solution (Sect. 4.2) requires knowledge of ν_{eff} for all sources in order to take the chromaticity into account. For most but not all sources, this is known from the photometric processing as described in Sect. 2.3. Given the calibrated chromaticity, it is also possible, however, to obtain an astrometric estimate of ν_{eff} for every source by formally introducing it as an additional (sixth) astrometric source parameter. The resulting estimate of ν_{eff}, called pseudocolour, is much less precise than the ν_{eff} calculated from G_{BP} –G_{RP} using Eq. (2), but has the advantage that it can be obtained for every source allowing a fiveparameter solution. Moreover, it is not affected by the BP/RP flux excess issue (Evans et al. 2018), which tends to make faint sources in crowded areas too blue as measured by the G_{BP} –G_{RP}.
To ensure the most uniform astrometric treatment of sources, the pseudocolour was consistently used as a proxy for ν_{eff} in all cases where Gaia DR2 provides a fiveparameter solution, that is, even when photometric colours are available. Because it is so important for the astrometry, the pseudocolour is given in the Gaia Archive as astrometric_pseudo_colour. Normally, it does not provide an astrophysically useful estimate of the colour because its precision is much lower than the photometric data.
Our treatment of the pseudocolour as a sixth source parameter should not be confused with the use of the radial proper motion μ_{r} = v_{r}ϖ∕A_{u} in the kinematic source model (e.g. Eq. (2) of Lindegren et al. 2016). This quantity, sometimes referred to as the “sixth astrometric parameter”, is used internally in AGIS to take into account the perspective acceleration, but is never explicitly estimated as an astrometric parameter.
Ten HIPPARCOS sources in Gaia DR2 with the largest predicted perspective acceleration.
3.2 Attitude model
The attitude specifies the orientation of the optical instrument in ICRS as a function of time. Mathematically, it is given by the unit quaternion q(t). The attitude model described in Sect. 3.3 of the AGIS paper represents the timedependent components of q (t) as cubic splines. For Gaia DR1, a knot interval of about 30 s was used in the splines, but it was noted that a much shorter knot interval (i.e. more flexible splines) would actually be needed to cope with the considerable attitude irregularities on shorter timescales, including a large number of “microevents” such as the very frequent microclanks (see Appendices C.4 and E.4 in Lindegren et al. 2016) and less frequent micrometeoroid hits. Decreasing the knot interval of the splines is not a good way forward, however, as it would weaken the solution by the increased number of attitude parameters. Moreover, this cannot adequately represent the CCDintegrated effects of the microevents, which depend also on the gate (g) used for an observation. For Gaia DR2 the attitude model includes a new layer, known as the corrective attitude q_{c} (t, g), such that the (gatedependent) effective attitude becomes (3)
Here q_{p}(t) is the primary attitude: this uses the same spline representation as the old attitude model, and its parameters are estimated inthe primary solution in a similar way as before, the main difference being that the field angle residuals (Eqs. (25)–(26) in the AGIS paper) are now computed using the effective attitude q_{e} (t, g) for the relevant gate. The effective attitude represents the mean pointing of the instrument during the CCD integration interval, which is different depending on g.
In Eq. (3) the corrective attitude q_{c} represents a small time and gatedependent rotation that takes care of attitude irregularities that are too fast for the spline model. It is calculated in the AGIS preprocessor and remains fixed during subsequent astrometric solutions. For details about its calculation, we refer to the Gaia DR2 online documentation. Briefly, the procedure includes the following steps:
 1.
Given two successive CCD observations in the astrometric field (AF) of the same source, with observation times t_{k} and t_{k+1}, an estimate of the inertial angular rate along the nominal spin axis z (in the scanning reference system, SRS) is obtained as
where η_{k} and η_{k+1} are the AL field angles calculated from a preliminary geometrical model of the instrument. The minus sign on the first term is due to the apparent motions of images in the direction of negative η (see Fig. 3 in the AGIS paper). The second term takes into account the (slow) rotation of the field that is due to the acrossscan (AC) angular rates ω_{x} and ω_{y}. φ and ζ are the AL and AC instrument angles of the source (Fig. 2 in the AGIS paper) at a time midway between the two observations. (Only approximate AC rates are needed here, as tanζ < 0.01.) The bar in signifies that it is a mean value of the instantaneous rate, averaged over both the CCD integration time (≃4.42 s for ungated observations) and the time between successive CCD observations (≃4.86 s).
 2.
Applying Eq. (4) to ungated AF observations for all sources in the magnitude range 12 to 16 yields on average several hundred measurements per second of the AL angular rate. The rate measurements are binned by time, using a bin size of 0.2 s, and the median value calculated in each bin. This provides an accurate timeseries representation of with sufficient time resolution for the next step.
 3.
Microclanks are small quasiinstantaneous changes in the physical orientation of the instrument axes, which create trapezoidal profiles in with aconstant and known profile; for an example, see the bottom panels of Fig. D.4 in Lindegren et al. (2016). In this step, microclanks are detected, and their times and amplitudes estimated, by locally fitting a smooth background signal plus a scaled profile to the timeseries representation of. The fitted profile is subtracted and the procedure repeated until no more significant clank is detected. The end result is a list of detected clanks, with their times and amplitudes, together with an estimate of the rate without clanks.
 4.
Integrating as a function of time and fitting a cubic spline with uniform 5 s knot separation provides an estimate of the attitude irregularities at frequencies below ≃0.1 Hz, including the effects of minor micrometeoroid hits. Finally, the corrective attitude is obtained by adding, depending on g, the analytically integrated effect of the detected clanks.
Thanks to the use of a precomputed corrective attitude, it is possible to use a rather long (30 s) knot interval in the primary astrometric solution without causing a degradation in the accuracy. For Gaia DR2, this procedure was only applied to the AL attitude component (z axis). In thefuture, the AC components will be similarly corrected for microclanks and other mediumfrequency irregularities.
Micrometeoroid hits cause rate irregularities that are distinctly different from the clank profiles: they are less abrupt, of much longer duration, and have somewhat variable profiles depending on the response of the onboardattitude control system. Nevertheless, they could in principle be detected and handled in a similar way as the clanks. Currently, however, only major hits are automatically detected and treated simply by inserting data gaps around them. Such hits, detected from attitude rate disturbances exceeding a few mas s^{−1}, occurred at a fairly constant rate of about five hits per month. Minor hits remain undetected, but are effectively corrected by the integrated rate that is part of the corrective attitude.
3.3 Calibration model
The astrometric calibration model specifies the location of the fiducial “observation line” for a particular combination of field of view (f), CCD (n), and gate (g) indices, as a function of the AC pixel coordinate μ, time t, and other relevant quantities (Sect. 3.4 in the AGIS paper). Formally, it defines the functions η_{fng}(μ, t, …), ζ_{fng}(μ, t, …) in terms of a discrete set of calibration parameters, where (η, ζ) are the field angles along the observation line. In the generic calibration model, these functions are written as sums of a number of “effects”, which in turn are linear combinations of basis functions with the calibration parameters as coefficients. Table 2 gives an overview of the effects and number of calibration parameters used in the final primary solution for Gaia DR2. All calibration effects are independently modelled for the 2 × 62 = 124 combinations of the field and CCD indices. The calibration model for the sky mappers (SM) is similar, but not described here as the SM observations are not used in the astrometric solutions.
Although Gaia is designed to be extremely stable on short timescales, inevitable changes in the optics and mechanical support structure require a timedependent calibration. Occasional spontaneous, minute changes in the instrument geometry, and major operational events such as mirror decontaminations, telescope refocusing, unplanned data gaps and resets, make it necessary to have breakpoints (discontinuities) at specific times. To accommodate both gradual and sudden changes, the generic calibration model allows the use of several time axes, with different granularities, such that an independent subset of calibration parameters is estimated for each granule. The current model uses three time axes with 243, 14, and 10 granules spanning the length of the data. The first one, having the shortest granules of typically 3 days, is used for the most rapidly changing effects. The other two are used for effects that are either intrinsically less variable (e.g. representing the internal structure of the CCDs) or less critical for the solution (e.g. the AC calibration). The third axis has granules of exactly 63 days duration, tuned to the scanning law in order to minimise crosstalk between spinrelated calibration effects and the celestial reference frame.
The current calibration model differs in many details from the one used for Gaia DR1 (Appendix A.1 in Lindegren et al. 2016); in particular, it includes colour and magnitudedependent terms needed to account for centroid shifts that are not yet calibrated in the preprocessing of the raw data.
The AL calibration model is the sum of the five different effects listed in the upper part of Table 2, giving a total of 335 544 AL parameters. As explained in Appendix A.1 of Lindegren et al. (2016), the variation with acrossscan coordinate μ within a CCD, and with time t within a granule, is modelled as a linear combination of basis functions (5)
where , are the shifted Legendre polynomials^{2} of degree l and m, orthogonal on 0 ≤ x ≤ 1 for l≠ m, is the normalised AC pixel coordinate (with μ_{min} = 13.5 and μ_{max} = 1979.5), and the normalised time within granule j, t ∈ [t_{j}, t_{j+1}). The third and fourth columns in Table 2 list the combination of indices l and m used for a particular effect, and the number of basis functions K_{lm} used for each combination of jfn, and their orders lm. For example, effect 1 is a linear combination of K_{00}, K_{10}, K_{20}, and K_{01} for each combination jfn. Similarly, effect 2 is a linear combination of K_{00} and K_{10} for each combination jfngb.
This calibration model does not include any effects that vary on a very short spatial scale, for instance, from one pixel column to the next. Such smallscale effects do exist (see Fig. 10), and will be included in future calibrations. In the present astrometric solutions, they are treated as random noise on the individual CCD observations.
In principle, the image parameter determination (Sect. 2.2) should result in centroid positions that are independent of window class^{3}, colour, and magnitude. For the current solution, this was not the case, and these effects were instead included in the astrometric calibration model described here. Effect 3 describes the displacement of each window class (w) for a source of reference colour (ν_{eff} = 1.6) and reference magnitude (G = 13), while effects 4 and 5 describe the dependence on colour and magnitude by means of additional terms proportional to ν_{eff} –1.6 and G–13, respectively.
Combining all five effects, the complete AL calibration model is (6)
where is the nominal observation line for CCD n and gate g, and Δ η are the calibration parameters. For brevity, the arguments of K_{lm} (different in each term) are suppressed and Einstein’s summation convention is used for the repeated indices lm. Indices j and b are implicit functions of t and μ, respectively, with j depending on the granularity of the time axis and b depending on the “stitch block” structure imprinted on the pixel geometry by the CCD manufacturing process (cf. Fig. 10).
The AC calibration model is similarly a sum of the five effects given in the lower part of Table 2, giving a total of 57 288 AC parameters. The expression for ζ_{fngw}(μ, t, ν_{eff}, G) is analogous to (6), with ζ replacing η everywhere, except that there is no dependence on the stitch block index b. The coarse time granularity is used for all AC effects.
Certain constraints among the calibration parameters are needed to avoid degeneracies in the astrometric solution. For Gaia DR2, only the basic constraints defining the origin of η and ζ (Eqs. (16)–(18) in the AGIS paper) were used. It is known that the calibration model has additional degeneracies, corresponding to missing constraints; these are handled internally by the solution algorithm (cf. Appendix C.3 in the AGIS paper) and should not affect the astrometric parameters.
Fig. 2 Relation between the number of visibility periods and fieldofview transits (matched observations) per source used in the secondary astrometric solutions. A small random number was added to the integer number of visibility periods to widen the vertical bars. The white horizontal line through each bar shows the location of the median. The diagram was constructed for a random subset of about 2.5 million sources. 
Fig. 3 Dependence of the faint reference frame on colour. The diagram shows the components of spin ω_{X}, ω_{Y}, and ω_{Z} around the ICRS axes, as estimated for faint (G ≃ 15–21) quasars subdivided by effective wavenumber. The components in X and Z were shifted by ± 0.2 mas yr^{−1} for better visibility. Error bars are at 68% confidence intervals for the estimated spin. 
Fig. 4 Dependence of the reference frame on magnitude. The diagram shows the spin components as in Fig. 3, but subdivided by magnitude. The points at the faint end (G ≳ 15) are estimated from the proper motions of quasars. At the bright end (G ≲ 13), the spin is estimated from the differences in stellar proper motions between Gaia DR2 and the HIPPARCOS subset of TGAS in Gaia DR1. 
Fig. 5 Density map of the full quasar sample (union of AllWISE AGNs and VLBI sources) at a resolution of 1.8 × 1.8 deg^{2}. The scatter of points in the Galactic band are VLBI sources. This and following fullsky maps use a Hammer–Aitoff projection in equatorial (ICRS) coordinates with α = δ = 0 at the centre, north up, and α increasing from right to left. 
Fig. 6 Parallax distribution for 556 869 sources identified as quasars. Outer (blue) curve: the whole sample; inner (grey) curve: the subsample of 492 928 sources with σ_{ϖ} < 1 mas. 
Fig. 7 Parallaxes for the full quasar sample plotted against magnitude (left), colour (middle), and ecliptic latitude (right). Because of the chosen scale, only about onethird of the data points are shown as yellow dots; the blue curves are the running medians. 
Fig. 8 Distributions of the normalised centred parallaxes for the same samples as in Fig. 6. The red curve is a Gaussian distribution with the same standard deviation (1.081) as the normalised centred parallaxes for the full sample. 
Fig. 9 Precision of alongscan astrometric measurements as a function of magnitude. The red (lower) curve is a running median of the formal precision from the image parameter determination; the blue (upper) curve is a robust estimate of the actual standard deviation of the postfit residuals. The difference between the two curves represents the combination of all unmodelled errors. 
Fig. 10 Smallscale distortion for ungated observations on one of the astrometric CCDs (strip 7, row 4). The curves show the median AL residual for sources in the magnitude range G = 13–16 plotted against the AC pixel coordinate μ, and subdivided according to field of view (preceding PFoV, or following FFoV) and time (before or after the decontamination at OBMT ≃ 2400). For better visibility, the successive curves were vertically displaced by 0.1 mas. The vertical dashed lines show the stitch block boundaries, which divide the 1966 pixels in blocks of 250 pixels, except for the two outermost blocks that are 108 pixels. 
3.4 Spinrelated distortion model
As shown by the BAM data (Sect. 2.4) and confirmed in early astrometric solutions, the basic angle between Gaia’s two fields of view undergoes very significant (~1 mas amplitude) periodic variations. The variations depend mainly on the phase of the 6 h spin with respect to the Sun, as given by the heliotropic spin phase Ω(t) (e.g. Fig. 1 in Butkevich et al. 2017). To first order, they can be represented by (7)
(cf. Eqs. (A.10)–(A.11) in Lindegren et al. 2016), where d(t) is the Sun–Gaia distance in au. Values of the Fourier coefficients obtained by fitting Eq. (7) to the periodic part of the basicangle corrector (Sect. 2.4), using t_{ref} = J2015.5, are given in Table 3.
Although the exact mechanism is not known, the large 6 h variations are believed to be caused by thermoelastic perturbations in the Sunilluminated service module of Gaia propagating to the optomechanical structure of the payload (Mora et al. 2016). It is then almost unavoidable that the optical distortions in the astrometric fields also undergo periodic variations, although most likely of much smaller amplitude. The spinrelated distortion model aims at estimating, and hence correcting, such variations inthe astrometric solution, based on the assumption that they are stable on long timescales. Specifically, for Gaia DR2, it is assumed that the variations scale with the inverse square of the distance to the Sun, but otherwise are strictly periodic in Ω (t). Since such amodel in fact describes the basicangle variations measured by the BAM rather well, it is not unreasonable to assume that it could also work for the optical distortion.
The spinrelated distortion may be regarded as just another effect in the astrometric calibration model (Sect. 3.3). However, the character of the variations, requiring a single block of parameters for all observations, made it more convenient to implement it as a set of global parameters (Sect. 5.4 in the AGIS paper).
Depending on the field index f (= +1 for the preceding and − 1 for the following field of view), the spinrelated distortion model adds a timedependent AL displacement to the calibration model in Sect. 3.3: (8)
Here and are the shifted Legendre polynomials of degree l and m (see footnote 2), and , are normalised field angles. (The limits in ζ depend on f because of the different AC locations of the optical centre in the preceding and following fields; see Fig. 3 and Eq. (14) in the AGIS paper.) For the present thirdorder model (l + m ≤ 3), there are ten twodimensional basis functions per field of view. The functions F_{flm}(t) of degree l + m > 0 are modelled as a truncated Fourier series in Ω(t), scaled by the inverse square of the distance to the Sun: (9)
This gives 288 parameters c_{fklm} and s_{fklm}. The functions F_{−1,0,0}(t) and F_{+1,0,0}(t), that is, for f = ±1 and l = m = 0, require a separate treatment to avoid degeneracy. They represent timedependent offsets in the two fields that are independent of the field angles η and ζ. The mean function [F_{−1,0,0}(t) + F_{+1,0,0}(t)]∕2 is equivalent to a timedependent AL shift of the attitude and can therefore be constrained to zero for all t. The difference δΓ (t) = F_{−1,0,0}(t) − F_{+1,0,0}(t) represents a timedependent correction to the basic angle in addition to the basicangle corrector derived from BAM data (Sect. 2.4) and the slower variations of the calibration model (Sect. 3.3). This correction is modelled as a scaled Fourier series, in which the Fourier coefficients have a linear dependence on time similar to Eq. (7): (10)
with t_{ref} = J2015.5. However, as discussed in Sect. 5.2, the parameter δC_{1,0} is nearly degenerate with a global shift of the parallaxes and in the present solution it was not estimated, meaning that it was assumed to be zero. This gave 31 parameters for δΓ(t), and a total of 319 parameters for the complete spinrelated distortion model.
Results from the final solution for the parameters in Eq. (10) are shown in Table 3 in the columns marked Corr. These values can be interpreted as corrections to the mean harmonic coefficients from Eq. (7) shown in the columns marked BAM. The statistical uncertainty of all values is below 1 μas or 1 μas yr^{−1}. The main conclusion from this table is that the BAM data, while substantially correct, nevertheless require significant corrections at least for k ≤ 4. One possible interpretation is that the BAM accurately measures the basicangle variations at the location of the BAM CCD, outside the astrometric field, but that these variations are not completely representative for the whole astrometric field. The special case of δC_{1,0} and further aspects of c_{fklm} and s_{fklm} are discussed in Sect. 5.2.
Summary of the astrometric calibration model and number of calibration parameters in the astrometric solution for Gaia DR2.
4 Astrometric solutions
The astrometric results in Gaia DR2 were not produced in a single large leastsquares process, but were the end result of a long series of solutions using different versions of the input data and testing different calibration models and solution strategies. The description below ignores much of this and only mentions the main path and milestones. As described in the AGIS paper, a complete astrometric solution consists of two parts, known as the primary solution and the secondary solutions. In the primary solution, which involves only a small fraction of the sources known as primary sources, the attitude and calibration parameters (and optionally the global parameters) are adjusted simultaneously with the astrometric parameters of the primary sources using an iterative algorithm. The reference frame is also adjusted using a subset of the primary sources identified as quasars. In the secondary solutions, the five astrometric parameters of every source are adjusted using fixed attitude, calibration, and global parameters from the preceding primary solution. The restriction on the number of primary sources comes mainly from practical considerations, as the primary solution is computationally and numerically demanding because of the large systems of equations that need to be solved. By contrast, the secondary solutions can be made one source at a time essentially by solving a system with only five unknowns (or six, if pseudocolour is also estimated). For consistency, the astrometric parameters of the primary sources are recomputed in the secondary solutions.
For Gaia DR2, two complete astrometric solutions were calculated, internally referred to as AGIS02.1 and AGIS02.2. The published data exclusively come from AGIS02.2.
4.1 Provisional solution (AGIS02.1)
The first complete astrometric solution based on the Gaia DR2 input data was made in December 2016. This solution, known as AGIS02.1, provided a provisional attitude and astrometric calibration, and provisional astrometric parameters for about 1620 million sources. These data were used as a starting point for the final solution (AGIS02.2) and allowed us to identify and resolve a number of issues at an early stage. Typical differences between the provisional and final solutions are below 0.2 mas or 0.2 mas yr^{−1}.
The provisional solution was also used in some of the downstream processing, notably for the wavelength calibrations of the photometric instruments (Riello et al. 2018) and radialvelocity spectrometer (Sartoretti et al. 2018). The availability of a provisional solution more than a year before the release was crucial for the inclusion of highquality photometric and spectroscopic results in Gaia DR2.
4.2 Final Gaia DR2 solution (AGIS02.2)
Compared with the provisional solution, the main improvements in the final solution were
use of pseudocolours in the source model (Sect. 3.1) to take chromaticity into account;
a more accurate corrective attitude (Sect. 3.2), based on the AGIS02.1 calibration;
an improved basic angle corrector, including many detected jumps (Sect. 2.4);
a calibration model (Sect. 3.3) better tuned to the data, derived after detailed analysis of several test runs;
inclusion of global parameters for the spinrelated distortion model (Sect. 3.4).
The main steps for producing the final solution were as follows:
 1.
AGIS preprocessing. This collected and converted input data for each source: astrometric parameters from a previous solution, photometric information, radial velocity when relevant (Sect. 3.1), and the image parameters from all the astrometric observations of the source. The corrective attitude was also computed at this point.
 2.
Preliminary secondary solutions. A preliminary adjustment of the parameters for all the sources was performed, using the attitude and calibration from AGIS02.1. The main purpose of this was to collect source statistics in order to tune theselection of primary sources for the next step. Two secondary solutions were made for each source: the first computed the pseudocolour of the source, and the second recomputed the astrometric solution using the derived pseudocolour. This gave preliminary astrometric parameters and solution statistics for nearly 2500 million sources.
 3.
Selectionof primary sources. About 16 million primary sources were selected based on the results of the previous step. The criteria for the selection were that (i) sources must have G, G_{BP}, and G_{RP} magnitudes from the photometric processing; (ii) there should be a roughly equal number of sources with observations in each of the three window classes; (iii) for each window class, there should be a roughly homogeneous coverage of the whole sky and a good distribution in magnitude and colour; and (iv) within the constraints set by the previous criteria, sources with high astrometric weight (bright, with small excess noise and a good number of observations) were preferentially selected. To this were added some 490 000 probable quasars for the reference frame alignment (Sect. 5.1).
 4.
Primary solution. The astrometric parameters of the primary sources were adjusted, along with the attitude, calibration, and global parameters, using a hybrid scheme of simple and conjugate gradient iterations (see Sect. 4.7 in the AGIS paper). The frame rotator was used to keep the astrometric parameters and attitude on ICRS using the subset of primary sources identified as quasars (Sect. 5.1).
 5.
Final secondary solutions. This essentially repeated step 2 with the final attitude, calibration, and global parameters from step 4, including a recomputation of the pseudocolours for all sources using the final chromaticity calibration. Sources failing to meet the acceptance criteria for a fiveparameter solution (Sect. 4.3) obtained a fallback solution at this stage.
 6.
Regeneration of attitude and calibration. The primary solution did not use data from the first month of nominal operations (in EPSL mode; Sect. 2.1), and several shorter intervals of problematic observations were also skipped. In this step the attitude and calibration were recomputed for these intervals by updating the corresponding parameters while keeping the source parameters fixed. This allowed other processes, such as the photometric processing, to make use of observations in these time intervals as well.
 7.
AGIS postprocessing. This converted the results into the required formats and stored them in the main database for their subsequent use by all other processes, including the generation of the Gaia Archive.
Although not part of the astrometric processing proper, a further important step was carried out at the point when the astrometric data were converted from the main database into the Gaia Archive: the formal uncertainties of the fiveparameter solutions were corrected for the “DOF bug”. The background and details of this are described in Appendix A. Here it is sufficient to note that the formal astrometric uncertainties given in the Gaia Archive, denoted σ_{α*}, σ_{δ}, σ_{ϖ}, σ_{μα*}, and σ_{μδ}, generally differ from the (uncorrected) uncertainties obtained in step 5. When occasionally we need to refer to the latter values, we use the notation ς_{α*}, ς_{δ}, ς_{ϖ}, ς_{μα*}, and ς_{μδ} for the uncorrected uncertainties.
4.3 Acceptance criteria and fallback (twoparameter) solution
In the final secondary solution (step 5 of Sect. 4.2), a fiveparameter solution without priors was first attempted for every source. If this solution was not of acceptable quality, a fallback solution for the two position parameters was tried instead. The fallback solution is actually still a fiveparameter solution, but with prior information added on the parallax and proper motion components. Details of the procedure are given in Michalik et al. (2015). In the notation of that paper, the precise priors used in the fallback solutions of Gaia DR2 were σ_{α*,p} = σ_{δ,p} = 1000 mas for the position, σ_{ϖ,p} = 10 σ_{ϖ,F90} for the parallax, and for the proper motion components, with yr^{−1}. Compared with a genuine twoparameter solution, where the parallax and proper motion are constrained to be exactly zero, the use of priorsin most cases gives a more realistic estimate of the positional uncertainties. The resulting parallax and proper motion values are biased by the priors, and therefore not published.
The criterion for accepting a fiveparameter solution uses two quality indicators specifically constructed for this purpose:
visibility_periods_used counts the number of distinct observation epochs, or “visibility periods”, used in the secondary solution for a particular source. A visibility period is a group of observations separated from other groups by a gap of at least four days. This statistic is a better indicator of an astrometrically wellobserved source than for example astrometric_matched_observations (the number of fieldofview transits used in the solution): while a fiveparameter solution is in principle possible with fewer than ten fieldofview transits, such a solution will be very unreliable unless the transits are well spread out in time. As illustrated in Fig. 2, there are many sources with >10 transits concentrated in just a few visibility periods.
astrometric_sigma5d_max is a fivedimensional equivalent to the semimajor axis of the position error ellipse and is useful for filtering out cases where one of the five parameters, or some linear combination of several parameters, is particularly bad. It is measured in mas and computed as the square root of the largest singular value of the scaled 5 × 5 covariance matrix of the astrometric parameters. The matrix is scaled so as to put the five parameters on a comparable scale, taking into account the maximum alongscan parallax factor for the parallax and the time coverage of the observations for the proper motion components. If C is the unscaled covariance matrix, the scaled matrix is SCS, where S = diag(1, 1, sinξ, T∕2, T∕2), ξ = 45° is the solar aspect angle in the nominal scanning law, and T = 1.75115 yr the time coverage of the data used in the solution. astrometric_sigma5d_max was not corrected for the DOF bug, as that would obscure the source selection made at an earlier stage based on the uncorrected quantity.
The fiveparameter solution was accepted if the following conditions were all met for the source: (11)
where γ(G) = max[1, 10^{0.2(G−18)}]. The upper limit in (iii) gradually increases from 1.2 mas for G ≤ 18 to 4.78 mas at G = 21. This test was applied using preliminary G magnitudes, with the result that some sources in Gaia DR2 have fiveparameter solutions even though they do not satisfy (iii).
If the fiveparameter solution was rejected by Eq. (11), a fallback solution was attempted as previously described. The resulting position, referring to the epoch J2015.5, was accepted provided that the following conditions are all met: (12)
astrometric_excess_noise is the excess source noise ϵ_{i} introduced in Sect. 3.6 of the AGIS paper, and σ_{pos, max} is the semimajor axis of the error ellipse in position given by Eq. (B.1). Sources rejected also by Eq. (12) are mostly spurious and no results are published for them.
These criteria resulted in 1335 million sources with a fiveparameter solution and 400 million with a fallback solution, that is, without parallax and proper motion. About 18 million sources were subsequently removed as duplicates, that is, where the observationsof the same physical source had been split between two or more different source identifiers. Duplicates were identified by positional coincidence, using a maximum separation of 0.4 arcsec. To decide which source to keep, the following order of preference was used: unconditionally keep any source (quasar) used for the reference frame alignment; otherwise prefer a fiveparameter solution before a fallback solution, and keep the source with the smallest astrometric_sigma5d_max to break a tie.
Gaia DR2 finally gives fiveparameter solutions for 1332 million sources, with formal uncertainties ranging from about 0.02 mas to 2 mas in parallax and twice that in annual proper motion. For the 361 million sources with fallback solutions, the positional uncertainty at J2015.5 is about 1 to 4 mas. Further statistics are given in Appendix B.
5 Internal validation
This section summarises the results of a number of investigations carried out by the DPAC astrometry team in order to validate the astrometric solutions. This aimed in particular at characterising the systematic errors in parallax and proper motion, and the realism of the formal uncertainties. Some additional quality indicators are discussed in Appendix C.
5.1 Reference frame
The celestial reference frame of Gaia DR2, known as GaiaCRF2 (Gaia Collaboration 2018b), is nominally aligned with ICRS and nonrotating with respect to the distant universe. This was achieved by means of a subset of 492 006 primary sources assumed to be quasars. These included 2843 sources provisionally identified as the optical counterparts of VLBI sources in a prototype version of ICRF3, and 489 163 sources found by crossmatching AGIS02.1 with the AllWISE AGN catalogue (Secrest et al. 2015, 2016). The unpublished prototype ICRF3 catalogue (30/06/2017, solution from GSFC) contains accurate VLBI positions for4262 radio sources and was kindly made available to us by the IAU Working Group Third Realisation of International Celestial Reference Frame.
The radius for the positional matching was 0.1 arcsec for the VLBI sources and 1 arcsec for the AllWISE sample. Apart from the positional coincidence, the joint application of the following conditions reduced the risk of contamination by Galactic stars: (13)
where b is Galactic latitude. We used the formula sinb = (−0.867666cosα − 0.198076sinα)cosδ + 0.455984sinδ, which is accurate to about 0.1 arcsec. These conditions were applied to both samples, except that (v) was not used for the VLBI sample where the risk of contamination is much lower thanks to the smaller positional match radius.
The selection of sources for the frame rotator described above was made before the final solution had been computed and therefore used preliminary values for the various quantities in Eq. (13), including standard uncertainties (ς) not yet corrected for the DOF bug. The resulting subsets of sources are indicated in the Gaia Archive by the field frame_rotator_object_type, which is 2 for the 2843 sources matched to the ICRF3 prototype, 3 for the 489 163 sources matched to the AllWISE AGN catalogue, and 0 for sources not used by the frame rotator. The magnitude distributions of these subsets are shown in Fig. B.1. It can be noted that the AllWISE sample (labelled “QSO” in the diagram) contains three bright sources (G < 12) that are probably distant Galactic stars of unusual colours (the brightest being the Herbig AeBe star HD 37357). These objects are not included in the larger but cleaner quasar sample analysed in Sect. 5.2, obtained by applying the stricter Eq. (14) to the final data.
The adjustment of the reference frame was done in the primary solution (step 4 of Sect. 4.2) using the frame rotator described in Sect. 6.1 of the AGIS paper. At the end of an iteration, the frame rotator estimated the frame orientation parameters [ϵ_{X}, ϵ_{Y}, ϵ_{Z}] at J2015.5, using the VLBI sources, and the spin parameters [ω_{X}, ω_{Y}, ω_{Z}] using the AllWISE and VLBI sources. The attitude and the positions and proper motions of the primary sources were then corrected accordingly. The acceleration parameters [a_{X}, a_{Y}, a_{Z}] were not estimated as part of this process, as they are expected to be insignificant compared with the current level of systematics (see below).
At the end of the primary solution, the attitude was thus aligned with the VLBI frame, and the subsequent secondary solutions (step 5 of Sect. 4.2) should then result in source parameters in the desired reference system. This was checked by a separate offline analysis, using independent software and more sophisticated algorithms. This confirmed the global alignment of the positions with the VLBI to within ± 0.02 mas per axis. This applies to the faint reference frame represented by the VLBI sample with a median magnitude of G ≃ 18.8. The bright reference frame was checked by means of some 20 bright radio stars with accurate VLBI positions and proper motions collected from the literature. Unfortunately, their small number and the sometimes large epoch difference between the VLBI observations and Gaia, combined with the manifestly nonlinear motions of many of the radio stars, did not allow a good determination of the orientation error of the bright reference frame of Gaia DR2 at epoch J2015.5. No significant offset was found at an upper (2σ) limit of about ± 0.3 mas per axis.
Concerning the spin of the reference frame relative to the quasars, estimates of [ω_{X}, ω_{Y}, ω_{Z}] using various weighting schemes and including also the acceleration parameters confirmed that the faint reference frame of Gaia DR2 is globally nonrotating to within ± 0.02 mas yr^{−1} in all three axes. Particular attention was given to a possible dependence of the spin parameters on colour (using the effective wavenumber ν_{eff}) and magnitude (G). Figure 3 suggests a small systematic dependence on colour, for example, by ± 0.02 mas yr^{−1} over the range 1.4 ≲ ν_{eff} ≲ 1.8 μm^{−1} corresponding to roughly G_{BP}–G_{RP} = 0 to 2 mag. As this result was derived for quasars that are typically fainter than 15th magnitude, it does not necessarily represent the quality of the Gaia DR2 reference frame for much brighter objects.
Figure 4 indeed suggests that the bright (G ≲ 12) reference frame of Gaia DR2 has a significant (~0.15 mas yr^{−1}) spin relative to the fainter quasars. The points in the left part of the diagram were calculated from stellar proper motion differences between the current solution and Gaia DR1 (TGAS). Only 88 091 sources in the HIPPARCOS subset of TGAS were used for this comparison owing to their superior precision in TGAS. Although based on a much shorter stretch of Gaia observations than the present solution, TGAS provides a valuable comparison for the proper motions thanks to its ~24 yr time difference from the HIPPARCOS epoch. If the spin difference of 0.15 mas yr^{−1} between the two catalogues were to be explained as systematics in TGAS, it would require an alignment error of ~3.6 mas in the positions either in TGAS at epoch J2015.0 or in HIPPARCOS at epoch J1991.25. Given the way these catalogues were constructed, both hypotheses are very unlikely. The most reasonable explanation for the offsets in Fig. 4 is therefore systematics in the Gaia DR2 proper motions of the bright sources. The gradual change between magnitudes 12 and 10 suggests an origin in the gated observations, which dominate for G ≲ 12, or possibly in observations of window class 0, which dominate for G ≲ 13.
Formally, GaiaCRF2 is materialised by the positions in Gaia DR2 of the 556 869 sources identified as quasars in Sect. 5.2. A separate list of these sources is provided in the Gaia Archive. A more comprehensive analysis of GaiaCRF2 is given by Gaia Collaboration (2018b).
5.2 Parallax zero point
Global astrometric satellites like HIPPARCOS and Gaia are able to measure absolute parallaxes, that is, without zeropoint error, but this capability is susceptible to various instrumental effects, in particular, to a certain kind of basicangle variations. As discussed by Butkevich et al. (2017), periodic variations of the basic angle (Γ) of the form δΓ(t) = A_{1}d(t)cosΩ(t), where d(t) is the distance of Gaia from the solar system barycentre in au and Ω(t) is the spin phase relative to the barycentre, are observationally almost indistinguishable from a global parallax shift of δϖ = A_{1}∕[2sinξsin(Γ∕2)] ≃ 0.883A_{1}. This is clearly reminiscent of the first term in Eq. (10). Although d, ξ, and Ω in that equation are heliotropic quantities, while the present formula uses barytropic quantities, and d appears with different powers in the two expressions, the differences are small enough to cause a neardegeneracy between A_{1} and δC_{1,0}. This is the reason why the latter parameter was not estimated in the spinrelated distortion model.
It is believed that the basicangle corrector derived from BAM data (Sect. 2.4) eliminates basicangle variations very efficiently, but a remaining small variation corresponding to the undetermined δC_{1,0} cannot be excluded. This would then show up as a small offset in the parallaxes. For this reason, it is extremely important to investigate the parallax zero point by external means, that is, using astrophysical sources with known parallaxes. It is also important to check possible dependences of the zero point on other factors such as position, magnitude, and colour, which could be created by errors in the calibration model.
The quasars are almost ideal for checking the parallax zero point thanks to their extremely small parallaxes (< 0.0025 μas for redshift z > 0.1), large number, availability over most of the celestial sphere, and, in most cases, nearly pointlike appearance. The main drawbacks are their faintness and peculiar colours.
In orderto create the largest possible quasar sample for validation purposes, a new crossmatch of the final Gaia DR2 data with the AllWISE AGN catalogue (Secrest et al. 2015) was made, choosing in each case the nearest positional match. The further selection used the criteria (14)
which is somewhat similar to Eq. (13), but stricter and applied to the final data. Step (ii) selects fiveparameter solutions (31 = 11111_{2}), and step (iii) takes into account the median offset of the parallaxes (see below). The combination of steps (v) and (vi) makes the probability of a chance match with a Galactic star generally lower than ~ 10^{−4} at all Galactic latitudes. A reality check of the resulting selection against SIMBAD revealed that the two brightest sources (at G = 8.85 and 11.72 mag) are stars; removing them leaves 555 934 sources in the sample. The fraction of stars among the AllWISE AGN sources is estimated at ≤ 0.041% (Secrest et al. 2015), or ≲230 in this sample, but only a fraction of them may pass the criteria in Eq. (14).
Applying conditions (i)–(iv) to the sources matched to the ICRF3 prototype (Sect. 5.1) gave 2820 sources, 1885 of which were already in the AllWISE sample. The union set thus contains a total of 556 869 sources, which also define the celestial reference frame of Gaia DR2 (Gaia Collaboration 2018b). A density map of this quasar sample (Fig. 5) shows imprints of the Gaia and AllWISE scanning laws as well as the effects of Galactic extinction and confusion. In the following, the highprecision subset of 492 928 sources with σ_{ϖ} < 1 mas is sometimes used instead of the full quasar sample.
Figure 6 shows the distribution of parallaxes for the full quasar sample and the highprecision subset. For the full sample, the mean and median parallax is − 0.0308 mas and − 0.0287 mas, respectively; for the highprecision subset, the corresponding values are − 0.0288 mas and − 0.0283 mas. For the subsequent analysis we adopt − 0.03 mas as the global zero point of the parallaxes. Scatter plots of the parallaxes versus magnitude and colour (left and middle panels of Fig. 7) show systematic trends with a change of ~0.02 mas over the ranges covered by the data. A plot against ecliptic latitude (right panel) shows a roughly quadratic variation with ~0.010 mas smaller parallaxes towards the ecliptic poles. Thus, while the global mean offset of − 0.029 mas is statistically welldetermined, the actual offset applicable for a given combination of magnitude, colour, and position may be different by several tens of μas. Spatial variations of the parallax zero point are further analysed in Sect. 5.4.
Figure 8 shows the distribution of (ϖ + 0.029 mas)∕σ_{ϖ}, that is, the parallaxes corrected for the global offset and normalised by the formal uncertainties. Ideally, this should follow a normal distribution with zero mean and unit variance. The actual sample standard deviation of this quantity is 1.081. Similarly, the sample standard deviations of the normalised proper motions, μ_{α*} ∕σ_{μα*} and μ_{δ} ∕σ_{μδ}, are 1.093 and 1.115, respectively. The distributions are very close to normal, as suggested by the red curve in Fig. 8, although it should be noted that the selection in Eq. (14) removed any point beyond ± 5 units in the normalised quantities. The conclusion is that the accidental errors are close to normal, but with a standard deviation some 8–12% larger than the formal uncertainties. This applies to the faint sources (G ≳ 15) beyond the Galactic plane (sinb  > 0.1) represented by the quasar subset.
The observations contributing to the parallax determinations are distributed roughly uniformly over the 62 CCDs in the central 0.7° × 0.7° astrometric field of the Gaia instrument. The basicangle variation relevant for the parallax zero point is therefore effectively given by the average variation in this field. On the other hand, the CCD generating the BAM data is situated about 0.7° from the centre of the astrometric field, that is, well outside the field near one of its corners. The corrections given in Table 3 show that the variations measured by the BAM are not fully representative of the variations present in the astrometric field. It is noted that a parallax zero point of − 29 μas corresponds to a value ≃−33 μas for the undetermined correction δC_{1,0} in Table 3.
Differential variations within the astrometric field depending on Ω are described by the global parameters c_{fklm}, s_{fklm} in Eq. (9), which are estimated in the primary solution. In principle, this allows the differential variations to be extrapolated to the location of the BAM. Although such a procedure is clearly problematic, it could provide an independent estimate of the crucial parameter δC_{1,0} and important consistency checks for other parameters. A detailed investigation along these lines will only be meaningful at a later time when other calibration errors have been substantially eliminated. With the current solution, we note that the largest amplitudes c_{fklm}, s_{fklm} are associated with the lowest temporal (k) and spatial (l + m) orders, as would be expected for a physical instrument. Moreover, their sizes (0.01 to 0.05 mas) are in the approximate range needed to account for the corrections to the BAM data reported in Table 3 as well as the global parallax offset of − 0.029 mas. However, there could be many other explanations for this offset; in particular, it appears that unmodelled AL centroid shifts related to the transverse smearing of the images during a CCD integration (depending on the AC rate d ζ∕dt) could be an important contributor (Sect. 5.3).
5.3 Residual analysis
Analysis of the astrometric residuals can reveal inadequacies in the calibration model, for example where a new effect needs to be added or where the time granularity of some effect already included in the model has insufficient resolution. It is particularly interesting to look for model deficiencies that might explain the systematics seen in the astrometric results, for instance, the parallax zero point error. In this section we first estimate the total size of the unmodelled errors, and then give two examples of effects that contribute to the errors in the present solutions, but could be eliminated in future releases.
Figure 9 compares the photonstatistical uncertainties of the AL angular measurements with the scatter of postfit residuals in the astrometric solution. The red curve is the formal precision from the image parameter determination, derived from the assumed Poissonian character of the individual CCD sample values. This curve has three domains, depending on the number of photons (N) in the stellar image: for moderately bright sources (G ≃ 12–17), the centroiding precision is limited by the photon noise in the stellar image, or σ ∝N^{−1∕2}, leading to a slope of about 0.2 dex mag^{−1}; for fainter sources (G ≳ 17), the background gradually becomes more important, leading to a higher slope in the red curve; finally, for the bright stars (G ≲ 12), the use of the gates limits N and hence the centroiding precision to a value roughly independent of G.
The blue curve in Fig. 9 is the robust scatter estimate (RSE) ^{4} of the postfit residuals, computed in bins of 0.1 mag. For faint sources, it agrees reasonably well with the formal uncertainties (for G > 17 the RSE is on average 15% higher than the formal uncertainties), but for brighter sources, there is a strong discrepancy. The difference between the blue and red curves represents the combination of all unmodelled source, attitude, and calibration errors. The quadratic difference amounts to about 0.3 mas for G ≃ 6–12, 0.25 mas for G ≃ 12–13, and 0.15 mas for G ≳ 12. Part of this may be attributable to the sources (e.g. binarity), part to residual attitude irregularities, but a major part is clearly due to inadequacies of the calibration models, including the LSF and PSF models used for the image parameter determination. A main task in preparation for future Gaia data releases will be to improve these models and hence reduce the gap between the two curves.
The astrometric calibration model (Sect. 3.3) currently does not include smallscale irregularities of the CCDs. To assess the importance of such errors, we plot in Fig. 10 the median AL residual, subdivided by field of view and time, as a function of the AC pixel coordinate μ. Comparing the four curves, it is seen that the pattern is extremely stable in time, but slightly different in the two fields ofview. The rms amplitude is only 0.013 mas in the preceding and 0.015 mas in the following field of view, far too small to explain the discrepancy seen in Fig. 9. While the smallscale irregularities are therefore unimportant in the current solution, they will be included in future calibration models.
One of the most interesting trends revealed by the residual analysis concerns a hitherto unmodelled dependence on the acrossscan rate dζ∕dt, where ζ is the AC fieldangle. In the nominal scanning law, the AC rate varies sinusoidally over the 6 hr spin period with an amplitude of about ± 0.18 arcsec s^{−1}, or ± 0.3% of the constant AL rate (60 arcsec s^{−1}). It is in general different in the two fields of view. The AC motion of stellar images by up to 0.8 arcsec during its motion across a CCD smears the PSF in the AC direction. While this obviously has a strong effect on the AC location of the image, it should, to a first approximation, not affect the AL location of the centroid. However, secondary effects involving a nonsymmetric PSF or nonlinear response to the photon flux could easily generate a small dependence of the precise AL location on the AC rate. Figure 11 shows that this is indeed the case. Test solutions including astrometric calibration terms depending on the AC rate show reduced levels of systematics, for example in terms of the ~1 deg scale correlations discussed in Sect. 5.4. AL centroiding errors depending on the AC rate are particularly insidious, asthe AC rate exhibits a strong correlation with the AL parallax factor in the current nominal scanning law.
Fig. 11 Residual systematics depending on the AC scan rate. The curves show the median residual as a function of OBMT for observations of window class 1 (G ≃ 13–16) in the preceding field of view. The red curve is for observations with positive AC rate, and the blue curve for negative AC rate. The vertical dashed lines show the approximate times of the two decontamination events. 
5.4 Spatial correlations
Figure 12 is a map of the median quasar parallax, adjusted for the median offset − 0.029 mas, at a resolution of a few degrees. Away from the Galactic plane, where there is a sufficient density of quasars (cf. Fig. 5) for estimating alocal zero point, there are several areas of a few tens of degrees where the parallaxes are systematically offset by about ± 0.05 mas from the global mean. This demonstrates the presence of correlated errors on spatial scales of 10–20 deg and RMS values of a few tens of μas. Irregularities on smaller scales cannot be probed in this way using quasars, owing to their low average density.
However, distant stars in dense regions reveal significant variations on much smaller scales. As an example,Fig. 13 shows the median parallaxes for about 2.5 million sources in the area of the LMC. To remove most foreground stars, we selected sources with magnitudes between G = 17 and 19, within 5 deg of the LMC centre (α, δ) = (78.77°, −69.01°), and with proper motions mas^{2} yr^{−2} (cf. Gaia Collaboration 2018c). The mean and median values of their parallaxes are − 0.014 mas, roughly consistent with the parallax zero point from quasars at the LMC location near the South Ecliptic Pole (Fig. 7, right), assuming a true parallax of 0.020 mas for the LMC (Freedman et al. 2001). The quasiregular triangular pattern in Fig. 13 has a period of about 1 deg and a typical amplitude of about ± 0.03 mas. The left part of the circular area seems to be offset by 0.02 mas from the rest with a straight and rather sharp boundary. These patterns areclearly related to Gaia’s scanning law with its precessional motion of about 1 deg per revolution. Similar (unphysical) patterns are seen in parallax maps of highdensity areas around the Galactic centre, and also in the proper motions. Thus strong correlated errors (or systematics) also exist on spatial scales much below 1 deg.
A global, quantitative characterisation of these correlations can be obtained by calculating the covariance of the quasar parallax errors as a function of angular separation, (15)
Here is the mean parallax of all the quasars in the sample, and the average is taken over all nonredundant pairs of quasars (i >j) with angular separation θ ±Δθ∕2. Figure 14 shows the result of this calculation for the highprecision quasar sample, using a bin width of Δ θ = 0.125 deg. The positive covariance for angles ≲40 deg is a signature of largescale systematics and is reasonably well approximated by the fitted exponential (16)
shown by the dashed curve. This function corresponds to errors with an RMS amplitude of 285^{1∕2} ≃ 17 μas and a characteristic spatial scale of 14 deg, both of which are consistent with the largescale patterns seen in Fig. 12. The dip in V_{ϖ} (θ) around θ = 120 deg may be related to the basic angle, although it is centred on a slightly higher value than Γ = 106.5 deg.
The lower panel of Fig. 14 shows V_{ϖ}(θ) for θ < 7 deg. The blue curve connects the slightly smoothed values. Although Eq. (16), shown by the dashed curve, well describes the mean covariance averaged over a few degrees, the detailed curve shows multiple oscillations around the exponential with a period of about 1 deg, and for the smallest angles (< 0.125 deg), the covariance becomes much larger, about 1850 μas^{2} (with a large statistical uncertainty), corresponding to an RMS amplitude of 43 μas. These features are clearly produced by smallscale patterns similar to what is seen in the LMC area (Fig. 13).
Qualitatively similar correlations on both large and small angular scales are found by analysing the proper motions of the quasars. We define (17)
where μ_{i} =p_{i}μ_{α*i} +q_{i}μ_{δi} is the proper motion vector of source i, with unit vectors p_{i} and q_{i} towards increasing α and δ, respectively (e.g. Eq. (3) in Lindegren et al. 2016). The prime denotes the scalar product. The vector formulation was chosen in order to combine the two components of proper motion in a frameindependent way. For small separations p_{i} ≃p_{j} and q_{i} ≃q_{j}, which gives V_{μ}(θ) ≃ (μ_{α*i}μ_{α*j} + μ_{δi}μ_{δj})∕2; thus V_{μ} is the covariance averaged between the two components of the proper motion. Figure 15 shows V_{μ} (θ) for the highprecision quasar sample. The dashed curve is the fitted exponential (18)
The value at θ = 0 corresponds to an RMS amplitude of about 28 μas yr^{−1} for the largescale systematics. At small separations similar features are seen as for V_{ϖ} (θ), including the 1 deg oscillations; for θ < 0.125 deg the covariance is 4400 μas^{2}yr^{−2}, corresponding to an RMS value of 66 μas yr^{−1} per component of the proper motions. Again, this is consistent with smallscale proper motion patterns seen, for example, in the LMC (Gaia Collaboration 2018c).
The RMS values derived above and summarised in Table 4 for the different angular scales can be interpreted as the noise floor when averaging the parallaxes or proper motions for a large number of sources in areas of the corresponding sizes. The numbers should be seen as indicative and not necessarily as representative for sources that are much brighter than the quasars.
Fig. 12 Map of the median parallaxes for the full quasar sample, showing largescale variations of the parallax zero point. See Fig. 5 for the coordinate system and density of sources. Median values are calculated in cells of about 3.7 × 3.7 deg^{2}. Only cells with sinb  > 0.2 are plotted. 
Fig. 13 Map of the median parallaxes for a sample of sources in the LMC area, showing smallscale variations of the parallax zero point. Median values are calculated in cells of about 0.057 ×0.057 deg^{2}. 
Fig. 14 Top: spatial covariance V_{ϖ}(θ) of parallax errors in the highprecision quasar sample. Red circles are the individual estimates, and thedashed black curve shows a fitted exponential. Bottom: same data for separations < 7° with errors bars (68% confidence intervals) and a running triangular mean (blue curve). The two highest points, for separations < 0.25°, are outside the plot in the top panel. 
Fig. 15 Same as Fig. 14, but for the proper motions of the highprecision quasar sample (V_{μ} (θ)). The highest point, for the smallest separation, is outside the plot in the top panel. 
Fourier coefficients for the basicangle variations.
Summary of estimated systematics for faint sources (G ≳ 16 mag).
5.5 Splitfield solutions
The internal consistency of the astrometric solution can be examined by comparing solutions based on complementary subsets of the observations. The observations can for example be divided depending on the CCD strip in the astrometric field (AF). Normally a source is observed in nine consecutive CCD strips, denoted AF1–AF9, as its image moves over the focal plane (see e.g. Fig. 3 in the AGIS paper), thus generating up to nine AL observations per fieldofview transit. The photon noise component is strictly independent between the nine observations, while systematic errors depending on the calibration model may be partly similar.
We have made two separate primary solutions using only the AF2–AF5 and AF6–AF9 strips, respectively; these are called the “early” and “late” solutions. The same set of primary sources was used in both solutions as for the final primary solution (step 4 of Sect. 4.2), and the calibration and attitude models were also the same. (Naturally, the calibration model only included the relevant CCDs, and the normalised AL field angle was similarly redefined for the spinrelated distortion model.) Although the early and late solutions are partly affected by similar systematics from deficiencies in the calibration or attitude models, the differences in the resulting astrometric parameters may give a realistic impression of the magnitude and general character of the systematics, and a very good check of the random errors. It should be noted that the differences can never be interpreted as corrections to the published data: indeed we do not know which of the two solutions, if any, is better in terms of systematics.
Figure 16 shows the difference in parallax as a function of magnitude. The scatter is broadly consistent with the combined formal uncertainties of the two solutions. The median difference exhibits a strong dependence on magnitude, which is clearly related to the window class (steps at G ≃ 13 and 16) and the use of gates for G ≲ 12. To explore the spatial variations, maps of the median parallax differences are shown in Fig. 17 for three magnitude ranges that roughly correspond to window classes 0, 1, and 2 (see footnote 3). In each map, the median parallax difference for sources in the magnitude range was subtracted in order to eliminate the magnitude effect. The spatial variations are similar in the middle and right maps, but distinctly different in the left map (window class 0). The RMS amplitude of the variations shown in these maps, that is, of the median differences at a pixel size of 3.36 deg^{2}, is 0.010, 0.008, and 0.013 mas, respectively.
The splitfield solutions generally support the findings in Sects. 5.1 and 5.4, viz. the presence of a magnitudedependent systematic error, probably mainly affecting the bright (G ≲ 13) sources, and spatial variations ofa few tens of μas on a scale ofseveral degrees.
About 477 000 of the quasars from Sect. 5.2 have accepted solutions in both the “early” and “late” solutions. The median parallax is − 0.034 mas for the early solution and − 0.022 mas for the late solution. A scatter plot of the parallaxes (Fig. 18) shows that the parallax errors are practically independent in the two solutions; the correlation coefficient is + 0.0245.
Fig. 16 Difference in parallax between the “late” and “early” solutions as a function of magnitude. The cyan curve is the median. Only results for primary sources are plotted; discontinuities in the density of points at G = 13, 16, etc. are caused by the way the primary sources are selected. 
Fig. 17 Maps of the median difference in parallax between the “late” and “early” solutions, subdivided by magnitude. In each map, the global median was subtracted to remove the major part of the magnitude dependence seen in Fig. 16. Left: Magnitude range G < 13 mag. Middle: 13 < G < 16 mag. Right: 16 < G < 19 mag. 
Fig. 18 Splitfield parallax solutions for the quasar sample. 
6 Conclusions
Compared with Gaia DR1 (Lindegren et al. 2016), the second release contains a vastly increased number of sources with full astrometric data, including parallaxes and proper motions. For the bright (G ≲ 12 mag) sources where such data were included already in the first release, the present results are generally more accurate and fully independentof the HIPPARCOS and Tycho catalogues. The reference frame, GaiaCRF2, is entirely defined by Gaia observations of quasars, including the optical counterparts of VLBI sources in a prototype version of the ICRF3.
In spite of these improvements, we recall that the astrometric results in Gaia DR2 are based on less than two years of observationsand very preliminary calibrations that have not yet benefited from the iterative improvement of the preprocessing of the CCD measurements. As a consequence, random and systematic errors are both considerably higher than can be expected for the final mission products.
In this release, all sources beyond the solar system are still treated as single stars, that is, as point objects whose motions can be described by the basic fiveparameter model. For unresolved binaries (separation ≲100 mas), the photocentre is consistently observed and the astrometric parameters thus refer to the position and motion of the photocentre in the wavelength band of the G magnitude. Orbital motion and photometric variablility may bias the astrometric parameters for such sources. Resolved or partially resolved binaries cause a different kind of errors, for example when the different observations of a source variously refer to one or the other of the components, or to the photocentre, depending on the direction of the scan. In some cases, this is known to produce spurious results, for instance, very large positive or negative parallaxes (Arenou et al. 2018; see also Appendix C). These limitations will be eliminated in future releases.
The random errors, as described by the formal uncertainties in the Gaia Archive, are summarised in Tables B.1 and B.2. An attempt to quantify the systematic errors, mainly based on the analysis of quasar data, is given in Table 4. The main weaknesses identified through the internal validation process (Sect. 5) are listed below. A more extensive discussion is found in the paper by Arenou et al. (2018) on the catalogue validation.
Parallax zero point. Although the measurement principle of Gaia should give absolute parallaxes, the results for quasars very clearly indicate a global zero point of about − 0.03 mas (i.e. 0.03 mas should be added to the published values). There are, however, variations of a similar size depending on magnitude, colour, and position (Figs. 7 and 12). On small scales, the zeropoint variations may present quasiperiodic patterns as in Fig. 13. A different zero point may apply to bright sources (see below). Therefore, in any scientific usage of samples of Gaia DR2 parallaxes for which the zero point is important (e.g. periodluminosity relations, or other luminosity calibrations), the zero point itself might be treated as an adjustable parameter. This will not always be possible, for example for very small samples, or when the distance is nearly constant in the sample, as in a star cluster.
Formal errors. The DOF bug resulted in significantly underestimated uncertainties for the bright (G ≲ 13 mag) sources, which has been approximately corrected in the Gaia Archive as described in Appendix A. Nevertheless, the quasar sample shows that the uncertainties of the parallaxes and proper motions of the faint quasars away from the Galactic plane are underestimated by 8–12%. For brighter sources, and closer to the Galactic plane, the uncertainties may be more severely underestimated (Arenou et al. 2018).
Bright sources. The instrument calibration is still very provisional and particularly problematic for the bright (G ≲ 13 mag) sources. This manifests itself in larger uncertainties compared with the slightly fainter stars, and possibly in a systematic rotation of the proper motion system of the bright sources relative to the quasars (Fig. 4). The bright sources also behave distinctly different in the splitfield solutions (Sect. 5.5), suggesting that the parallax zero point could also be different for G ≲ 13.
Spurious large positive and negative parallaxes. The release contains a small number of sources with very large positive or negative parallaxes, for example, exceeding ± 1 arcsec. These are most likely produced by crossmatching issues, where the different observations of the same nominal source were matched to different physical sources. In such cases, the proper motion will in general also be corrupted. This, and the related question how the data might be “cleaned”, is discussed in Appendix C. No filtering was made in the Gaia Archive based on the sizes of the parallaxes and proper motions.
A summary of the astrometric properties of Gaia DR2 is given in Appendix B.
Acknowledgements
This work presents results from the European Space Agency (ESA) space mission Gaia. Gaia data are being processed by the Gaia Data Processing and Analysis Consortium (DPAC). Funding for the DPAC is provided by national institutions, in particular the institutions participating in the Gaia MultiLateral Agreement (MLA). The Gaia mission website is https://www.cosmos.esa.int/gaia. The Gaia Archive website is https://archives.esac.esa.int/gaia. This work was financially supported by the European Space Agency (ESA) in the framework of the Gaia project; the Centre National d’Etudes Spatiales (CNES); the National Science Foundation of China (NSFC) through grants 11703065 and 11573054; German Aerospace Agency (Deutsches Zentrum für Luft und Raumfahrt e.V., DLR) through grants 50QG0501, 50QG0601, 50QG0602, 50QG0701, 50QG0901, 50QG1001, 50QG1101, 50QG1401, 50QG1402, 50QG1403, and 50QG1404 and the Centre for Information Services and High Performance Computing (ZIH) at the Technische Universität (TU) Dresden through a generous allocation of computer time; the Agenzia Spaziale Italiana (ASI) through contracts I/037/08/0, I/058/10/0, 2014025R.0, and 2014025R.1.2015 to the Italian Istituto Nazionale di Astrofisica (INAF), contract 2014049R.0/1/2 to INAF dedicated to the Space Science Data Centre (SSDC, formerly known as the ASI Sciece Data Centre, ASDC), and contracts I/008/10/0, 2013/030/I.0, 2013030I.0.12015, and 201617I.0 to the Aerospace Logistics Technology Engineering Company (ALTEC S.p.A.), and INAF; the NetherlandsOrganisation for Scientific Research (NWO) through grant NWOM614.061.414 and the Netherlands Research School for Astronomy (NOVA); the Spanish Ministry of Economy (MINECO/FEDER, UE) through grants ESP201455996C21R, ESP201455996C22R, ESP201680079C21R, and ESP201680079C22R, the Spanish Ministerio de Economía, Industria y Competitividad through grant AyA201455216, the Spanish Ministerio de Educación, Cultura y Deporte (MECD) through grant FPU16/03827, the Institute of Cosmos Sciences University of Barcelona (ICCUB, Unidad de Excelencia “María de Maeztu”) through grant MDM20140369, the Xunta de Galicia and the Centros Singulares de Investigación de Galicia for the period 20162019 through the Centro de Investigación en Tecnologías de la Información y las Comunicaciones (CITIC), the Red Española de Supercomputación (RES) computer resources at MareNostrum, and the Barcelona Supercomputing Centre  Centro Nacional de Supercomputación (BSCCNS) through activities AECT201610006, AECT201620013, AECT201630011, and AECT201710020; the Swedish National Space Board (SNSB/Rymdstyrelsen); the Swiss State Secretariat for Education, Research, and Innovation through the ESA PRODEX programme, the Mesures d’Accompagnement, the Swiss Activités Nationales Complémentaires, and the Swiss National Science Foundation; the United Kingdom Science and Technology Facilities Council (STFC) through grant ST/L006553/1, the United Kingdom Space Agency (UKSA) through grant ST/N000641/1 and ST/N001117/1, as well as a Particle Physics and Astronomy Research Council Grant PP/C503703/1. The unpublished prototype version of ICRF3 was kindly provided by the IAU Working Group on ICRF3. Diagrams were produced using the astronomyoriented data handling and visualisation software TOPCAT (Taylor 2005). This research has made use of the SIMBAD database, operated at CDS, Strasbourg, France. We thank A.G.A. Brown and C. Jordi for valuable feedback during the preparation of this paper, and the referee, V.V. Makarov, for constructive comments on the original version of the manuscript.
Appendix A DOF bug and how it was corrected
A.1 Background
A necessary but not sufficient condition for correctly estimated formal uncertainties in a leastsquares solution is that the residuals have expected sizes in relation to the assumed uncertainties of the observations. Given the considerable gap, illustrated in Fig. 9, between the formal uncertainties of the observations derived from the image parameter determination and the actual scatter of residuals, it is clear that some reweighting of the observations is necessary in order to achieve the required consistency. As explained in Sect. 3.6 of the AGIS paper, the reweighting is done by quadratically adding the excess noise ϵ to the formal uncertainty of the observation σ_{η}. The excess noise has two components: the excess source noise ϵ_{i}, which is the same for all observations of a given source i, and the excess attitude noise ϵ_{a}(t), which is a function of time but the same for all sources at a given time. Briefly, ϵ_{i} and ϵ_{a} (t) are globally adjusted to make the weighted sum of squared residuals (A.1)
for all the sources. Here R_{l} is the AL residual of observation l, the sum is taken over all the accepted observations of the source, and ν is the number of degrees of freedom, that is, the number of accepted AL observations minus 5. (In this and the following equation, we disregard the outlier treatment for simplicity.) The nonnegative quantity ϵ_{i}, given in the Gaia Archive as astrometric_excess_noise, is a useful characteristic of the source, since it should only be zero if all the observations fit the singlestar model well enough, given the level of excess attitude noise set by the majority of other sources.
An alternative measure of how well the singlestar model fits a given source is the quantity astrometric_chi2_al, also given in the Gaia Archive. It is calculated as (A.2)
We note that the excess source noise is not included in the denominator, otherwise we would always have χ^{2} ≤ ν. The a posteriori mean error of unit weight , also known as the unit weight error, is a more useful goodnessoffit statistic, since it is expected to be around unity in wellbehaved cases.
So far, we have described how the weighting scheme was intended to work. Now we consider some actualstatistics in Gaia DR2. In Fig. A.1 we have plotted the unit weight error as a function of magnitude for a random subset of sources with zero excess source noise. According to what was said above, we expect this u to be on average around 1.0. As shown by the cyan curve, which is a running median, this is actually the case only for sources fainter than G ≃ 17. For the brighter sources, there are strong deviations: at G ≲ 13, corresponding to window class 0, the median u is in the range 1.2–1.4, while at intermediate magnitudes it is <1, with a minimum of 0.8 at G ≃ 13.4.
This unexpected behaviour of u for sources brighter than G ≃ 17 was traced to a bug in the source update algorithm. This bug, which we refer to as the “DOF bug”, directly affected sources with observations in window class 0, and indirectly other sources as well, as explained below. Observations in window class 0 are special in that they provide measurements in both the alongscan (AL) and acrossscan (AC) directions, while window class 1 and 2 only give AL measurements. However, the estimation of the excess source noise ϵ_{i} by means of Eq. (A.1) should only use the more precise AL observations. The source update algorithm correctly neglected the AC observations when computing the sum Q, but erroneously included them in the degrees of freedom (DOF) ν. As a result, the excess source noise was seriously underestimated for sources with G ≲ 13, and in fact set to zero for about 80% of them. Since these sources usually have an equal number of AC and AL observations, this explains why u is roughly a factor 1.4 too large for these sources. It also means that the formal uncertainties are underestimated in this magnitude range.
Fig. A.1 Residual statistics for sources with fiveparameters solutions and excess source noise equal to zero. The yellow dots show individual values of the unit weight error for the sources brighter than G = 11, and for a gradually decreasing random fraction of the fainter sources. The cyan curve is the running median. 
The too low values of u for the somewhat fainter sources (G ≃ 13–17) are an indirect effect of the same bug. In the primary solution, the excess attitude noise was estimated as a function of time by analysis of the residuals after taking into account the excess source noise. As is evident from Eq. (A.1), the underestimated excess source noise for many sources then had to be compensated for by a higher excess attitude noise. As a result, the excess attitude noise was generally overestimated, leading to underestimated χ^{2} in Eq. (A.2) at intermediate magnitudes (G ≃ 13–17). Even fainter sources are not significantly affected by the overestimated excess attitude noise, as their error budget is in any case dominated by the photon noise.
A.2 How the bug was corrected
In order to evaluate the impact of the DOF bug on the astrometric results, a new primary solution was computed after having corrected the software for the DOF bug. The astrometric parameters, their formal uncertainties, and other statistics were compared with the corresponding data from the original (uncorrected) solution. It was found that the astrometric parameters themselves are only very marginally affected by the bug. This was expected, as the observations had not changed, only their relative weights. The most serious impact is on the formal uncertainties, which are underestimated for bright sources and slightly overestimated at intermediate magnitudes. Very nearly the same correction factor applies to the uncertainties of all five astrometric parameters of a given source.
The DOF bug was discovered very late in the data processing cycle and at a stage when it was judged too risky to recompute, revalidate, and replace the complete astrometric solution in Gaia DR2. Instead it was decided not to touch the astrometric parameters themselves, but apply a statistical correction to their formal uncertainties. For each source with a fiveparameter solution, a correction factor F was computed as described below and applied to the formal uncertainties ς_{α*}, etc. of the original solution. The Gaia Archive then contains the corrected uncertainties (A.3)
The correction factor was computed as (A.4)
is the ratio of the number of AC to AL observations. Since 0 ≤ R ≤ 1 and ς_{ϖ} ≥ 0.015 mas in Gaia DR2, the factor F is constrained to the range 0.5 to 1.8.
Fig. A.2 Ratio of parallax uncertainties before (top) and after (bottom) applying the statistical correction factor F from Eq. (A.4). The yellow dots are for the individual primary sources, the cyan curve is the median, and the blue curves are the 10th and 90th percentiles. 
Equation (A.4) was derived by comparing the parallax uncertainties in the primary solution with the DOF bug fixed with the corresponding values in the original solution. The top diagram in Fig. A.2 shows the ratio as a function of magnitude before the correction; the bottom diagram shows the same ratio after the correction by F, that is, , where σ_{ϖ} = Fς_{ϖ}. The constants 0.8 and 0.025 mas in Eq. (A.4) were adjusted to make the median curve in the bottom diagram as close to unity as possible. The diagrams illustrate the statistical nature of the correction: although the median ratio is roughly correct after correction, the uncertainties could still be significantly wrong for some sources.
Should there ever be a need to undo the correction, it is possible to compute F in terms of the published parallax uncertainty σ_{ϖ} as (A.6)
from which ς_{α*} = σ_{α*}∕F, etc.
A.3 Secondary effects on other statistics
The DOF bug also affected the excess source noise and its significance, but there was no simple way to correct this, and the uncorrected values are therefore left in the Gaia Archive. Typically the excess source noise may be underestimated by 0.15–0.3 mas for G ≲ 13, and not at all or by less than 0.15 mas for fainter sources. The astrometric χ^{2} is affected by the overestimated excess attitude noise, and is therefore generally underestimated at all magnitudes; again no correction was made for this quantity. To single out “bad” solutions using any of these statistics can in any case only be done in an ad hoc fashion by considering the overall distributions of the quantities at the relevant magnitudes. An example is given in Appendix C.
Appendix B Astrometric properties of Gaia DR2
This appendix gives statistics for the most important astrometric characteristics of Gaia DR2. Figure B.1 shows the distribution of sources according to G magnitude (photometric_g_mean_mag). In all statistics, it is necessary to separate the two kinds of solutions: full (fiveparameter) solutions with positions, parallaxes, and proper motions; and fallback (twoparameter) solutions with only positions. The subsets of the sources used to define the reference frame (Sect. 5.1) are shown by the green and magenta histograms in Fig. B.1.
Subsequent tables and figures illustrate the variation of various quality indicators with magnitude and position. The quantities considered are listed below with a brief explanation.
ra_error = standard uncertainty in right ascension at epoch J2015.5 σ_{α*} = σ_{α} cosδ,
dec_error = standard uncertainty in declination at epoch J2015.5 σ_{δ},
parallax_error = standard uncertainty in parallax σ_{ϖ},
pmra_error = standard uncertainty of proper motion in right ascension σ_{μα*} = σ_{μα} cosδ,
pmdec_error = standard uncertainty of proper motion in declination σ_{μδ},
semimajor axis of error ellipse in position at epoch J2015.5, σ_{pos,max} see Eq. (B.1),
semimajor axis of error ellipse in proper motion, σ_{pm,max} see Eq. (B.2),
astrometric_excess_noise = excess source noise, ϵ_{i}: this is the extra noise per observation that must be postulated to explain the scatter of residuals in the astrometric solution for the source,
visibility_periods_used = number of visibility periods of the source i.e. groups of observations separated by at least four days (Sect. 4.3),
astrometric_matched_observations = number of fieldofview transits of the source used in the astrometric solution,
astrometric_n_good_obs_al = number of good CCD observations AL of the source used in the astrometric solution,
fraction of bad CCD observations AL of the source = astrometric_n_bad_obs_al∕astrometric_n_obs_al,
parallax_pmra_corr = correlation coefficient between ϖ and μ_{α*}, ρ(ϖ, μ_{α*}),
parallax_pmdec_corr = correlation coefficient between ϖ and μ_{δ}, ρ(ϖ, μ_{δ}),
pmra_pmdec_corr = correlation coefficient between μ_{α*} and μ_{δ}, ρ(μ_{α*}, μ_{δ}).
The meaning of “good” and “bad” CCD observations requires an explanation. In AGIS an illfitting observation is never downright rejected, but its statistical weight is reduced by a factor 0 < w ≤ 1 depending on the size of the postfit residual in relation to the expected uncertainty – see Eq. (66) in the AGIS paper. Somewhat arbitrarily we count an observation as “good” if w ≥ 0.2 and “bad” if w < 0.2. This corresponds to a limit of 4.83 standard deviations for a “good” residual.
The semimajor axes of the error ellipses in position and proper motion are not given in the Gaia Archive but can be calculated as (B.1)
(cf. Eq. 9 in Lindegren et al. 2016), where C_{ij} are elementsof the 5 × 5 covariance matrix; specifically (B.3)
Table B.1 gives the median uncertainties of the astrometric parameters, and some other statistics, at selected magnitudes. At any magnitude there is a considerable scatter among the individual sources, as illustrated in Fig. B.2, and a systematic variation with position, as illustrated in Fig. B.3 for G ≃ 15. The latter figure is fairly representative for all magnitudes after appropriate scaling. Additional statistics at G ≃ 15 are shown in Figs. B.4 and B.5. Table B.2 gives statistics for the fallback (twoparameter) solutions.
Summary statistics for the 1332 million sources in Gaia DR2 with a full astrometric solution (five astrometric parameters).
Fig. B.1 Magnitude distribution of sources in Gaia DR2. Grey: All sources. Blue: Sources with a full astrometric solution (five parameters). Red: Sources with a fallback solution (position only). Green: Quasar candidates from the AllWISE AGN catalogue. Magenta: VLBI sources from the ICRF3 prototype. 
Fig. B.2 Formal uncertainties versus the G magnitude for sources with a fiveparameter astrometric solution. Left: semimajor axis of the error ellipse in position at epoch J2015.5. Middle: standard deviation in parallax. Right: semimajor axis of the error ellipse in proper motion. The yellow dots show individual values for a representative selection of the sources; the cyan curve is the median uncertainty and the blue curves are the 10th and 90th percentiles. The plotted sample contains all sources for G < 11, and a geometrically decreasing random fraction of the fainter sources with roughly uniform distribution in G. 
Fig. B.3 Formal uncertainties at G ≃ 15 for sources with a fiveparameter astrometric solution. Left: semimajor axis of the error ellipse in position at epoch J2015.5. Middle: standard deviation in parallax. Right: semimajor axis of the error ellipse in proper motion. This and all other fullsky maps in this paper use a Hammer–Aitoff projection in equatorial (ICRS) coordinates with α = δ = 0 at the centre, north up, and α increasing from right to left. 
Fig. B.4 Observation statistics at G ≃ 15 for sources with a fiveparameter astrometric solution. These statistics are main factors governing the formal uncertainties of the astrometric data. Left: number of visibility periods used. Middle: number of good CCD observations AL. A map of the number of used fieldofview transits is very similar, with a factor nine smaller numbers. Right: mean excess source noise. 
Fig. B.5 Correlation coefficients at G ≃ 15 for sources with a fiveparameter astrometric solution. Maps of the correlations at other magnitudes are very similar to these. Left: correlation between ϖ and μ_{α*}. Middle: correlation between ϖ and μ_{δ}. Right: correlation between μ_{α*} and μ_{δ}. 
Summary statistics for the 361 million sources in Gaia DR2 with a fallback solution (position only).
Appendix C Selectingastrometrically “clean” subsets
The criterion for an accepted fiveparameter solution, Eq. (11), was designed to include as many sources as possible with reasonably reliable astrometry. Using a stricter criterion would have resulted in a smaller, but possibly more reliable catalogue. The choice of a relatively lenient criterion presumes that users can and should implement additional filters as required by their particular applications, with due consideration of possible selection biases introduced by the filters. The Gaia Archive contains several statistics that may be useful in this process, but their interpretation is far from simple. In this appendix we illustrate both the benefits and limitations of certain filters in a specific case, namely the construction of a “clean” HR diagram of nearby (< 100 pc) stars. This should not be seen as a fixed recipe for selecting sources with the most reliable astrometry, but it may provide some useful hints for further exploration. Complementing the internal validation in Sect. 5 it also contains a brief discussion of the extremely large positive and negative parallaxes in Gaia DR2.
The left panel of Fig. C.1 is an HR diagram obtained by plotting G +5log_{10}(ϖ∕100 mas) versus colour index G_{BP}–G_{RP}. (This ignores extinction and takes the distance to be inverse parallax, both reasonable approximations in the solar neighbourhood.) The criteria used were:
yielding 338 833 sources. Nominally, (i) selects sources within 100 pc, (ii) those with at most 10% uncertainty in distance (corresponding to ≃0.2 mag in absolute magnitude), and (iii)–(iv) those with at most 10% uncertainty in the BP and RP fluxes (corresponding to ≃0.1 mag in G_{BP} and G_{RP}). Taken at face value, this selection should produce a very clean HR diagram. Indeed, the astrophysically expected features are very prominent in the left panel of Fig. C.1 but many points fall in unexpected places, e.g. between the main and whitedwarf sequences. Selection A includes three sources with ϖ > 800 mas, i.e. nominally closer to the Sun than Proxima Centauri (which has the fourth largest parallax in the sample). All three sources are faint (G > 19.7) and lie in a very crowded region within 10 deg of the Galactic centre. This suggests that their large parallaxes are spurious, resulting from inconsistent matching of the observations to different physical sources. If that is the case, then most likely the proper motions of these sources are also corrupted.
Fig. C.1 HR diagram of sources nominally within 100 pc and with relative distance error less than 10%. Left: raw diagram (Selection A). Middle: sources filtered by unit weight error (Selection B). Right: sources filtered by unit weight error and flux excess ratio (Selection C). 
With a maximum star density of the order of one million per square degree, there is a nonnegligible probability to have a chance configuration of two stars, separated by an arcsec or less, which could be mistaken for a single object with a large parallax. This is more likely to happen in areas that combine a high star density with a relatively small number of visibility periods, as is the case in the region of the Galactic centre (Fig. B.4). However, it is reasonable to expect that in most such cases of spurious parallaxes, observations do not fit the singlestar parallax model very well. This should lead to an increased chisquare, or unit weight error . In the left panel of Fig. C.2 this quantity is plotted versus G for Selection A. Compared with a similar plot for wellbehaved sources (Fig. A.1), there are several noteworthy differences: the strong rise for G < 6 caused by uncalibrated CCD saturation; a blob of moderately large values of u for G > 18, possibly extending to much larger values for brighter sources; and a general scatter of large u at all magnitudes, which could be caused by partially resolved or astrometric binaries. If we want to keep the sources with G < 6 (which include most of the giants) but remove the blob at G > 18, a possible cut is given by the black lines, i.e. the function (C.1)
Fig. C.2 Unit weight error versus magnitude for two samples. Left: selection A (positive parallaxes). Right: selection N (negative parallaxes). The black line is the threshold defined in Eq. (C.1). 
Adding this criterion to Selection A gives Selection B with 249 793 sources and the much cleaner HR diagram in the middle panel of Fig. C.1. (A similar filtering could be obtained by using the excess source noise instead of u, for example by selecting astrometric_excess_noise < 1 mas, but the behaviour of the excessnoise for G ≲ 15 is less discriminating due to the DOF bug.) Selection B still contains two sources with ϖ > 800 mas.
Additional scatter in the HR diagram is produced by photometric errors mainly in the BP and RP bands, affecting in particular faint sources in crowded areas. An indicator of possible issues with the BP and RP photometry is the flux excess factor E = (I_{BP} + I_{RP})∕I_{G} (phot_bp_rp_excess_factor), where I_{X} is the photometric flux in band X (Evans et al. 2018). Adding the criterion (Gaia Collaboration 2018d) (C.2)
to Selection B gives Selection C with 242 582 sources and the HR diagram in the right panel of Fig. C.1. The remaining scatter of points between the main and whitedwarf sequences may be partly real, consisting of binaries with whitedwarf and mainsequence companions of roughly equal magnitude. In Selection C the source with the largest parallax is Proxima Centauri.
The chance matching mechanism discussed above, where different observations of the same Gaia source are matched to two (or more) physically distinct objects, should produce a roughly equal number of positive and negative spurious parallaxes. Further insight into the mechanism can therefore be gained by inspecting a sample of sources with significantly negative parallaxes. The selection
gives 113 393 sources with manifestly unphysical parallaxes. A plot of u versus G for this sample is shown in the right panel of Fig. C.2. The similarity to the “blob” in the left plot is striking, and supports the idea that most of the spurious large (positive or negative) parallaxes can be removed by a judicious cut in the (G, u) plane. In fact 90% of the sources in Selection N are removed by the cut in Eq. (C.1).
Selection N includes 61 sources with ϖ < −800 mas, the smallest being − 1857 mas. For comparison, if the photometric criteria (iii) and (iv) are removed from Selection A, the number of sources with ϖ > 800 mas is 46. Conversely, if (iii) and (iv) are imposed on Selection N, the number of sources with ϖ < −800 mas is reduced to 6. The similar number of very large negative and positive parallaxes, when similar criteria are applied, broadly supports the hypothesis that most of the spurious large parallaxes result from the previously described chance matching of the observations to distinct objects. (The same thing can of course happen with the resolved components of a physical double star, if the separation is ≲1 arcsec.) The probability that it happens should decrease steeply with an increased number of available observations, or rather with the number of visibility periods (Sect. 4.3). That this is indeed the case is illustrated in Fig. C.3, where the tail of normalised negative parallaxes is plotted for Selection N and for some subsets of it. Nominally, if the parallax errors were truly unbiased and Gaussian, we would expect to have no source at all with − ϖ∕σ_{ϖ} > 6. The blue curve shows the distribution for the sources in Selection N, which by Eq. (11) all have at least six visibility periods. Requiring at least 7 or 10 visibility periods (green line/rings, and grey line/squares, respectively) drastically reduces the negative tail while retaining 85% and 41% of the sources. Requiring even more visibility periods only shrinksthe sample without changing the shape of the tail. If these criteria are applied to Selection A, the HR diagramgets cleaner at the faint end, but most of the points between the main sequence and white dwarfs around colour index 1 are still present. Increasing the minimum number of visibility periods is therefore efficient for eliminating the most extreme spurious parallaxes, but not for cleaning the middle and upper part of the HR diagram. The red curve in Fig. C.3 shows the distribution of negative parallaxes after the cut in Eq. (C.1), which is clearly more effective in removing the many parallaxes that are only moderately wrong.
Fig. C.3 Distribution of the negative tail of normalised parallaxes. 
The effectiveness of the filters described above is also illustrated in Fig. C.4. The left map shows the celestial distribution of the 73 246 sources in Gaia DR2 that are nominally within 50 pc from the Sun, i.e. with ϖ > 20 mas. Stars in this volume should have a rather uniform distribution on the sky; yet the map shows strong features correlated with the density of faint stars (e.g. along the Galactic equator) or related to the scanning law (e.g. the triangular patch in the left part of the map). Much of these features disappear after applying the cut in Eq. (C.1), as shown in the middle map. Applying in addition the cut in Eq. (C.2) leaves 34 001 sources with a nearly uniform distribution (right map). The remaining concentration of points at (α, δ) ≃ (67°, +16°) is the Hyades cluster. It can be inferred that most of the remaining sources are real. Inevitably, however, the filtering eliminates also some real sources with valid solutions. In this example the 39 245 sources removed by Eqs. (C.1) and (C.2) include at least some 700 actual nearby stars, among them Sirius B, Kruger 60, Ross 614, η Cas, π^{3} Ori, and δ Eri.
Fig. C.4 Distribution in equatorial (ICRS) coordinates of sources formally within 50 pc. Left: all 73 246 sources with ϖ > 20 mas. Middle: the subset of 39 478 sources satisfying Eq. (C.1). Right: the subset of 34 001 sources satisfying both Eqs. (C.1) and (C.2). 
References
 Arenou, F., Luri, X., Babusiaux, C. et al. 2018, A&A, 616, A17 (Gaia 2 SI) [Google Scholar]
 Butkevich, A. G., Klioner, S. A., Lindegren, L., Hobbs, D., & van Leeuwen F. 2017, A&A, 603, A45 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
 Evans, D. W., Riello, M., De Angeli, F., et al. 2018, A&A, 616, A4 (Gaia 2 SI) [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
 Fabricius, C., Bastian, U., Portell, J., et al. 2016, A&A, 595, A3 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
 Freedman, W. L., Madore, B. F., Gibson, B. K., et al. 2001, ApJ, 553, 47 [NASA ADS] [CrossRef] [Google Scholar]
 Gaia Collaboration (Brown, A. G. A., et al.) 2016a, A&A, 595, A2 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
 Gaia Collaboration (Prusti, T., et al.) 2016b, A&A, 595, A1 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
 Gaia Collaboration (Brown, A. G. A., et al.) 2018a, A&A, 616, A1 (Gaia 2 SI) [Google Scholar]
 Gaia Collaboration (Mignard, F., et al.) 2018b, A&A, 616, A14 (Gaia 2 SI) [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
 Gaia Collaboration (Helmi, A., et al.) 2018c, A&A, 616, A12 (Gaia 2 SI) [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
 Gaia Collaboration (Babusiaux, S., et al.) 2018d, A&A, 616, A10 (Gaia 2 SI) [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
 Hambly, N.C., Copper, M., Boudreault, S., et al. 2018, A&A, 616, A15 (Gaia 2 SI) [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
 Klioner, S. A. 2003, AJ, 125, 1580 [Google Scholar]
 Klioner, S. A. 2004, Phys. Rev. D, 69, 124001 [NASA ADS] [CrossRef] [Google Scholar]
 Lindegren, L., Lammers, U., Hobbs, D., et al. 2012, A&A, 538, A78 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
 Lindegren, L., Lammers, U., Bastian, U., et al. 2016, A&A, 595, A4 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
 Michalik, D., Lindegren, L., Hobbs, D., & Butkevich, A. G. 2015, A&A, 583, A68 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
 Mora, A., Biermann, M., Bombrun, A., et al. 2016, in Proc. SPIE, Space Telescopes and Instrumentation 2016: Optical, Infrared, and Millimeter Wave, 9904, 99042D [CrossRef] [Google Scholar]
 Riello, M., De Angeli, F., Evans, D. W., et al. 2018, A&A, 616, A3 (Gaia 2 SI) [Google Scholar]
 Rodgers, A. W. & Eggen, O. J. 1974, PASP, 86, 742 [NASA ADS] [CrossRef] [Google Scholar]
 Sartoretti, P., Katz, D., Cropper, M., et al. 2018, A&A, 616, A6 (Gaia 2 SI) [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
 Secrest, N. J., Dudik, R. P., Dorland, B. N., et al. 2015, ApJS, 221, 12 [NASA ADS] [CrossRef] [Google Scholar]
 Secrest, N. J., Dudik, R. P., Dorland, B. N., et al. 2016, VizieR Online Data Catalog: II/221 [Google Scholar]
 Soffel, M., Klioner, S. A., Petit, G., et al. 2003, AJ, 126, 2687 [NASA ADS] [CrossRef] [Google Scholar]
 Taylor, M. B. 2005, in Astronomical Data Analysis Software and Systems XIV, eds. P. Shopbell, M. Britton, & R. Ebert, ASP Conf. Ser., 347, 29 [Google Scholar]
 van Leeuwen, F., ed. 2007, Hipparcos, the New Reduction of the Raw Data Astrophysics and Space Science Library, 350 [Google Scholar]
 van Leeuwen, F., Evans, D. W., De Angeli, F., et al. 2017, A&A, 599, A32 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
 Wenger, M., Ochsenbein, F., Egret, D., et al. 2000, A&AS, 143, 9 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
The shifted Legendre polynomials are related to the (ordinary) Legendre polynomials P_{n}(x) by .Specifically,,, and. In the AGIS paper and in Lindegren et al. (2016), the shifted Legendre polynomials were denoted.
Window classes (WC) 0, 1, and 2 are different sampling schemes of pixels around a detected source, decided by an onboard algorithm mainly based on the brightness of the source: WC0 (for G ≲ 13) is a twodimensional sampling, from which both the AL and AC centroid locations can be determined on ground, while WC1 (13 ≲ G ≲ 16) and WC2 (G ≳ 16) give onedimensional arrays of 18 and 12 samples, respectively, allowing only the AL location to be determined.
All Tables
Ten HIPPARCOS sources in Gaia DR2 with the largest predicted perspective acceleration.
Summary of the astrometric calibration model and number of calibration parameters in the astrometric solution for Gaia DR2.
Summary statistics for the 1332 million sources in Gaia DR2 with a full astrometric solution (five astrometric parameters).
Summary statistics for the 361 million sources in Gaia DR2 with a fallback solution (position only).
All Figures
Fig. 1 Effective wavenumber as a function of colour index. The curve is the analytical relation in Eq. (2). We also show the distribution of G_{BP} –G_{RP} for a randomselection of bright (G < 12 mag, bluish histogram with two peaks) and faint (G > 18 mag, reddish histogram) sources. 

In the text 
Fig. 2 Relation between the number of visibility periods and fieldofview transits (matched observations) per source used in the secondary astrometric solutions. A small random number was added to the integer number of visibility periods to widen the vertical bars. The white horizontal line through each bar shows the location of the median. The diagram was constructed for a random subset of about 2.5 million sources. 

In the text 
Fig. 3 Dependence of the faint reference frame on colour. The diagram shows the components of spin ω_{X}, ω_{Y}, and ω_{Z} around the ICRS axes, as estimated for faint (G ≃ 15–21) quasars subdivided by effective wavenumber. The components in X and Z were shifted by ± 0.2 mas yr^{−1} for better visibility. Error bars are at 68% confidence intervals for the estimated spin. 

In the text 
Fig. 4 Dependence of the reference frame on magnitude. The diagram shows the spin components as in Fig. 3, but subdivided by magnitude. The points at the faint end (G ≳ 15) are estimated from the proper motions of quasars. At the bright end (G ≲ 13), the spin is estimated from the differences in stellar proper motions between Gaia DR2 and the HIPPARCOS subset of TGAS in Gaia DR1. 

In the text 
Fig. 5 Density map of the full quasar sample (union of AllWISE AGNs and VLBI sources) at a resolution of 1.8 × 1.8 deg^{2}. The scatter of points in the Galactic band are VLBI sources. This and following fullsky maps use a Hammer–Aitoff projection in equatorial (ICRS) coordinates with α = δ = 0 at the centre, north up, and α increasing from right to left. 

In the text 
Fig. 6 Parallax distribution for 556 869 sources identified as quasars. Outer (blue) curve: the whole sample; inner (grey) curve: the subsample of 492 928 sources with σ_{ϖ} < 1 mas. 

In the text 
Fig. 7 Parallaxes for the full quasar sample plotted against magnitude (left), colour (middle), and ecliptic latitude (right). Because of the chosen scale, only about onethird of the data points are shown as yellow dots; the blue curves are the running medians. 

In the text 
Fig. 8 Distributions of the normalised centred parallaxes for the same samples as in Fig. 6. The red curve is a Gaussian distribution with the same standard deviation (1.081) as the normalised centred parallaxes for the full sample. 

In the text 
Fig. 9 Precision of alongscan astrometric measurements as a function of magnitude. The red (lower) curve is a running median of the formal precision from the image parameter determination; the blue (upper) curve is a robust estimate of the actual standard deviation of the postfit residuals. The difference between the two curves represents the combination of all unmodelled errors. 

In the text 
Fig. 10 Smallscale distortion for ungated observations on one of the astrometric CCDs (strip 7, row 4). The curves show the median AL residual for sources in the magnitude range G = 13–16 plotted against the AC pixel coordinate μ, and subdivided according to field of view (preceding PFoV, or following FFoV) and time (before or after the decontamination at OBMT ≃ 2400). For better visibility, the successive curves were vertically displaced by 0.1 mas. The vertical dashed lines show the stitch block boundaries, which divide the 1966 pixels in blocks of 250 pixels, except for the two outermost blocks that are 108 pixels. 

In the text 
Fig. 11 Residual systematics depending on the AC scan rate. The curves show the median residual as a function of OBMT for observations of window class 1 (G ≃ 13–16) in the preceding field of view. The red curve is for observations with positive AC rate, and the blue curve for negative AC rate. The vertical dashed lines show the approximate times of the two decontamination events. 

In the text 
Fig. 12 Map of the median parallaxes for the full quasar sample, showing largescale variations of the parallax zero point. See Fig. 5 for the coordinate system and density of sources. Median values are calculated in cells of about 3.7 × 3.7 deg^{2}. Only cells with sinb  > 0.2 are plotted. 

In the text 
Fig. 13 Map of the median parallaxes for a sample of sources in the LMC area, showing smallscale variations of the parallax zero point. Median values are calculated in cells of about 0.057 ×0.057 deg^{2}. 

In the text 
Fig. 14 Top: spatial covariance V_{ϖ}(θ) of parallax errors in the highprecision quasar sample. Red circles are the individual estimates, and thedashed black curve shows a fitted exponential. Bottom: same data for separations < 7° with errors bars (68% confidence intervals) and a running triangular mean (blue curve). The two highest points, for separations < 0.25°, are outside the plot in the top panel. 

In the text 
Fig. 15 Same as Fig. 14, but for the proper motions of the highprecision quasar sample (V_{μ} (θ)). The highest point, for the smallest separation, is outside the plot in the top panel. 

In the text 
Fig. 16 Difference in parallax between the “late” and “early” solutions as a function of magnitude. The cyan curve is the median. Only results for primary sources are plotted; discontinuities in the density of points at G = 13, 16, etc. are caused by the way the primary sources are selected. 

In the text 
Fig. 17 Maps of the median difference in parallax between the “late” and “early” solutions, subdivided by magnitude. In each map, the global median was subtracted to remove the major part of the magnitude dependence seen in Fig. 16. Left: Magnitude range G < 13 mag. Middle: 13 < G < 16 mag. Right: 16 < G < 19 mag. 

In the text 
Fig. 18 Splitfield parallax solutions for the quasar sample. 

In the text 
Fig. A.1 Residual statistics for sources with fiveparameters solutions and excess source noise equal to zero. The yellow dots show individual values of the unit weight error for the sources brighter than G = 11, and for a gradually decreasing random fraction of the fainter sources. The cyan curve is the running median. 

In the text 
Fig. A.2 Ratio of parallax uncertainties before (top) and after (bottom) applying the statistical correction factor F from Eq. (A.4). The yellow dots are for the individual primary sources, the cyan curve is the median, and the blue curves are the 10th and 90th percentiles. 

In the text 
Fig. B.1 Magnitude distribution of sources in Gaia DR2. Grey: All sources. Blue: Sources with a full astrometric solution (five parameters). Red: Sources with a fallback solution (position only). Green: Quasar candidates from the AllWISE AGN catalogue. Magenta: VLBI sources from the ICRF3 prototype. 

In the text 
Fig. B.2 Formal uncertainties versus the G magnitude for sources with a fiveparameter astrometric solution. Left: semimajor axis of the error ellipse in position at epoch J2015.5. Middle: standard deviation in parallax. Right: semimajor axis of the error ellipse in proper motion. The yellow dots show individual values for a representative selection of the sources; the cyan curve is the median uncertainty and the blue curves are the 10th and 90th percentiles. The plotted sample contains all sources for G < 11, and a geometrically decreasing random fraction of the fainter sources with roughly uniform distribution in G. 

In the text 
Fig. B.3 Formal uncertainties at G ≃ 15 for sources with a fiveparameter astrometric solution. Left: semimajor axis of the error ellipse in position at epoch J2015.5. Middle: standard deviation in parallax. Right: semimajor axis of the error ellipse in proper motion. This and all other fullsky maps in this paper use a Hammer–Aitoff projection in equatorial (ICRS) coordinates with α = δ = 0 at the centre, north up, and α increasing from right to left. 

In the text 
Fig. B.4 Observation statistics at G ≃ 15 for sources with a fiveparameter astrometric solution. These statistics are main factors governing the formal uncertainties of the astrometric data. Left: number of visibility periods used. Middle: number of good CCD observations AL. A map of the number of used fieldofview transits is very similar, with a factor nine smaller numbers. Right: mean excess source noise. 

In the text 
Fig. B.5 Correlation coefficients at G ≃ 15 for sources with a fiveparameter astrometric solution. Maps of the correlations at other magnitudes are very similar to these. Left: correlation between ϖ and μ_{α*}. Middle: correlation between ϖ and μ_{δ}. Right: correlation between μ_{α*} and μ_{δ}. 

In the text 
Fig. C.1 HR diagram of sources nominally within 100 pc and with relative distance error less than 10%. Left: raw diagram (Selection A). Middle: sources filtered by unit weight error (Selection B). Right: sources filtered by unit weight error and flux excess ratio (Selection C). 

In the text 
Fig. C.2 Unit weight error versus magnitude for two samples. Left: selection A (positive parallaxes). Right: selection N (negative parallaxes). The black line is the threshold defined in Eq. (C.1). 

In the text 
Fig. C.3 Distribution of the negative tail of normalised parallaxes. 

In the text 
Fig. C.4 Distribution in equatorial (ICRS) coordinates of sources formally within 50 pc. Left: all 73 246 sources with ϖ > 20 mas. Middle: the subset of 39 478 sources satisfying Eq. (C.1). Right: the subset of 34 001 sources satisfying both Eqs. (C.1) and (C.2). 

In the text 
Current usage metrics show cumulative count of Article Views (fulltext article views including HTML views, PDF and ePub downloads, according to the available data) and Abstracts Views on Vision4Press platform.
Data correspond to usage on the plateform after 2015. The current usage metrics is available 4896 hours after online publication and is updated daily on week days.
Initial download of the metrics may take a while.