-
The Kepler Mission was designed to identify and characterize transiting
planets in the Kepler Field of View and to determine their occurrence rates.
Emphasis was placed on identification of Earth-size planets orbiting in the
Habitable Zone of their host stars. Science data were acquired for a period of
four years. Long-cadence data with 29.4 min sampling were obtained for ~200,000
individual stellar targets in at least one observing quarter in the primary
Kepler Mission. Light curves for target stars are extracted in the Kepler
Science Data Processing Pipeline, and are searched for transiting planet
signatures. A Threshold Crossing Event is generated in the transit search for
targets where the transit detection threshold is exceeded and transit
consistency checks are satisfied. These targets are subjected to further
scrutiny in the Data Validation (DV) component of the Pipeline. Transiting
planet candidates are characterized in DV, and light curves are searched for
additional planets after transit signatures are modeled and removed. A suite of
diagnostic tests is performed on all candidates to aid in discrimination
between genuine transiting planets and instrumental or astrophysical false
positives. Data products are generated per target and planet candidate to
document and display transiting planet model fit and diagnostic test results.
These products are exported to the Exoplanet Archive at the NASA Exoplanet
Science Institute, and are available to the community. We describe the DV
architecture and diagnostic tests, and provide a brief overview of the data
products. Transiting planet modeling and the search for multiple planets on
individual targets are described in a companion paper. The final revision of
the Kepler Pipeline code base is available to the general public through
GitHub. The Kepler Pipeline has also been modified to support the TESS Mission
which will commence in 2018.
-
We present results of the final Kepler Data Processing Pipeline search for
transiting planet signals in the full 17-quarter primary mission data set. The
search includes a total of 198,709 stellar targets, of which 112,046 were
observed in all 17 quarters and 86,663 in fewer than 17 quarters. We report on
17,230 targets for which at least one transit signature is identified that
meets the specified detection criteria: periodicity, minimum of three observed
transit events, detection statistic (i.e., signal-to-noise ratio) in excess of
the search threshold, and passing grade on three statistical transit
consistency tests. Light curves for which a transit signal is identified are
iteratively searched for additional signatures after a limb-darkened transiting
planet model is fitted to the data and transit events are removed. The search
for additional planets adds 16,802 transit signals for a total of 34,032; this
far exceeds the number of transit signatures identified in prior pipeline runs.
There was a strategic emphasis on completeness over reliability for the final
Kepler transit search. A comparison of the transit signals against a set of
3402 well-established, high-quality Kepler Objects of Interest yields a
recovery rate of 99.8%. The high recovery rate must be weighed against a large
number of false-alarm detections. We examine characteristics of the planet
population implied by the transiting planet model fits with an emphasis on
detections that would represent small planets orbiting in the habitable zone of
their host stars.
-
In the first three years of operation the Kepler mission found 3,697 planet
candidates from a set of 18,406 transit-like features detected on over 200,000
distinct stars. Vetting candidate signals manually by inspecting light curves
and other diagnostic information is a labor intensive effort. Additionally,
this classification methodology does not yield any information about the
quality of planet candidates; all candidates are as credible as any other
candidate. The torrent of exoplanet discoveries will continue after Kepler as
there will be a number of exoplanet surveys that have an even broader search
area. This paper presents the application of machine-learning techniques to the
classification of exoplanet transit-like signals present in the \Kepler light
curve data. Transit-like detections are transformed into a uniform set of
real-numbered attributes, the most important of which are described in this
paper. Each of the known transit-like detections is assigned a class of planet
candidate; astrophysical false positive; or systematic, instrumental noise. We
use a random forest algorithm to learn the mapping from attributes to classes
on this training set. The random forest algorithm has been used previously to
classify variable stars; this is the first time it has been used for exoplanet
classification. We are able to achieve an overall error rate of 5.85% and an
error rate for classifying exoplanets candidates of 2.81%.
-
The Kepler mission discovered 2842 exoplanet candidates with 2 years of data.
We provide updates to the Kepler planet candidate sample based upon 3 years
(Q1-Q12) of data. Through a series of tests to exclude false-positives,
primarily caused by eclipsing binary stars and instrumental systematics, 855
additional planetary candidates have been discovered, bringing the total number
known to 3697. We provide revised transit parameters and accompanying posterior
distributions based on a Markov Chain Monte Carlo algorithm for the cumulative
catalogue of Kepler Objects of Interest. There are now 130 candidates in the
cumulative catalogue that receive less than twice the flux the Earth receives
and more than 1100 have a radius less than 1.5 Rearth. There are now a dozen
candidates meeting both criteria, roughly doubling the number of candidate
Earth analogs. A majority of planetary candidates have a high probability of
being bonafide planets, however, there are populations of likely
false-positives. We discuss and suggest additional cuts that can be easily
applied to the catalogue to produce a set of planetary candidates with good
fidelity. The full catalogue is publicly available at the NASA Exoplanet
Archive.
-
We present the results of a search for potential transit signals in four
years of photometry data acquired by the Kepler Mission. The targets of the
search include 111,800 stars which were observed for the entire interval and
85,522 stars which were observed for a subset of the interval. We found that
9,743 targets contained at least one signal consistent with the signature of a
transiting or eclipsing object, where the criteria for detection are
periodicity of the detected transits, adequate signal-to-noise ratio, and
acceptance by a number of tests which reject false positive detections. When
targets that had produced a signal were searched repeatedly, an additional
6,542 signals were detected on 3,223 target stars, for a total of 16,285
potential detections. Comparison of the set of detected signals with a set of
known and vetted transit events in the Kepler field of view shows that the
recovery rate for these signals is 96.9%. The ensemble properties of the
detected signals are reviewed.
-
We present the results of a search for potential transit signals in the first
three years of photometry data acquired by the Kepler Mission. The targets of
the search include 112,321 targets which were observed over the full interval
and an additional 79,992 targets which were observed for a subset of the full
interval. From this set of targets we find a total of 11,087 targets which
contain at least one signal which meets the Kepler detection criteria: those
criteria are periodicity of the signal, an acceptable signal-to-noise ratio,
and three tests which reject false positives. Each target containing at least
one detected signal is then searched repeatedly for additional signals, which
represent multi-planet systems of transiting planets. When targets with
multiple detections are considered, a total of 18,406 potential transiting
planet signals are found in the Kepler Mission dataset. The detected signals
are dominated by events with relatively low signal-to-noise ratios and by
events with relatively short periods. The distribution of estimated transit
depths appears to peak in the range between 20 and 30 parts per million, with a
few detections down to fewer than 10 parts per million. The detections exhibit
signal-to-noise ratios from 7.1 sigma, which is the lower cut-off for
detections, to over 10,000 sigma, and periods ranging from 0.5 days, which is
the shortest period searched, to 525 days, which is the upper limit of
achievable periods given the length of the data set and the requirement that
all detections include at least 3 transits. The detected signals are compared
to a set of known transit events in the Kepler field of view, many of which
were identified by alternative methods; the comparison shows that the current
search recovery rate for targets with known transit events is 98.3%.
-
Kepler provides light curves of 156,000 stars with unprecedented precision.
However, the raw data as they come from the spacecraft contain significant
systematic and stochastic errors. These errors, which include discontinuities,
systematic trends, and outliers, obscure the astrophysical signals in the light
curves. To correct these errors is the task of the Presearch Data Conditioning
(PDC) module of the Kepler data analysis pipeline. The original version of PDC
in Kepler did not meet the extremely high performance requirements for the
detection of miniscule planet transits or highly accurate analysis of stellar
activity and rotation. One particular deficiency was that astrophysical
features were often removed as a side-effect to removal of errors. In this
paper we introduce the completely new and significantly improved version of PDC
which was implemented in Kepler SOC 8.0. This new PDC version, which utilizes a
Bayesian approach for removal of systematics, reliably corrects errors in the
light curves while at the same time preserving planet transits and other
astrophysically interesting signals. We describe the architecture and the
algorithms of this new PDC module, show typical errors encountered in Kepler
data, and illustrate the corrections using real light curve examples.
-
With the unprecedented photometric precision of the Kepler Spacecraft,
significant systematic and stochastic errors on transit signal levels are
observable in the Kepler photometric data. These errors, which include
discontinuities, outliers, systematic trends and other instrumental signatures,
obscure astrophysical signals. The Presearch Data Conditioning (PDC) module of
the Kepler data analysis pipeline tries to remove these errors while preserving
planet transits and other astrophysically interesting signals. The completely
new noise and stellar variability regime observed in Kepler data poses a
significant problem to standard cotrending methods such as SYSREM and TFA.
Variable stars are often of particular astrophysical interest so the
preservation of their signals is of significant importance to the astrophysical
community. We present a Bayesian Maximum A Posteriori (MAP) approach where a
subset of highly correlated and quiet stars is used to generate a cotrending
basis vector set which is in turn used to establish a range of "reasonable"
robust fit parameters. These robust fit parameters are then used to generate a
Bayesian Prior and a Bayesian Posterior Probability Distribution Function (PDF)
which when maximized finds the best fit that simultaneously removes systematic
effects while reducing the signal distortion and noise injection which commonly
afflicts simple least-squares (LS) fitting. A numerical and empirical approach
is taken where the Bayesian Prior PDFs are generated from fits to the light
curve distributions themselves.