
The 21cm power spectrum (PS) has been shown to be a powerful discriminant of
reionization and cosmic dawn astrophysical parameters. However, the 21cm
tomographic signal is highly nonGaussian. Therefore there is additional
information which is wasted if only the PS is used for parameter recovery. Here
we showcase astrophysical parameter recovery directly from 21cm images, using
deep learning with convolutional neural networks (CNN). Using a database of 2D
images taken from 10,000 21cm lightcones (each generated from different
cosmological initial conditions), we show that a CNN is able to recover
parameters describing the first galaxies: (i) Tvir , their minimum host halo
virial temperatures (or masses) capable of hosting efficient star formation;
(ii) {\zeta} , their typical ionizing efficiencies; (iii) LX/SFR , their
typical softband Xray luminosity to star formation rate; and (iv) E0 , the
minimum Xray energy capable of escaping the galaxy into the IGM. For most of
their allowed ranges, log Tvir and log LX/SFR are recovered with < 1%
uncertainty, while {\zeta} and E0 are recovered within 10% uncertainty. Our
results are roughly comparable to the accuracy obtained from Monte Carlo Markov
Chain sampling of the PS with 21CMMC for the two mock observations analyzed
previously, although we caution that we do not yet include noise and foreground
contaminants in this proofofconcept study.

The cosmic 21 cm signal is set to revolutionise our understanding of the
early Universe, allowing us to probe the 3D temperature and ionisation
structure of the intergalactic medium (IGM). It will open a window onto the
unseen first galaxies, showing us how their UV and Xray photons drove the
cosmic milestones of the epoch of reionisation (EoR) and epoch of heating
(EoH). To facilitate parameter inference from the 21 cm signal, we previously
developed 21CMMC: a Monte Carlo Markov Chain sampler of 3D EoR simulations.
Here we extend 21CMMC to include simultaneous modelling of the EoH, resulting
in a complete Bayesian inference framework for the astrophysics dominating the
observable epochs of the cosmic 21 cm signal. We demonstrate that second
generation interferometers, the Hydrogen Epoch of Reionisation Array (HERA) and
Square Kilometre Array (SKA) will be able to constrain ionising and Xray
source properties of the first galaxies with a fractional precision of order
$\sim1$10 per cent (1$\sigma$). The ionisation history of the Universe can be
constrained to within a few percent. Using our extended framework, we quantify
the bias in EoR parameter recovery incurred by the common simplification of a
saturated spin temperature in the IGM. Depending on the extent of overlap
between the EoR and EoH, the recovered astrophysical parameters can be biased
by $\sim310\sigma$.

We quantify the presence of Ly\alpha\ damping wing absorption from a
partiallyneutral intergalactic medium (IGM) in the spectrum of the $z=7.08$
QSO, ULASJ1120+0641. Using a Bayesian framework, we simultaneously account for
uncertainties in: (i) the intrinsic QSO emission spectrum; and (ii) the
distribution of cosmic HI patches during the epoch of reionisation (EoR). For
(i) we use a new intrinsic Ly\alpha\ emission line reconstruction method (Greig
et al.), sampling a covariance matrix of emission line properties built from a
large database of moderate$z$ QSOs. For (ii), we use the Evolution of 21cm
Structure (EOS; Mesinger et al.) simulations, which span a range of
physicallymotivated EoR models. We find strong evidence for the presence of
damping wing absorption redward of Ly\alpha\ (where there is no contamination
from the Ly\alpha\ forest). Our analysis implies that the EoR is not yet
complete by $z=7.1$, with the volumeweighted IGM neutral fraction constrained
to $\bar{x}_{\rm H\,{\scriptsize I}} = 0.40\substack{+0.21 0.19}$ at $1\sigma$
($\bar{x}_{\rm H\,{\scriptsize I}} = 0.40\substack{+0.41 0.32}$ at $2\sigma$).
This result is insensitive to the EoR morphology. Our detection of significant
neutral HI in the IGM at $z=7.1$ is consistent with the latest Planck 2016
measurements of the CMB Thompson scattering optical depth (Planck Collaboration
XLVII).

Using a Bayesian framework, we quantify what current observations imply about
the history of the epoch of reionisation (EoR). We use a popular,
threeparameter EoR model, flexible enough to accommodate a wide range of
physicallyplausible reionisation histories. We study the impact of various EoR
observations: (i) the optical depth to the CMB measured by Planck 2016; (ii)
the dark fraction in the Lyman $\alpha$ and $\beta$ forests; (iii) the redshift
evolution of galactic Ly$\alpha$ emission (socalled "Ly$\alpha$ fraction");
(iv) the clustering of Ly$\alpha$ emitters; (v) the IGM damping wing imprint in
the spectrum of QSO ULASJ1120+0641; (vi) and the patchy kinetic
SunyaevZel'dovich signal. Combined, (i) and (ii) already place interesting
constraints on the reionisation history, with the epochs corresponding to an
average neutral fraction of (75, 50, 25) per cent, constrained at 1$\sigma$ to
$z= (9.21\substack{+1.22 1.15}, 8.14\substack{+1.08 1.00},
7.26\substack{+1.13 0.96})$. Foldingin more modeldependent EoR observations
[(iiivi)], strengthens these constraints by tens of per cent, at the cost of
a decrease in the likelihood of the bestfit model, driven mostly by (iii). The
tightest constraints come from (v). Unfortunately, no current observational set
is sufficient to break degeneracies and constrain the astrophysical EoR
parameters. However, modeldependent priors on the EoR parameters themselves
can be used to set tight limits by excluding regions of parameter space with
strong degeneracies. Motivated by recent observations of $z\sim7$ faint, lensed
galaxies, we show how a conservative upper limit on the virial temperature of
haloes which host reionising galaxies can constrain the escape fraction of
ionising photons to $f_{\rm esc} = 0.14\substack{+0.26 0.09}$

Foreground power dominates the measurements of interferometers that seek a
statistical detection of highlyredshifted HI emission from the Epoch of
Reionization (EoR). The inherent spectral smoothness of synchrotron radiation,
the dominant foreground emission mechanism, and the chromaticity of the
instrument allows these experiments to delineate a boundary between spectrally
smooth and structured emission in Fourier space (the "wedge" or "pitchfork",
and the "EoR Window", respectively). Faraday rotation can inject spectral
structure into otherwise smooth polarized foreground emission, which through
instrument effects or miscalibration could possibly pollute the EoR Window.
Using data from the Hydrogen Epoch of Reionization Array (HERA) 19element
commissioning array, we investigate the polarization response of this new
instrument in the power spectrum domain. We confirm the expected structure of
foreground emission in Fourier space predicted by Thyagarajan et al. (2015a,
2016) for a HERAtype dish, and detect polarized power within the pitchfork.
Using simulations of the polarized response of HERA feeds, we find that almost
all of the power in Stokes Q, U and V can be attributed to instrumental leakage
effects. Power consistent with noise in the EoR window suggests a negligible
amount of spectrallystructured polarized power, to the noiselevels attained.
This lends confidence to deep integrations with HERA in the future, but with a
lower noise floor these future studies will also have to investigate their
polarized response.

We extend 21CMMC, a Monte Carlo Markov Chain sampler of 3D reionisation
simulations, to perform parameter estimation directly on 3D lightcones of the
cosmic 21cm signal. This brings theoretical analysis closer to the tomographic
21cm observations achievable with next generation interferometers like HERA
and the SKA. Parameter recovery can therefore account for modes which evolve
with redshift/frequency. Additionally, simulated data can be more easily
corrupted to resemble real data. Using the lightcone version of 21CMMC, we
quantify the biases in the recovered astrophysical parameters if we use the
21cm power spectrum from the coevolution approximation to fit a 3D lightcone
mock observation. While ignoring the lightcone effect under most assumptions
will not significantly bias the recovered astrophysical parameters, it can lead
to an underestimation of the associated uncertainty. However significant biases
($\sim$few  10 $\sigma$) can occur if the 21cm signal evolves rapidly (i.e.
the epochs of reionisation and heating overlap significantly) and: (i)
foreground removal is very efficient, allowing large physical scales
($k\lesssim0.1$~Mpc$^{1}$) to be used in the analysis or (ii) theoretical
modelling is accurate to within $\sim10$ per cent in the power spectrum
amplitude.

Current and upcoming radio interferometric experiments are aiming to make a
statistical characterization of the highredshift 21cm fluctuation signal
spanning the hydrogen reionization and Xray heating epochs of the universe.
However, connecting 21cm statistics to underlying physical parameters is
complicated by the theoretical challenge of modeling the relevant physics at
computational speeds quick enough to enable exploration of the high dimensional
and weakly constrained parameter space. In this work, we use machine learning
algorithms to build a fast emulator that mimics expensive simulations of the
21cm signal across a wide parameter space to high precision. We embed our
emulator within a MarkovChain Monte Carlo framework, enabling it to explore
the posterior distribution over a large number of model parameters, including
those that govern the Epoch of Reionization, the Epoch of Xray Heating, and
cosmology. As a worked example, we use our emulator to present an updated
parameter constraint forecast for the Hydrogen Epoch of Reionization Array
experiment, showing that its characterization of a fiducial 21cm power spectrum
will considerably narrow the allowed parameter space of reionization and
heating parameters, and could help strengthen Planck's constraints on
$\sigma_8$. We provide both our generalized emulator code and its
implementation specifically for 21cm parameter constraints as publicly
available software.

The experimental efforts to detect the redshifted 21 cm signal from the Epoch
of Reionization (EoR) are limited predominantly by the chromatic instrumental
systematic effect. The delay spectrum methodology for 21 cm power spectrum
measurements brought new attention to the critical impact of an antenna's
chromaticity on the viability of making this measurement. This methodology
established a straightforward relationship between timedomain response of an
instrument and the power spectrum modes accessible to a 21 cm EoR experiment.
We examine the performance of a prototype of the Hydrogen Epoch of Reionization
Array (HERA) array element that is currently observing in Karoo desert, South
Africa. We present a mathematical framework to derive the beam integrated
frequency response of a HERA prototype element in reception from the return
loss measurements between 100200 MHz and determined the extent of additional
foreground contamination in the delay space. The measurement reveals excess
spectral structures in comparison to the simulation studies of the HERA
element. Combined with the HERA data analysis pipeline that incorporates
inverse covariance weighting in optimal quadratic estimation of power spectrum,
we find that in spite of its departure from the simulated response, HERA
prototype element satisfies the necessary criteria posed by the foreground
attenuation limits and potentially can measure the power spectrum at spatial
modes as low as $k_{\parallel} > 0.1h$~Mpc$^{1}$. The work highlights a
straightforward method for directly measuring an instrument response and
assessing its impact on 21 cm EoR power spectrum measurements for future
experiments that will use reflectortype antenna.

We introduce an intrinsic Ly\alpha\ emission line profile reconstruction
method for high$z$ quasars (QSOs). This approach utilises a covariance matrix
of emission line properties obtained from a large, moderate$z$ ($2 \leq z \leq
2.5$), high signal to noise (S/N > 15) sample of BOSS QSOs. For each QSO, we
complete a Monte Carlo Markov Chain fitting of the continuum and emission line
properties and perform a visual quality assessment to construct a large
database of robustly fit spectra. With this dataset, we construct a covariance
matrix to describe the correlations between the high ionisation emission lines
Ly\alpha, C IV, Si IV + O IV] and C III], and find it to be well approximated
by an $N$dimensional Gaussian distribution. This covariance matrix
characterises the correlations between the line width, peak height and velocity
offset from systemic while also allowing for the existence of broad and narrow
line components for Ly\alpha\ and C IV. We illustrate how this covariance
matrix allows us to statistically characterise the intrinsic Ly\alpha\ line
solely from the observed spectrum redward of 1275\AA. This procedure can be
used to reconstruct the intrinsic Ly\alpha\ line emission profile in cases
where Ly\alpha\ may otherwise be obscured. Applying this reconstruction method
to our sample of QSOs, we recovered the Ly\alpha\ line flux to within 15 per
cent of the measured flux at 1205\AA\ (1220\AA) ~85 (90) per cent of the time.

We study the redshift evolution of the quasar UV Luminosity Function (LF) for
0.5 < z < 6.5, by collecting the most up to date observational data and, in
particular, the recently discovered population of faint AGNs. We fit the QSO LF
using either a double powerlaw or a Schechter function, finding that both
forms provide good fits to the data. We derive empirical relations for the LF
parameters as a function of redshift and, based on these results, predict the
quasar UV LF at z=8. From the inferred LF evolution, we compute the redshift
evolution of the QSO/AGN comoving ionizing emissivity and hydrogen
photoionization rate. If faint AGNs are included, the contribution of quasars
to reionization increases substantially. However, their level of contribution
critically depends on the detailed shape of the QSO LF, which can be
constrained by efficient searches of highz quasars. To this aim, we predict
the expected (i) number of z>6 quasars detectable by ongoing and future NIR
surveys (as EUCLID and WFIRST), and (ii) number counts for a single
radiorecombination line observation with SKAMID (FoV = 0.49 deg^2) as a
function of the Hnalpha flux density, at 0<z<8. These surveys (even at z<6)
will be fundamental to better constrain the role of quasars as reionization
sources.

The Hydrogen Epoch of Reionization Array (HERA) is a staged experiment to
measure 21 cm emission from the primordial intergalactic medium (IGM)
throughout cosmic reionization ($z=612$), and to explore earlier epochs of our
Cosmic Dawn ($z\sim30$). During these epochs, early stars and black holes
heated and ionized the IGM, introducing fluctuations in 21 cm emission. HERA is
designed to characterize the evolution of the 21 cm power spectrum to constrain
the timing and morphology of reionization, the properties of the first
galaxies, the evolution of largescale structure, and the early sources of
heating. The full HERA instrument will be a 350element interferometer in South
Africa consisting of 14m parabolic dishes observing from 50 to 250 MHz.
Currently, 19 dishes have been deployed on site and the next 18 are under
construction. HERA has been designated as an SKA Precursor instrument.
In this paper, we summarize HERA's scientific context and provide forecasts
for its key science results. After reviewing the current state of the art in
foreground mitigation, we use the delayspectrum technique to motivate
highlevel performance requirements for the HERA instrument. Next, we present
the HERA instrument design, along with the subsystem specifications that ensure
that HERA meets its performance requirements. Finally, we summarize the
schedule and status of the project. We conclude by suggesting that, given the
realities of foreground contamination, currentgeneration 21 cm instruments are
approaching their sensitivity limits. HERA is designed to bring both the
sensitivity and the precision to deliver its primary science on the basis of
proven foreground filtering techniques, while developing new subtraction
techniques to unlock new capabilities. The result will be a major step toward
realizing the widely recognized scientific potential of 21 cm cosmology.

We present upper limits on the 21 cm power spectrum at $z = 5.9$ calculated
from the modelindependent limit on the neutral fraction of the intergalactic
medium of $x_{\rm H{\small I }} < 0.06 + 0.05\ (1\sigma)$ derived from dark
pixel statistics of quasar absorption spectra. Using 21CMMC, a Markov chain
Monte Carlo Epoch of Reionization analysis code, we explore the probability
distribution of 21 cm power spectra consistent with this constraint on the
neutral fraction. We present 99 per cent confidence upper limits of
$\Delta^2(k) < 10$ to $20\ {\rm mK}^2$ over a range of $k$ from 0.5 to $2.0\
h{\rm Mpc}^{1}$, with the exact limit dependent on the sampled $k$ mode. This
limit can be used as a null test for 21 cm experiments: a detection of power at
$z=5.9$ in excess of this value is highly suggestive of residual foreground
contamination or other systematic errors affecting the analysis.

We introduce the Evolution of 21cm Structure (EOS) project: providing
periodic, public releases of the latest cosmological 21cm simulations. 21cm
interferometry is set to revolutionize studies of the Cosmic Dawn (CD) and
epoch of reionization (EoR), eventually resulting in 3D maps of the first
billion years of our Universe. Progress will depend on sophisticated data
analysis pipelines, which are in turn tested on largescale mock observations.
Here we present the 2016 EOS data release, consisting of the largest (1.6 Gpc
on side with a 1024^3 grid), public 21cm simulations of the CD and EoR. We
include calibrated, subgrid prescriptions for inhomogeneous recombinations and
photoheating suppression of star formation in small mass galaxies. We present
two simulation runs that approximately bracket the contribution from faint
unseen galaxies. From these two extremes, we predict that the duration of
reionization (defined as a change in the mean neutral fraction from 0.9 to 0.1)
should be between 2.7 < Delta z < 5.7. The largescale 21cm power during the
advanced EoR stages can be different by up to a factor of ~10, depending on the
model. This difference has a comparable contribution from: (i) the typical bias
of sources; and (ii) a more efficient negative feedback in models with an
extended EoR driven by faint galaxies. We also make detectability forecasts.
With a 1000h integration, HERA and SKA1low should achieve a signaltonoise of
~fewhundreds throughout the EoR/CD, while in the maximally optimistic scenario
of perfect foreground cleaning, all instruments should make a statistical
detection of the cosmic signal. We also caution that our ability to clean
foregrounds determines the relative performance of narrow/deep vs. wide/shallow
surveys expected with SKA1. Our 21cm power spectra, simulation outputs and
visualizations are publicly available.

Interferometry of the cosmic 21cm signal is set to revolutionize our
understanding of the Epoch of Reionization (EoR), eventually providing 3D maps
of the early Universe. Initial detections however will be low signaltonoise,
limited by systematics. To confirm a putative 21cm detection, and check the
accuracy of 21cm data analysis pipelines, it would be very useful to
crosscorrelate against a genuine cosmological signal. The most promising
cosmological signals are widefield maps of Lyman alpha emitting galaxies
(LAEs), expected from the Subaru HyperSuprime Cam (HSC) UltraDeep field. Here
we present estimates of the correlation between LAE maps at z~7 and the 21cm
signal observed by both the Low Frequency Array (LOFAR) and the planned Square
Kilometer Array Phase 1 (SKA1). We adopt a systematic approach, varying both:
(i) the prescription of assigning LAEs to host halos; and (ii) the largescale
structure of neutral and ionized regions (i.e. EoR morphology). We find that
the LAE21cm crosscorrelation is insensitive to (i), thus making it a robust
probe of the EoR. A 1000h observation with LOFAR would be sufficient to
discriminate at >1 standard deviation a fully ionized Universe from one with a
mean neutral fraction of xHI~0.50, using the LAE21cm crosscorrelation
function on scales of R~310 Mpc. Unlike LOFAR, whose detection of the LAE21cm
crosscorrelation is limited by noise, SKA1 is mostly limited by ignorance of
the EoR morphology. However, the planned 100h widefield SKA1Low survey will
be sufficient to discriminate an ionized Universe from one with xHI~0.25, even
with maximally pessimistic assumptions.

We compute robust lower limits on the spin temperature, $T_{\rm S}$, of the
$z=8.4$ intergalactic medium (IGM), implied by the upper limits on the 21cm
power spectrum recently measured by PAPER64. Unlike previous studies which
used a single epoch of reionization (EoR) model, our approach samples a large
parameter space of EoR models: the dominant uncertainty when estimating
constraints on $T_{\rm S}$. Allowing $T_{\rm S}$ to be a free parameter and
marginalizing over EoR parameters in our Markov Chain Monte Carlo code 21CMMC,
we infer $T_{\rm S}\ge3 {\rm K}$ (corresponding approximately to $1\sigma$) for
a mean IGM neutral fraction of $\bar{x}_{\rm H{\scriptsize I}}\gtrsim0.1$. We
further improve on these limits by foldingin additional EoR constraints based
on: (i) the dark fraction in QSO spectra, which implies a strict upper limit of
$\bar{x}_{\rm H{\scriptsize I}}[z=5.9]\leq 0.06+0.05 \,(1\sigma)$; and (ii) the
electron scattering optical depth, $\tau_{\rm e}=0.066\pm0.016\,(1\sigma)$
measured by the Planck satellite. By restricting the allowed EoR models, these
additional observations tighten the approximate $1\sigma$ lower limits on the
spin temperature to $T_{\rm S} \ge 6$ K. Thus, even such preliminary 21cm
observations begin to rule out extreme scenarios such as `cold reionization',
implying at least some prior heating of the IGM. The analysis framework
developed here can be applied to upcoming 21cm observations, thereby providing
unique insights into the sources which heated and subsequently reionized the
very early Universe.

With the first phase of the Square Kilometre Array (SKA1) entering into its
final preconstruction phase, we investigate how best to maximise its
scientific return. Specifically, we focus on the statistical measurement of the
21 cm power spectrum (PS) from the epoch of reionization (EoR) using the low
frequency array, SKA1low. To facilitate this investigation we use the recently
developed MCMC based EoR analysis tool 21CMMC (Greig & Mesinger). In light of
the recent 50 per cent cost reduction, we consider several different SKA core
baseline designs, changing: (i) the number of antenna stations; (ii) the number
of dipoles per station; and also (iii) the distribution of baseline lengths. We
find that a design with a reduced number of dipoles per core station (increased
field of view and total number of core stations), together with shortened
baselines, maximises the recovered EoR signal. With this optimal baseline
design, we investigate three observing strategies, analysing the tradeoff
between lowering the instrumental thermal noise against increasing the field of
view. SKA1low intends to perform a three tiered observing approach, including
a deep 100 deg$^{2}$ at 1000 h, a mediumdeep 1000 deg$^{2}$ at 100 h and a
shallow 10,000 deg$^{2}$ at 10 h survey. We find that the three observing
strategies result in comparable ($\lesssim$ per cent) constraints on our EoR
astrophysical parameters. This is contrary to naive predictions based purely on
the total signaltonoise, thus highlighting the need to use EoR parameter
constraints as a figure of merit, in order to maximise scientific returns with
next generation interferometers.

We introduce 21CMMC: a parallelized, Monte Carlo Markov Chain analysis tool,
incorporating the epoch of reionization (EoR) seminumerical simulation
21CMFAST. 21CMMC estimates astrophysical parameter constraints from 21 cm EoR
experiments, accommodating a variety of EoR models, as well as priors on model
parameters and the reionization history. To illustrate its utility, we consider
two different EoR scenarios, one with a single population of galaxies (with a
massindependent ionizing efficiency) and a second, more general model with two
different, feedbackregulated populations (each with massdependent ionizing
efficiencies). As an example, combining three observations (z=8, 9 and 10) of
the 21 cm power spectrum with a conservative noise estimate and uniform model
priors, we find that interferometers with specifications like the Low Frequency
Array/Hydrogen Epoch of Reionization Array (HERA)/Square Kilometre Array 1
(SKA1) can constrain common reionization parameters: the ionizing efficiency
(or similarly the escape fraction), the mean free path of ionizing photons and
the log of the minimum virial temperature of starforming haloes to within
45.3/22.0/16.7, 33.5/18.4/17.8 and 6.3/3.3/2.4 per cent, ~$1\sigma$ fractional
uncertainty, respectively. Instead, if we optimistically assume that we can
perfectly characterize the EoR modelling uncertainties, we can improve on these
constraints by up to a factor of ~few. Similarly, the fractional uncertainty on
the average neutral fraction can be constrained to within $\lesssim10$ per cent
for HERA and SKA1. By studying the resulting impact on astrophysical
constraints, 21CMMC can be used to optimize (i) interferometer designs; (ii)
foreground cleaning algorithms; (iii) observing strategies; (iv) alternative
statistics characterizing the 21 cm signal; and (v) synergies with other
observational programs.

We develop a semianalytic method for assessing the impact of the largescale
IGM temperature fluctuations expected following He${\rm\,{\scriptstyle II}}$
reionization on threedimensional clustering measurements of the Ly$\alpha$
forest. Our methodology builds upon the existing large volume, mock Ly$\alpha$
forest survey simulations presented by Greig et al. by including a prescription
for a spatially inhomogeneous ionizing background, temperature fluctuations
induced by patchy He${\rm\,{\scriptstyle II}}$ photoheating and the clustering
of quasars. This approach enables us to achieve a dynamic range within our
semianalytic model substantially larger than currently feasible with
computationally expensive, fully numerical simulations. The results agree well
with existing numerical simulations, with largescale temperature fluctuations
introducing a scaledependent increase in the spherically averaged 3D
Ly$\alpha$ forest power spectrum of up to 2030 per cent at wavenumbers
$k\sim0.02$ Mpc$^{1}$. Although these largescale thermal fluctuations will
not substantially impact upon the recovery of the baryon acoustic oscillation
scale from existing and forthcoming dark energy spectroscopic surveys, any
complete forward modelling of the broadband term in the Ly$\alpha$ correlation
function will none the less require their inclusion.

The Square Kilometre Array (SKA) will offer an unprecedented view onto the
early Universe, using interferometric observations of the redshifted 21cm line.
The 21cm line probes the thermal and ionization state of the cosmic gas, which
is governed by the birth and evolution of the first structures in our Universe.
Here we show how the evolution of the 21cm signal will allow us to study when
the first generations of galaxies appeared, what were their properties, and
what was the structure of the intergalactic medium. We highlight qualitative
trends which will offer robust insights into the early Universe.

Large surveys for Lymanalpha emitting (LAE) galaxies have been proposed as a
new method for measuring clustering of the galaxy population at high redshift
with the goal of determining cosmological parameters. However, Lymanalpha
radiative transfer effects may modify the observed clustering of LAE galaxies
in a way that mimics gravitational effects, potentially reducing the precision
of cosmological constraints. For example, the effect of the linear
redshiftspace distortion on the power spectrum of LAE galaxies is potentially
degenerate with Lymanalpha radiative transfer effects owing to the dependence
of observed flux on intergalactic medium velocity gradients. In this paper, we
show that the threepoint function (bispectrum) can distinguish between
gravitational and nongravitational effects, and thus breaks these
degeneracies, making it possible to recover cosmological parameters from LAE
galaxy surveys. Constraints on the angular diameter distance and the Hubble
expansion rate can also be improved by combining power spectrum and bispectrum
measurements.

High redshift measurements of the baryonic acoustic oscillation scale (BAO)
from large Lyalpha forest surveys represent the next frontier of dark energy
studies. As part of this effort, efficient simulations of the BAO signature
from the Lyalpha forest will be required. We construct a model for producing
fast, large volume simulations of the Lyalpha forest for this purpose.
Utilising a calibrated semianalytic approach, we are able to run very large
simulations in 1 Gpc^3 volumes which fully resolve the Jeans scale in less than
a day on a desktop PC using a GPU enabled version of our code. The Lyalpha
forest spectra extracted from our semianalytical simulations are in excellent
agreement with those obtained from a fully hydrodynamical reference simulation.
Furthermore, we find our simulated data are in broad agreement with
observational measurements of the flux probability distribution and 1D flux
power spectrum. We are able to correctly recover the input BAO scale from the
3D Lyalpha flux power spectrum measured from our simulated data, and estimate
that a BOSSlike 10^4 deg^2 survey with ~15 background sources per square
degree and a signaltonoise of ~5 per pixel should achieve a measurement of
the BAO scale to within ~1.4 per cent. We also use our simulations to provide
simple powerlaw expressions for estimating the fractional error on the BAO
scale on varying the signaltonoise and the number density of background
sources. The speed and flexibility of our approach is well suited for exploring
parameter space and the impact of observational and astrophysical systematics
on the recovery of the BAO signature from forthcoming large scale spectroscopic
surveys.