
With the advancement of treatment modalities in radiation therapy for cancer
patients, outcomes have improved, but at the cost of increased treatment plan
complexity and planning time. The accurate prediction of dose distributions
would alleviate this issue by guiding clinical plan optimization to save time
and maintain high quality plans. We have modified a convolutional deep network
model, Unet (originally designed for segmentation purposes), for predicting
dose from patient image contours of the planning target volume (PTV) and organs
at risk (OAR). We show that, as an example, we are able to accurately predict
the dose of intensitymodulated radiation therapy (IMRT) for prostate cancer
patients, where the average Dice similarity coefficient is 0.91 when comparing
the predicted vs. true isodose volumes between 0% and 100% of the prescription
dose. The average value of the absolute differences in [max, mean] dose is
found to be under 5% of the prescription dose, specifically for each structure
is [1.80%, 1.03%](PTV), [1.94%, 4.22%](Bladder), [1.80%, 0.48%](Body), [3.87%,
1.79%](L Femoral Head), [5.07%, 2.55%](R Femoral Head), and [1.26%,
1.62%](Rectum) of the prescription dose. We thus managed to map a desired
radiation dose distribution from a patient's PTV and OAR contours. As an
additional advantage, relatively little data was used in the techniques and
models described in this paper.

Charge transfer and electronphonon coupling (EPC) are proposed to be two
important constituents associated with enhanced superconductivity in the single
unit cell FeSe films on oxide surfaces. Using highresolution electron energy
loss spectroscopy combined with firstprinciples calculations, we have explored
the lattice dynamics of ultrathin FeSe films grown on SrTiO3. We show that,
despite the significant effect from the substrate on the electronic structure
and superconductivity of the system, the FeSe phonons in the films are
unaffected. The energy dispersion and linewidth associated with the Fe and
Sederived vibrational modes are thickness and temperatureindependent.
Theoretical calculations indicate the crucial role of antiferromagnetic
correlation in FeSe to reproduce the experimental phonon dispersion.
Importantly, the only detectable change due to the growth of FeSe films is the
broadening of the FuchsKliewer (FK) phonons associated with the lattice
vibrations of SrTiO$_3$(001) substrate. If EPC plays any role in the
enhancement of film superconductivity, it must be the interfacial coupling
between the electrons in FeSe film and the FK phonons from substrate rather
than the phonons of FeSe.

Plasmons, the collective excitations of electrons in the bulk or at the
surface, play an important role in the properties of materials, and have
generated the field of Plasmonics. We report the observation of a highly
unusual acoustic plasmon mode on the surface of a threedimensional topological
insulator (TI), Bi2Se3, using momentum resolved inelastic electron scattering.
In sharp contrast to ordinary plasmon modes, this mode exhibits almost linear
dispersion into the second Brillouin zone and remains prominent with remarkably
weak damping not seen in any other systems. This behavior must be associated
with the inherent robustness of the electrons in the TI surface state, so that
not only the surface Dirac states but also their collective excitations are
topologically protected. On the other hand, this mode has much smaller energy
dispersion than expected from a continuous media excitation picture, which can
be attributed to the strong coupling with surface phonons.

We systematically investigated the superstructure evolution of Te atoms on
Au(111) substrate at different coverages. As revealed by low temperature
scanning tunneling microscopy and spectroscopy, Te atoms form onedimensional
root3 R30{\deg} chains near 0.10 monolayer (ML). Two twodimensional chiral
structures, (root111*root111)R4.7{\deg} and (3root21*3root21)R10.9{\deg}, can
be formed and their stability can be tuned by slightly adjusting the Te coverge
near 1/3 ML. A honeycomblike superstructure is observed by further increasing
the coverage to 4/9 ML. An interfacial state emerges at ~0.65 eV due to Te
adsorption on Au(111). The formation of these Teinduced highorder
superstructures is accompanied by relaxation of gold atoms in the surface
layer, indicating the strong TeAu interaction.

The significant role of interfacial coupling on the superconductivity
enhancement in FeSe films on SrTiO3 has been widely recognized. But the
explicit origination of this coupling is yet to be identified. Here by surface
phonon measurements using high resolution electron energy loss spectroscopy, we
found electric field generated by FuchsKliewer (FK) phonon modes of SrTiO3
can penetrate into FeSe films and strongly interact with electrons therein. The
modespecific electronphonon coupling (EPC) constant for the ~92 meV FK
phonon is ~0.25 in the singlelayer FeSe on SrTiO3. With increasing FeSe
thickness, the penetrating field intensity decays exponentially, which matches
well the observed exponential decay of the superconducting gap. It is
unambiguously shown that the SrTiO3 FK phonon penetrating into FeSe is
essential in the interfacial superconductivity enhancement.

VMAT optimization is a computationally challenging problem due to its large
data size, high degrees of freedom, and many hardware constraints.
Highperformance graphics processing units have been used to speed up the
computations. However, its small memory size cannot handle cases with a large
dosedeposition coefficient (DDC) matrix. This paper is to report an
implementation of our columngeneration based VMAT algorithm on a multiGPU
platform to solve the memory limitation problem. The columngeneration approach
generates apertures sequentially by solving a pricing problem (PP) and a master
problem (MP) iteratively. The DDC matrix is split into four submatrices
according to beam angles, stored on four GPUs in compressed sparse row format.
Computation of beamlet price is accomplished using multiGPU. While the
remaining steps of PP and MP problems are implemented on a single GPU due to
their modest computational loads. A H&N patient case was used to validate our
method. We compare our multiGPU implementation with three single GPU
implementation strategies: truncating DDC matrix (S1), repeatedly transferring
DDC matrix between CPU and GPU (S2), and porting computations involving DDC
matrix to CPU (S3). Two more H&N patient cases and three prostate cases were
also used to demonstrate the advantages of our method. Our multiGPU
implementation can finish the optimization within ~1 minute for the H&N patient
case. S1 leads to an inferior plan quality although its total time was 10
seconds shorter than the multiGPU implementation. S2 and S3 yield same plan
quality as the multiGPU implementation but take ~4 minutes and ~6 minutes,
respectively. High computational efficiency was consistently achieved for the
other 5 cases. The results demonstrate that the multiGPU implementation can
handle the largescale VMAT optimization problem efficiently without
sacrificing plan quality.

Monte Carlo (MC) method has been recognized the most accurate dose
calculation method for radiotherapy. However, its extremely long computation
time impedes clinical applications. Recently, a lot of efforts have been made
to realize fast MC dose calculation on GPUs. Nonetheless, most of the GPUbased
MC dose engines were developed in NVidia CUDA environment. This limits the code
portability to other platforms, hindering the introduction of GPUbased MC
simulations to clinical practice. The objective of this paper is to develop a
fast crossplatform MC dose engine oclMC using OpenCL environment for external
beam photon and electron radiotherapy in MeV energy range. Coupled
photonelectron MC simulation was implemented with analogue simulations for
photon transports and a Class II condensed history scheme for electron
transports. To test the accuracy and efficiency of our dose engine oclMC, we
compared dose calculation results of oclMC and gDPM, our previously developed
GPUbased MC code, for a 15 MeV electron beam and a 6 MV photon beam on a
homogenous water phantom, one slab phantom and one halfslab phantom.
Satisfactory agreement was observed in all the cases. The average dose
differences within 10% isodose line of the maximum dose were 0.480.53% for the
electron beam cases and 0.150.17% for the photon beam cases. In terms of
efficiency, our dose engine oclMC was 617% slower than gDPM when running both
codes on the same NVidia TITAN card due to both different physics particle
transport models and different computational environments between CUDA and
OpenCL. The crossplatform portability was also validated by successfully
running our new dose engine on a set of different compute devices including an
Nvidia GPU card, two AMD GPU cards and an Intel CPU card using one or four
cores. Computational efficiency among these platforms was compared.

Monte Carlo (MC) simulation is considered as the most accurate method for
radiation dose calculations. Accuracy of a source model for a linear
accelerator is critical for the overall dose calculation accuracy. In this
paper, we presented an analytical source model that we recently developed for
GPUbased MC dose calculations. A key concept called phasespacering (PSR) was
proposed. It contained a group of particles that are of the same type and close
in energy and radial distance to the center of the phasespace plane. The model
parameterized probability densities of particle location, direction and energy
for each primary photon PSR, scattered photon PSR and electron PSR. For a
primary photon PSRs, the particle direction is assumed to be from the beam
spot. A finite spot size is modeled with a 2D Gaussian distribution. For a
scattered photon PSR, multiple Gaussian components were used to model the
particle direction. The direction distribution of an electron PSRs was also
modeled as a 2D Gaussian distribution with a large standard deviation. We also
developed a method to analyze a phasespace file and derive corresponding model
parameters. To test the accuracy of our linac source model, dose distributions
of different open fields in a water phantom were calculated using our source
model and compared to those directly calculated using the reference phasespace
file. The average distancetoagreement (DTA) was within 1 mm for the depth
dose in the buildup region and beam penumbra regions. The rootmeansquare
(RMS) dose difference was within 1.1% for dose profiles at inner and outer beam
regions. The maximal relative difference of output factors was within 0.5%.
Good agreements were also found in an IMRT prostate patient case and an IMRT
headandneck case. These results demonstrated the efficacy of our source model
in terms of accurately representing a reference phasespace file.

We recently built an analytical source model for GPUbased MC dose engine. In
this paper, we present a sampling strategy to efficiently utilize this source
model in GPUbased dose calculation. Our source model was based on a concept of
phasespacering (PSR). This ring structure makes it effective to account for
beam rotational symmetry, but not suitable for dose calculations due to
rectangular jaw settings. Hence, we first convert PSR source model to its
phasespace let (PSL) representation. Then in dose calculation, different types
of subsources were separately sampled. Source sampling and particle transport
were iterated. So that the particles being sampled and transported
simultaneously are of same type and close in energy to alleviate GPU thread
divergence. We also present an automatic commissioning approach to adjust the
model for a good representation of a clinical linear accelerator . Weighting
factors were introduced to adjust relative weights of PSRs, determined by
solving a quadratic minimization problem with a nonnegativity constraint. We
tested the efficiency gain of our model over a previous source model using PSL
files. The efficiency was improved by 1.70 ~ 4.41, due to the avoidance of long
data reading and transferring. The commissioning problem can be solved in ~20
sec. Its efficacy was tested by comparing the doses computed using the
commissioned model and the uncommissioned one, with measurements in different
open fields in a water phantom under a clinical Varian Truebeam 6MV beam. For
the depth dose curves, the average distancetoagreement was improved from
0.04~0.28 cm to 0.04~0.12 cm for buildup region and the rootmeansquare (RMS)
dose difference after buildup region was reduced from 0.32%~0.67% to
0.21%~0.48%. For lateral dose profiles, RMS difference was reduced from
0.31%~2.0% to 0.06%~0.78% at inner beam and from 0.20%~1.25% to 0.10%~0.51% at
outer beam.

In this paper, we present a new method to generate an instantaneous
volumetric image using a single xray projection. To fully extract motion
information hidden in projection images, we partitioned a projection image into
small patches. We utilized a sparse learning method to automatically select
patches that have a high correlation with principal component analysis (PCA)
coefficients of a lung motion model. A model that maps the patch intensity to
the PCA coefficients is built along with the patch selection process. Based on
this model, a measured projection can be used to predict the PCA coefficients,
which are further used to generate a motion vector field and hence a volumetric
image. We have also proposed an intensity baseline correction method based on
the partitioned projection, where the first and the second moments of pixel
intensities at a patch in a simulated image are matched with those in a
measured image via a linear transformation. The proposed method has been valid
in simulated data and real phantom data. The algorithm is able to identify
patches that contain relevant motion information, e.g. diaphragm region. It is
found that intensity correction step is important to remove the systematic
error in the motion prediction. For the simulation case, the sparse learning
model reduced prediction error for the first PCA coefficient to 5%, compared to
the 10% error when sparse learning is not used. 95th percentile error for the
predicted motion vector is reduced from 2.40 mm to 0.92mm. In the phantom case,
the predicted tumor motion trajectory is successfully reconstructed with 0.82
mm mean vector field error compared to 1.66 mm error without using the sparse
learning method. The algorithm robustness with respect to sparse level, patch
size, and existence of diaphragm, as well as computation time, has also been
studied.

Cone beam CT (CBCT) has been widely used for patient setup in image guided
radiation therapy (IGRT). Radiation dose from CBCT scans has become a clinical
concern. The purposes of this study are 1) to commission a GPUbased Monte
Carlo (MC) dose calculation package gCTD for Varian OnBoard Imaging (OBI)
system and test the calculation accuracy, and 2) to quantitatively evaluate
CBCT dose from the OBI system in typical IGRT scan protocols. We first
conducted dose measurements in a water phantom. Xray source model parameters
used in gCTD are obtained through a commissioning process. gCTD accuracy is
demonstrated by comparing calculations with measurements in water and in CTDI
phantoms. 25 brain cancer patients are used to study dose in a standarddose
head protocol, and 25 prostate cancer patients are used to study dose in pelvis
protocol and pelvis spotlight protocol. Mean dose to each organ is calculated.
Mean dose to 2% voxels that have the highest dose is also computed to quantify
the maximum dose. It is found that the mean dose value to an organ varies
largely among patients. Moreover, dose distribution is highly nonhomogeneous
inside an organ. The maximum dose is found to be 1~3 times higher than the mean
dose depending on the organ, and is up to 8 times higher for the entire body
due to the very high dose region in bony structures. High computational
efficiency has also been observed in our studies, such that MC dose calculation
time is less than 5 min for a typical case.

A novel phasespace source implementation has been designed for GPUbased
Monte Carlo dose calculation engines. Due to the parallelized nature of GPU
hardware, it is essential to simultaneously transport particles of the same
type and similar energies but separated spatially to yield a high efficiency.
We present three methods for phasespace implementation that have been
integrated into the most recent version of the GPUbased Monte Carlo
radiotherapy dose calculation package gDPM v3.0. The first method is to
sequentially read particles from a patientdependent phasespace and sort them
onthefly based on particle type and energy. The second method supplements
this with a simple secondary collimator model and fluence map implementation so
that patientindependent phasespace sources can be used. Finally, as the third
method (called the phasespacelet, or PSL, method) we introduce a novel
strategy to preprocess patientindependent phasespaces and bin particles by
type, energy and position. Position bins located outside a rectangular region
of interest enclosing the treatment field are ignored, substantially decreasing
simulation time. The three methods were validated in absolute dose against
BEAMnrc/DOSXYZnrc and compared using gammaindex tests (2%/2mm above the 10%
isodose). It was found that the PSL method has the optimal balance between
accuracy and efficiency and thus is used as the default method in gDPM v3.0.
Using the PSL method, open fields of 4x4, 10x10 and 30x30 cm2 in water resulted
in gamma passing rates of 99.96%, 99.92% and 98.66%, respectively. Relative
output factors agreed within 1%. An IMRT patient plan using the PSL method
resulted in a passing rate of 97%, and was calculated in 50 seconds using a
single GPU compared to 8.4 hours (per CPU) for BEAMnrc/DOSXYZnrc.

In the treatment plan optimization for intensity modulated radiation therapy
(IMRT), dosedeposition coefficient (DDC) matrix is often precomputed to
parameterize the dose contribution to each voxel in the volume of interest from
each beamlet of unit intensity. However, due to the limitation of computer
memory and the requirement on computational efficiency, in practice matrix
elements of small values are usually truncated, which inevitably compromises
the quality of the resulting plan. A fixedpoint iteration scheme has been
applied in IMRT optimization to solve this problem, which has been reported to
be effective and efficient based on the observations of the numerical
experiments. In this paper, we aim to point out the mathematics behind this
scheme and to answer the following three questions: 1) whether the fixedpoint
iteration algorithm converges or not? 2) when it converges, whether the fixed
point solution is same as the original solution obtained with the complete DDC
matrix? 3) if not the same, whether the fixed point solution is more accurate
than the naive solution of the truncated problem obtained without the
fixedpoint iteration? To answer these questions, we first performed
mathematical analysis and deductions using a simplified fluence map
optimization (FMO) model. Then we conducted numerical experiments on a
headandneck patient case using both the simplified and the original FMO
model. Both our mathematical analysis and numerical experiments demonstrate
that with proper DDC matrix truncation, the fixedpoint iteration can converge.
Even though the converged solution is not the one that we obtain with the
complete DDC matrix, the fixedpoint iteration scheme could significantly
improve the plan accuracy compared with the solution to the truncated problem
obtained without the fixedpoint iteration.

The gammaindex test has been commonly adopted to quantify the degree of
agreement between a reference dose distribution and an evaluation dose
distribution. Monte Carlo (MC) simulation has been widely used for the
radiotherapy dose calculation for both clinical and research purposes. The goal
of this work is to investigate both theoretically and experimentally the impact
of the MC statistical fluctuation on the gammaindex test when the fluctuation
exists in the reference, the evaluation, or both dose distributions. To the
first order approximation, we theoretically demonstrated in a simplified model
that the statistical fluctuation tends to overestimate gammaindex values when
existing in the reference dose distribution and underestimate gammaindex
values when existing in the evaluation dose distribution given the original
gammaindex is relatively large for the statistical fluctuation. Our numerical
experiments using clinical photon radiation therapy cases have shown that 1)
when performing a gammaindex test between an MC reference dose and a nonMC
evaluation dose, the average gammaindex is overestimated and the passing rate
decreases with the increase of the noise level in the reference dose; 2) when
performing a gammaindex test between a nonMC reference dose and an MC
evaluation dose, the average gammaindex is underestimated when they are within
the clinically relevant range and the passing rate increases with the increase
of the noise level in the evaluation dose; 3) when performing a gammaindex
test between an MC reference dose and an MC evaluation dose, the passing rate
is overestimated due to the noise in the evaluation dose and underestimated due
to the noise in the reference dose. We conclude that the gammaindex test
should be used with caution when comparing dose distributions computed with
Monte Carlo simulation.

In adaptive radiotherapy, deformable image registration is often conducted
between the planning CT and treatment CT (or cone beam CT) to generate a
deformation vector field (DVF) for dose accumulation and contour propagation.
The auto propagated contours on the treatment CT may contain relatively large
errors, especially in low contrast regions. A clinician inspection and editing
of the propagated contours are frequently needed. The edited contours are able
to meet the clinical requirement for adaptive therapy; however, the DVF is
still inaccurate and inconsistent with the edited contours. The purpose of this
work is to develop a contourguided deformable image registration (CGDIR)
algorithm to improve the accuracy and consistency of the DVF for adaptive
radiotherapy. Incorporation of the edited contours into the registration
algorithm is realized by regularizing the objective function of the original
demons algorithm with a term of intensity matching between the delineated
structures set pairs. The CGDIR algorithm is implemented on computer graphics
processing units (GPUs) by following the original GPUbased demons algorithm
computation framework [Gu et al, Phys Med Biol. 55(1): 207219, 2010]. The
performance of CGDIR is evaluated on five clinical headandneck and one
pelvic cancer patient data. It is found that compared with the original demons,
CGDIR improves the accuracy and consistency of the DVF, while retaining
similar high computational efficiency.

Patient respiratory signal associated with the cone beam CT (CBCT)
projections is important for lung cancer radiotherapy. In contrast to
monitoring an external surrogate of respiration, such signal can be extracted
directly from the CBCT projections. In this paper, we propose a novel local
principle component analysis (LPCA) method to extract the respiratory signal by
distinguishing the respiration motioninduced content change from the gantry
rotationinduced content change in the CBCT projections. The LPCA method is
evaluated by comparing with three stateoftheart projectionbased methods,
namely, the Amsterdam Shroud (AS) method, the intensity analysis (IA) method,
and the Fouriertransform based phase analysis (FTp) method. The clinical CBCT
projection data of eight patients, acquired under various clinical scenarios,
were used to investigate the performance of each method. We found that the
proposed LPCA method has demonstrated the best overall performance for cases
tested and thus is a promising technique for extracting respiratory signal. We
also identified the applicability of each existing method.

In a treatment plan optimization problem for radiotherapy, a clinically
acceptable plan is usually generated by an optimization process with weighting
factors or reference doses adjusted for organs. Recent discoveries indicate
that adjusting parameters associated with each voxel may lead to better plan
quality. However, it is still unclear regarding the mathematical reasons behind
it. To answer questions related to this problem, we establish in this work a
new mathematical framework equipped with two theorems. The new framework
clarifies the different consequences of adjusting organdependent and
voxeldependent parameters for the treatment plan optimization of radiation
therapy, as well as the different effects of adjusting weighting factors versus
reference doses in the optimization process. The main discoveries are
threefold: 1) While in the organbased model the selection of the objective
function has an impact on the quality of the optimized plans, this is no longer
an issue for the voxelbased model since the entire Pareto surface could be
generated regardless the specific form of the objective function as long as it
satisfies certain mathematical conditions; 2) A larger Pareto surface is
explored by adjusting voxeldependent parameters than by adjusting
organdependent parameters, possibly allowing for the generation of plans with
better tradeoffs among different clinical objectives; 3) Adjusting voxel
weighting factors is preferred to adjusting the voxel reference doses since the
Pareto optimality can be maintained.

Simulation of xray projection images plays an important role in cone beam CT
(CBCT) related research projects. A projection image contains primary signal,
scatter signal, and noise. It is computationally demanding to perform accurate
and realistic computations for all of these components. In this work, we
develop a package on GPU, called gDRR, for the accurate and efficient
computations of xray projection images in CBCT under clinically realistic
conditions. The primary signal is computed by a trilinear raytracing
algorithm. A Monte Carlo (MC) simulation is then performed, yielding the
primary signal and the scatter signal, both with noise. A denoising process is
applied to obtain a smooth scatter signal. The noise component is then obtained
by combining the difference between the MC primary and the raytracing primary
signals, and the difference between the MC simulated scatter and the denoised
scatter signals. Finally, a calibration step converts the calculated noise
signal into a realistic one by scaling its amplitude. For a typical CBCT
projection with a polyenergetic spectrum, the calculation time for the primary
signal is 1.2~2.3 sec, while the MC simulations take 28.1~95.3 sec. Computation
time for all other steps is negligible. The raytracing primary signal matches
well with the primary part of the MC simulation result. The MC simulated
scatter signal using gDRR is in agreement with EGSnrc results with a relative
difference of 3.8%. A noise calibration process is conducted to calibrate gDRR
against a real CBCT scanner. The calculated projections are accurate and
realistic, such that beamhardening artifacts and scatter artifacts can be
reproduced using the simulated projections. The noise amplitudes in the CBCT
images reconstructed from the simulated projections also agree with those in
the measured images at corresponding mAs levels.

Computed tomography (CT) to conebeam computed tomography (CBCT) deformable
image registration (DIR) is a crucial step in adaptive radiation therapy.
Current intensitybased registration algorithms, such as demons, may fail in
the context of CTCBCT DIR because of inconsistent intensities between the two
modalities. In this paper, we propose a variant of demons, called Deformation
with Intensity Simultaneously Corrected (DISC), to deal with CTCBCT DIR. DISC
distinguishes itself from the original demons algorithm by performing an
adaptive intensity correction step on the CBCT image at every iteration step of
the demons registration. Specifically, the intensity correction of a voxel in
CBCT is achieved by matching the first and the second moments of the voxel
intensities inside a patch around the voxel with those on the CT image. It is
expected that such a strategy can remove artifacts in the CBCT image, as well
as ensuring the intensity consistency between the two modalities. DISC is
implemented on computer graphics processing units (GPUs) in compute unified
device architecture (CUDA) programming environment. The performance of DISC is
evaluated on a simulated patient case and six clinical headandneck cancer
patient data. It is found that DISC is robust against the CBCT artifacts and
intensity inconsistency and significantly improves the registration accuracy
when compared with the original demons.

Using fiducial markers on patient's body surface to predict the tumor
location is a widely used approach in lung cancer radiotherapy. The purpose of
this work is to propose an algorithm that automatically identifies a sparse set
of locations on the patient's surface with the optimal prediction power for the
tumor motion. The sparse selection of markers on the external surface and the
assumed linear relationship between the marker motion and the internal tumor
motion are represented by a prediction matrix. Such a matrix is determined by
solving an optimization problem, where the objective function contains a
sparsity term that penalizes the number of markers chosen on the patient's
surface. The performance of our algorithm has been tested on realistic clinical
data of four lung cancer patients. Thoracic 4DCT scans with 10 phases are used
for the study. On a reference phase, a grid of points are casted on the
patient's surface (except for patient's back) and propagated to other phases
via deformable image registration of the corresponding CT images. Tumor
locations at each phase are also manually delineated. We use 9 out of 10 phases
of the 4DCT images to identify a small group of surface markers that are most
correlated with the motion of the tumor, and find the prediction matrix at the
same time. The 10th phase is then used to test the accuracy of the prediction.
It is found that on average 6 to 7 surface markers are necessary to predict
tumor locations with a 3D error of about 1mm. In addition, the selected marker
locations lie closely in those areas where surface point motion has a high
correlation with the tumor motion. Our method can automatically select sparse
locations on patient's external surface and estimate a correlation matrix based
on 4DCT, so that the selected surface locations can be used to place fiducial
markers to optimally predict internal tumor motions.

Respirationcorrelated CBCT, commonly called 4DCBCT, provide respiratory
phaseresolved CBCT images. In many clinical applications, it is more
preferable to reconstruct true 4DCBCT with the 4th dimension being time, i.e.,
each CBCT image is reconstructed based on the corresponding instantaneous
projection. We propose in this work a novel algorithm for the reconstruction of
this truly timeresolved CBCT, called cineCBCT, by effectively utilizing the
underlying temporal coherence, such as periodicity or repetition, in those
cineCBCT images. Assuming each column of the matrix $\bm{U}$ represents a CBCT
image to be reconstructed and the total number of columns is the same as the
number of projections, the central idea of our algorithm is that the rank of
$\bm{U}$ is much smaller than the number of projections and we can use a matrix
factorization form $\bm{U}=\bm{L}\bm{R}$ for $\bm{U}$. The number of columns
for the matrix $\bm{L}$ constraints the rank of $\bm{U}$ and hence implicitly
imposing a temporal coherence condition among all the images in cineCBCT. The
desired image properties in $\bm{L}$ and the periodicity of the breathing
pattern are achieved by penalizing the sparsity of the tight wavelet frame
transform of $\bm{L}$ and that of the Fourier transform of $\bm{R}$,
respectively. A split Bregman method is used to solve the problem. In this
paper we focus on presenting this new algorithm and showing the proof of
principle using simulation studies on an NCAT phantom.

While compressed sensing (CS) based reconstructions have been developed for
lowdose CBCT, a clear understanding on the relationship between the image
quality and imaging dose at low dose levels is needed. In this paper, we
qualitatively investigate this subject in a comprehensive manner with extensive
experimental and simulation studies. The basic idea is to plot image quality
and imaging dose together as functions of number of projections and mAs per
projection over the whole clinically relevant range. A clear understanding on
the tradeoff between image quality and dose can be achieved and optimal
lowdose CBCT scan protocols can be developed for various imaging tasks in
IGRT. Main findings of this work include: 1) Under the CS framework, image
quality has little degradation over a large dose range, and the degradation
becomes evident when the dose < 100 total mAs. A dose < 40 total mAs leads to a
dramatic image degradation. Optimal lowdose CBCT scan protocols likely fall in
the dose range of 40100 total mAs, depending on the specific IGRT
applications. 2) Among different scan protocols at a constant lowdose level,
the super sparseview reconstruction with projection number less than 50 is the
most challenging case, even with strong regularization. Better image quality
can be acquired with other low mAs protocols. 3) The optimal scan protocol is
the combination of a medium number of projections and a medium level of
mAs/view. This is more evident when the dose is around 72.8 total mAs or below
and when the ROI is a lowcontrast or highresolution object. Based on our
results, the optimal number of projections is around 90 to 120. 4) The
clinically acceptable lowest dose level is task dependent. In our study,
72.8mAs is a safe dose level for visualizing lowcontrast objects, while 12.2
total mAs is sufficient for detecting highcontrast objects of diameter greater
than 3 mm.

Fourdimensional Cone Beam Computed Tomography (4DCBCT) has been developed
to provide respiratory phase resolved volumetric imaging in image guided
radiation therapy (IGRT). Inadequate number of projections in each phase bin
results in low quality 4DCBCT images with obvious streaking artifacts. In this
work, we propose two novel 4DCBCT algorithms: an iterative reconstruction
algorithm and an enhancement algorithm, utilizing a temporal nonlocal means
(TNLM) method. We define a TNLM energy term for a given set of 4DCBCT images.
Minimization of this term favors those 4DCBCT images such that any anatomical
features at one spatial point at one phase can be found in a nearby spatial
point at neighboring phases. 4DCBCT reconstruction is achieved by minimizing a
total energy containing a data fidelity term and the TNLM energy term. As for
the image enhancement, 4DCBCT images generated by the FDK algorithm are
enhanced by minimizing the TNLM function while keeping the enhanced images
close to the FDK results. A forwardbackward splitting algorithm and a
GaussJacobi iteration method are employed to solve the problems. The
algorithms are implemented on GPU to achieve a high computational efficiency.
The reconstruction algorithm and the enhancement algorithm generate visually
similar 4DCBCT images, both better than the FDK results. Quantitative
evaluations indicate that, compared with the FDK results, our reconstruction
method improves contrasttonoiseratio (CNR) by a factor of 2.56~3.13 and our
enhancement method increases the CNR by 2.75~3.33 times. The enhancement method
also removes over 80% of the streak artifacts from the FDK results. The total
computation time is ~460 sec for the reconstruction algorithm and ~610 sec for
the enhancement algorithm on an NVIDIA Tesla C1060 GPU card.

Recently, Xray imaging dose from computed tomography (CT) or cone beam CT
(CBCT) scans has become a serious concern. Patientspecific imaging dose
calculation has been proposed for the purpose of dose management. While Monte
Carlo (MC) dose calculation can be quite accurate for this purpose, it suffers
from low computational efficiency. In response to this problem, we have
successfully developed a MC dose calculation package, gCTD, on GPU architecture
under the NVIDIA CUDA platform for fast and accurate estimation of the xray
imaging dose received by a patient during a CT or CBCT scan. Techniques have
been developed particularly for the GPU architecture to achieve high
computational efficiency. Dose calculations using CBCT scanning geometry in a
homogeneous water phantom and a heterogeneous Zubal head phantom have shown
good agreement between gCTD and EGSnrc, indicating the accuracy of our code. In
terms of improved efficiency, it is found that gCTD attains a speedup of ~400
times in the homogeneous water phantom and ~76.6 times in the Zubal phantom
compared to EGSnrc. As for absolute computation time, imaging dose calculation
for the Zubal phantom can be accomplished in ~17 sec with the average relative
standard deviation of 0.4%. Though our gCTD code has been developed and tested
in the context of CBCT scans, with simple modification of geometry it can be
used for assessing imaging dose in CT scans as well.

Monte Carlo (MC) simulation is commonly considered to be the most accurate
dose calculation method in radiotherapy. However, its efficiency still requires
improvement for many routine clinical applications. In this paper, we present
our recent progress towards the development a GPUbased MC dose calculation
package, gDPM v2.0. It utilizes the parallel computation ability of a GPU to
achieve high efficiency, while maintaining the same particle transport physics
as in the original DPM code and hence the same level of simulation accuracy. In
GPU computing, divergence of execution paths between threads can considerably
reduce the efficiency. Since photons and electrons undergo different physics
and hence attain different execution paths, we use a simulation scheme where
photon transport and electron transport are separated to partially relieve the
thread divergence issue. High performance random number generator and hardware
linear interpolation are also utilized. We have also developed various
components to handle fluence map and linac geometry, so that gDPM can be used
to compute dose distributions for realistic IMRT or VMAT treatment plans. Our
gDPM package is tested for its accuracy and efficiency in both phantoms and
realistic patient cases. In all cases, the average relative uncertainties are
less than 1%. A statistical ttest is performed and the dose difference between
the CPU and the GPU results is found not statistically significant in over 96%
of the high dose region and over 97% of the entire region. Speed up factors of
69.1 ~ 87.2 have been observed using an NVIDIA Tesla C2050 GPU card against a
2.27GHz Intel Xeon CPU processor. For realistic IMRT and VMAT plans, MC dose
calculation can be completed with less than 1% standard deviation in 36.1~39.6
sec using gDPM.