-
Advances in multimaterial 3D printing have the potential to reproduce various
visual appearance attributes of an object in addition to its shape. Since many
existing 3D file formats encode color and translucency by RGBA textures mapped
to 3D shapes, RGBA information is particularly important for practical
applications. In contrast to color (encoded by RGB), which is specified by the
object's reflectance, selected viewing conditions and a standard observer,
translucency (encoded by A) is neither linked to any measurable physical nor
perceptual quantity. Thus, reproducing translucency encoded by A is open for
interpretation.
In this paper, we propose a rigorous definition for A suitable for use in
graphical 3D printing, which is independent of the 3D printing hardware and
software, and which links both optical material properties and perceptual
uniformity for human observers. By deriving our definition from the absorption
and scattering coefficients of virtual homogeneous reference materials with an
isotropic phase function, we achieve two important properties. First, a simple
adjustment of A is possible, which preserves the translucency appearance if an
object is re-scaled for printing. Second, determining the value of A for a real
(potentially non-homogeneous) material, can be achieved by minimizing a
distance function between light transport measurements of this material and
simulated measurements of the reference materials. Such measurements can be
conducted by commercial spectrophotometers used in graphic arts.
Finally, we conduct visual experiments employing the method of constant
stimuli, and derive from them an embedding of A into a nearly perceptually
uniform scale of translucency for the reference materials.
-
In order to avoid the curse of dimensionality, frequently encountered in Big
Data analysis, there was a vast development in the field of linear and
nonlinear dimension reduction techniques in recent years. These techniques
(sometimes referred to as manifold learning) assume that the scattered input
data is lying on a lower dimensional manifold, thus the high dimensionality
problem can be overcome by learning the lower dimensionality behavior. However,
in real life applications, data is often very noisy. In this work, we propose a
method to approximate $\mathcal{M}$ a $d$-dimensional $C^{m+1}$ smooth
submanifold of $\mathbb{R}^n$ ($d \ll n$) based upon noisy scattered data
points (i.e., a data cloud). We assume that the data points are located "near"
the lower dimensional manifold and suggest a non-linear moving least-squares
projection on an approximating $d$-dimensional manifold. Under some mild
assumptions, the resulting approximant is shown to be infinitely smooth and of
high approximation order (i.e., $O(h^{m+1})$, where $h$ is the fill distance
and $m$ is the degree of the local polynomial approximation). The method
presented here assumes no analytic knowledge of the approximated manifold and
the approximation algorithm is linear in the large dimension $n$. Furthermore,
the approximating manifold can serve as a framework to perform operations
directly on the high dimensional data in a computationally efficient manner.
This way, the preparatory step of dimension reduction, which induces
distortions to the data, can be avoided altogether.
-
This paper presents the nearest neighbor value (NNV) algorithm for high
resolution (H.R.) image interpolation. The difference between the proposed
algorithm and conventional nearest neighbor algorithm is that the concept
applied, to estimate the missing pixel value, is guided by the nearest value
rather than the distance. In other words, the proposed concept selects one
pixel, among four directly surrounding the empty location, whose value is
almost equal to the value generated by the conventional bilinear interpolation
algorithm. The proposed method demonstrated higher performances in terms of
H.R. when compared to the conventional interpolation algorithms mentioned.
-
We propose a novel approach for deformation-aware neural networks that learn
the weighting and synthesis of dense volumetric deformation fields. Our method
specifically targets the space-time representation of physical surfaces from
liquid simulations. Liquids exhibit highly complex, non-linear behavior under
changing simulation conditions such as different initial conditions. Our
algorithm captures these complex phenomena in two stages: a first neural
network computes a weighting function for a set of pre-computed deformations,
while a second network directly generates a deformation field for refining the
surface. Key for successful training runs in this setting is a suitable loss
function that encodes the effect of the deformations, and a robust calculation
of the corresponding gradients. To demonstrate the effectiveness of our
approach, we showcase our method with several complex examples of flowing
liquids with topology changes. Our representation makes it possible to rapidly
generate the desired implicit surfaces. We have implemented a mobile
application to demonstrate that real-time interactions with complex liquid
effects are possible with our approach.
-
A descriptive approach for automatic generation of visual blends is
presented. The implemented system, the Blender, is composed of two components:
the Mapper and the Visual Blender. The approach uses structured visual
representations along with sets of visual relations which describe how the
elements (in which the visual representation can be decomposed) relate among
each other. Our system is a hybrid blender, as the blending process starts at
the Mapper (conceptual level) and ends at the Visual Blender (visual
representation level). The experimental results show that the Blender is able
to create analogies from input mental spaces and produce well-composed blends,
which follow the rules imposed by its base-analogy and its relations. The
resulting blends are visually interesting and some can be considered as
unexpected.
-
We present in this paper a generic and parameter-free algorithm to
efficiently build a wide variety of optical components, such as mirrors or
lenses, that satisfy some light energy constraints. In all of our problems, one
is given a collimated or point light source and a desired illumination after
reflection or refraction and the goal is to design the geometry of a mirror or
lens which transports exactly the light emitted by the source onto the target.
We first propose a general framework and show that eight different optical
component design problems amount to solving a light energy conservation
equation that involves the computation of visibility diagrams. We then show
that these diagrams all have the same structure and can be obtained by
intersecting a 3D Power diagram with a planar or spherical domain. This allows
us to propose an efficient and fully generic algorithm capable to solve these
eight optical component design problems. The support of the prescribed target
illumination can be a set of directions or a set of points located at a finite
distance. Our solutions satisfy design constraints such as convexity or
concavity. We show the effectiveness of our algorithm on simulated and
fabricated examples.
-
The problem of $\textit{visual metamerism}$ is defined as finding a family of
perceptually indistinguishable, yet physically different images. In this paper,
we propose our NeuroFovea metamer model, a foveated generative model that is
based on a mixture of peripheral representations and style transfer
forward-pass algorithms. Our gradient-descent free model is parametrized by a
foveated VGG19 encoder-decoder which allows us to encode images in high
dimensional space and interpolate between the content and texture information
with adaptive instance normalization anywhere in the visual field. Our
contributions include: 1) A framework for computing metamers that resembles a
noisy communication system via a foveated feed-forward encoder-decoder network
-- We observe that metamerism arises as a byproduct of noisy perturbations that
partially lie in the perceptual null space; 2) A perceptual optimization scheme
as a solution to the hyperparametric nature of our metamer model that requires
tuning of the image-texture tradeoff coefficients everywhere in the visual
field which are a consequence of internal noise; 3) An ABX psychophysical
evaluation of our metamers where we also find that the rate of growth of the
receptive fields in our model match V1 for reference metamers and V2 between
synthesized samples. Our model also renders metamers at roughly a second,
presenting a $\times1000$ speed-up compared to the previous work, which allows
for tractable data-driven metamer experiments.
-
A new method is presented, allowing for the generation of 3D terrain and
texture from coherent noise. The method is significantly faster than prevailing
fractal brownian motion approaches, while producing results of equivalent
quality. The algorithm is derived through a systematic approach that
generalizes to an arbitrary number of spatial dimensions and gradient
smoothness. The results are compared, in terms of performance and quality, to
fundamental and efficient gradient noise methods widely used in the domain of
fast terrain generation: Perlin noise and OpenSimplex noise. Finally, to
objectively quantify the degree of realism of the results, a fractal analysis
of the generated landscapes is performed and compared to real terrain data.
-
As many different 3D volumes could produce the same 2D x-ray image, inverting
this process is challenging. We show that recent deep learning-based
convolutional neural networks can solve this task. As the main challenge in
learning is the sheer amount of data created when extending the 2D image into a
3D volume, we suggest firstly to learn a coarse, fixed-resolution volume which
is then fused in a second step with the input x-ray into a high-resolution
volume. To train and validate our approach we introduce a new dataset that
comprises of close to half a million computer-simulated 2D x-ray images of 3D
volumes scanned from 175 mammalian species. Applications of our approach
include stereoscopic rendering of legacy x-ray images, re-rendering of x-rays
including changes of illumination, view pose or geometry. Our evaluation
includes comparison to previous tomography work, previous learning methods
using our data, a user study and application to a set of real x-rays.
-
In recent years, consumer-level depth cameras have been adopted for various
applications. However, they often produce depth maps at only a moderately high
frame rate (approximately 30 frames per second), preventing them from being
used for applications such as digitizing human performance involving fast
motion. On the other hand, low-cost, high-frame-rate video cameras are
available. This motivates us to develop a hybrid camera that consists of a
high-frame-rate video camera and a low-frame-rate depth camera and to allow
temporal interpolation of depth maps with the help of auxiliary color images.
To achieve this, we develop a novel algorithm that reconstructs intermediate
depth maps and estimates scene flow simultaneously. We test our algorithm on
various examples involving fast, non-rigid motions of single or multiple
objects. Our experiments show that our scene flow estimation method is more
precise than a tracking-based method and the state-of-the-art techniques.
-
We propose a platform-independent multi-threaded function library that
provides data structures to generate, differentiate and render both the
ordinary basis and the normalized B-basis of a user-specified extended
Chebyshev (EC) space that comprises the constants and can be identified with
the solution space of a constant-coefficient homogeneous linear differential
equation defined on a sufficiently small interval. Using the obtained
normalized B-bases, our library can also generate, (partially) differentiate,
modify and visualize a large family of so-called B-curves and tensor product
B-surfaces. Moreover, the library also implements methods that can be used to
perform dimension elevation, to subdivide B-curves and B-surfaces by means of
de Casteljau-like B-algorithms, and to generate basis transformations for the
B-representation of arbitrary integral curves and surfaces that are described
in traditional parametric form by means of the ordinary bases of the underlying
EC spaces. Independently of the algebraic, exponential, trigonometric or mixed
type of the applied EC space, the proposed library is numerically stable and
efficient up to a reasonable dimension number and may be useful for academics
and engineers in the fields of Approximation Theory, Computer Aided Geometric
Design, Computer Graphics, Isogeometric and Numerical Analysis.
-
For a monochrome layer $x$ of opacity $0\le o_x\le1 $ placed on another
monochrome layer of opacity 1, the result given by the standard formula is
$$\small\Pi\left({\bf
C}_\varphi\right)=1+\sum_{n=1}^2\left(2-n-(-1)^no_{\chi(\varphi+1)}\right)\left(\chi(n+\varphi-1)-o_{\chi(n+\varphi-1)}\right),$$
the formula being of course explained in detail in this paper. We will
eventually deduce a very simple theorem, generalize it and then see its
validity with alternative formulas to this standard containing the same main
properties here exposed.
-
The colorful appearance of a physical painting is determined by the
distribution of paint pigments across the canvas, which we model as a per-pixel
mixture of a small number of pigments with multispectral absorption and
scattering coefficients. We present an algorithm to efficiently recover this
structure from an RGB image, yielding a plausible set of pigments and a low RGB
reconstruction error. We show that under certain circumstances we are able to
recover pigments that are close to ground truth, while in all cases our results
are always plausible. Using our decomposition, we repose standard digital image
editing operations as operations in pigment space rather than RGB, with
interestingly novel results. We demonstrate tonal adjustments, selection
masking, cut-copy-paste, recoloring, palette summarization, and edge
enhancement.
-
We propose MAD-GAN, an intuitive generalization to the Generative Adversarial
Networks (GANs) and its conditional variants to address the well known problem
of mode collapse. First, MAD-GAN is a multi-agent GAN architecture
incorporating multiple generators and one discriminator. Second, to enforce
that different generators capture diverse high probability modes, the
discriminator of MAD-GAN is designed such that along with finding the real and
fake samples, it is also required to identify the generator that generated the
given fake sample. Intuitively, to succeed in this task, the discriminator must
learn to push different generators towards different identifiable modes. We
perform extensive experiments on synthetic and real datasets and compare
MAD-GAN with different variants of GAN. We show high quality diverse sample
generations for challenging tasks such as image-to-image translation and face
generation. In addition, we also show that MAD-GAN is able to disentangle
different modalities when trained using highly challenging diverse-class
dataset (e.g. dataset with images of forests, icebergs, and bedrooms). In the
end, we show its efficacy on the unsupervised feature representation task. In
Appendix, we introduce a similarity based competing objective (MAD-GAN-Sim)
which encourages different generators to generate diverse samples based on a
user defined similarity metric. We show its performance on the image-to-image
translation, and also show its effectiveness on the unsupervised feature
representation task.
-
In this article, we devise a concise algorithm for computing BOCP. Our method
is simple, easy-to-implement but without loss of efficiency. Given two
circular-arc polygons with $m$ and $n$ edges respectively, our method runs in
$O(m+n+(l+k)\log l)$ time, using $O(m+n+k)$ space, where $k$ is the number of
intersections, and $l$ is the number of {edge}s. Our algorithm has the power to
approximate to linear complexity when $k$ and $l$ are small. The superiority of
the proposed algorithm is also validated through empirical study.
-
This paper proposes a new data-driven approach to model detailed splashes for
liquid simulations with neural networks. Our model learns to generate
small-scale splash detail for the fluid-implicit-particle method using training
data acquired from physically parametrized, high resolution simulations. We use
neural networks to model the regression of splash formation using a classifier
together with a velocity modifier. For the velocity modification, we employ a
heteroscedastic model. We evaluate our method for different spatial scales,
simulation setups, and solvers. Our simulation results demonstrate that our
model significantly improves visual fidelity with a large amount of realistic
droplet formation and yields splash detail much more efficiently than finer
discretizations.
-
A robust and informative local shape descriptor plays an important role in
mesh registration. In this regard, spectral descriptors that are based on the
spectrum of the Laplace-Beltrami operator have been a popular subject of
research for the last decade due to their advantageous properties, such as
isometry invariance. Despite such, however, spectral descriptors often fail to
give a correct similarity measure for non-isometric cases where the metric
distortion between the models is large. Hence, they are not reliable for
correspondence matching problems when the models are not isometric. In this
paper, it is proposed a method to improve the similarity metric of spectral
descriptors for correspondence matching problems. We embed a spectral shape
descriptor into a different metric space where the Euclidean distance between
the elements directly indicates the geometric dissimilarity. We design and
train a Siamese neural network to find such an embedding, where the embedded
descriptors are promoted to rearrange based on the geometric similarity. We
demonstrate our approach can significantly enhance the performance of the
conventional spectral descriptors by the simple augmentation achieved via the
Siamese neural network in comparison to other state-of-the-art methods.
-
Purpose: The class of models that can be represented by STL files is larger
than the class of models that can be printed using additive manufacturing
technologies. Stated differently, there exist well-formed STL files that cannot
be printed. In this paper such a gap is formalized and a fully automatic
procedure is described to turn any such file into a printable model.
Approach: Based on well-established concepts from combinatorial topology, we
provide an unambiguous description of all the mathematical entities involved in
the modeling-printing pipeline. Specifically, we formally define the conditions
that an STL file must satisfy to be printable and, based on these, we design an
as-exact-as-possible repairing algorithm.
Findings: We have found that, in order to cope with all the possible triangle
configurations, the algorithm must distinguish between triangles that bound
solid parts and triangles that constitute zero-thickness sheets. Only the
former set can be fixed without distortion.
Originality: Previous methods that are guaranteed to fix all the possible
configurations provide only approximate solutions with an unnecessary
distortion. Conversely, our procedure is as exact as possible, meaning that no
visible distortion is introduced unless it is strictly imposed by limitations
of the printing device. Thanks to such an unprecedented flexibility and
accuracy, this algorithm is expected to significantly simplify the
modeling-printing process, in particular within the continuously emerging
non-professional "maker" communities.
-
Sketch-based modeling strives to bring the ease and immediacy of drawing to
the 3D world. However, while drawings are easy for humans to create, they are
very challenging for computers to interpret due to their sparsity and
ambiguity. We propose a data-driven approach that tackles this challenge by
learning to reconstruct 3D shapes from one or more drawings. At the core of our
approach is a deep convolutional neural network (CNN) that predicts occupancy
of a voxel grid from a line drawing. This CNN provides us with an initial 3D
reconstruction as soon as the user completes a single drawing of the desired
shape. We complement this single-view network with an updater CNN that refines
an existing prediction given a new drawing of the shape created from a novel
viewpoint. A key advantage of our approach is that we can apply the updater
iteratively to fuse information from an arbitrary number of viewpoints, without
requiring explicit stroke correspondences between the drawings. We train both
CNNs by rendering synthetic contour drawings from hand-modeled shape
collections as well as from procedurally-generated abstract shapes. Finally, we
integrate our CNNs in a minimal modeling interface that allows users to
seamlessly draw an object, rotate it to see its 3D reconstruction, and refine
it by re-drawing from another vantage point using the 3D reconstruction as
guidance. The main strengths of our approach are its robustness to freehand
bitmap drawings, its ability to adapt to different object categories, and the
continuum it offers between single-view and multi-view sketch-based modeling.
-
We study data-driven representations for three-dimensional triangle meshes,
which are one of the prevalent objects used to represent 3D geometry. Recent
works have developed models that exploit the intrinsic geometry of manifolds
and graphs, namely the Graph Neural Networks (GNNs) and its spectral variants,
which learn from the local metric tensor via the Laplacian operator. Despite
offering excellent sample complexity and built-in invariances, intrinsic
geometry alone is invariant to isometric deformations, making it unsuitable for
many applications. To overcome this limitation, we propose several upgrades to
GNNs to leverage extrinsic differential geometry properties of
three-dimensional surfaces, increasing its modeling power.
In particular, we propose to exploit the Dirac operator, whose spectrum
detects principal curvature directions --- this is in stark contrast with the
classical Laplace operator, which directly measures mean curvature. We coin the
resulting models \emph{Surface Networks (SN)}. We prove that these models
define shape representations that are stable to deformation and to
discretization, and we demonstrate the efficiency and versatility of SNs on two
challenging tasks: temporal prediction of mesh deformations under non-linear
dynamics and generative models using a variational autoencoder framework with
encoders/decoders given by SNs.
-
Blue noise sampling has proved useful for many graphics applications, but
remains underexplored in high-dimensional spaces due to the difficulty of
generating distributions and proving properties about them. We present a blue
noise sampling method with good quality and performance across different
dimensions. The method, spoke-dart sampling, shoots rays from prior samples and
selects samples from these rays. It combines the advantages of two major
high-dimensional sampling methods: the locality of advancing front with the
dimensionality-reduction of hyperplanes, specifically line sampling. We prove
that the output sampling is saturated with high probability, with bounds on
distances between pairs of samples and between any domain point and its nearest
sample. We demonstrate spoke-dart applications for approximate Delaunay graph
construction, global optimization, and robotic motion planning. Both the
blue-noise quality of the output distribution and the adaptability of the
intermediate processes of our method are useful in these applications.
-
This paper introduces a novel weighted unsupervised learning for object
detection using an RGB-D camera. This technique is feasible for detecting the
moving objects in the noisy environments that are captured by an RGB-D camera.
The main contribution of this paper is a real-time algorithm for detecting each
object using weighted clustering as a separate cluster. In a preprocessing
step, the algorithm calculates the pose 3D position X, Y, Z and RGB color of
each data point and then it calculates each data point's normal vector using
the point's neighbor. After preprocessing, our algorithm calculates k-weights
for each data point; each weight indicates membership. Resulting in clustered
objects of the scene.
-
The real world exhibits an abundance of non-stationary textures. Examples
include textures with large-scale structures, as well as spatially variant and
inhomogeneous textures. While existing example-based texture synthesis methods
can cope well with stationary textures, non-stationary textures still pose a
considerable challenge, which remains unresolved. In this paper, we propose a
new approach for example-based non-stationary texture synthesis. Our approach
uses a generative adversarial network (GAN), trained to double the spatial
extent of texture blocks extracted from a specific texture exemplar. Once
trained, the fully convolutional generator is able to expand the size of the
entire exemplar, as well as of any of its sub-blocks. We demonstrate that this
conceptually simple approach is highly effective for capturing large-scale
structures, as well as other non-stationary attributes of the input exemplar.
As a result, it can cope with challenging textures, which, to our knowledge, no
other existing method can handle.
-
We present a novel end-to-end framework for facial performance capture given
a monocular video of an actor's face. Our framework are comprised of 2 parts.
First, to extract the information in the frames, we optimize a triplet loss to
learn the embedding space which ensures the semantically closer facial
expressions are closer in the embedding space and the model can be transferred
to distinguish the expressions that are not presented in the training dataset.
Second, the embeddings are fed into an LSTM network to learn the deformation
between frames. In the experiments, we demonstrated that compared to other
methods, our method can distinguish the delicate motion around lips and
significantly reduce jitters between the tracked meshes.
-
Numerous techniques have been proposed for reconstructing 3D models for
opaque objects in past decades. However, none of them can be directly applied
to transparent objects. This paper presents a fully automatic approach for
reconstructing complete 3D shapes of transparent objects. Through positioning
an object on a turntable, its silhouettes and light refraction paths under
different viewing directions are captured. Then, starting from an initial rough
model generated from space carving, our algorithm progressively optimizes the
model under three constraints: surface and refraction normal consistency,
surface projection and silhouette consistency, and surface smoothness.
Experimental results on both synthetic and real objects demonstrate that our
method can successfully recover the complex shapes of transparent objects and
faithfully reproduce their light refraction properties.