• We introduce Morse branes in the Fukaya category of a holomorphic symplectic manifold, with the goal of constructing tilting objects in the category. We give a construction of a class of Morse branes in the cotangent bundles, and apply it to give the holomorphic branes that represent the big tilting sheaves on flag varieties.
  • For a semisimple Lie group $G_\mathbb{C}$ over $\mathbb{C}$, we study the homotopy type of the symplectomorphism group of the cotangent bundle of the flag variety and its relation to the braid group. We prove a homotopy equivalence between the two groups in the case of $G_\mathbb{C}=SL_3(\mathbb{C})$, under the $SU(3)$-equivariancy condition on symplectomorphisms.
  • In this paper, we address a challenging problem of aesthetic image classification, which is to label an input image as high or low aesthetic quality. We take both the local and global features of images into consideration. A novel deep convolutional neural network named ILGNet is proposed, which combines both the Inception modules and an connected layer of both Local and Global features. The ILGnet is based on GoogLeNet. Thus, it is easy to use a pre-trained GoogLeNet for large-scale image classification problem and fine tune our connected layers on an large scale database of aesthetic related images: AVA, i.e. \emph{domain adaptation}. The experiments reveal that our model achieves the state of the arts in AVA database. Both the training and testing speeds of our model are higher than those of the original GoogLeNet.
  • One key challenge to learning-based video compression is that motion predictive coding, a very effective tool for video compression, can hardly be trained into a neural network. In this paper we propose the concept of VoxelCNN which includes motion extension and hybrid prediction networks. VoxelCNN can model spatiotemporal coherence to effectively perform predictive coding inside the learning network. On the basis of VoxelCNN, we further explore a learning based framework for video compression with additional components of iterative analysis/synthesis, binarization, etc. Experiment results demonstrate the effectiveness of the proposed scheme. Although entropy coding and complex configurations are not employed in this paper, we still demonstrate superior performance compared with MPEG-2 and achieve comparable results with H.264 codec. The proposed learning based scheme provides a possible new direction to further improve compression efficiency and functionalities of future video coding.
  • Coordination services are a fundamental building block of modern cloud systems, providing critical functionalities like configuration management and distributed locking. The major challenge is to achieve low latency and high throughput while providing strong consistency and fault-tolerance. Traditional server-based solutions require multiple round-trip times (RTTs) to process a query. This paper presents NetChain, a new approach that provides scale-free sub-RTT coordination in datacenters. NetChain exploits recent advances in programmable switches to store data and process queries entirely in the network data plane. This eliminates the query processing at coordination servers and cuts the end-to-end latency to as little as half of an RTT---clients only experience processing delay from their own software stack plus network delay, which in a datacenter setting is typically much smaller. We design new protocols and algorithms based on chain replication to guarantee strong consistency and to efficiently handle switch failures. We implement a prototype with four Barefoot Tofino switches and four commodity servers. Evaluation results show that compared to traditional server-based solutions like ZooKeeper, our prototype provides orders of magnitude higher throughput and lower latency, and handles failures gracefully.
  • In the coming Virtual/Augmented Reality (VR/AR) era, 3D contents will be popularized just as images and videos today. The security and privacy of these 3D contents should be taken into consideration. 3D contents contain surface models and solid models. The surface models include point clouds, meshes and textured models. Previous work mainly focus on encryption of solid models, point clouds and meshes. This work focuses on the most complicated 3D textured model. We propose a 3D Lu chaotic mapping based encryption method of 3D textured model. We encrypt the vertexes, the polygons and the textures of 3D models separately using the 3D Lu chaotic mapping. Then the encrypted vertices, edges and texture maps are composited together to form the final encrypted 3D textured model. The experimental results reveal that our method can encrypt and decrypt 3D textured models correctly. In addition, our method can resistant several attacks such as brute-force attack and statistic attack.
  • Image relighting is to change the illumination of an image to a target illumination effect without known the original scene geometry, material information and illumination condition. We propose a novel outdoor scene relighting method, which needs only a single reference image and is based on material constrained layer decomposition. Firstly, the material map is extracted from the input image. Then, the reference image is warped to the input image through patch match based image warping. Lastly, the input image is relit using material constrained layer decomposition. The experimental results reveal that our method can produce similar illumination effect as that of the reference image on the input image using only a single reference image.
  • Aesthetic quality prediction is a challenging task in the computer vision community because of the complex interplay with semantic contents and photographic technologies. Recent studies on the powerful deep learning based aesthetic quality assessment usually use a binary high-low label or a numerical score to represent the aesthetic quality. However the scalar representation cannot describe well the underlying varieties of the human perception of aesthetics. In this work, we propose to predict the aesthetic score distribution (i.e., a score distribution vector of the ordinal basic human ratings) using Deep Convolutional Neural Network (DCNN). Conventional DCNNs which aim to minimize the difference between the predicted scalar numbers or vectors and the ground truth cannot be directly used for the ordinal basic rating distribution. Thus, a novel CNN based on the Cumulative distribution with Jensen-Shannon divergence (CJS-CNN) is presented to predict the aesthetic score distribution of human ratings, with a new reliability-sensitive learning method based on the kurtosis of the score distribution. Experimental results on large scale aesthetic dataset demonstrate the effectiveness of our introduced CJS-CNN in this task. In addition, by recasting the predicted score histogram to a binary score using the mean value and a relative small scale CNN, the proposed method outperforms the state-of-the-art methods on aesthetic image classification.
  • Recently, cloud storage and processing have been widely adopted. Mobile users in one family or one team may automatically backup their photos to the same shared cloud storage space. The powerful face detector trained and provided by a 3rd party may be used to retrieve the photo collection which contains a specific group of persons from the cloud storage server. However, the privacy of the mobile users may be leaked to the cloud server providers. In the meanwhile, the copyright of the face detector should be protected. Thus, in this paper, we propose a protocol of privacy preserving face retrieval in the cloud for mobile users, which protects the user photos and the face detector simultaneously. The cloud server only provides the resources of storage and computing and can not learn anything of the user photos and the face detector. We test our protocol inside several families and classes. The experimental results reveal that our protocol can successfully retrieve the proper photos from the cloud server and protect the user photos and the face detector.
  • Using a scanning microwave microscope, we imaged in water aluminum interconnect lines buried in aluminum and silicon oxides fabricated through a state-of-the-art 0.13 um SiGe BiCMOS process. The results were compared with that obtained by using atomic force microscopy both in air and water. It was found the images in water was degraded by only approximately 60% from that in air.
  • Let $L$ be an exact Lagrangian submanifold of a cotangent bundle $T^* M$, asymptotic to a Legendrian submanifold $\Lambda \subset T^{\infty} M$. We study a locally constant sheaf of $\infty$-categories on $L$, called the sheaf of brane structures or $\mathrm{Brane}_L$. Its fiber is the $\infty$-category of spectra, and we construct a Hamiltonian invariant, fully faithful functor from $\Gamma(L,\mathrm{Brane}_L)$ to the $\infty$-category of sheaves of spectra on $M$ with singular support in $\Lambda$.
  • A cloud server spent a lot of time, energy and money to train a Viola-Jones type object detector with high accuracy. Clients can upload their photos to the cloud server to find objects. However, the client does not want the leakage of the content of his/her photos. In the meanwhile, the cloud server is also reluctant to leak any parameters of the trained object detectors. 10 years ago, Avidan & Butman introduced Blind Vision, which is a method for securely evaluating a Viola-Jones type object detector. Blind Vision uses standard cryptographic tools and is painfully slow to compute, taking a couple of hours to scan a single image. The purpose of this work is to explore an efficient method that can speed up the process. We propose the Random Base Image (RBI) Representation. The original image is divided into random base images. Only the base images are submitted randomly to the cloud server. Thus, the content of the image can not be leaked. In the meanwhile, a random vector and the secure Millionaire protocol are leveraged to protect the parameters of the trained object detector. The RBI makes the integral-image enable again for the great acceleration. The experimental results reveal that our method can retain the detection accuracy of that of the plain vision algorithm and is significantly faster than the traditional blind vision, with only a very low probability of the information leakage theoretically.
  • A new methodology to measure coded image/video quality using the just-noticeable-difference (JND) idea was proposed. Several small JND-based image/video quality datasets were released by the Media Communications Lab at the University of Southern California. In this work, we present an effort to build a large-scale JND-based coded video quality dataset. The dataset consists of 220 5-second sequences in four resolutions (i.e., $1920 \times 1080$, $1280 \times 720$, $960 \times 540$ and $640 \times 360$). For each of the 880 video clips, we encode it using the H.264 codec with $QP=1, \cdots, 51$ and measure the first three JND points with 30+ subjects. The dataset is called the "VideoSet", which is an acronym for "Video Subject Evaluation Test (SET)". This work describes the subjective test procedure, detection and removal of outlying measured data, and the properties of collected JND data. Finally, the significance and implications of the VideoSet to future video coding research and standardization efforts are pointed out. All source/coded video clips as well as measured JND data included in the VideoSet are available to the public in the IEEE DataPort.
  • Over the last two decades, face alignment or localizing fiducial facial points has received increasing attention owing to its comprehensive applications in automatic face analysis. However, such a task has proven extremely challenging in unconstrained environments due to many confounding factors, such as pose, occlusions, expression and illumination. While numerous techniques have been developed to address these challenges, this problem is still far away from being solved. In this survey, we present an up-to-date critical review of the existing literatures on face alignment, focusing on those methods addressing overall difficulties and challenges of this topic under uncontrolled conditions. Specifically, we categorize existing face alignment techniques, present detailed descriptions of the prominent algorithms within each category, and discuss their advantages and disadvantages. Furthermore, we organize special discussions on the practical aspects of face alignment in-the-wild, towards the development of a robust face alignment system. In addition, we show performance statistics of the state of the art, and conclude this paper with several promising directions for future research.
  • Let X be a compact complex manifold, $D_c^b(X)$ be the bounded derived category of constructible sheaves on $X$, and $Fuk(T^*X)$ be the Fukaya category of $T^*X$. A Lagrangian brane in $Fuk(T^*X)$ is holomorphic if the underlying Lagrangian submanifold is complex analytic in $T^*X_{\mathbb{C}}$, the holomorphic cotangent bundle of $X$. We prove that under the quasi-equivalence between $D^b_c(X)$ and $DFuk(T^*X)$ established in [NaZa09] and [Nad09], holomorphic Lagrangian branes with appropriate grading correspond to perverse sheaves.
  • This paper addresses the problem of competition vs. cooperation in the downlink, between base stations (BSs), of a multiple input multiple output (MIMO) interference, heterogeneous wireless network (HetNet). This research presents a scenario where a macrocell base station (MBS) and a cochannel femtocell base station (FBS) each simultaneously serving their own user equipment (UE), has to choose to act as individual systems or to cooperate in coordinated multipoint transmission (CoMP). The paper employes both the theories of non-cooperative and cooperative games in a unified procedure to analyze the decision making process. The BSs of the competing system are assumed to operate at the\emph{}maximum expected sum rate\emph{}(MESR)\emph{}correlated equilibrium\emph{}(CE), which is compared against the value of CoMP to establish the stability of the coalition. It is proven that there exists a threshold geographical separation, $d_{\text{th}}$, between the macrocell user equipment (MUE) and FBS, under which the region of coordination is non-empty. Theoretical results are verified through simulations.
  • In this paper, we propound a solution named Cognitive Sub-Small Cell for Sojourners (CSCS) in allusion to a broadly representative small cell scenario, where users can be categorized into two groups: sojourners and inhabitants. CSCS contributes to save energy, enhance the number of concurrently supportable users and enshield inhabitants. We consider two design issues in CSCS: i) determining the number of transmit antennas on sub-small cell APs; ii) controlling downlink inter-sub-small cell interference. For issue i), we excogitate an algorithm helped by the probability distribution of the number of concurrent sojourners. For issue ii), we propose an interference control scheme named BDBF: Block Diagonalization (BD) Precoding based on uncertain channel state information in conjunction with auxiliary optimal Beamformer (BF). In the simulation, we delve into the issue: how the factors impact the number of transmit antennas on sub-small cell APs. Moreover, we verify a significant conclusion: Using BDBF gains more capacity than using optimal BF alone within a bearably large radius of uncertainty region.
  • Existing cellular networks suffer from inflexible and expensive equipment, and complex control-plane protocols. To address these challenges, we present SoftCell, a scalable architecture for supporting fine-grained policies for mobile devices in cellular core networks. The SoftCell controller realizes high-level service polices by directing traffic over paths that traverse a sequence of middleboxes, optimized to the network conditions and user locations. To ensure scalability, the core switches forward traffic on hierarchical addresses (grouped by base station) and policy tags (identifying paths through middleboxes). This minimizes data-plane state in the core switches, and pushes all fine-grained state to software switches at the base stations. These access switches apply fine-grained rules, specified by the controller, to map all traffic to the appropriate addresses and tags. SoftCell guarantees that packets in the same connection traverse the same sequence of middleboxes in both directions, even in the presence of mobility. Our characterization of real LTE workloads, micro-benchmarks on our prototype controller, and large-scale simulations demonstrate that SoftCell improves the flexibility of cellular core networks, while enabling the use of inexpensive commodity switches and middleboxes.
  • The spectrum sharing has recently passed into a mainstream Cognitive Radio (CR) strategy. We investigate the core issue in this strategy: interference mitigation at Primary Receiver (PR).We propose a linear precoder design which aims at alleviating the interference caused by Secondary User (SU) from the source for Orthogonal Space-Time Block Coding (OSTBC) based CR. We resort to Minimum Variance (MV) approach to contrive the precoding matrix at Secondary Transmitter (ST) in order to maximize the Signal to Noise Ratio (SNR) at Secondary Receiver (SR) on the premise that the orthogonality of OSTBC is kept, the interference introduced to Primary Link (PL) by Secondary Link (SL) is maintained under a tolerable level and the total transmitted power constraint at ST is satisfied. Moreover, the selection of polarization mode for SL is incorporated in the precoder design. In order to provide an analytic solution with low computational cost, we put forward an original precoder design algorithm which exploits an auxiliary variable to treat the optimization problem with a mixture of linear and quadratic constraints. Numerical results demonstrate that our proposed precoder design enable SR to have an agreeable SNR on the prerequisite that the interference at PR is maintained below the threshold.
  • A hidden database refers to a dataset that an organization makes accessible on the web by allowing users to issue queries through a search interface. In other words, data acquisition from such a source is not by following static hyper-links. Instead, data are obtained by querying the interface, and reading the result page dynamically generated. This, with other facts such as the interface may answer a query only partially, has prevented hidden databases from being crawled effectively by existing search engines. This paper remedies the problem by giving algorithms to extract all the tuples from a hidden database. Our algorithms are provably efficient, namely, they accomplish the task by performing only a small number of queries, even in the worst case. We also establish theoretical results indicating that these algorithms are asymptotically optimal -- i.e., it is impossible to improve their efficiency by more than a constant factor. The derivation of our upper and lower bound results reveals significant insight into the characteristics of the underlying problem. Extensive experiments confirm the proposed techniques work very well on all the real datasets examined.
  • We consider the homogeneous components U_r of the map on R = k[x,y,z]/(x^A, y^B, z^C) that multiplies by x + y + z. We prove a relationship between the Smith normal forms of submatrices of an arbitrary Toeplitz matrix using Schur polynomials, and use this to give a relationship between Smith normal form entries of U_r. We also give a bijective proof of an identity proven by J. Li and F. Zanello equating the determinant of the middle homogeneous component U_r when (A, B, C) = (a + b, a + c, b + c) to the number of plane partitions in an a by b by c box. Finally, we prove that, for certain vector subspaces of R, similar identities hold relating determinants to symmetry classes of plane partitions, in particular classes 3, 6, and 8.