• ### Common-Message Broadcast Channels with Feedback in the Nonasymptotic Regime: Full Feedback(1706.07731)

Aug. 30, 2018 cs.IT, math.IT
We investigate the maximum coding rate achievable on a two-user broadcast channel for the case where a common message is transmitted with feedback using either fixed-blocklength codes or variable-length codes. For the fixed-blocklength-code setup, we establish nonasymptotic converse and achievability bounds. An asymptotic analysis of these bounds reveals that feedback improves the second-order term compared to the no-feedback case. In particular, for a certain class of antisymmetric broadcast channels, we show that the dispersion is halved. For the variable-length-code setup, we demonstrate that the channel dispersion is zero.
• ### Common-Message Broadcast Channels with Feedback in the Nonasymptotic Regime: Stop Feedback(1607.03519)

Aug. 30, 2018 cs.IT, math.IT
We investigate the maximum coding rate for a given average blocklength and error probability over a K-user discrete memoryless broadcast channel for the scenario where a common message is transmitted using variable-length stop-feedback codes. For the point-to-point case, Polyanskiy et al. (2011) demonstrated that variable-length coding combined with stop-feedback significantly increases the speed of convergence of the maximum coding rate to capacity. This speed-up manifests itself in the absence of a square-root penalty in the asymptotic expansion of the maximum coding rate for large blocklengths, i.e., zero dispersion. In this paper, we present nonasymptotic achievability and converse bounds on the maximum coding rate of the common-message K-user discrete memoryless broadcast channel, which strengthen and generalize the ones reported in Trillingsgaard et al. (2015) for the two-user case. An asymptotic analysis of these bounds reveals that zero dispersion cannot be achieved for certain common-message broadcast channels (e.g., the binary symmetric broadcast channel). Furthermore, we identify conditions under which our converse and achievability bounds are tight up to the second order. Through numerical evaluations, we illustrate that our second-order expansions approximate accurately the maximum coding rate and that the speed of convergence to capacity is indeed slower than for the point-to-point case.
• ### Beta-Beta Bounds: Finite-Blocklength Analog of the Golden Formula(1706.05972)

June 17, 2018 cs.IT, math.IT
It is well known that the mutual information between two random variables can be expressed as the difference of two relative entropies that depend on an auxiliary distribution, a relation sometimes referred to as the golden formula. This paper is concerned with a finite-blocklength extension of this relation. This extension consists of two elements: 1) a finite-blocklength channel-coding converse bound by Polyanskiy and Verd\'{u} (2014), which involves the ratio of two Neyman-Pearson $\beta$ functions (beta-beta converse bound); and 2) a novel beta-beta channel-coding achievability bound, expressed again as the ratio of two Neyman-Pearson $\beta$ functions. To demonstrate the usefulness of this finite-blocklength extension of the golden formula, the beta-beta achievability and converse bounds are used to obtain a finite-blocklength extension of Verd\'{u}'s (2002) wideband-slope approximation. The proof parallels the derivation of the latter, with the beta-beta bounds used in place of the golden formula. The beta-beta (achievability) bound is also shown to be useful in cases where the capacity-achieving output distribution is not a product distribution due to, e.g., a cost constraint or structural constraints on the codebook, such as orthogonality or constant composition. As an example, the bound is used to characterize the channel dispersion of the additive exponential-noise channel and to obtain a finite-blocklength achievability bound (the tightest to date) for multiple-input multiple-output Rayleigh-fading channels with perfect channel state information at the receiver.
• ### 3D Human Pose Estimation in the Wild by Adversarial Learning(1803.09722)

April 16, 2018 cs.CV
Recently, remarkable advances have been achieved in 3D human pose estimation from monocular images because of the powerful Deep Convolutional Neural Networks (DCNNs). Despite their success on large-scale datasets collected in the constrained lab environment, it is difficult to obtain the 3D pose annotations for in-the-wild images. Therefore, 3D human pose estimation in the wild is still a challenge. In this paper, we propose an adversarial learning framework, which distills the 3D human pose structures learned from the fully annotated dataset to in-the-wild images with only 2D pose annotations. Instead of defining hard-coded rules to constrain the pose estimation results, we design a novel multi-source discriminator to distinguish the predicted 3D poses from the ground-truth, which helps to enforce the pose estimator to generate anthropometrically valid poses even with images in the wild. We also observe that a carefully designed information source for the discriminator is essential to boost the performance. Thus, we design a geometric descriptor, which computes the pairwise relative locations and distances between body joints, as a new information source for the discriminator. The efficacy of our adversarial learning framework with the new geometric descriptor has been demonstrated through extensive experiments on widely used public benchmarks. Our approach significantly improves the performance compared with previous state-of-the-art approaches.
• Particle physics has an ambitious and broad experimental programme for the coming decades. This programme requires large investments in detector hardware, either to build new facilities and experiments, or to upgrade existing ones. Similarly, it requires commensurate investment in the R&D of software to acquire, manage, process, and analyse the shear amounts of data to be recorded. In planning for the HL-LHC in particular, it is critical that all of the collaborating stakeholders agree on the software goals and priorities, and that the efforts complement each other. In this spirit, this white paper describes the R&D activities required to prepare for this software upgrade.
• ### Robust 3D Human Motion Reconstruction Via Dynamic Template Construction(1801.10434)

Jan. 31, 2018 cs.CV, cs.GR
In multi-view human body capture systems, the recovered 3D geometry or even the acquired imagery data can be heavily corrupted due to occlusions, noise, limited field of- view, etc. Direct estimation of 3D pose, body shape or motion on these low-quality data has been traditionally challenging.In this paper, we present a graph-based non-rigid shape registration framework that can simultaneously recover 3D human body geometry and estimate pose/motion at high fidelity.Our approach first generates a global full-body template by registering all poses in the acquired motion sequence.We then construct a deformable graph by utilizing the rigid components in the global template. We directly warp the global template graph back to each motion frame in order to fill in missing geometry. Specifically, we combine local rigidity and temporal coherence constraints to maintain geometry and motion consistencies. Comprehensive experiments on various scenes show that our method is accurate and robust even in the presence of drastic motions.
• ### Identity-Aware Textual-Visual Matching with Latent Co-attention(1708.01988)

Aug. 7, 2017 cs.CV
Textual-visual matching aims at measuring similarities between sentence descriptions and images. Most existing methods tackle this problem without effectively utilizing identity-level annotations. In this paper, we propose an identity-aware two-stage framework for the textual-visual matching problem. Our stage-1 CNN-LSTM network learns to embed cross-modal features with a novel Cross-Modal Cross-Entropy (CMCE) loss. The stage-1 network is able to efficiently screen easy incorrect matchings and also provide initial training point for the stage-2 training. The stage-2 CNN-LSTM network refines the matching results with a latent co-attention mechanism. The spatial attention relates each word with corresponding image regions while the latent semantic attention aligns different sentence structures to make the matching results more robust to sentence structure variations. Extensive experiments on three datasets with identity-level annotations show that our framework outperforms state-of-the-art approaches by large margins.
• ### Learning Feature Pyramids for Human Pose Estimation(1708.01101)

Aug. 3, 2017 cs.CV
Articulated human pose estimation is a fundamental yet challenging task in computer vision. The difficulty is particularly pronounced in scale variations of human body parts when camera view changes or severe foreshortening happens. Although pyramid methods are widely used to handle scale changes at inference time, learning feature pyramids in deep convolutional neural networks (DCNNs) is still not well explored. In this work, we design a Pyramid Residual Module (PRMs) to enhance the invariance in scales of DCNNs. Given input features, the PRMs learn convolutional filters on various scales of input features, which are obtained with different subsampling ratios in a multi-branch network. Moreover, we observe that it is inappropriate to adopt existing methods to initialize the weights of multi-branch networks, which achieve superior performance than plain networks in many tasks recently. Therefore, we provide theoretic derivation to extend the current weight initialization scheme to multi-branch network structures. We investigate our method on two standard benchmarks for human pose estimation. Our approach obtains state-of-the-art results on both benchmarks. Code is available at https://github.com/bearpaw/PyraNet.
• ### Designing high-performance composite joints close to parent materials of aluminum matrix composites(1707.01003)

July 4, 2017 cond-mat.mtrl-sci
Composite joints with high-performance close to the parent materials of aluminum matrix composites were fabricated by a new joining technology assisted by ultrasonic vibration. We found both the microstructure and the mechanical performance were systematically dependent on the volume fraction and the distribution of reinforcement particles in the bond region. This study can be generalized to the bonding of other ceramic-reinforced metal matrix composites.
• ### Wiretap Channels: Nonasymptotic Fundamental Limits(1706.03866)

June 12, 2017 cs.IT, math.IT
This paper investigates the maximal secret communication rate over a wiretap channel subject to reliability and secrecy constraints at a given blocklength. New achievability and converse bounds are derived, which are uniformly tighter than existing bounds, and lead to the tightest bounds on the second-order coding rate for discrete memoryless and Gaussian wiretap channels. The exact second-order coding rate is established for semi-deterministic wiretap channels, which characterizes the optimal tradeoff between reliability and secrecy in the finite-blocklength regime. Underlying our achievability bounds are two new privacy amplification results, which not only refine the existing results, but also achieve stronger notions of secrecy.
• ### State-Dependent Gaussian Multiple Access Channels: New Outer Bounds and Capacity Results(1705.01640)

May 3, 2017 cs.IT, math.IT
This paper studies a two-user state-dependent Gaussian multiple-access channel (MAC) with state noncausally known at one encoder. Two scenarios are considered: i) each user wishes to communicate an independent message to the common receiver, and ii) the two encoders send a common message to the receiver and the non-cognitive encoder (i.e., the encoder that does not know the state) sends an independent individual message (this model is also known as the MAC with degraded message sets). For both scenarios, new outer bounds on the capacity region are derived, which improve uniformly over the best known outer bounds. In the first scenario, the two corner points of the capacity region as well as the sum rate capacity are established, and it is shown that a single-letter solution is adequate to achieve both the corner points and the sum rate capacity. Furthermore, the full capacity region is characterized in situations in which the sum rate capacity is equal to the capacity of the helper problem. The proof exploits the optimal-transportation idea of Polyanskiy and Wu (which was used previously to establish an outer bound on the capacity region of the interference channel) and the worst-case Gaussian noise result for the case in which the input and the noise are dependent.
• ### Multi-Context Attention for Human Pose Estimation(1702.07432)

Feb. 24, 2017 cs.CV
In this paper, we propose to incorporate convolutional neural networks with a multi-context attention mechanism into an end-to-end framework for human pose estimation. We adopt stacked hourglass networks to generate attention maps from features at multiple resolutions with various semantics. The Conditional Random Field (CRF) is utilized to model the correlations among neighboring regions in the attention map. We further combine the holistic attention model, which focuses on the global consistency of the full human body, and the body part attention model, which focuses on the detailed description for different body parts. Hence our model has the ability to focus on different granularity from local salient regions to global semantic-consistent spaces. Additionally, we design novel Hourglass Residual Units (HRUs) to increase the receptive field of the network. These units are extensions of residual units with a side branch incorporating filters with larger receptive fields, hence features with various scales are learned and combined within the HRUs. The effectiveness of the proposed multi-context attention mechanism and the hourglass residual units is evaluated on two widely used human pose estimation benchmarks. Our approach outperforms all existing methods on both benchmarks over all the body parts.
• ### Progressively Diffused Networks for Semantic Image Segmentation(1702.05839)

Feb. 20, 2017 cs.CV
This paper introduces Progressively Diffused Networks (PDNs) for unifying multi-scale context modeling with deep feature learning, by taking semantic image segmentation as an exemplar application. Prior neural networks, such as ResNet, tend to enhance representational power by increasing the depth of architectures and driving the training objective across layers. However, we argue that spatial dependencies in different layers, which generally represent the rich contexts among data elements, are also critical to building deep and discriminative representations. To this end, our PDNs enables to progressively broadcast information over the learned feature maps by inserting a stack of information diffusion layers, each of which exploits multi-dimensional convolutional LSTMs (Long-Short-Term Memory Structures). In each LSTM unit, a special type of atrous filters are designed to capture the short range and long range dependencies from various neighbors to a certain site of the feature map and pass the accumulated information to the next layer. From the extensive experiments on semantic image segmentation benchmarks (e.g., ImageNet Parsing, PASCAL VOC2012 and PASCAL-Part), our framework demonstrates the effectiveness to substantially improve the performances over the popular existing neural network models, and achieves state-of-the-art on ImageNet Parsing for large scale semantic segmentation.
• ### Personalized Classifier Ensemble Pruning Framework for Mobile Crowdsourcing(1701.07166)

Jan. 25, 2017 cs.DC, cs.HC, cs.LG
Ensemble learning has been widely employed by mobile applications, ranging from environmental sensing to activity recognitions. One of the fundamental issue in ensemble learning is the trade-off between classification accuracy and computational costs, which is the goal of ensemble pruning. During crowdsourcing, the centralized aggregator releases ensemble learning models to a large number of mobile participants for task evaluation or as the crowdsourcing learning results, while different participants may seek for different levels of the accuracy-cost trade-off. However, most of existing ensemble pruning approaches consider only one identical level of such trade-off. In this study, we present an efficient ensemble pruning framework for personalized accuracy-cost trade-offs via multi-objective optimization. Specifically, for the commonly used linear-combination style of the trade-off, we provide an objective-mixture optimization to further reduce the number of ensemble candidates. Experimental results show that our framework is highly efficient for personalized ensemble pruning, and achieves much better pruning performance with objective-mixture optimization when compared to state-of-art approaches.
• ### Dislocation Activities at the Martensite Phase Transformation Interface in Metastable Austenitic Stainless Steel: An In-situ TEM Study(1612.08282)

Dec. 25, 2016 cond-mat.mtrl-sci
Understanding the mechanism of martensitic transformation is of great importance in developing advanced high strength steels, especially TRansformation-Induced Plasticity (TRIP) steels. The TRIP effect leads to enhanced work-hardening rate, postponed onset of necking and excellent formability. In-situ transmission electron microscopy has been performed to systematically investigate the dynamic interactions between dislocations and alpha martensite at microscale. Local stress concentrations, e.g. from notches or dislocation pile-ups, render free edges and grain boundaries favorable nucleation sites for alpha martensite. Its growth leads to partial dislocation emission on two independent slip planes from the hetero-interface when the austenite matrix is initially free of dislocations. The kinematic analysis reveals that activating slip systems on two independent {111} planes of austenite are necessary in accommodating the interfacial mismatch strain. Full dislocation emission is generally observed inside of austenite regions that contain high density of dislocations. In both situations, phase boundary propagation generates large amounts of dislocations entering into the matrix, which renders the total deformation compatible and provide substantial strain hardening of the host phase. These moving dislocation sources enable plastic relaxation and prevent local damage accumulation by intense slipping on the softer side of the interfacial region. Thus, finely dispersed martensite distribution renders plastic deformation more uniform throughout the austenitic matrix, which explains the exceptional combination of strength and ductility of TRIP steels.
• ### $D \rightarrow a_1, f_1$ transition form factors and semileptonic decays via 3-point QCD sum rules(1608.03651)

Aug. 12, 2016 hep-ph
By using the 3-point QCD sum rules, we calculate the transition form factors of $D$ decays into the spin triplet axial vector mesons $a_1(1260)$, $f_1(1285)$, $f_1(1420)$. In the calculations, we consider the quark contents of each meson in detail. In view of the fact that the isospin of $a_1(1260)$ is one, we calculate the $D^+ \rightarrow a_1^0 (1260)$ and $D^0 \rightarrow a_1^- (1260)$ transition form factors separately. In the case of $f_1(1285), f_1(1420)$, the mixing between light flavor $SU(3)$ singlet and octet is taken into account. Based on the form factors obtained here, we give predictions for the branching ratios of relevant semileptonic decays, which can be tested in the future experiments.
• ### Mutual Information Optimally Local Private Discrete Distribution Estimation(1607.08025)

July 27, 2016 cs.IT, math.IT
Consider statistical learning (e.g. discrete distribution estimation) with local $\epsilon$-differential privacy, which preserves each data provider's privacy locally, we aim to optimize statistical data utility under the privacy constraints. Specifically, we study maximizing mutual information between a provider's data and its private view, and give the exact mutual information bound along with an attainable mechanism: $k$-subset mechanism as results. The mutual information optimal mechanism randomly outputs a size $k$ subset of the original data domain with delicate probability assignment, where $k$ varies with the privacy level $\epsilon$ and the data domain size $d$. After analysing the limitations of existing local private mechanisms from mutual information perspective, we propose an efficient implementation of the $k$-subset mechanism for discrete distribution estimation, and show its optimality guarantees over existing approaches.
• ### Nonasymptotic coding-rate bounds for binary erasure channels with feedback(1607.06837)

July 22, 2016 cs.IT, math.IT
We present nonasymptotic achievability and converse bounds on the maximum coding rate (for a fixed average error probability and a fixed average blocklength) of variable-length full-feedback (VLF) and variable-length stop-feedback (VLSF) codes operating over a binary erasure channel (BEC). For the VLF setup, the achievability bound relies on a scheme that maps each message onto a variable-length Huffman codeword and then repeats each bit of the codeword until it is received correctly. The converse bound is inspired by the meta-converse framework by Polyanskiy, Poor, and Verd\'u (2010) and relies on binary sequential hypothesis testing. For the case of zero error probability, our achievability and converse bounds match. For the VLSF case, we provide achievability bounds that exploit the following feature of BEC: the decoder can assess the correctness of its estimate by verifying whether the chosen codeword is the only one that is compatible with the erasure pattern. One of these bounds is obtained by analyzing the performance of a variable-length extension of random linear fountain codes. The gap between the VLSF achievability and the VLF converse bound, when number of messages is small, is significant: $23\%$ for 8 messages on a BEC with erasure probability $0.5.$ The absence of a tight VLSF converse bound does not allow us to assess whether this gap is fundamental.
• ### Minimum Energy to Send $k$ Bits Over Multiple-Antenna Fading Channels(1507.03843)

May 20, 2016 cs.IT, math.IT
This paper investigates the minimum energy required to transmit $k$ information bits with a given reliability over a multiple-antenna Rayleigh block-fading channel, with and without channel state information (CSI) at the receiver. No feedback is assumed. It is well known that the ratio between the minimum energy per bit and the noise level converges to $-1.59$ dB as $k$ goes to infinity, regardless of whether CSI is available at the receiver or not. This paper shows that lack of CSI at the receiver causes a slowdown in the speed of convergence to $-1.59$ dB as $k\to\infty$ compared to the case of perfect receiver CSI. Specifically, we show that, in the no-CSI case, the gap to $-1.59$ dB is proportional to $((\log k) /k)^{1/3}$, whereas when perfect CSI is available at the receiver, this gap is proportional to $1/\sqrt{k}$. In both cases, the gap to $-1.59$ dB is independent of the number of transmit antennas and of the channel's coherence time. Numerically, we observe that, when the receiver is equipped with a single antenna, to achieve an energy per bit of $- 1.5$ dB in the no-CSI case, one needs to transmit at least $7\times 10^7$ information bits, whereas $6\times 10^4$ bits suffice for the case of perfect CSI at the receiver.
• ### New Insights on Stacking Fault Behavior in Twin Induced Plasticity from Meta-Atom Molecular Dynamics Simulations(1604.00579)

April 3, 2016 cond-mat.mtrl-sci
There is growing interest in promoting deformation twinning for plasticity in advanced materials, as highly organized twin boundaries are beneficial to better strength-ductility combination in contrast to disordered grain boundaries. Twinning deformation typically involves the kinetics of stacking faults, its interaction with dislocations, and dislocation - twin boundary interactions. While the latter has been intensively investigated, the dynamics of stacking faults has been less known. In this work, we report several new insights on the stacking fault behavior in twin induced plasticity from our meta-atom molecular dynamics simulation: The stacking fault interactions are dominated by dislocation reactions taking place spontaneously, different from the proposed mechanism in literatures; The competition among generating a single stacking fault, a twinning partial and a trailing partial dislocation is dependent on a unique parameter, i.e. stacking fault energy, which in turn determines deformation twinning behaviors. The complex twin-slip and twin-dislocation interactions demonstrate the dual role of deformation twins as both dislocation barrier and storage, potentially contributing to the high strength and ductility of advanced materials like TWIP steels where deformation twinning dominated plasticity accounts for the superb strength-ductility combination.
• ### Finite-Blocklength Bounds for Wiretap Channels(1601.06055)

Jan. 22, 2016 cs.IT, math.IT
This paper investigates the maximal secrecy rate over a wiretap channel subject to reliability and secrecy constraints at a given blocklength. New achievability and converse bounds are derived, which are shown to be tighter than existing bounds. The bounds also lead to the tightest second-order coding rate for discrete memoryless and Gaussian wiretap channels.
• ### A Beta-Beta Achievability Bound with Applications(1601.05880)

Jan. 22, 2016 cs.IT, math.IT
A channel coding achievability bound expressed in terms of the ratio between two Neyman-Pearson $\beta$ functions is proposed. This bound is the dual of a converse bound established earlier by Polyanskiy and Verd\'{u} (2014). The new bound turns out to simplify considerably the analysis in situations where the channel output distribution is not a product distribution, for example due to a cost constraint or a structural constraint (such as orthogonality or constant composition) on the channel inputs. Connections to existing bounds in the literature are discussed. The bound is then used to derive 1) an achievability bound on the channel dispersion of additive non-Gaussian noise channels with random Gaussian codebooks, 2) the channel dispersion of the exponential-noise channel, 3) a second-order expansion for the minimum energy per bit of an AWGN channel, and 4) a lower bound on the maximum coding rate of a multiple-input multiple-output Rayleigh-fading channel with perfect channel state information at the receiver, which is the tightest known achievability result.
• ### Short-Packet Communications over Multiple-Antenna Rayleigh-Fading Channels(1412.7512)

Dec. 16, 2015 cs.IT, math.IT
Motivated by the current interest in ultra-reliable, low-latency, machine-type communication systems, we investigate the tradeoff between reliability, throughput, and latency in the transmission of information over multiple-antenna Rayleigh block-fading channels. Specifically, we obtain finite-blocklength, finite-SNR upper and lower bounds on the maximum coding rate achievable over such channels for a given constraint on the packet error probability. Numerical evidence suggests that our bounds delimit tightly the maximum coding rate already for short blocklengths (packets of about 100 symbols). Furthermore, our bounds reveal the existence of a tradeoff between the rate gain obtainable by spreading each codeword over all available time-frequency-spatial degrees of freedom, and the rate loss caused by the need of estimating the fading coefficients over these degrees of freedom. In particular, our bounds allow us to determine the optimal number of transmit antennas and the optimal number of time-frequency diversity branches that maximize the rate. Finally, we show that infinite-blocklength performance metrics such as the ergodic capacity and the outage capacity yield inaccurate throughput estimates.
• ### Finite-SNR Bounds on the Sum-Rate Capacity of Rayleigh Block-Fading Multiple-Access Channels with no a Priori CSI(1501.01957)

Aug. 9, 2015 cs.IT, math.IT
We provide nonasymptotic upper and lower bounds on the sum-rate capacity of Rayleigh block-fading multiple-access channels for the setup where a priori channel state information is not available. The upper bound relies on a dual formula for channel capacity and on the assumption that the users can cooperate perfectly. The lower bound is derived assuming a noncooperative scenario, where each user employs unitary space-time modulation (independently from the other users). Numerical results show that the gap between the upper and the lower bound is small already at moderate SNR values. This suggests that the sum-rate capacity gains obtainable through user cooperation are minimal.
• ### Inspection games in a mean field setting(1507.08339)

July 29, 2015 math.PR, math.OC
In this paper, we present a new development of inspection games in a mean field setting. In our dynamic version of an inspection game, there is one inspector and a large number N interacting inspectees with a finite state space. By applying the mean field game methodology, we present a solution as an epsilon-equilibrium to this type of inspection games, where epsilon goes to 0 as N tends to infinity. In order to facilitate numerical analysis of this new type inspection game, we conduct an approximation analysis, that is we approximate the optimal Lipschitz continuous switching strategies by smooth switching strategies. We show that any approximating smooth switching strategy is also an epsilon-equilibrium solution to the inspection game with a large and finite number N of inspectees with epsilon being of order 1/N.