• For massive multiple-input multiple-output (MIMO) systems, linear minimum mean-square error (MMSE) detection has been shown to achieve near-optimal performance but suffers from excessively high complexity due to the large-scale matrix inversion. Being matrix inversion free, detection algorithms based on the Gauss-Seidel (GS) method have been proved more efficient than conventional Neumann series expansion (NSE) based ones. In this paper, an efficient GS-based soft-output data detector for massive MIMO and a corresponding VLSI architecture are proposed. To accelerate the convergence of the GS method, a new initial solution is proposed. Several optimizations on the VLSI architecture level are proposed to further reduce the processing latency and area. Our reference implementation results on a Xilinx Virtex-7 XC7VX690T FPGA for a 128 base-station antenna and 8 user massive MIMO system show that our GS-based data detector achieves a throughput of 732 Mb/s with close-to-MMSE error-rate performance. Our implementation results demonstrate that the proposed solution has advantages over existing designs in terms of complexity and efficiency, especially under challenging propagation conditions.
  • As the first error correction codes provably achieving the symmetric capacity of binary-input discrete memory-less channels (B-DMCs), polar codes have been recently chosen by 3GPP for eMBB control channel. Among existing algorithms, CRC-aided successive cancellation list (CA-SCL) decoding is favorable due to its good performance, where CRC is placed at the end of the decoding and helps to eliminate the invalid candidates before final selection. However, the good performance is obtained with a complexity increase that is linear in list size $L$. In this paper, the tailored CRC-aided SCL (TCA-SCL) decoding is proposed to balance performance and complexity. Analysis on how to choose the proper CRC for a given segment is proposed with the help of \emph{virtual transform} and \emph{virtual length}. For further performance improvement, hybrid automatic repeat request (HARQ) scheme is incorporated. Numerical results have shown that, with the similar complexity as the state-of-the-art, the proposed TCA-SCL and HARQ-TCA-SCL schemes achieve $0.1$ dB and $0.25$ dB performance gain at frame error rate $\textrm{FER}=10^{-2}$, respectively. Finally, an efficient TCA-SCL decoder is implemented with FPGA demonstrating its advantages over CA-SCL decoder.