• We present an automatic moment capture system that runs in real-time on mobile cameras. The system is designed to run in the viewfinder mode and capture a burst sequence of frames before and after the shutter is pressed. For each frame, the system predicts in real-time a "goodness" score, based on which the best moment in the burst can be selected immediately after the shutter is released, without any user interference. To solve the problem, we develop a highly efficient deep neural network ranking model, which implicitly learns a "latent relative attribute" space to capture subtle visual differences within a sequence of burst images. Then the overall goodness is computed as a linear aggregation of the goodnesses of all the latent attributes. The latent relative attributes and the aggregation function can be seamlessly integrated in one fully convolutional network and trained in an end-to-end fashion. To obtain a compact model which can run on mobile devices in real-time, we have explored and evaluated a wide range of network design choices, taking into account the constraints of model size, computational cost, and accuracy. Extensive studies show that the best frame predicted by our model hit users' top-1 (out of 11 on average) choice for $64.1\%$ cases and top-3 choices for $86.2\%$ cases. Moreover, the model(only 0.47M Bytes) can run in real time on mobile devices, e.g. only 13ms on iPhone 7 for one frame prediction.
  • Optical flow estimation is a widely known problem in computer vision introduced by Gibson, J.J(1950) to describe the visual perception of human by stimulus objects. Estimation of optical flow model can be achieved by solving for the motion vectors from region of interest in the the different timeline. In this paper, we assumed slightly uniform change of velocity between two nearby frames, and solve the optical flow problem by traditional method, Lucas-Kanade(1981). This method performs minimization of errors between template and target frame warped back onto the template. Solving minimization steps requires optimization methods which have diverse convergence rate and error. We explored first and second order optimization methods, and compare their results with Gauss-Newton method in Lucas-Kanade. We generated 105 videos with 10,500 frames by synthetic objects, and 10 videos with 1,000 frames from real world footage. Our experimental results could be used as tuning parameters for Lucas-Kanade method.