-
Reliable obstacle detection and classification in rough and unstructured
terrain such as agricultural fields or orchards remains a challenging problem.
These environments involve large variations in both geometry and appearance,
challenging perception systems that rely on only a single sensor modality.
Geometrically, tall grass, fallen leaves, or terrain roughness can mistakenly
be perceived as nontraversable or might even obscure actual obstacles.
Likewise, traversable grass or dirt roads and obstacles such as trees and
bushes might be visually ambiguous. In this paper, we combine appearance- and
geometry-based detection methods by probabilistically fusing lidar and camera
sensing with semantic segmentation using a conditional random field. We apply a
state-of-the-art multimodal fusion algorithm from the scene analysis domain and
adjust it for obstacle detection in agriculture with moving ground vehicles.
This involves explicitly handling sparse point cloud data and exploiting both
spatial, temporal, and multimodal links between corresponding 2D and 3D
regions. The proposed method was evaluated on a diverse data set, comprising a
dairy paddock and different orchards gathered with a perception research robot
in Australia. Results showed that for a two-class classification problem
(ground and nonground), only the camera leveraged from information provided by
the other modality with an increase in the mean classification score of 0.5%.
However, as more classes were introduced (ground, sky, vegetation, and object),
both modalities complemented each other with improvements of 1.4% in 2D and
7.9% in 3D. Finally, introducing temporal links between successive frames
resulted in improvements of 0.2% in 2D and 1.5% in 3D.
-
In order to have effective human-AI collaboration, it is necessary to address
how the AI agent's behavior is being perceived by the humans-in-the-loop. When
the agent's task plans are generated without such considerations, they may
often demonstrate inexplicable behavior from the human's point of view. This
problem may arise due to the human's partial or inaccurate understanding of the
agent's planning model. This may have serious implications from increased
cognitive load to more serious concerns of safety around a physical agent. In
this paper, we address this issue by modeling plan explicability as a function
of the distance between a plan that agent makes and the plan that human expects
it to make. We learn a regression model for mapping the plan distances to
explicability scores of plans and develop an anytime search algorithm that can
use this model as a heuristic to come up with progressively explicable plans.
We evaluate the effectiveness of our approach in a simulated autonomous car
domain and a physical robot domain.
-
A defining feature of sampling-based motion planning is the reliance on an
implicit representation of the state space, which is enabled by a set of
probing samples. Traditionally, these samples are drawn either
probabilistically or deterministically to uniformly cover the state space. Yet,
the motion of many robotic systems is often restricted to "small" regions of
the state space, due to, for example, differential constraints or
collision-avoidance constraints. To accelerate the planning process, it is thus
desirable to devise non-uniform sampling strategies that favor sampling in
those regions where an optimal solution might lie. This paper proposes a
methodology for non-uniform sampling, whereby a sampling distribution is
learned from demonstrations, and then used to bias sampling. The sampling
distribution is computed through a conditional variational autoencoder,
allowing sample generation from the latent space conditioned on the specific
planning problem. This methodology is general, can be used in combination with
any sampling-based planner, and can effectively exploit the underlying
structure of a planning problem while maintaining the theoretical guarantees of
sampling-based approaches. Specifically, on several planning problems, the
proposed methodology is shown to effectively learn representations for the
relevant regions of the state space, resulting in an order of magnitude
improvement in terms of success rate and convergence to the optimal cost.
-
A great deal of work aims to discover large general purpose models of image
interest or memorability for visual search and information retrieval. This
paper argues that image interest is often domain and user specific, and that
efficient mechanisms for learning about this domain-specific image interest as
quickly as possible, while limiting the amount of data-labelling required, are
often more useful to end-users. This work uses pairwise image comparisons to
reduce the labelling burden on these users, and introduces an image interest
estimation approach that performs similarly to recent data hungry deep learning
approaches trained using pairwise ranking losses. Here, we use a Gaussian
process model to interpolate image interest inferred using a Bayesian ranking
approach over image features extracted using a pre-trained convolutional neural
network. Results show that fitting a Gaussian process in high-dimensional image
feature space is not only computationally feasible, but also effective across a
broad range of domains. The proposed probabilistic interest estimation approach
produces image interests paired with uncertainties that can be used to identify
images for which additional labelling is required and measure inference
convergence, allowing for sample efficient active model training. Importantly,
the probabilistic formulation allows for effective visual search and
information retrieval when limited labelling data is available.
-
We address the problem of controlling the workspace of a 3-DoF mobile robot.
In a human-robot shared space, robots should navigate in a human-acceptable way
according to the users' demands. For this purpose, we employ virtual borders,
that are non-physical borders, to allow a user the restriction of the robot's
workspace. To this end, we propose an interaction method based on a laser
pointer to intuitively define virtual borders. This interaction method uses a
previously developed framework based on robot guidance to change the robot's
navigational behavior. Furthermore, we extend this framework to increase the
flexibility by considering different types of virtual borders, i.e. polygons
and curves separating an area. We evaluated our method with 15 non-expert users
concerning correctness, accuracy and teaching time. The experimental results
revealed a high accuracy and linear teaching time with respect to the border
length while correctly incorporating the borders into the robot's navigational
map. Finally, our user study showed that non-expert users can employ our
interaction method.
-
Mobile ground robots operating on unstructured terrain must predict which
areas of the environment they are able to pass in order to plan feasible paths.
We address traversability estimation as a heightmap classification problem: we
build a convolutional neural network that, given an image representing the
heightmap of a terrain patch, predicts whether the robot will be able to
traverse such patch from left to right. The classifier is trained for a
specific robot model (wheeled, tracked, legged, snake-like) using simulation
data on procedurally generated training terrains; the trained classifier can be
applied to unseen large heightmaps to yield oriented traversability maps, and
then plan traversable paths. We extensively evaluate the approach in simulation
on six real-world elevation datasets, and run a real-robot validation in one
indoor and one outdoor environment.
-
Tactile-based blind grasping addresses realistic robotic grasping in which
the hand only has access to proprioceptive and tactile sensors. The robotic
hand has no prior knowledge of the object/grasp properties, such as object
weight, inertia, and shape. There exists no manipulation controller that
rigorously guarantees object manipulation in such a setting. Here, a robust
control law is proposed for object manipulation in tactile-based blind
grasping. The analysis ensures semi-global asymptotic and exponential stability
in the presence of model uncertainties and external disturbances that are
neglected in related work. Simulation and experimental results validate the
effectiveness of the proposed approach.
-
We introduce a neural architecture for navigation in novel environments. Our
proposed architecture learns to map from first-person views and plans a
sequence of actions towards goals in the environment. The Cognitive Mapper and
Planner (CMP) is based on two key ideas: a) a unified joint architecture for
mapping and planning, such that the mapping is driven by the needs of the task,
and b) a spatial memory with the ability to plan given an incomplete set of
observations about the world. CMP constructs a top-down belief map of the world
and applies a differentiable neural net planner to produce the next action at
each time step. The accumulated belief of the world enables the agent to track
visited regions of the environment. We train and test CMP on navigation
problems in simulation environments derived from scans of real world buildings.
Our experiments demonstrate that CMP outperforms alternate learning-based
architectures, as well as, classical mapping and path planning approaches in
many cases. Furthermore, it naturally extends to semantically specified goals,
such as 'going to a chair'. We also deploy CMP on physical robots in indoor
environments, where it achieves reasonable performance, even though it is
trained entirely in simulation.
-
The fresh water reservoirs are one of the main power resources of
Pakistan.These water reservoirs are in the form of Tarbela Dam, Mangla Dam,
Bhasha Dam,and Warsak Dam. To estimate the current power capability of the
Dams, the statistical information about the water in the dam has to be clear
and precise. For the purpose of water management monthly or yearly survey of
the dams required. One of the important parameter is to find the water level of
water, which can help us in finding the pressure and flow of water in dams. The
existing surveying systems have some problems, i.e., risky, errors in
measurement and sometimes expensive. Our project has tried a lot to overcome
these flaws and to develop more economical, safe and accurate system for
finding depth values of dams and ponds. The key purpose of Our Project
Autonomous Surveying Boat is to have it log water depths along a predefined set
of points. The Autonomous Surveying Boat floats in water according to
predefined path, getting the coordinates from GPS Sensor and direction is
controlled by using Magnetometer Sensor. It stores its data on SD card as a
text file for later readings. The boat can also be used to find the average
capacity of the dam. The average depth is calculated from the measured depth
values at different set points of the dam. The actual length of the dam is
determined by the magnetometer. The numbers of surveys over the time can help
us in finding the silting ratio in dams.For square dams the length and width of
the dam are measured and the average depth, then using these three parameters
we can estimate the average capacity of the dam.The boat is scalable for
furthered modification if needed.
-
We present a novel framework for the automatic discovery and recognition of
motion primitives in videos of human activities. Given the 3D pose of a human
in a video, human motion primitives are discovered by optimizing the `motion
flux', a quantity which captures the motion variation of a group of skeletal
joints. A normalization of the primitives is proposed in order to make them
invariant with respect to a subject anatomical variations and data sampling
rate. The discovered primitives are unknown and unlabeled and are
unsupervisedly collected into classes via a hierarchical non-parametric Bayes
mixture model. Once classes are determined and labeled they are further
analyzed for establishing models for recognizing discovered primitives. Each
primitive model is defined by a set of learned parameters.
Given new video data and given the estimated pose of the subject appearing on
the video, the motion is segmented into primitives, which are recognized with a
probability given according to the parameters of the learned models.
Using our framework we build a publicly available dataset of human motion
primitives, using sequences taken from well-known motion capture datasets. We
expect that our framework, by providing an objective way for discovering and
categorizing human motion, will be a useful tool in numerous research fields
including video analysis, human inspired motion generation, learning by
demonstration, intuitive human-robot interaction, and human behavior analysis.
-
This work studies the problem of stochastic dynamic filtering and state
propagation with complex beliefs. The main contribution is GP-SUM, a filtering
algorithm tailored to dynamic systems and observation models expressed as
Gaussian Processes (GP), and to states represented as a weighted sum of
Gaussians. The key attribute of GP-SUM is that it does not rely on
linearizations of the dynamic or observation models, or on unimodal Gaussian
approximations of the belief, hence enables tracking complex state
distributions. The algorithm can be seen as a combination of a sampling-based
filter with a probabilistic Bayes filter. On the one hand, GP-SUM operates by
sampling the state distribution and propagating each sample through the dynamic
system and observation models. On the other hand, it achieves effective
sampling and accurate probabilistic propagation by relying on the GP form of
the system, and the sum-of-Gaussian form of the belief. We show that GP-SUM
outperforms several GP-Bayes and Particle Filters on a standard benchmark. We
also demonstrate its use in a pushing task, predicting with experimental
accuracy the naturally occurring non-Gaussian distributions.
-
In this note, we extend the algorithms Extra and subgradient-push to a new
algorithm ExtraPush for consensus optimization with convex differentiable
objective functions over a directed network. When the stationary distribution
of the network can be computed in advance}, we propose a simplified algorithm
called Normalized ExtraPush. Just like Extra, both ExtraPush and Normalized
ExtraPush can iterate with a fixed step size. But unlike Extra, they can take a
column-stochastic mixing matrix, which is not necessarily doubly stochastic.
Therefore, they remove the undirected-network restriction of Extra.
Subgradient-push, while also works for directed networks, is slower on the same
type of problem because it must use a sequence of diminishing step sizes.
We present preliminary analysis for ExtraPush under a bounded sequence
assumption. For Normalized ExtraPush, we show that it naturally produces a
bounded, linearly convergent sequence provided that the objective function is
strongly convex.
In our numerical experiments, ExtraPush and Normalized ExtraPush performed
similarly well. They are significantly faster than subgradient-push, even when
we hand-optimize the step sizes for the latter.
-
This study considers the control of parent-child systems where a parent
system is acted on by a set of controllable child systems (i.e. a swarm).
Examples of such systems include a swarm of robots pushing an object over a
surface, a swarm of aerial vehicles carrying a large load, or a set of end
effectors manipulating an object. In this paper, a general approach for
decoupling the swarm from the parent system through a low-dimensional abstract
state space is presented. The requirements of this approach are given along
with how constraints on both systems propagate through the abstract state and
impact the requirements of the controllers for both systems. To demonstrate,
several controllers with hard state constraints are designed to track a given
desired angle trajectory of a tilting plane with a swarm of robots driving on
top. Both homogeneous and heterogeneous swarms of varying sizes and properties
are considered to test the robustness of this architecture. The controllers are
shown to be locally asymptotically stable and are demonstrated in simulation.
-
Characterizing human values is a topic deeply interwoven with the sciences,
humanities, art, and many other human endeavors. In recent years, a number of
thinkers have argued that accelerating trends in computer science, cognitive
science, and related disciplines foreshadow the creation of intelligent
machines which meet and ultimately surpass the cognitive abilities of human
beings, thereby entangling an understanding of human values with future
technological development. Contemporary research accomplishments suggest
sophisticated AI systems becoming widespread and responsible for managing many
aspects of the modern world, from preemptively planning users' travel schedules
and logistics, to fully autonomous vehicles, to domestic robots assisting in
daily living. The extrapolation of these trends has been most forcefully
described in the context of a hypothetical "intelligence explosion," in which
the capabilities of an intelligent software agent would rapidly increase due to
the presence of feedback loops unavailable to biological organisms. The
possibility of superintelligent agents, or simply the widespread deployment of
sophisticated, autonomous AI systems, highlights an important theoretical
problem: the need to separate the cognitive and rational capacities of an agent
from the fundamental goal structure, or value system, which constrains and
guides the agent's actions. The "value alignment problem" is to specify a goal
structure for autonomous agents compatible with human values. In this brief
article, we suggest that recent ideas from affective neuroscience and related
disciplines aimed at characterizing neurological and behavioral universals in
the mammalian class provide important conceptual foundations relevant to
describing human values. We argue that the notion of "mammalian value systems"
points to a potential avenue for fundamental research in AI safety and AI
ethics.
-
The complex physical properties of highly deformable materials such as
clothes pose significant challenges fanipulation systems. We present a novel
visual feedback dictionary-based method for manipulating defoor autonomous
robotic mrmable objects towards a desired configuration. Our approach is based
on visual servoing and we use an efficient technique to extract key features
from the RGB sensor stream in the form of a histogram of deformable model
features. These histogram features serve as high-level representations of the
state of the deformable material. Next, we collect manipulation data and use a
visual feedback dictionary that maps the velocity in the high-dimensional
feature space to the velocity of the robotic end-effectors for manipulation. We
have evaluated our approach on a set of complex manipulation tasks and
human-robot manipulation tasks on different cloth pieces with varying material
characteristics.
-
Loop-closure detection on 3D data is a challenging task that has been
commonly approached by adapting image-based solutions. Methods based on local
features suffer from ambiguity and from robustness to environment changes while
methods based on global features are viewpoint dependent. We propose SegMatch,
a reliable loop-closure detection algorithm based on the matching of 3D
segments. Segments provide a good compromise between local and global
descriptions, incorporating their strengths while reducing their individual
drawbacks.
SegMatch does not rely on assumptions of "perfect segmentation", or on the
existence of "objects" in the environment, which allows for reliable execution
on large scale, unstructured environments. We quantitatively demonstrate that
SegMatch can achieve accurate localization at a frequency of 1Hz on the largest
sequence of the KITTI odometry dataset. We furthermore show how this algorithm
can reliably detect and close loops in real-time, during online operation. In
addition, the source code for the SegMatch algorithm will be made available
after publication.
-
This paper considers the chattering problem of sliding mode control while
delay in robot manipulator caused chaos in such electromechanical systems.
Fractional calculus as a powerful theorem to produce a novel sliding mode;
which has a dynamic essence is used for chattering elimination. To realize the
control of a class of chaotic systems in master-slave configuration this novel
fractional dynamic sliding mode control scheme is presented and examined on
delay based chaotic robot in joint and work space. Also the stability of the
closed-loop system is guaranteed by Lyapunov stability theory. Beside these,
delayed robot motions are sorted out for qualitative and quantification study.
Finally, numerical simulation example illustrates the feasibility of proposed
control method.
-
Autonomous driving requires operation in different behavioral modes ranging
from lane following and intersection crossing to turning and stopping. However,
most existing deep learning approaches to autonomous driving do not consider
the behavioral mode in the training strategy. This paper describes a technique
for learning multiple distinct behavioral modes in a single deep neural network
through the use of multi-modal multi-task learning. We study the effectiveness
of this approach, denoted MultiNet, using self-driving model cars for driving
in unstructured environments such as sidewalks and unpaved roads. Using labeled
data from over one hundred hours of driving our fleet of 1/10th scale model
cars, we trained different neural networks to predict the steering angle and
driving speed of the vehicle in different behavioral modes. We show that in
each case, MultiNet networks outperform networks trained on individual modes
while using a fraction of the total number of parameters.
-
In this paper, we introduce a dynamic fiducial marker which can change its
appearance according to the spatiotemporal requirements of the visual
perception task of a mobile robot using a camera as the sensor. We present a
control scheme to dynamically change the appearance of the marker in order to
increase the range of detection and to assure a better accuracy on the close
range. The marker control takes into account the camera to marker distance
(which influences the scale of the marker in image coordinates) to select which
fiducial markers to display. Hence, we realize a tight coupling between the
visual pose control of the mobile robot and the appearance of the dynamic
fiducial marker. Additionally, we discuss the practical implications of time
delays due to processing time and communication delays between the robot and
the marker. Finally, we propose a real-time dynamic marker visual servoing
control scheme for quadcopter landing and evaluate the performance on a
real-world example.
-
We report on an extensive study of the benefits and limitations of current
deep learning approaches to object recognition in robot vision scenarios,
introducing a novel dataset used for our investigation. To avoid the biases in
currently available datasets, we consider a natural human-robot interaction
setting to design a data-acquisition protocol for visual object recognition on
the iCub humanoid robot. Analyzing the performance of off-the-shelf models
trained off-line on large-scale image retrieval datasets, we show the necessity
for knowledge transfer. We evaluate different ways in which this last step can
be done, and identify the major bottlenecks affecting robotic scenarios. By
studying both object categorization and identification problems, we highlight
key differences between object recognition in robotics applications and in
image retrieval tasks, for which the considered deep learning approaches have
been originally designed. In a nutshell, our results confirm the remarkable
improvements yield by deep learning in this setting, while pointing to specific
open challenges that need be addressed for seamless deployment in robotics.
-
We develop a new analysis of sampling-based motion planning in Euclidean
space with uniform random sampling, which significantly improves upon the
celebrated result of Karaman and Frazzoli (2011) and subsequent work.
Particularly, we prove the existence of a critical connection radius
proportional to ${\Theta(n^{-1/d})}$ for $n$ samples and ${d}$ dimensions:
Below this value the planner is guaranteed to fail (similarly shown by the
aforementioned work, ibid.). More importantly, for larger radius values the
planner is asymptotically (near-)optimal. Furthermore, our analysis yields an
explicit lower bound of ${1-O( n^{-1})}$ on the probability of success. A
practical implication of our work is that asymptotic (near-)optimality is
achieved when each sample is connected to only ${\Theta(1)}$ neighbors. This is
in stark contrast to previous work which requires ${\Theta(\log n)}$
connections, that are induced by a radius of order ${\left(\frac{\log
n}{n}\right)^{1/d}}$. Our analysis is not restricted to PRM and applies to a
variety of PRM-based planners, including RRG, FMT* and BTT. Continuum
percolation plays an important role in our proofs. Lastly, we develop similar
theory for all the aforementioned planners when constructed with deterministic
samples, which are then sparsified in a randomized fashion. We believe that
this new model, and its analysis, is interesting in its own right.
-
Contact modeling plays a central role in motion planning, simulation, and
control of legged robots, as legged locomotion is realized through contact. The
two prevailing approaches to model the contact consider rigid and compliant
premise at interaction ports. Contrary to the dynamics model of legged systems
with rigid contact (without impact) which is straightforward to develop, there
is no consensus among researchers to employ a standard compliant contact model.
Our main goal in this paper is to study the dynamics model structure of bipedal
walking systems with a rigid contact and a \textit{novel} compliant contact
model and to present experimental validation of both models. For the model with
rigid contact, after developing the model of the articulated bodies in flight
phase without any contact with the environment, we apply the holonomic
constraints at contact points and develop a constrained dynamics model of the
robot in both single and double support phases. For the model with compliant
contact, we propose a novel nonlinear contact model and simulate the motion of
the robot using this model. In order to show the performance of the developed
models, we compare obtained results from these models to the empirical
measurements from bipedal walking of the human-sized humanoid robot SURENA III,
which has been designed and fabricated at CAST, University of Tehran. This
analysis shows the merit of both models in estimating dynamic behavior of the
robot walking on a semi-rigid surface. The model with rigid contact, which is
less complex and independent of the physical properties of the contacting
bodies, can be employed for model-based motion optimization, analysis as well
as control, while the model with compliant contact and more complexity is
suitable for more realistic simulation scenarios.
-
The theoretical ability of modular robots to reconfigure in response to
complex tasks in a priori unknown environments has frequently been cited as an
advantage and remains a major motivator for work in the field. We present a
modular robot system capable of autonomously completing high-level tasks by
reactively reconfiguring to meet the needs of a perceived, a priori unknown
environment. The system integrates perception, high-level planning, and modular
hardware, and is validated in three hardware demonstrations. Given a high-level
task specification, a modular robot autonomously explores an unknown
environment, decides when and how to reconfigure, and manipulates objects to
complete its task. The system architecture balances distributed mechanical
elements with centralized perception, planning, and control. By providing an
example of how a modular robot system can be designed to leverage reactive
reconfigurability in unknown environments, we have begun to lay the groundwork
for modular self-reconfigurable robots to address tasks in the real world.
-
This paper addresses the problem of Human-Aware Navigation (HAN), using multi
camera sensors to implement a vision-based person tracking system. The main
contributions of this paper are as follows: a novel and efficient Deep Learning
person detection and a standardization of human-aware constraints. In the first
stage of the approach, we propose to cascade the Aggregate Channel Features
(ACF) detector with a deep Convolutional Neural Network (CNN) to achieve fast
and accurate Pedestrian Detection (PD). Regarding the human awareness (that can
be defined as constraints associated with the robot's motion), we use a mixture
of asymmetric Gaussian functions, to define the cost functions associated to
each constraint. Both methods proposed herein are evaluated individually to
measure the impact of each of the components. The final solution (including
both the proposed pedestrian detection and the human-aware constraints) is
tested in a typical domestic indoor scenario, in four distinct experiments. The
results show that the robot is able to cope with human-aware constraints,
defined after common proxemics and social rules.
-
The biomechanics of the human body gives subjects a high degree of freedom in
how they can execute movement. Nevertheless, subjects exhibit regularity in
their movement patterns. One way to account for this regularity is to suppose
that subjects select movement trajectories that are optimal in some sense. We
adopt the principle that human movements are optimal and develop a general
model for human movement patters that uses variational methods in the form of
optimal control theory to calculate trajectories of movement trajectories of
the body. We find that in this approach a constant of the motion that arises
from the model and which plays a role in the optimal control model that is
analogous to the role that the mechanical energy plays in classical physics. We
illustrate how this approach works in practice by using it to develop a model
of walking gait, making all the derivations and calculations in detail. We
finally show that this optimal control model of walking gait recovers in an
appropriate limit an existing model of walking gait which has been shown to
provide good estimates of many observed characteristics of walking gait.