-
Scholars have made handwritten notes and comments in books and manuscripts
for centuries. Today's blogs and news sites typically invite users to express
their opinions on the published content; URLs allow web resources to be shared
with accompanying annotations and comments using third-party services like
Twitter or Facebook. These contributions have until recently been constrained
within specific services, making them second-class citizens of the Web.
Web Annotations are now emerging as fully independent Linked Data in their
own right, no longer restricted to plain textual comments in application silos.
Annotations can now range from bookmarks and comments, to fine-grained
annotations of a selection of, for example, a section of a frame within a video
stream. Technologies and standards now exist to create, publish, syndicate,
mash-up and consume, finely targeted, semantically rich digital annotations on
practically any content, as first-class Web citizens. This development is being
driven by the need for collaboration and annotation reuse amongst domain
researchers, computer scientists, scientific publishers, and scholarly content
databases.
-
Exact Maximum Inner Product Search (MIPS) is an important task that is widely
pertinent to recommender systems and high-dimensional similarity search. The
brute-force approach to solving exact MIPS is computationally expensive, thus
spurring recent development of novel indexes and pruning techniques for this
task. In this paper, we show that a hardware-efficient brute-force approach,
blocked matrix multiply (BMM), can outperform the state-of-the-art MIPS solvers
by over an order of magnitude, for some -- but not all -- inputs.
In this paper, we also present a novel MIPS solution, MAXIMUS, that takes
advantage of hardware efficiency and pruning of the search space. Like BMM,
MAXIMUS is faster than other solvers by up to an order of magnitude, but again
only for some inputs. Since no single solution offers the best runtime
performance for all inputs, we introduce a new data-dependent optimizer,
OPTIMUS, that selects online with minimal overhead the best MIPS solver for a
given input. Together, OPTIMUS and MAXIMUS outperform state-of-the-art MIPS
solvers by 3.2× on average, and up to 10.9×, on widely studied
MIPS datasets.
-
Context-aware recommender systems extend traditional recommenders by adapting
their suggestions to users' contextual situations. CARSKit is a Java-based
open-source library specifically designed for the context-aware recommendation,
where the state-of-the-art context-aware recommendation algorithms have been
implemented. This report provides the basic user's guide to CARSKit, including
how to prepare the data set, how to configure the experimental settings, and
how to evaluate the algorithms, as well as interpreting the outputs. The
instructions in this guide are applicable for CARSKit v0.3.5 and above.
-
We sketch the history of spectral ranking, a general umbrella name for
techniques that apply the theory of linear maps (in particular, eigenvalues and
eigenvectors) to matrices that do not represent geometric transformations, but
rather some kind of relationship between entities. Albeit recently made famous
by the ample press coverage of Google's PageRank algorithm, spectral ranking
was devised more than a century ago, and has been studied in tournament
ranking, psychology, social sciences, bibliometrics, economy and choice theory.
We describe the contribution given by previous scholars in precise and modern
mathematical terms: along the way, we show how to express in a general way
damped rankings, such as Katz's index, as dominant eigenvectors of perturbed
matrices, and then use results on the Drazin inverse to go back to the dominant
eigenvectors by a limit process. The result suggests a regularized definition
of spectral ranking that yields for a general matrix a unique vector depending
on a boundary condition.
-
Near neighbor search is a powerful abstraction for data access; it allows
searching for objects similar to a query. Search indexes are data structures
designed to accelerate computing-intensive data processing, like those
routinely found in clustering and classification tasks. However, for
intrinsically high-dimensional data, competitive indexes tend to have either
impractical index construction times or memory usage. A recent turn around in
the literature has been introduced with the use of the approximate proximity
graph (APG): a connected graph with a greedy search algorithm with restarts,
needing sublinear time to solve queries. The APG computes an approximation of
the result set using a small memory footprint, i.e., proportional to the
underlying graph's degree. The degree along with the number of search repeats
determine the speed and accuracy of the algorithm.
This manuscript introduces three new algorithms based on local-search
metaheuristics for the search graph. Two of these algorithms are direct
improvements of the original one, yet we reduce the number of free parameters
of the algorithm; the third one is an entirely new method that improves both
the search speed and the accuracy of the result in most of our benchmarks. We
also provide a broad experimental study to characterize our search structures
and prove our claims; we also report an extensive performance comparison with
the current alternatives.
-
This paper proposes a text summarization approach for factual reports using a
deep learning model. This approach consists of three phases: feature
extraction, feature enhancement, and summary generation, which work together to
assimilate core information and generate a coherent, understandable summary. We
are exploring various features to improve the set of sentences selected for the
summary, and are using a Restricted Boltzmann Machine to enhance and abstract
those features to improve resultant accuracy without losing any important
information. The sentences are scored based on those enhanced features and an
extractive summary is constructed. Experimentation carried out on several
articles demonstrates the effectiveness of the proposed approach. Source code
available at: https://github.com/vagisha-nidhi/TextSummarizer
-
A recommender system is an information filtering technology which can be used
to predict preference ratings of items (products, services, movies, etc) and/or
to output a ranking of items that are likely to be of interest to the user.
Context-aware recommender systems (CARS) learn and predict the tastes and
preferences of users by incorporating available contextual information in the
recommendation process. One of the major challenges in context-aware
recommender systems research is the lack of automatic methods to obtain
contextual information for these systems. Considering this scenario, in this
paper, we propose to use contextual information from topic hierarchies of the
items (web pages) to improve the performance of context-aware recommender
systems. The topic hierarchies are constructed by an extension of the
LUPI-based Incremental Hierarchical Clustering method that considers three
types of information: traditional bag-of-words (technical information), and the
combination of named entities (privileged information I) with domain terms
(privileged information II). We evaluated the contextual information in four
context-aware recommender systems. Different weights were assigned to each type
of information. The empirical results demonstrated that topic hierarchies with
the combination of the two kinds of privileged information can provide better
recommendations.
-
Modern popular TV series often develop complex storylines spanning several
seasons, but are usually watched in quite a discontinuous way. As a result, the
viewer generally needs a comprehensive summary of the previous season plot
before the new one starts. The generation of such summaries requires first to
identify and characterize the dynamics of the series subplots. One way of doing
so is to study the underlying social network of interactions between the
characters involved in the narrative. The standard tools used in the Social
Networks Analysis field to extract such a network rely on an integration of
time, either over the whole considered period, or as a sequence of several
time-slices. However, they turn out to be inappropriate in the case of TV
series, due to the fact the scenes showed onscreen alternatively focus on
parallel storylines, and do not necessarily respect a traditional chronology.
This makes existing extraction methods inefficient to describe the dynamics of
relationships between characters, or to get a relevant instantaneous view of
the current social state in the plot. This is especially true for characters
shown as interacting with each other at some previous point in the plot but
temporarily neglected by the narrative. In this article, we introduce narrative
smoothing, a novel, still exploratory, network extraction method. It smooths
the relationship dynamics based on the plot properties, aiming at solving some
of the limitations present in the standard approaches. In order to assess our
method, we apply it to a new corpus of 3 popular TV series, and compare it to
both standard approaches. Our results are promising, showing narrative
smoothing leads to more relevant observations when it comes to the
characterization of the protagonists and their relationships. It could be used
as a basis for further modeling the intertwined storylines constituting TV
series plots.
-
The recent tremendous success of unsupervised word embeddings in a multitude
of applications raises the obvious question if similar methods could be derived
to improve embeddings (i.e. semantic representations) of word sequences as
well. We present a simple but efficient unsupervised objective to train
distributed representations of sentences. Our method outperforms the
state-of-the-art unsupervised models on most benchmark tasks, highlighting the
robustness of the produced general-purpose sentence embeddings.
-
With the ever increasing number of filed patent applications every year, the
need for effective and efficient systems for managing such tremendous amounts
of data becomes inevitably important. Patent Retrieval (PR) is considered the
pillar of almost all patent analysis tasks. PR is a subfield of Information
Retrieval (IR) which is concerned with developing techniques and methods that
effectively and efficiently retrieve relevant patent documents in response to a
given search request. In this paper we present a comprehensive review on PR
methods and approaches. It is clear that, recent successes and maturity in IR
applications such as Web search cannot be transferred directly to PR without
deliberate domain adaptation and customization. Furthermore, state-of-the-art
performance in automatic PR is still around average in terms of recall. These
observations motivate the need for interactive search tools which provide
cognitive assistance to patent professionals with minimal effort. These tools
must also be developed in hand with patent professionals considering their
practices and expectations. We additionally touch on related tasks to PR such
as patent valuation, litigation, licensing, and highlight potential
opportunities and open directions for computational scientists in these
domains.
-
In Web search, entity-seeking queries often trigger a special Question
Answering (QA) system. It may use a parser to interpret the question to a
structured query, execute that on a knowledge graph (KG), and return direct
entity responses. QA systems based on precise parsing tend to be brittle: minor
syntax variations may dramatically change the response. Moreover, KG coverage
is patchy. At the other extreme, a large corpus may provide broader coverage,
but in an unstructured, unreliable form. We present AQQUCN, a QA system that
gracefully combines KG and corpus evidence. AQQUCN accepts a broad spectrum of
query syntax, between well-formed questions to short `telegraphic' keyword
sequences. In the face of inherent query ambiguities, AQQUCN aggregates signals
from KGs and large corpora to directly rank KG entities, rather than commit to
one semantic interpretation of the query. AQQUCN models the ideal
interpretation as an unobservable or latent variable. Interpretations and
candidate entity responses are scored as pairs, by combining signals from
multiple convolutional networks that operate collectively on the query, KG and
corpus. On four public query workloads, amounting to over 8,000 queries with
diverse query syntax, we see 5--16% absolute improvement in mean average
precision (MAP), compared to the entity ranking performance of recent systems.
Our system is also competitive at entity set retrieval, almost doubling F1
scores for challenging short queries.
-
Item recommendation task predicts a personalized ranking over a set of items
for each individual user. One paradigm is the rating-based methods that
concentrate on explicit feedbacks and hence face the difficulties in collecting
them. Meanwhile, the ranking-based methods are presented with rated items and
then rank the rated above the unrated. This paradigm takes advantage of widely
available implicit feedback. It, however, usually ignores a kind of important
information: item reviews. Item reviews not only justify the preferences of
users, but also help alleviate the cold-start problem that fails the
collaborative filtering. In this paper, we propose two novel and simple models
to integrate item reviews into Bayesian personalized ranking. In each model, we
make use of text features extracted from item reviews using word embeddings. On
top of text features we uncover the review dimensions that explain the
variation in users' feedback and these review factors represent a prior
preference of users. Experiments on six real-world data sets show the benefits
of leveraging item reviews on ranking prediction. We also conduct analyses to
understand the proposed models.
-
Video recommendation has become an essential way of helping people explore
the massive videos and discover the ones that may be of interest to them. In
the existing video recommender systems, the models make the recommendations
based on the user-video interactions and single specific content features. When
the specific content features are unavailable, the performance of the existing
models will seriously deteriorate. Inspired by the fact that rich contents
(e.g., text, audio, motion, and so on) exist in videos, in this paper, we
explore how to use these rich contents to overcome the limitations caused by
the unavailability of the specific ones. Specifically, we propose a novel
general framework that incorporates arbitrary single content feature with
user-video interactions, named as collaborative embedding regression (CER)
model, to make effective video recommendation in both in-matrix and
out-of-matrix scenarios. Our extensive experiments on two real-world
large-scale datasets show that CER beats the existing recommender models with
any single content feature and is more time efficient. In addition, we propose
a priority-based late fusion (PRI) method to gain the benefit brought by the
integrating the multiple content features. The corresponding experiment shows
that PRI brings real performance improvement to the baseline and outperforms
the existing fusion methods.
-
Sequential recommendation is a fundamental task for network applications, and
it usually suffers from the item cold start problem due to the insufficiency of
user feedbacks. There are currently three kinds of popular approaches which are
respectively based on matrix factorization (MF) of collaborative filtering,
Markov chain (MC), and recurrent neural network (RNN). Although widely used,
they have some limitations. MF based methods could not capture dynamic user's
interest. The strong Markov assumption greatly limits the performance of MC
based methods. RNN based methods are still in the early stage of incorporating
additional information. Based on these basic models, many methods with
additional information only validate incorporating one modality in a separate
way. In this work, to make the sequential recommendation and deal with the item
cold start problem, we propose a Multi-View Recurrent Neural Network (MV-RNN})
model. Given the latent feature, MV-RNN can alleviate the item cold start
problem by incorporating visual and textual information. First, At the input of
MV-RNN, three different combinations of multi-view features are studied, like
concatenation, fusion by addition and fusion by reconstructing the original
multi-modal data. MV-RNN applies the recurrent structure to dynamically capture
the user's interest. Second, we design a separate structure and a united
structure on the hidden state of MV-RNN to explore a more effective way to
handle multi-view features. Experiments on two real-world datasets show that
MV-RNN can effectively generate the personalized ranking list, tackle the
missing modalities problem and significantly alleviate the item cold start
problem.
-
Rule-based techniques to extract relational entities from documents allow
users to specify desired entities with natural language questions, finite state
automata, regular expressions and structured query language. They require
linguistic and programming expertise and lack support for Arabic morphological
analysis. We present a morphology-based entity and relational entity extraction
framework for Arabic (MERF). MERF requires basic knowledge of linguistic
features and regular expressions, and provides the ability to interactively
specify Arabic morphological and synonymity features, tag types associated with
regular expressions, and relations and code actions defined over matches of
subexpressions. MERF constructs entities and relational entities from matches
of the specifications. We evaluated MERF with several case studies. The results
show that MERF requires shorter development time and effort compared to
existing application specific techniques and produces reasonably accurate
results within a reasonable overhead in run time.
-
We introduce a large scale MAchine Reading COmprehension dataset, which we
name MS MARCO. The dataset comprises of 1,010,916 anonymized
questions---sampled from Bing's search query logs---each with a human generated
answer and 182,669 completely human rewritten generated answers. In addition,
the dataset contains 8,841,823 passages---extracted from 3,563,535 web
documents retrieved by Bing---that provide the information necessary for
curating the natural language answers. A question in the MS MARCO dataset may
have multiple answers or no answers at all. Using this dataset, we propose
three different tasks with varying levels of difficulty: (i) predict if a
question is answerable given a set of context passages, and extract and
synthesize the answer as a human would (ii) generate a well-formed answer (if
possible) based on the context passages that can be understood with the
question and passage context, and finally (iii) rank a set of retrieved
passages given a question. The size of the dataset and the fact that the
questions are derived from real user search queries distinguishes MS MARCO from
other well-known publicly available datasets for machine reading comprehension
and question-answering. We believe that the scale and the real-world nature of
this dataset makes it attractive for benchmarking machine reading comprehension
and question-answering models.
-
Recently, multi-view representation learning has become a rapidly growing
direction in machine learning and data mining areas. This paper introduces two
categories for multi-view representation learning: multi-view representation
alignment and multi-view representation fusion. Consequently, we first review
the representative methods and theories of multi-view representation learning
based on the perspective of alignment, such as correlation-based alignment.
Representative examples are canonical correlation analysis (CCA) and its
several extensions. Then from the perspective of representation fusion we
investigate the advancement of multi-view representation learning that ranges
from generative methods including multi-modal topic learning, multi-view sparse
coding, and multi-view latent space Markov networks, to neural network-based
methods including multi-modal autoencoders, multi-view convolutional neural
networks, and multi-modal recurrent neural networks. Further, we also
investigate several important applications of multi-view representation
learning. Overall, this survey aims to provide an insightful overview of
theoretical foundation and state-of-the-art developments in the field of
multi-view representation learning and to help researchers find the most
appropriate tools for particular applications.
-
In last decades, diversity and accuracy have been regarded as two important
measures in evaluating a recommendation model. However, a clear concern is that
a model focusing excessively on one measure will put the other one at risk,
thus it is not easy to greatly improve diversity and accuracy simultaneously.
In this paper, we propose to enhance the Resource-Allocation (RA) similarity in
resource transfer equations of diffusion-like models, by giving a tunable
exponent to the RA similarity, and traversing the value of the exponent to
achieve the optimal recommendation results. In this way, we can increase the
recommendation scores (allocated resource) of many unpopular objects.
Experiments on three benchmark data sets, MovieLens, Netflix, and RateYourMusic
show that the modified models can yield remarkable performance improvement
compared with the original ones.
-
Music summarization allows for higher efficiency in processing, storage, and
sharing of datasets. Machine-oriented approaches, being agnostic to human
consumption, optimize these aspects even further. Such summaries have already
been successfully validated in some MIR tasks. We now generalize previous
conclusions by evaluating the impact of generic summarization of music from a
probabilistic perspective. We estimate Gaussian distributions for original and
summarized songs and compute their relative entropy, in order to measure
information loss incurred by summarization. Our results suggest that relative
entropy is a good predictor of summarization performance in the context of
tasks relying on a bag-of-features model. Based on this observation, we further
propose a straightforward yet expressive summarizer, which minimizes relative
entropy with respect to the original song, that objectively outperforms
previous methods and is better suited to avoid potential copyright issues.
-
When convolutional neural networks are used to tackle learning problems based
on music or, more generally, time series data, raw one-dimensional data are
commonly pre-processed to obtain spectrogram or mel-spectrogram coefficients,
which are then used as input to the actual neural network. In this
contribution, we investigate, both theoretically and experimentally, the
influence of this pre-processing step on the network's performance and pose the
question, whether replacing it by applying adaptive or learned filters directly
to the raw data, can improve learning success. The theoretical results show
that approximately reproducing mel-spectrogram coefficients by applying
adaptive filters and subsequent time-averaging is in principle possible. We
also conducted extensive experimental work on the task of singing voice
detection in music. The results of these experiments show that for
classification based on Convolutional Neural Networks the features obtained
from adaptive filter banks followed by time-averaging perform better than the
canonical Fourier-transform-based mel-spectrogram coefficients. Alternative
adaptive approaches with center frequencies or time-averaging lengths learned
from training data perform equally well.
-
With the ever-growing volume of online information, recommender systems have
been an effective strategy to overcome such information overload. The utility
of recommender systems cannot be overstated, given its widespread adoption in
many web applications, along with its potential impact to ameliorate many
problems related to over-choice. In recent years, deep learning has garnered
considerable interest in many research fields such as computer vision and
natural language processing, owing not only to stellar performance but also the
attractive property of learning feature representations from scratch. The
influence of deep learning is also pervasive, recently demonstrating its
effectiveness when applied to information retrieval and recommender systems
research. Evidently, the field of deep learning in recommender system is
flourishing. This article aims to provide a comprehensive review of recent
research efforts on deep learning based recommender systems. More concretely,
we provide and devise a taxonomy of deep learning based recommendation models,
along with providing a comprehensive summary of the state-of-the-art. Finally,
we expand on current trends and provide new perspectives pertaining to this new
exciting development of the field.
-
While generative models such as Latent Dirichlet Allocation (LDA) have proven
fruitful in topic modeling, they often require detailed assumptions and careful
specification of hyperparameters. Such model complexity issues only compound
when trying to generalize generative models to incorporate human input. We
introduce Correlation Explanation (CorEx), an alternative approach to topic
modeling that does not assume an underlying generative model, and instead
learns maximally informative topics through an information-theoretic framework.
This framework naturally generalizes to hierarchical and semi-supervised
extensions with no additional modeling assumptions. In particular, word-level
domain knowledge can be flexibly incorporated within CorEx through anchor
words, allowing topic separability and representation to be promoted with
minimal human intervention. Across a variety of datasets, metrics, and
experiments, we demonstrate that CorEx produces topics that are comparable in
quality to those produced by unsupervised and semi-supervised variants of LDA.
-
We propose a hybrid model of differential privacy that considers a
combination of regular and opt-in users who desire the differential privacy
guarantees of the local privacy model and the trusted curator model,
respectively. We demonstrate that within this model, it is possible to design a
new type of blended algorithm for the task of privately computing the head of a
search log. This blended approach provides significant improvements in the
utility of obtained data compared to related work while providing users with
their desired privacy guarantees. Specifically, on two large search click data
sets, comprising 1.75 and 16 GB respectively, our approach attains NDCG values
exceeding 95% across a range of privacy budget values.
-
Text preprocessing is often the first step in the pipeline of a Natural
Language Processing (NLP) system, with potential impact in its final
performance. Despite its importance, text preprocessing has not received much
attention in the deep learning literature. In this paper we investigate the
impact of simple text preprocessing decisions (particularly tokenizing,
lemmatizing, lowercasing and multiword grouping) on the performance of a
standard neural text classifier. We perform an extensive evaluation on standard
benchmarks from text categorization and sentiment analysis. While our
experiments show that a simple tokenization of input text is generally
adequate, they also highlight significant degrees of variability across
preprocessing techniques. This reveals the importance of paying attention to
this usually-overlooked step in the pipeline, particularly when comparing
different models. Finally, our evaluation provides insights into the best
preprocessing practices for training word embeddings.
-
We propose the Neural Vector Space Model (NVSM), a method that learns
representations of documents in an unsupervised manner for news article
retrieval. In the NVSM paradigm, we learn low-dimensional representations of
words and documents from scratch using gradient descent and rank documents
according to their similarity with query representations that are composed from
word representations. We show that NVSM performs better at document ranking
than existing latent semantic vector space methods. The addition of NVSM to a
mixture of lexical language models and a state-of-the-art baseline vector space
model yields a statistically significant increase in retrieval effectiveness.
Consequently, NVSM adds a complementary relevance signal. Next to semantic
matching, we find that NVSM performs well in cases where lexical matching is
needed.
NVSM learns a notion of term specificity directly from the document
collection without feature engineering. We also show that NVSM learns
regularities related to Luhn significance. Finally, we give advice on how to
deploy NVSM in situations where model selection (e.g., cross-validation) is
infeasible. We find that an unsupervised ensemble of multiple models trained
with different hyperparameter values performs better than a single
cross-validated model. Therefore, NVSM can safely be used for ranking documents
without supervised relevance judgments.