• ### PDD Graph: Bridging Electronic Medical Records and Biomedical Knowledge Graphs via Entity Linking(1707.05340)

July 24, 2017 cs.AI, cs.DB, cs.IR
Electronic medical records contain multi-format electronic medical data that consist of an abundance of medical knowledge. Facing with patient's symptoms, experienced caregivers make right medical decisions based on their professional knowledge that accurately grasps relationships between symptoms, diagnosis and corresponding treatments. In this paper, we aim to capture these relationships by constructing a large and high-quality heterogenous graph linking patients, diseases, and drugs (PDD) in EMRs. Specifically, we propose a novel framework to extract important medical entities from MIMIC-III (Medical Information Mart for Intensive Care III) and automatically link them with the existing biomedical knowledge graphs, including ICD-9 ontology and DrugBank. The PDD graph presented in this paper is accessible on the Web via the SPARQL endpoint, and provides a pathway for medical discovery and applications, such as effective treatment recommendations.
• ### Exploiting Source-Object Network to Resolve Object Conflicts in Linked Data(1604.08407)

April 21, 2017 cs.DB
Considerable effort has been made to increase the scale of Linked Data. However, an inevitable problem when dealing with data integration from multiple sources is that multiple different sources often provide conflicting objects for a certain predicate of the same real-world entity, so-called object conflicts problem. Currently, the object conflicts problem has not received sufficient attention in the Linked Data community. In this paper, we first formalize the object conflicts resolution problem as computing the joint distribution of variables on a heterogeneous information network called the Source-Object Network, which successfully captures the all correlations from objects and Linked Data sources. Then, we introduce a novel approach based on network effects called ObResolution(Object Resolution), to identify a true object from multiple conflicting objects. ObResolution adopts a pairwise Markov Random Field (pMRF) to model all evidences under a unified framework. Extensive experimental results on six real-world datasets show that our method exhibits higher accuracy than existing approaches and it is robust and consistent in various domains. \keywords{Linked Data, Object Conflicts, Linked Data Quality, Truth Discovery
• ### TruthDiscover: Resolving Object Conflicts on Massive Linked Data(1603.02056)

April 21, 2017 cs.DB
Considerable effort has been made to increase the scale of Linked Data. However, because of the openness of the Semantic Web and the ease of extracting Linked Data from semi-structured sources (e.g., Wikipedia) and unstructured sources, many Linked Data sources often provide conflicting objects for a certain predicate of a real-world entity. Existing methods cannot be trivially extended to resolve conflicts in Linked Data because Linked Data has a scale-free property. In this demonstration, we present a novel system called TruthDiscover, to identify the truth in Linked Data with a scale-free property. First, TruthDiscover leverages the topological properties of the Source Belief Graph to estimate the priori beliefs of sources, which are utilized to smooth the trustworthiness of sources. Second, the Hidden Markov Random Field is utilized to model interdependencies among objects for estimating the trust values of objects accurately. TruthDiscover can visualize the process of resolving conflicts in Linked Data. Experiments results on four datasets show that TruthDiscover exhibits satisfactory accuracy when confronted with data having a scale-free property.
• ### Truth Discovery to Resolve Object Conflicts in Linked Data(1509.00104)

April 21, 2017 cs.DB
In the community of Linked Data, anyone can publish their data as Linked Data on the web because of the openness of the Semantic Web. As such, RDF (Resource Description Framework) triples described the same real-world entity can be obtained from multiple sources; it inevitably results in conflicting objects for a certain predicate of a real-world entity. The objective of this study is to identify one truth from multiple conflicting objects for a certain predicate of a real-world entity. An intuitive principle based on common sense is that an object from a reliable source is trustworthy; thus, a source that provide trustworthy object is reliable. Many truth discovery methods based on this principle have been proposed to estimate source reliability and identify the truth. However, the effectiveness of existing truth discovery methods is significantly affected by the number of objects provided by each source. Therefore, these methods cannot be trivially extended to resolve conflicts in Linked Data with a scale-free property, i.e., most of the sources provide few conflicting objects, whereas only a few sources have many conflicting objects. To address this challenge, we propose a novel approach called TruthDiscover to identify the truth in Linked Data with a scale-free property. Two strategies are adopted in TruthDiscover to reduce the effect of the scale-free property on truth discovery. First, this approach leverages the topological properties of the Source Belief Graph to estimate the priori beliefs of sources, which are utilized to smooth the trustworthiness of sources. Second, this approach utilizes the Hidden Markov Random Field to model the interdependencies between objects to estimate the trust values of objects accurately. Experiments are conducted in the six datasets to evaluate TruthDiscover.
• ### The Plastic Scintillator Detector at DAMPE(1703.00098)

March 1, 2017 physics.ins-det, astro-ph.IM
he DArk Matter Particle Explorer (DAMPE) is a general purposed satellite-borne high energy $\gamma-$ray and cosmic ray detector, and among the scientific objectives of DAMPE are the searches for the origin of cosmic rays and an understanding of Dark Matter particles. As one of the four detectors in DAMPE, the Plastic Scintillator Detector (PSD) plays an important role in the particle charge measurement and the photons/electrons separation. The PSD has 82 modules, each consists of a long organic plastic scintillator bar and two PMTs at both ends for readout, in two layers and covers an overall active area larger than 82 cm $\times$ 82 cm. It can identify the charge states for relativistic ions from H to Fe, and the detector efficiency for Z=1 particles can reach 0.9999. The PSD has been successfully launched with DAMPE on Dec. 17, 2015. In this paper, the design, the assembly, the qualification tests of the PSD and some of the performance measured on the ground have been described in detail.