人工智能方向最新文章[1022]
阅读量:3668200
2019-10-22
AI: 涵盖人工智能领域(cs.AI)的最新文章[1] Multi-agent Hierarchical Reinforcement Learning with Dynamic Termination链接:http://arxiv.org/abs/1910.09508v1备注:PRICAI 2019作者:Dongge Han;Wendelin Boehmer;Michael Wooldridge;Alex Rogers摘要:In a multi-agent system, an agent's optimal policy will typically depend on the policies chosen by others. Therefore, a key issue in multi-agent systems research is that of predicting the behaviours of others, and responding promptly to changes in such behaviours.
[2] A Logic-Based Framework Leveraging Neural Networks for Studying the Evolution of Neurological Disorders链接:http://arxiv.org/abs/1910.09472v1备注:Under consideration in Theory and Practice of Logic Programming (TPLP)作者:Francesco Calimeri;Francesco Cauteruccio;Luca Cinelli;Aldo Marzullo;Claudio Stamile;Giorgio Terracina;Francoise Durand-Dubief;Dominique Sappey-Marinier摘要:Deductive formalisms have been strongly developed in recent years; among them, Answer Set Programming (ASP) gained some momentum, and has been lately fruitfully employed in many real-world scenarios.
[3] Recurrent neural network approach for cyclic job shop scheduling problem链接:http://arxiv.org/abs/1910.09437v1备注:Journal of Manufacturing Systems, Volume 32, Issue 4, October 2013, Pages 689-699作者:M-Tahar Kechadi;Kok Seng Low;G. Goncalves摘要:While cyclic scheduling is involved in numerous real-world applications, solving the derived problem is still of exponential complexity.
[4] Redistribution Mechanism Design on Networks链接:http://arxiv.org/abs/1910.09335v1备注:作者:Wen Zhang;Dengji Zhao;Hanyu Chen摘要:Redistribution mechanisms have been proposed for more efficient resource allocation but not for profit. We consider redistribution mechanism design for the first time in a setting where participants are connected and the resource owner is only aware of her neighbours.
[5] Solving dynamic multi-objective optimization problems via support vector machine链接:http://arxiv.org/abs/1910.08747v1备注:作者:Min Jiang;Weizhen Hu;Liming Qiu;Minghui Shi;Kay Chen Tan摘要:Dynamic Multi-objective Optimization Problems (DMOPs) refer to optimization problems that objective functions will change with time.
[6] Optimal Immunization Policy Using Dynamic Programming链接:http://arxiv.org/abs/1910.08677v1备注:作者:Atiye Alaeddini;Daniel Klein摘要:Decisions in public health are almost always made in the context of uncertainty. Policy makers responsible for making important decisions are faced with the daunting task of choosing from many possible options.
[7] Blameworthiness in Security Games链接:http://arxiv.org/abs/1910.08647v1备注:作者:Pavel Naumov;Jia Tao摘要:Security games are an example of a successful real-world application of game theory.
[8] Maximum Probability Principle and Black-Box Priors链接:http://arxiv.org/abs/1910.09417v1备注:作者:Amir Emad Marvasti;Ehsan Emad Marvasti;Hassan Foroosh摘要:We present an axiomatic way of assigning probabilities to black box models. In particular, we quantify an upper bound for probability of a model or in terms of information theory, a lower bound for amount of information that is stored in a model.
[9] Making Bayesian Predictive Models Interpretable: A Decision Theoretic Approach链接:http://arxiv.org/abs/1910.09358v1备注:作者:Homayun Afrabandpey;Tomi Peltola;Juho Piironen;Aki Vehtari;Samuel Kaski摘要:A salient approach to interpretable machine learning is to restrict modeling to simple and hence understandable models. In the Bayesian framework, this can be pursued by restricting the model structure and prior to favor interpretable models.
[10] A Neural Entity Coreference Resolution Review链接:http://arxiv.org/abs/1910.09329v1备注:作者:Nikolaos Stylianou;Ioannis Vlahavas摘要:Entity Coreference Resolution is the task of resolving all the mentions in a document that refer to the same real world entity and is considered as one of the most difficult tasks in natural language understanding.
[11] On Semi-Supervised Multiple Representation Behavior Learning链接:http://arxiv.org/abs/1910.09292v1备注:18 pages, 7 figures作者:Ruqian Lu;Shengluan Hou摘要:We propose a novel paradigm of semi-supervised learning (SSL)--the semi-supervised multiple representation behavior learning (SSMRBL).
[12] Dealing with Sparse Rewards in Reinforcement Learning链接:http://arxiv.org/abs/1910.09281v1备注:作者:Joshua Hare摘要:Successfully navigating a complex environment to obtain a desired outcome is a difficult task, that up to recently was believed to be capable only by humans.
[13] Human-Like Decision Making: Document-level Aspect Sentiment Classification via Hierarchical Reinforcement Learning链接:http://arxiv.org/abs/1910.09260v1备注:作者:Jingjing Wang;Changlong Sun;Shoushan Li;Jiancheng Wang;Luo Si;Min Zhang;Xiaozhong Liu;Guodong Zhou摘要:Recently, neural networks have shown promising results on Document-level Aspect Sentiment Classification (DASC). However, these approaches often offer little transparency w.r.t. their inner working mechanisms and lack interpretability.
[14] Regularization Matters in Policy Optimization链接:http://arxiv.org/abs/1910.09191v1备注:Code link: https://github.com/xuanlinli17/po-rl-regularization作者:Zhuang Liu;Xuanlin Li;Bingyi Kang;Trevor Darrell摘要:Deep Reinforcement Learning (Deep RL) has been receiving increasingly more attention thanks to its encouraging performance on a variety of control tasks. Yet, conventional regularization techniques in training neural networks (e.g.
[15] Two Case Studies of Experience Prototyping Machine Learning Systems in the Wild链接:http://arxiv.org/abs/1910.09137v1备注:This is an accepted position paper for the ACM CHI'19 Workshop <Emerging Perspectives in Human-Centered Machine Learning>作者:Qian Yang摘要:Throughout the course of my Ph.D., I have been designing the user experience (UX) of various machine learning (ML) systems. In this workshop, I share two projects as case studies in which people engage with ML in much more complicated and nuanced ways than the technical HCML work might assume.
[16] All-Action Policy Gradient Methods: A Numerical Integration Approach链接:http://arxiv.org/abs/1910.09093v1备注:9 pages, 2 figures. NeurIPS 2019 Optimization Foundations of Reinforcement Learning Workshop作者:Benjamin Petit;Loren Amdahl-Culleton;Yao Liu;Jimmy Smith;Pierre-Luc Bacon摘要:While often stated as an instance of the likelihood ratio trick [Rubinstein, 1989], the original policy gradient theorem [Sutton, 1999] involves an integral over the action space.
[17] Amortized Rejection Sampling in Universal Probabilistic Programming链接:http://arxiv.org/abs/1910.09056v1备注:作者:Saeid Naderiparizi;Adam ?cibior;Andreas Munk;Mehrdad Ghadiri;At?l?m Güne? Baydin;Bradley Gram-Hansen;Christian Schroeder de Witt;Robert Zinkov;Philip H. S. Torr;Tom Rainforth;Yee Whye Teh;Frank Wood摘要:Existing approaches to amortized inference in probabilistic programs with unbounded loops can produce estimators with infinite variance. An instance of this is importance sampling inference in programs that explicitly include rejection sampling as part of the user-programmed generative procedure.
[18] Computer-supported Analysis of Positive Properties, Ultrafilters and Modal Collapse in Variants of G?del's Ontological Argument链接:http://arxiv.org/abs/1910.08955v1备注:21 pages, 6 figures作者:Christoph Benzmüller;David Fuenmayor摘要:Three variants of Kurt G"odel's ontological argument, as proposed byDana Scott, C. Anthony Anderson and Melvin Fitting, are encoded and rigorously assessed on the computer.
[19] Autonomous Industrial Management via Reinforcement Learning: Self-Learning Agents for Decision-Making -- A Review链接:http://arxiv.org/abs/1910.08942v1备注:作者:Leonardo A. Espinosa Leal;Magnus Westerlund;Anthony Chapman摘要:Industry has always been in the pursuit of becoming more economically efficient and the current focus has been to reduce human labour using modern technologies.
[20] Policy Learning for Malaria Control链接:http://arxiv.org/abs/1910.08926v1备注:作者:Van Bach Nguyen;Belaid Mohamed Karim;Bao Long Vu;J?rg Schl?tterer;Michael Granitzer摘要:Sequential decision making is a typical problem in reinforcement learning with plenty of algorithms to solve it. However, only a few of them can work effectively with a very small number of observations.
[21] RLScheduler: Learn to Schedule HPC Batch Jobs Using Deep Reinforcement Learning链接:http://arxiv.org/abs/1910.08925v1备注:10 pages; conference in submission作者:Di Zhang;Dong Dai;Youbiao He;Forrest Sheng Bao摘要:We present RLScheduler, a deep reinforcement learning based job scheduler for scheduling independent batch jobs in high-performance computing (HPC) environment.
[22] Neuro-SERKET: Development of Integrative Cognitive System through the Composition of Deep Probabilistic Generative Models链接:http://arxiv.org/abs/1910.08918v1备注:Submitted to New Generation Computing作者:Tadahiro Taniguchi;Tomoaki Nakamura;Masahiro Suzuki;Ryo Kuniyasu;Kaede Hayashi;Akira Taniguchi;Takato Horii;Takayuki Nagai摘要:This paper describes a framework for the development of an integrative cognitive system based on probabilistic generative models (PGMs) called Neuro-SERKET.
[23] Reverse Experience Replay链接:http://arxiv.org/abs/1910.08780v1备注:7 pages, 6 figures作者:Egor Rotinov摘要:This paper describes an improvement in Deep Q-learning called Reverse Experience Replay (also RER) that solves the problem of sparse rewards and helps to deal with reward maximizing tasks by sampling transitions successively in reverse order.
[24] Explainable AI: Deep Reinforcement Learning Agents for Residential Demand Side Cost Savings in Smart Grids链接:http://arxiv.org/abs/1910.08719v1备注:作者:Hareesh Kumar摘要:Motivated by the recent advancements in deep Reinforcement Learning (RL), we develop an RL agent to manage the operation of storage devices in a household designed to maximize demand-side cost savings.
[25] OffWorld Gym: open-access physical robotics environment for real-world reinforcement learning benchmark and research链接:http://arxiv.org/abs/1910.08639v1备注:作者:Ashish Kumar;Toby Buckley;Qiaozhi Wang;Alicia Kavelaars;Ilya Kuzovkin摘要:Success stories of applied machine learning can be traced back to the datasets and environments that were put forward as challenges for the community. The challenge that the community sets as a benchmark is usually the challenge that the community eventually solves.