articles+ search results
242 articles+ results
1  20
Next

Bonet, Blai and Geffner, Hector
 Subjects

Computer Science  Artificial Intelligence
 Abstract

It has been observed that many classical planning domains with atomic goals can be solved by means of a simple polynomial exploration procedure, called IW, that runs in time exponential in the problem width, which in these cases is bounded and small. Yet, while the notion of width has become part of stateoftheart planning algorithms such as BFWS, there is no good explanation for why so many benchmark domains have bounded width when atomic goals are considered. In this work, we address this question by relating bounded width with the existence of general optimal policies that in each planning instance are represented by tuples of atoms of bounded size. We also define the notions of (explicit) serializations and serialized width that have a broader scope as many domains have a bounded serialized width but no bounded width. Such problems are solved nonoptimally in polynomial time by a suitable variant of the Serialized IW algorithm. Finally, the language of general policies and the semantics of serializations are combined to yield a simple, meaningful, and expressive language for specifying serializations in compact form in the form of sketches, which can be used for encoding domain control knowledge by hand or for learning it from small examples. Sketches express general problem decompositions in terms of subgoals, and sketches of bounded width express problem decompositions that can be solved in polynomial time.
 Full text View this record from Arxiv

Bonet, Blai and Geffner, Hector
 Subjects

Computer Science  Artificial Intelligence
 Abstract

Consider the finite state graph that results from a simple, discrete, dynamical system in which an agent moves in a rectangular grid picking up and dropping packages. Can the state variables of the problem, namely, the agent location and the package locations, be recovered from the structure of the state graph alone without having access to information about the objects, the structure of the states, or any background knowledge? We show that this is possible provided that the dynamics is learned over a suitable domainindependent firstorder causal language that makes room for objects and relations that are not assumed to be known. The preference for the most compact representation in the language that is compatible with the data provides a strong and meaningful learning bias that makes this possible. The language of structured causal models (SCMs) is the standard language for representing (static) causal models but in dynamic worlds populated by objects, firstorder causal languages such as those used in "classical AI planning" are required. While "classical AI" requires handcrafted representations, similar representations can be learned from unstructured data over the same languages. Indeed, it is the languages and the preference for compact representations in those languages that provide structure to the world, uncovering objects, relations, and causes.
 Full text View this record from Arxiv

Ståhlberg, Simon, Bonet, Blai, and Geffner, Hector
 Subjects

Computer Science  Artificial Intelligence and Computer Science  Machine Learning
 Abstract

We consider the problem of learning generalized policies for classical planning domains using graph neural networks from small instances represented in lifted STRIPS. The problem has been considered before but the proposed neural architectures are complex and the results are often mixed. In this work, we use a simple and general GNN architecture and aim at obtaining crisp experimental results and a deeper understanding: either the policy greedy in the learned value function achieves close to 100% generalization over instances larger than those used in training, or the failure must be understood, and possibly fixed, logically. For this, we exploit the relation established between the expressive power of GNNs and the $C_{2}$ fragment of firstorder logic (namely, FOL with 2 variables and counting quantifiers). We find for example that domains with general policies that require more expressive features can be solved with GNNs once the states are extended with suitable "derived atoms" encoding role compositions and transitive closures that do not fit into $C_{2}$. The work follows the GNN approach for learning optimal general policies in a supervised fashion (Stahlberg, Bonet, Geffner, 2022); but the learned policies are no longer required to be optimal (which expands the scope, as many planning domains do not have general optimal policies) and are learned without supervision. Interestingly, valuebased reinforcement learning methods that aim to produce optimal policies, do not always yield policies that generalize, as the goals of optimality and generality are in conflict in domains where optimal planning is NPhard.
Comment: Proceedings of the 19th International Conference on Principles of Knowledge Representation and Reasoning (KR22)
 Full text View this record from Arxiv

Liberman, Andrés Occhipinti, Bonet, Blai, and Geffner, Hector
 Subjects

Computer Science  Artificial Intelligence
 Abstract

Two main approaches have been developed for learning firstorder planning (action) models from unstructured data: combinatorial approaches that yield crisp action schemas from the structure of the state space, and deep learning approaches that produce action schemas from states represented by images. A benefit of the former approach is that the learned action schemas are similar to those that can be written by hand; a benefit of the latter is that the learned representations (predicates) are grounded on the images, and as a result, new instances can be given in terms of images. In this work, we develop a new formulation for learning crisp firstorder planning models that are grounded on parsed images, a step to combine the benefits of the two approaches. Parsed images are assumed to be given in a simple O2D language (objects in 2D) that involves a small number of unary and binary predicates like "left", "above", "shape", etc. After learning, new planning instances can be given in terms of pairs of parsed images, one for the initial situation and the other for the goal. Learning and planning experiments are reported for several domains including Blocks, Sokoban, IPC Grid, and Hanoi.
 Full text View this record from Arxiv

Drexler, Dominik, Seipp, Jendrik, and Geffner, Hector
 Subjects

Computer Science  Artificial Intelligence
 Abstract

Recently, sketches have been introduced as a general language for representing the subgoal structure of instances drawn from the same domain. Sketches are collections of rules of the form C > E over a given set of features where C expresses Boolean conditions and E expresses qualitative changes. Each sketch rule defines a subproblem: going from a state that satisfies C to a state that achieves the change expressed by E or a goal state. Sketches can encode simple goal serializations, general policies, or decompositions of bounded width that can be solved greedily, in polynomial time, by the SIW_R variant of the SIW algorithm. Previous work has shown the computational value of sketches over benchmark domains that, while tractable, are challenging for domainindependent planners. In this work, we address the problem of learning sketches automatically given a planning domain, some instances of the target class of problems, and the desired bound on the sketch width. We present a logical formulation of the problem, an implementation using the ASP solver Clingo, and experimental results. The sketch learner and the SIW_R planner yield a domainindependent planner that learns and exploits domain structure in a crisp and explicit form.
Comment: This work will appear in the Proceedings of the 32nd International Conference on Automated Planning and Scheduling (ICAPS2022)
 Full text View this record from Arxiv

Ståhlberg, Simon, Bonet, Blai, and Geffner, Hector
 Subjects

Computer Science  Artificial Intelligence
 Abstract

It has been recently shown that general policies for many classical planning domains can be expressed and learned in terms of a pool of features defined from the domain predicates using a description logic grammar. At the same time, most description logics correspond to a fragment of $k$variable counting logic ($C_k$) for $k=2$, that has been shown to provide a tight characterization of the expressive power of graph neural networks. In this work, we make use of these results to understand the power and limits of using graph neural networks (GNNs) for learning optimal general policies over a number of tractable planning domains where such policies are known to exist. For this, we train a simple GNN in a supervised manner to approximate the optimal value function $V^{*}(s)$ of a number of sample states $s$. As predicted by the theory, it is observed that general optimal policies are obtained in domains where general optimal value functions can be defined with $C_2$ features but not in those requiring more expressive $C_3$ features. In addition, it is observed that the features learned are in close correspondence with the features needed to express $V^{*}$ in closed form. The theory and the analysis of the domains let us understand the features that are actually learned as well as those that cannot be learned in this way, and let us move in a principled manner from a combinatorial optimization approach to learning general policies to a potentially, more robust and scalable approach based on deep learning.
Comment: Proceedings of the 32nd International Conference on Automated Planning and Scheduling (ICAPS22)
 Full text View this record from Arxiv

Geffner, Hector
 AAAI 2022
 Subjects

Computer Science  Artificial Intelligence
 Abstract

Recent breakthroughs in AI have shown the remarkable power of deep learning and deep reinforcement learning. These developments, however, have been tied to specific tasks, and progress in outofdistribution generalization has been limited. While it is assumed that these limitations can be overcome by incorporating suitable inductive biases, the notion of inductive biases itself is often left vague and does not provide meaningful guidance. In the paper, I articulate a different learning approach where representations do not emerge from biases in a neural architecture but are learned over a given target language with a known semantics. The basic ideas are implicit in mainstream AI where representations have been encoded in languages ranging from fragments of firstorder logic to probabilistic structural causal models. The challenge is to learn from data the representations that have traditionally been crafted by hand. Generalization is then a result of the semantics of the language. The goals of this paper are to make these ideas explicit, to place them in a broader context where the design of the target language is crucial, and to illustrate them in the context of learning to act and plan. For this, after a general discussion, I consider learning representations of actions, general policies, and subgoals ("intrinsic rewards"). In these cases, learning is formulated as a combinatorial problem but nothing prevents the use of deep learning techniques instead. Indeed, learning representations over languages with a known semantics provides an account of what is to be learned, while learning representations with neural nets provides a complementary account of how representations can be learned. The challenge and the opportunity is to bring the two together.
 Full text View this record from Arxiv

Ståhlberg, Simon, Bonet, Blai, and Geffner, Hector
 Proceedings of the ThirtySecond International Conference on Automated Planning and Scheduling (ICAPS 2022). :629637
 Subjects

Natural Sciences, Computer and Information Sciences, Computer Sciences, Naturvetenskap, Data och informationsvetenskap, and Datavetenskap (datalogi)
 Abstract

It has been recently shown that general policies for many classical planning domains can be expressed and learned in termsof a pool of features defined from the domain predicates usinga description logic grammar. At the same time, most description logics correspond to a fragment of kvariable countinglogic (Ck ) for k = 2, that has been shown to provide a tightcharacterization of the expressive power of graph neural networks. In this work, we make use of these results to understand the power and limits of using graph neural networks(GNNs) for learning optimal general policies over a numberof tractable planning domains where such policies are knownto exist. For this, we train a simple GNN in a supervised manner to approximate the optimal value function V ∗(s) of anumber of sample states s. As predicted by the theory, it is observed that general optimal policies are obtained in domainswhere general optimal value functions can be defined withC2 features but not in those requiring more expressive C3 features. In addition, it is observed that the features learned are inclose correspondence with the features needed to express V ∗in closed form. The theory and the analysis of the domainslet us understand the features that are actually learned as wellas those that cannot be learned in this way, and let us movein a principled manner from a combinatorial optimization approach to learning general policies to a potentially, more robust and scalable approach based on deep learning.
 Full text View record in SwePub

Rodriguez, Ivan D., Bonet, Blai, Sardina, Sebastian, and Geffner, Hector
 The journal of artificial intelligence research. 74:887916
 Subjects

Natural Sciences, Mathematics, Discrete Mathematics, Naturvetenskap, Matematik, and Diskret matematik
 Abstract

We consider the problem of reaching a propositional goal condition in fullyobservable non deterministic (FOND) planning under a general class of fairness assumptions that are given explicitly. The fairness assumptions are of the form A/B and say that state trajectories that contain infinite occurrences of an action a from A in a state s and finite occurrence of actions from B, must also contain infinite occurrences of action a in s followed by each one of its possible outcomes. The infinite trajectories that violate this condition are deemed as unfair, and the solutions are policies for which all the fair trajectories reach a goal state. We show that strong and strongcyclic FOND planning, as well as QNP planning, a planning model introduced recently for generalized planning, are all special cases of FOND planning with fairness assumptions of this form which can also be combined. FOND+ planning, as this form of planning is called, combines the syntax of FOND planning with some of the versatility of LTL for expressing fairness constraints. A sound and complete FOND+ planner is implemented by reducing FOND+ planning to answer set programs, and its performance is evaluated in comparison with FOND and QNP planners, and LTL synthesis tools. Two other FOND+ planners are introduced as well which are more scalable but are not complete.
 Full text View on content provider's site

Rodriguez, Ivan D., Bonet, Blai, Romero, Javier, and Geffner, Hector
 Subjects

Computer Science  Artificial Intelligence
 Abstract

Recently Bonet and Geffner have shown that firstorder representations for planning domains can be learned from the structure of the state space without any prior knowledge about the action schemas or domain predicates. For this, the learning problem is formulated as the search for a simplest firstorder domain description D that along with information about instances I_i (number of objects and initial state) determine state space graphs G(P_i) that match the observed state graphs G_i where P_i = (D, I_i). The search is cast and solved approximately by means of a SAT solver that is called over a large family of propositional theories that differ just in the parameters encoding the possible number of action schemas and domain predicates, their arities, and the number of objects. In this work, we push the limits of these learners by moving to an answer set programming (ASP) encoding using the CLINGO system. The new encodings are more transparent and concise, extending the range of possible models while facilitating their exploration. We show that the domains introduced by Bonet and Geffner can be solved more efficiently in the new approach, often optimally, and furthermore, that the approach can be easily extended to handle partial information about the state graphs as well as noise that prevents some states from being distinguished.
 Full text View this record from Arxiv

Drexler, Dominik, Seipp, Jendrik, and Geffner, Hector
 Subjects

Computer Science  Artificial Intelligence
 Abstract

Widthbased planning methods deal with conjunctive goals by decomposing problems into subproblems of low width. Algorithms like SIW thus fail when the goal is not easily serializable in this way or when some of the subproblems have a high width. In this work, we address these limitations by using a simple but powerful language for expressing finer problem decompositions introduced recently by Bonet and Geffner, called policy sketches. A policy sketch over a set of Boolean and numerical features is a set of sketch rules that express how the values of these features are supposed to change. Like general policies, policy sketches are domain general, but unlike policies, the changes captured by sketch rules do not need to be achieved in a single step. We show that many planning domains that cannot be solved by SIW are provably solvable in low polynomial time with the SIW_R algorithm, the version of SIW that employs userprovided policy sketches. Policy sketches are thus shown to be a powerful language for expressing domainspecific knowledge in a simple and compact way and a convenient alternative to languages such as HTNs or temporal logics. Furthermore, they make it easy to express general problem decompositions and prove key properties of them like their width and complexity.
Comment: This work will appear in the Proceedings of the 18th International Conference on Principles of Knowledge Representation and Reasoning
 Full text View this record from Arxiv

Rodriguez, Ivan D., Bonet, Blai, Sardina, Sebastian, and Geffner, Hector
 Journal of Artificial Intelligence Research 2022
 Subjects

Computer Science  Artificial Intelligence
 Abstract

We consider the problem of reaching a propositional goal condition in fullyobservable nondeterministic (FOND) planning under a general class of fairness assumptions that are given explicitly. The fairness assumptions are of the form A/B and say that state trajectories that contain infinite occurrences of an action a from A in a state s and finite occurrence of actions from B, must also contain infinite occurrences of action a in s followed by each one of its possible outcomes. The infinite trajectories that violate this condition are deemed as unfair, and the solutions are policies for which all the fair trajectories reach a goal state. We show that strong and strongcyclic FOND planning, as well as QNP planning, a planning model introduced recently for generalized planning, are all special cases of FOND planning with fairness assumptions of this form which can also be combined. FOND+ planning, as this form of planning is called, combines the syntax of FOND planning with some of the versatility of LTL for expressing fairness constraints. A new planner is implemented by reducing FOND+ planning to answer set programs, and the performance of the planner is evaluated in comparison with FOND and QNP planners, and LTL synthesis tools.
Comment: Extended version of ICAPS21 paper
 Full text View this record from Arxiv

Francès, Guillem, Bonet, Blai, and Geffner, Hector
 Subjects

Computer Science  Artificial Intelligence
 Abstract

Generalized planning is concerned with the computation of general policies that solve multiple instances of a planning domain all at once. It has been recently shown that these policies can be computed in two steps: first, a suitable abstraction in the form of a qualitative numerical planning problem (QNP) is learned from sample plans, then the general policies are obtained from the learned QNP using a planner. In this work, we introduce an alternative approach for computing more expressive general policies which does not require sample plans or a QNP planner. The new formulation is very simple and can be cast in terms that are more standard in machine learning: a large but finite pool of features is defined from the predicates in the planning examples using a general grammar, and a small subset of features is sought for separating "good" from "bad" state transitions, and goals from nongoals. The problems of finding such a "separating surface" while labeling the transitions as "good" or "bad" are jointly addressed as a single combinatorial optimization problem expressed as a Weighted MaxSAT problem. The advantage of looking for the simplest policy in the given feature space that solves the given examples, possibly nonoptimally, is that many domains have no general, compact policies that are optimal. The approach yields general policies for a number of benchmark domains.
Comment: AAAI 2021, version extended with appendix containing full proofs and experimental details
 Full text View this record from Arxiv

Bonet, Blai and Geffner, Hector
 Subjects

Computer Science  Artificial Intelligence
 Abstract

It has been observed that in many of the benchmark planning domains, atomic goals can be reached with a simple polynomial exploration procedure, called IW, that runs in time exponential in the problem width. Such problems have indeed a bounded width: a width that does not grow with the number of problem variables and is often no greater than two. Yet, while the notion of width has become part of the stateoftheart planning algorithms like BFWS, there is still no good explanation for why so many benchmark domains have bounded width. In this work, we address this question by relating bounded width and serialized width to ideas of generalized planning, where general policies aim to solve multiple instances of a planning problem all at once. We show that bounded width is a property of planning domains that admit optimal general policies in terms of features that are explicitly or implicitly represented in the domain encoding. The results are extended to much larger class of domains with bounded serialized width where the general policies do not have to be optimal. The study leads also to a new simple, meaningful, and expressive language for specifying domain serializations in the form of policy sketches which can be used for encoding domain control knowledge by hand or for learning it from traces. The use of sketches and the meaning of the theoretical results are all illustrated through a number of examples.
Comment: Longer version of AAAI2021 paper that includes proofs and more explanations
 Full text View this record from Arxiv

Bonet, Blai and Geffner, Hector
 Journal of Artificial Intelligence Research 69 (2020) 923961
 Subjects

Computer Science  Artificial Intelligence
 Abstract

Qualitative numerical planning is classical planning extended with nonnegative real variables that can be increased or decreased "qualitatively", i.e., by positive indeterminate amounts. While deterministic planning with numerical variables is undecidable in general, qualitative numerical planning is decidable and provides a convenient abstract model for generalized planning. The solutions to qualitative numerical problems (QNPs) were shown to correspond to the strong cyclic solutions of an associated fully observable nondeterministic (FOND) problem that terminate. This leads to a generateandtest algorithm for solving QNPs where solutions to a FOND problem are generated one by one and tested for termination. The computational shortcomings of this approach for solving QNPs, however, are that it is not simple to amend FOND planners to generate all solutions, and that the number of solutions to check can be doubly exponential in the number of variables. In this work we address these limitations while providing additional insights on QNPs. More precisely, we introduce two polynomialtime reductions, one from QNPs to FOND problems and the other from FOND problems to QNPs both of which do not involve termination tests. A result of these reductions is that QNPs are shown to have the same expressive power and the same complexity as FOND problems.
 Full text View this record from Arxiv

Bonet, Blai and Geffner, Hector
 Subjects

Computer Science  Artificial Intelligence
 Abstract

Belief tracking is a basic problem in planning with sensing. While the problem is intractable, it has been recently shown that for both deterministic and nondeterministic systems expressed in compact form, it can be done in time and space that are exponential in the problem width. The width measures the maximum number of state variables that are all relevant to a given precondition or goal. In this work, we extend this result both theoretically and practically. First, we introduce an alternative decomposition scheme and algorithm with the same time complexity but different completeness guarantees, whose space complexity is much smaller: exponential in the causal width of the problem that measures the number of state variables that are causally relevant to a given precondition, goal, or observable. Second, we introduce a fast, meaningful, and powerful approximation that trades completeness by speed, and is both time and space exponential in the problem causal width. It is then shown empirically that the algorithm combined with simple heuristics yields stateoftheart realtime performance in domains with high widths but low causal widths such as Minesweeper, Battleship, and Wumpus.
Comment: Proceedings IJCAI13
 Full text View this record from Arxiv
17. Factored Probabilistic Belief Tracking [2019]

Bonet, Blai and Geffner, Hector
 Subjects

Computer Science  Artificial Intelligence
 Abstract

The problem of belief tracking in the presence of stochastic actions and observations is pervasive and yet computationally intractable. In this work we show however that probabilistic beliefs can be maintained in factored form exactly and efficiently across a number of causally closed beams, when the state variables that appear in more than one beam obey a form of backward determinism. Since computing marginals from the factors is still computationally intractable in general, and variables appearing in several beams are not always backwarddeterministic, the basic formulation is extended with two approximations: forms of belief propagation for computing marginals from factors, and sampling of nonbackwarddeterministic variables for making such variables backwarddeterministic given their sampled history. Unlike, RaoBlackwellized particlefiltering, the sampling is not used for making inference tractable but for making the factorization sound. The resulting algorithm involves sampling and belief propagation or just one of them as determined by the structure of the model.
Comment: Proceedings IJCAI13
 Full text View this record from Arxiv

Bonet, Blai, De Giacomo, Giuseppe, Geffner, Hector, and Rubin, Sasha
 Subjects

Computer Science  Artificial Intelligence and Computer Science  Logic in Computer Science
 Abstract

We study the characterization and computation of general policies for families of problems that share a structure characterized by a common reduction into a single abstract problem. Policies $\mu$ that solve the abstract problem P have been shown to solve all problems Q that reduce to P provided that $\mu$ terminates in Q. In this work, we shed light on why this termination condition is needed and how it can be removed. The key observation is that the abstract problem P captures the common structure among the concrete problems Q that is local (Markovian) but misses common structure that is global. We show how such global structure can be captured by means of trajectory constraints that in many cases can be expressed as LTL formulas, thus reducing generalized planning to LTL synthesis. Moreover, for a broad class of problems that involve integer variables that can be increased or decreased, trajectory constraints can be compiled away, reducing generalized planning to fully observable nondeterministic planning.
Comment: Proceedings IJCAI17
 Full text View this record from Arxiv

Bonet, Blai and Geffner, Hector
 Subjects

Computer Science  Artificial Intelligence
 Abstract

In the presence of nonadmissible heuristics, A* and other bestfirst algorithms can be converted into anytime optimal algorithms over OR graphs, by simply continuing the search after the first solution is found. The same trick, however, does not work for bestfirst algorithms over AND/OR graphs, that must be able to expand leaf nodes of the explicit graph that are not necessarily part of the best partial solution. Anytime optimal variants of AO* must thus address an explorationexploitation tradeoff: they cannot just "exploit", they must keep exploring as well. In this work, we develop one such variant of AO* and apply it to finitehorizon MDPs. This Anytime AO* algorithm eventually delivers an optimal policy while using nonadmissible random heuristics that can be sampled, as when the heuristic is the cost of a base policy that can be sampled with rollouts. We then test Anytime AO* for action selection over large infinitehorizon MDPs that cannot be solved with existing offline heuristic search and dynamic programming algorithms, and compare it with UCT.
Comment: Proceedings AAAI12
 Full text View this record from Arxiv
20. Learning FirstOrder Symbolic Representations for Planning from the Structure of the State Space [2019]

Bonet, Blai and Geffner, Hector
 Subjects

Computer Science  Artificial Intelligence
 Abstract

One of the main obstacles for developing flexible AI systems is the split between databased learners and modelbased solvers. Solvers such as classical planners are very flexible and can deal with a variety of problem instances and goals but require firstorder symbolic models. Databased learners, on the other hand, are robust but do not produce such representations. In this work we address this split by showing how the firstorder symbolic representations that are used by planners can be learned from nonsymbolic inputs that encode the structure of the state space. The representation learning problem is formulated as the problem of inferring planning instances over a common but unknown firstorder domain that account for the structure of the observed state space. This means to infer a complete firstorder representation (i.e. general action schemas, relational symbols, and objects) that explains the observed state space structures. The inference problem is cast as a twolevel combinatorial search where the outer level searches for values of a small set of hyperparameters and the inner level, solved via SAT, searches for a firstorder symbolic model. The framework is shown to produce general and correct firstorder representations for standard problems like Gripper, Blocksworld, and Hanoi from input graphs that encode the flat statespace structure of a single instance.
Comment: Proc. ECAI2020
 Full text View this record from Arxiv
Catalog
Books, media, physical & digital resources
Guides
Course and topicbased guides to collections, tools, and services.
1  20
Next