Using AI to aid automation of proof search in Formal Methods

Publication listing for 2015

  • Ewen Denney, Ganesh J. Pai, and Iain Whiteside. Formal foundations for hierarchical safety cases. In HASE 2015, pages 52–59. IEEE, 2015.


    Safety cases are increasingly being required in many safety-critical domains to assure, using structured argumentation and evidence, that a system is acceptably safe. However, comprehensive system-wide safety arguments present appreciable challenges to develop, understand, evaluate, and manage, partly due to the volume of information that they aggregate, such as the results of hazard analysis, requirements analysis, testing, formal verification, and other engineering activities. Previously, we have proposed hierarchical safety cases, hicases, to aid the comprehension of safety case argument structures. In this paper, we build on a formal notion of safety case to formalise the use of hierarchy as a structuring technique, and show that hicases satisfy several desirable properties. Our aim is to provide a formal, theoretical foundation for safety cases. In particular, we believe that tools for high assurance systems should be granted similar assurance to the systems to which they are applied. To this end, we formally specify and prove the correctness of key operations for constructing and managing hicases, which gives the specification for implementing hicases in Advocate, our toolset for safety case automation. We motivate and explain the theory with the help of a simple running example, extracted from a real safety case and developed using Advocate.

Publication listing for 2014

  • Leo Freitas, Cliff B. Jones, Andrius Velykis, and Iain Whiteside. A model for capturing and replaying proof strategies. In Dimitra Giannakopoulou and Daniel Kroening, editors, VSTTE 2014, volume 8471 of LNCS, pages 183–199. Springer, 2014.


    Modern theorem provers can discharge a significant proportion of Proof Obligation (POs) that arise in the use of Formal Method (FMs). Unfortunately, the residual POs require tedious manual guidance. On the positive side, these “difficult” POs tend to fall into families each of which requires only a few key ideas to unlock. This paper outlines a system that can lessen the burden of FM proofs by identifying and characterising ways of discharging POs of a family by tracking an interactive proof of one member of the family. This opens the possibility of capturing ideas - represented as proof strategies - from an expert and/or maximising reuse of ideas after changes to definitions. The proposed system has to store a wealth of meta-information about conjectures, which can be matched against previously learned strategies, or can be used to construct new strategies based on expert guidance.

  • Gudmund Grov, Aleks Kissinger, and Yuhui Lin. Tinker, tailor, solver, proof. In Christoph Benzmüller and Bruno Woltzenlogel Paleo, editors, UITP 2014, volume 167 of EPTCS, pages 23–34, 2014.


    We introduce Tinker, a tool for designing and evaluating proof strategies based on proof-strategy graphs, a formalism previously introduced by the authors. We represent proof strategies as open-graphs, which are directed graphs with additional input/output edges. Tactics appear as nodes in a graph, and can be ‘piped’ together by adding edges between them. Goals are added to the input edges of such a graph, and flow through the graph as the strategy is evaluated. Properties of the edges ensure that only the right ‘type’ of goals are accepted. In this paper, we detail the Tinker tool and show how it can be integrated with two different theorem provers: Isabelle and ProofPower.

  • Leo Freitas and Iain Whiteside. Proof patterns for formal methods. In Cliff B. Jones, Pekka Pihlajasaari, and Jun Sun, editors, FM 2014, volume 8442 of LNCS, pages 279–295. Springer, 2014.


    Design patterns represent a highly successful technique in software engineering, giving a reusable ‘best practice’ solution to commonly occurring problems in software design. Taking inspiration from this approach, this paper introduces proof patterns, which aim to provide a common vocabulary for solving formal methods proof obligations by capturing and describing solutions to common patterns of proof.

  • Gudmund Grov, Leo Freitas and Iain Whiteside, editors. Contribution of the AI4FM 2014 Workshop, Singapore, 2014. Department of Computer Science, Heriot-Watt University. Technical report HW-MACS-TR-0100.


    The AI4FM 2014 workshop was being held 13 May 2014 in Singapore, as a satellite event of FM 2014. This report contains a collection of abstracts for all the workshop talks. The main goal of the AI4FM workshop series is to bring together researchers in the areas of formal methods and artificial intelligence to address the issue of how artificial intelligence can be utilised to support the formal development process. This is the fifth workshop in this series, and previous events have been held in: Newcastle (2010), Edinburgh (2011), Schloss Dagstuhl (Germany, 2012) and Rennes (France, part of ITP 2013).

  • Leo Freitas, Cliff B. Jones, and Andrius Velykis. Can a system learn from interactive proofs? In Andrei Voronkov and Margarita Korovina, editors, HOWARD-60. A Festschrift on the Occasion of Howard Barringer’s 60th Birthday, pages 124-139. EasyChair, 2014.


    This paper sets out the on-going research in a project which is investigating how to learn from one interactive proof so that other similar proofs can be completed automatically.

Publication listing for 2013

  • Gudmund Grov, Aleks Kissinger and Yuhui Lin. A Graphical Language for Proof Strategies. In Logic for Programming Artificial Intelligence and Reasoning, LPAR-19th, in Stellenbosch, South Africa.


    Complex automated proof strategies are often difficult to extract, visualise, modify, and debug. Traditional tactic languages, often based on stack-based goal propagation, make it easy to write proofs that obscure the flow of goals between tactics and are fragile to minor changes in input, proof structure or changes to tactics themselves. Here, we address this by introducing a graphical language called PSGraph for writing proof strategies. Strategies are constructed visually by “wiring together” collections of tactics and evaluated by propagating goal nodes through the diagram via graph rewriting. Tactic nodes can have many output wires, and use a filtering procedure based on goal-types (predicates describing the features of a goal) to decide where best to send newly-generated sub-goals.

    In addition to making the flow of goal information explicit, the graphical language can fulfil the role of many tacticals using visual idioms like branching, merging, and feedback loops. We argue that this language enables development of more robust proof strategies and provide several examples, along with a prototype implementation in Isabelle.

  • Jonathan Heras, Ekaterina Komendantskaya, Moa Johannson and Ewen Maclean. Proof-Pattern Recognition and Lemma Discovery in ACL2. In Logic for Programming Artificial Intelligence and Reasoning, LPAR-19th, in Stellenbosch, South Africa.


    We present a novel technique for combining statistical ma- chine learning for proof-pattern recognition with symbolic methods for lemma discovery. The resulting tool, ACL2(ml), gathers proof statis- tics and uses statistical pattern-recognition to pre-processes data from libraries, and then suggests auxiliary lemmas in new proofs by analogy with already seen examples. This paper presents the implementation of ACL2(ml) alongside theoretical descriptions of the proof-pattern recog- nition and lemma discovery methods involved in it.

  • Leo Freitas, Cliff B. Jones, Andrius Velykis, and Iain Whiteside. How to say why (in AI4FM). Technical Report CS-TR-1398, School of Computing Science, Newcastle University, October 2013.


    In the AI4FM project we have set ourselves the challenge of building a system that can learn high-level proof strategies by monitoring expert users. A typical level of ambition is users who are proving the feasibility and reification of medium-sized specifications. The purpose of this report is to provide a source document. In particular, it (a) summarises some experiments in the use of verification tools to determine how realistic the ambition is of extracting the “why” from experts’ use of verification tools; and, (b) provides a revision of an earlier description of an abstract model of an AI4FM system that is linked to the case studies.

  • Cliff B. Jones, Leo Freitas, and Andrius Velykis. Ours is to reason why. In Zhiming Liu, Jim Woodcock, and Huibiao Zhu, editors, Theories of Programming and Formal Methods - Essays Dedicated to Jifeng He on the Occasion of His 70th Birthday, volume 8051 of LNCS, pages 227–243, Shanghai, China, September 2013. Springer.


    It is now widely understood how to write formal specifications so as to be able to justify designs (and thus implementations) against such specifications. In many formal approaches, a “posit and prove” approach allows a designer to record an engineering design decision from which a collection of “proof obligations” are generated; their discharge justifies the design step. Modern theorem proving tools greatly simplify the discharge of such proof obligations. In typical industrial applications, however, there remain sufficiently many proof obligations that require manual intervention that an engineer finds them a hurdle to the deployment of formal proofs. This problem is exacerbated by the need to repeat proofs when changes are made to specifications or designs. This paper outlines how a key additional resource can be brought to bear on the discharge of proof obligations: the central idea is to “learn” new ways of discharging families of proof obligations by tracking one interactive proof performed by an expert. Since what blocks any fixed set of heuristics from automatically discharging proof obligations is issues around data structures and/or functions, it is expected that what the system can learn from one interactive proof will facilitate the discharge of significant “families” of recalcitrant proof tasks.

  • Gudmund Grov, Ewen Maclean, and Leo Freitas, editors. Contributions to AI4FM 2013, Rennes, France, 2013. Department of Computer Science, Heriot-Watt University. Technical report HW-MACS-TR-0100.


    This report contains the contributions to the 4th international workshop on Artificial Intelligence for Formal Methods (AI4FM 2013). This was held as a satellite event of the 4th conference on Interactive Theorem Proving (ITP 2013), in Rennes (France) on July 22nd 2013

  • Andrew Ireland, Gudmund Grov, Maria Teresa Llano, and Michael Butler. Reasoned modelling critics: Turning failed proofs into modelling guidance. Science of Computer Programming, 78(3):293–309, 2013. Special issue: Abstract State Machines, Alloy, B and Z – Selected Papers from ABZ 2010.


    The activities of formal modelling and reasoning are closely related. But while the rigour of building formal models brings significant benefits, formal reasoning remains a major barrier to the wider acceptance of formalism within design. Here we propose reasoned modelling critics – an approach which aims to abstract away from the complexities of low-level proof obligations, and provide high-level modelling guidance to designers when proofs fail. Inspired by proof planning critics, the technique combines proof-failure analysis with modelling heuristics. Here, we present the details of our proposal, implement them in a prototype and outline future plans.

Publication listing for 2012

  • Alan Bundy, Dieter Hutter, Cliff B. Jones, and J Strother Moore. AI meets Formal Software Development (Dagstuhl Seminar 12271). Dagstuhl Reports, 2(7):1–29, 2012.


    This report documents the program and the outcomes of Dagstuhl Seminar 12271 “AI meets Formal Software Development”. This seminar brought together researchers from formal methods and AI. The participants addressed the issue of how AI can aid the formal software development process, including modelling and proof. There was a pleasing number of participants from industry and this made it possible to ground the discussions on industrial-scale problems.

  • Andrius Velykis. Inferring the Proof Process. In Christine Choppy, David Delayahe, and Kaïs Klaï, editors, Doctoral Symposium at 18th International Symposium on Formal Methods (FM2012), Paris, France, August 2012.


    This PhD project aims to investigate how enough information can be collected from an interactive formal proof to capture an expert’s ideas as a high-level proof process. It would then serve for extracting proof strategies to facilitate proof automation. Ways of inferring this proof process automatically are explored; and a family of tools is developed to capture the different proof processes and their features.

  • Iain Whiteside, David Aspinall, and Gudmund Grov. An essence of SSReflect. In Johan Jeuring, John A. Campbell, Jacques Carette, Gabriel Dos Reis, Petr Sojka, Makarius Wenzel, and Volker Sorge, editors, Conferences on Intelligent Computer Mathematics—CICM 2012, volume 7362 of LNCS, pages 186–201. Springer, 2012.


    SSReflect is a powerful language for proving theorems in the Coq system. It has been used for some of the largest proofs in formal mathematics thus far. However, although it constructs proofs in a formal system, like most other proof languages the semantics is informal making it difficult to reason about such proof scripts. We give a semantics to a subset of the language, using a hierarchical notion of proof tree, and show some simple transformations on proofs that preserve the semantics.

  • Gudmund Grov, Ekaterina Komendantskaya, and Alan Bundy. A statistical relational learning challenge – extracting proof strategies from exemplar proofs. In ICML-12 Workshop on Statistical Relational Learning (SRL 2012), Edinburgh, Scotland, June 2012.


    Proof automation is becoming a bottleneck in both mechanised mathematics and particularly industrial use of formal development methods. In this challenge paper, we argue that statistical relational learning can be used to discover and extract proof strategies, which can aid with this automation. We outline what we believe are key problems and possible approaches to solve these questions.

  • Yuhui Lin, Alan Bundy, and Gudmund Grov. The use of Rippling to automate Event-B invariant preservation proofs. In IJCAR Workshop on Automated Theory eXploration (ATX 2012), Manchester, UK, June 2012.


    Proof automation is a common bottleneck for industrial adoption of formal methods. In Event-B, a significant proportion of proof obligations which require human interaction fall into a family called invariant preservation. In this paper we show that a rewriting technique called rippling can increase the automation of proofs in this family, and extend this technique by combining two existing approaches.

    An earlier version of this paper has previously been published in NFM12.

  • Gudmund Grov, Andrew Ireland, and Maria Teresa Llano. Refinement plans for informed formal design. In John Derrick, John A. Fitzgerald, Stefania Gnesi, Sarfraz Khurshid, Michael Leuschel, Steve Reeves, and Elvinia Riccobene, editors, Abstract State Machines, Alloy, B, VDM, and Z - Third International Conference, ABZ 2012, volume 7316 of LNCS, pages 208–222. Springer, 2012.


    Refinement is a powerful technique for tackling the complexities that arise when formally modelling systems. Here we focus on a posit-and-prove style of refinement, and specifically where a user requires guidance in order to overcome a failed refinement step. We take an integrated approach – combining the complementary strengths of top-down planning and bottom-up theory formation. In this paper we focus mainly on the planning perspective. Specifically, we propose a new technique called refinement plans which combines both modelling and reasoning perspectives. When a refinement step fails, refinement plans provide a basis for automatically generating modelling guidance by abstracting away from the details of low-level proof failures. The refinement plans described here are currently being implemented for the Event-B modelling formalism, and have been assessed on paper using case studies drawn from the literature. Longer-term, our aim is to identify refinement plans that are applicable to a range of modelling formalisms.

  • Yuhui Lin, Alan Bundy, and Gudmund Grov. The use of Rippling to automate Event-B invariant preservation proofs. In Alwyn Goodloe and Suzette Person, editors, NASA Formal Methods—4th International Symposium, NFM 2012, volume 7226 of LNCS, pages 231–236. Springer, 2012.


    Proof automation is a common bottleneck for industrial adoption of formal methods. In Event-B, a significant proportion of proof obligations which require human interaction fall into a family called invariant preservation. In this paper we show that a rewriting technique called rippling can increase the automation of proofs in this family, and extend this technique by combining two existing approaches.

Publication listing for 2011

  • Iain Whiteside, David Aspinall, Lucas Dixon, and Gudmund Grov. Towards formal proof script refactoring. In James H. Davenport, William M. Farmer, Josef Urban, and Florian Rabe, editors, Intelligent Computer Mathematics—18th Symposium, Calculemus 2011, and 10th International Conference, MKM 2011, volume 6824 of LNCS, pages 260–275. Springer, 2011.


    We propose proof script refactorings as a robust tool for constructing, restructuring, and maintaining formal proof developments. We argue that a formal approach is vital, and illustrate by defining and proving correct a number of valuable refactorings in a simplified proof script and declarative proof language of our own design.

  • Productive use of failure in top-down formal methods

    A. Bundy, G. Grov, and Y. Lin

    Abstract: Proof obligations (POs) arise when applying formal methods (FMs). The undischarged POs can become a bottleneck in the use of FMs in industry. The work we proposed here, as a part of AI4FM project, aims at increasing the proportion of discharged POs by analysing failed proofs and related patches to classify POs as families and construct proof strategies, which can be used to guide proof search for the POs in the same families.

Publication listing for 2010

  • Learning from an expert's proof

    L. Freitas and C. B. Jones
    In proceedings of UV10 (Usable Verification) workshop.

    Abstract: This position paper outlines the background and current approaches taken within AI4FM, a 4-year research project aimed at combining AI methodologies to aid proof discovery of certain families of interest. Namely, those repeated proofs often appearing in the application of verification to industrial applications.

  • A small experiment in Event-B rippling

    G. Grov, A. Bundy, and L. Dixon
  • AI4FM — A new project seeking challenges!

    G. Grov and C. B. Jones
    In proceedings of Tools & Experiments workshop at VSTTE 2010.

    Abstract: The "proof obligations" generated from many formal method tend to be simple and can often be discharged by modern automatic theorem provers or SMT systems. However, those proof tasks that need hand –or interactive– intervention present a barrier to the use of formal methods. Theorem proving was one of the earliest challenges addressed by researchers in the area of Artificial Intelligence and enormous progress has been made in the provision of general purpose heuristics. The approach in the recently started AI4FM project is different: we hope to devise a system that will learn from an expert user how they tackle one interactive proof and then apply the discovered high-level strategy to other related proof tasks. We are fortunate in having access to many such problems through the DEPLOY project but are aware of the dan- gers of devising an overly specific approach. This short paper appeals for challenge problems from other sources.

  • Formal Modelling of Separation Kernel Components

    A. Velykis and L. Freitas

    Abstract: Separation kernels are key components in embedded applications. Their small size and widespread use in high-integrity environments make them good targets for formal modelling and verification. We summarise results from the mechanisation of a separation kernel scheduler using the Z/Eves theorem prover. We concentrate on key data structures to model scheduler operations. The results are part of an experiment in a Grand Challenge in software verification, as part of a pilot project in verified OS kernels. The project aims at creating a mechanised formal model of kernel components that gets refined to code. This provides a set of reusable components, proof strategies, and general lemmas. Important findings about properties and requirements are also discussed.

  • Ideas for a high-level proof strategy language

    C. B. Jones, G. Grov, and A. Bundy
    In proceedings of Fifth Automated Formal Methods workshops (AFM'10) — In association with Computer-Aided Verification (CAV) 2010, part of the Federated Logic Conference (FLoC).

    Abstract: Finding ways to prove theorems mechanically was one of the earliest challenges tackled by the AI community. Notable progress has been made but there is still always a limit to any set of heuristic search techniques. From a proof done by human users, we wish to find out whether AI techniques can also be used to learn from a human user. AI4FM (Artificial Intelligence for Formal Methods) is a four-year project that starts officially in April 2010. It focuses on helping users of "formal methods" many of which give rise to proof obligations that have to be (mechanically) verified (by a theorem prover). In industrial-sized developments, there are often a large number of proof obligations and, whilst many of them succumb to similar proof strategies, those that remain can hold up engineers trying to use formal methods. The goal of AI4FM is to learn enough from one manual proof, to discharge proof obligations automatically that yield to similar proof strategies. To achieve this, a high-level (proof) strategy language is required, and in this paper we outline some ideas of such language, and towards extracting them.

  • The AI4FM approach for proof automation within formal methods — A Grand Challenge 6 "Dependable Systems Evolution" project

    G. Grov, A. Bundy, C. B. Jones, and A. Ireland
    In UKCRC Submissions to Grand Challenges in Computing Research 2010 GCCR’10 (April, 2010)

Publication listing for 2009

  • An outline of a proposed system that learns from experts how to discharge proof obligations automatically

    A. Bundy, G. Grov, and C. B. Jones
    Extended abstract to appear in Dagstuhl proceedings on Refinement Based Methods (Seminar 09381; September, 2009).
  • Learning from experts to aid the automation of proof search

    A. Bundy, G. Grov, and C. B. Jones
    Short contribution to AVoCS’09

    Abstract: Most formal methods give rise to proof obligations (POs) which are putative lemmas that need proof. Discharging these POs can become a bottleneck in the use of formal methods in practical applications. It is our aim to increase the repertoire of techniques for reducing this bottleneck by tackling learning from proof attempts. In many cases where a correct PO has not been discharged, an expert can easily see how to complete a proof. We believe that it would be acceptable to rely on such expert intervention to do one proof if this would enable a system to kill off others "of the same form".



  • Capturing Proof Process

    Andrius Velykis. Capturing Proof Process. PhD thesis, School of Computing Science, Newcastle University, UK, 2015.

    Andrius' PhD thesis investigates capture, abstraction and inferring of interactive proof processes. The captured data can be used to extract reusable proof strategies as well as improve proof maintenance, teaching and training, etc. Associated tools, namely the ProofProcess framework, are available on GitHub.


    Proof automation is a common bottleneck for industrial adoption of formal methods. Heuristic search techniques fail to discharge every proof obligation (PO), and significant effort is spent on proving the remaining ones interactively. Luckily, they usually fall into several proof families, where a single idea is required to discharge all similar POs. However, interactive formal proof requires expertise and is expensive: repeating the ideas over multiple proofs adds up to significant costs.

    The AI4FM research project aims to alleviate the repetitive effort by “learning” from an expert doing interactive proof. The expert’s proof attempts can give rise to reusable strategies, which capture the ideas necessary to discharge similar POs. Automatic replay of these strategies would complete the remaining proof tasks within the same family, enabling the expert to focus on novel proof ideas.

    This thesis presents an architecture to capture the expert’s proof ideas as a high-level proof process. Expert insight is not reflected in low-level proof scripts, therefore a generic ProofProcess framework is developed to capture high-level proof information, such as proof intent and important proof features of the proof steps taken. The framework accommodates branching to represent the actual proof structure as well as layers of abstraction to accommodate different granularities. The full history of how the proof was discovered is recorded, including multiple attempts to capture alternative, failed or unfinished versions.

    A prototype implementation of the ProofProcess framework is available, including integrations with Isabelle and Z/EVES theorem provers. Two case studies illustrate how the ProofProcess systems are used to capture high-level proof processes in examples from industrial-style formal developments. Reuse of the captured information to discharge similar proofs within the examples is also explored.

    The captured high-level information facilitates extraction of reusable proof strategies. Furthermore, the data could be used for proof maintenance, training, proof metrics, and other use cases.

  • Extending the proof methods and critics of a proof planner

    MSc thesis in Cognitive Science by Daniel Raggi (School of Informatics, University of Edinburgh, 2011)
  • A study in the use of Event-B for system development from a software engineering viewpoint

    MSc thesis by Sasan Padidar (School of Informatics, University of Edinburgh, 2010)

AI4FM documents

  • Case for support — AI4FM research project proposal

    C. B. Jones, A. Bundy, and A. Ireland

Background information

The following resources have not been published as part of the AI4FM project. They are included as background information supporting the ideas behind the research.