CISPA
Browse
cispa_all_3101.pdf (728.92 kB)

Mining Input Grammars from Dynamic Control Flow

Download (728.92 kB)
conference contribution
posted on 2023-11-29, 18:13 authored by Rahul Gopinath, Björn MathisBjörn Mathis, Andreas ZellerAndreas Zeller
One of the key properties of a program is its input specification. Having a formal input specification can be critical in fields such as vulnerability analysis, reverse engineering, software testing, clone detection, or refactoring. Unfortunately, accurate input specifications for typical programs are often unavailable or out of date. In this paper, we present a general algorithm that takes a program and a small set of sample inputs and automatically infers a readable context-free grammar capturing the input language of the program. We infer the syntactic input structure only by observing access of input characters at different locations of the input parser. This works on all stack based recursive descent input parsers, including parser combinators, and works entirely without program specific heuristics. Our Mimid prototype produced accurate and readable grammars for a variety of evaluation subjects, including complex languages such as JSON, TinyC, and JavaScript.

History

Preferred Citation

Rahul Gopinath, Björn Mathis and Andreas Zeller. Mining Input Grammars from Dynamic Control Flow. In: European Software Engineering Conference and Symposium on the Foundations of Software Engineering (ESEC/FSE). 2020.

Primary Research Area

  • Secure Connected and Mobile Systems

Name of Conference

European Software Engineering Conference and Symposium on the Foundations of Software Engineering (ESEC/FSE)

Legacy Posted Date

2020-06-09

Open Access Type

  • Unknown

BibTeX

@inproceedings{cispa_all_3101, title = "Mining Input Grammars from Dynamic Control Flow", author = "Gopinath, Rahul and Mathis, Björn and Zeller, Andreas", booktitle="{European Software Engineering Conference and Symposium on the Foundations of Software Engineering (ESEC/FSE)}", year="2020", }

Usage metrics

    Categories

    No categories selected

    Exports

    RefWorks
    BibTeX
    Ref. manager
    Endnote
    DataCite
    NLM
    DC