CISPA
Browse

Inputs from Hell: Learning Input Distributions for Grammar-Based Test Generation

Download (585.26 kB)
journal contribution
posted on 2023-11-29, 18:07 authored by Ezekiel Soremekun, Esteban Pavese, Nikolas Havrikov, Lars Grunske, Andreas ZellerAndreas Zeller
Grammars can serve as producers for structured test inputs that are syntactically correct by construction. A probabilistic grammar assigns probabilities to individual productions, thus controlling the distribution of input elements. Using the grammars as input parsers, we show how to learn input distributions from input samples, allowing to create inputs that are similar to the sample; by inverting the probabilities, we can create inputs that are dissimilar to the sample. This allows for three test generation strategies: 1) “Common inputs” – by learning from common inputs, we can create inputs that are similar to the sample; this is useful for regression testing. 2) “Uncommon inputs” – learning from common inputs and inverting probabilities yields inputs that are strongly dissimilar to the sample; this is useful for completing a test suite with “inputs from hell” that test uncommon features, yet are syntactically valid. 3) “Failure-inducing inputs” – learning from inputs that caused failures in the past gives us inputs that share similar features and thus also have a high chance of triggering bugs; this is useful for testing the completeness of fixes. Our evaluation on three common input formats (JSON, JavaScript, CSS) shows the effectiveness of these approaches. Results show that “common inputs” reproduced 96% of the methods induced by the samples. In contrast, for almost all subjects (95%), the “uncommon inputs” covered significantly different methods from the samples. Learning from failure-inducing samples reproduced all exceptions (100%) triggered by the failure-inducing samples and discovered new exceptions not found in any of the samples learned from.

History

Preferred Citation

Ezekiel Soremekun, Esteban Pavese, Nikolas Havrikov, Lars Grunske and Andreas Zeller. Inputs from Hell: Learning Input Distributions for Grammar-Based Test Generation. In: IEEE Transactions on Software Engineering. 2020.

Primary Research Area

  • Secure Connected and Mobile Systems

Legacy Posted Date

2020-08-03

Journal

IEEE Transactions on Software Engineering

Open Access Type

  • Green

Sub Type

  • Article

BibTeX

@article{cispa_all_3167, title = "Inputs from Hell: Learning Input Distributions for Grammar-Based Test Generation", author = "Soremekun, Ezekiel and Pavese, Esteban and Havrikov, Nikolas and Grunske, Lars and Zeller, Andreas", journal="{IEEE Transactions on Software Engineering}", year="2020", }

Usage metrics

    Categories

    No categories selected

    Exports

    RefWorks
    BibTeX
    Ref. manager
    Endnote
    DataCite
    NLM
    DC