CISPA
Browse
25987-Article Text-30050-1-2-20230626.pdf (277.11 kB)

Identifying Selection Bias from Observational Data

Download (277.11 kB)
conference contribution
posted on 2024-04-04, 06:56 authored by David Kaltenpoth, Jilles VreekenJilles Vreeken
Access to a representative sample from the population is an assumption that underpins all of machine learning. Selection effects can cause observations to instead come from a subpopulation, by which our inferences may be subject to bias. It is therefore important to know whether or not a sample is affected by selection effects. We study under which conditions we can identify selection bias and give results for both parametric and non-parametric families of distributions. Based on these results we develop two practical methods to determine whether or not an observed sample comes from a distribution subject to selection bias. Through extensive evaluation on synthetic and real world data we verify that our methods beat the state of the art both in detecting as well as characterizing selection bias.

History

Editor

Williams B ; Chen Y ; Neville J

Primary Research Area

  • Trustworthy Information Processing

Name of Conference

National Conference of the American Association for Artificial Intelligence (AAAI)

Journal

Proceedings of the AAAI Conference on Artificial Intelligence

Volume

37

Page Range

8177-8185

Publisher

Association for the Advancement of Artificial Intelligence (AAAI)

Open Access Type

  • Gold

BibTeX

@inproceedings{Kaltenpoth:Vreeken:2023, title = "Identifying Selection Bias from Observational Data", author = "Kaltenpoth, David" AND "Vreeken, Jilles", editor = "Williams, Brian" AND "Chen, Yiling" AND "Neville, Jennifer", year = 2023, month = 6, journal = "Proceedings of the AAAI Conference on Artificial Intelligence", number = "7", pages = "8177--8185", publisher = "Association for the Advancement of Artificial Intelligence (AAAI)", issn = "2159-5399", doi = "10.1609/aaai.v37i7.25987" }

Usage metrics

    Categories

    No categories selected

    Licence

    Exports

    RefWorks
    BibTeX
    Ref. manager
    Endnote
    DataCite
    NLM
    DC