13 Papers accepted at IEEE S&P

May 18, 2026

Scientists at MPI-SP published 13 research articles at one of the top security conferences in the world. The 47^th IEEE Symposium on Security and Privacy will take place this week in San Francisco.

"I Wonder if These Warnings Are Accurate": Security and Privacy Advice in Nine Majority World Countries

Authors: Collins W. Munyendo, Veronica A. Rivera, Jackie Hu, Emmanuel Tweneboah, Amna Shahnawaz, Karen Sowon, Dilara Keküllüoğlu, Marcos Silva, Yue Deng, Mercy Omeiza, Gayatri Priyadarsini Kancherla, Maria Rosario Niniz Silva, Maryam Mustafa, Abhishek Bichhawat, Francisco Marmolejo-Cossio, Elissa M. Redmiles, Yixin Zou

Security and privacy (S&P) advice plays a crucial role in how people stay safe online. While prior work shows that the plethora of advice from varied sources makes it difficult for users to prioritize advice, the insights are primarily based on studies conducted in Western contexts. Other work shows that users outside the West have different S&P needs and thus, we cannot simply rely on advice curated in the West to generalize to the majority world—regions of Africa, Asia, Latin America, and the Middle East, where most of the world’s population lives. We fill this gap by investigating S&P advice across nine majority world countries via 70 semi-structured interviews with local experts: cybercafe operators, tech repair specialists, and other community figures that people commonly rely on for tech support and S&P advice. We find that the advice provided by local experts in the majority world largely matches the advice they provide to their constituents and the advice from the West. However, we surface various significant barriers that hinder majority world users from implementing advice, including economic constraints, language barriers, and social friction from taking protective measures. Our findings further show how factors such as social norms and gender shape advice practices, e.g., by driving gendered advice-seeking. We discuss how S&P advice in the majority world can be improved and reflect on how the S&P community can better engage with local communities in conducting similar research.

Original publication: https://www.computer.org/csdl/proceedings-article/sp/2026/606500a076/2bojv9k3yOk

SmuFuzz: Enable Deep System Management Mode Fuzzing in Fully Featured UEFI Runtime Environment

Authors: Jianqiang Wang, Yi Xiang, Meng Wang, Qinying Wang, Ali Abbasi, Thorsten Holz

As part of the UEFI standard, System Management Mode (SMM) was introduced on x86 processors to handle critical hardware events. With strict access control to this operating mode, SMM applications run at a high privilege level (known as Ring -2), in which they have (almost) unlimited access to system resources. However, vendors commonly use memory-unsafe system programming languages to develop SMM applications, which makes them vulnerable to memory corruption and an appealing target for attackers. Fuzzing is an effective method for detecting memory corruption vulnerabilities across a wide range of applications. Unfortunately, existing approaches for testing SMM applications lack a UEFI runtime environment to properly support SMM application execution. Without this environment, application data is often not correctly initialized. Once such uninitialized data is accessed during fuzzing, it causes premature exits or unintentional crashes. As a result, existing methods can only explore shallow parts and often produce high false-positive rates. In this paper, we propose SmuFuzz, a fuzzing framework designed to detect vulnerabilities in closed-source SMM applications distributed by vendors. SmuFuzz overcomes prior limitations by partially rehosting SMM applications within a custom infrastructure that provides a fully featured UEFI runtime environment. This infrastructure provides the necessary dependencies and runtime for SMM application preparation, initialization, and finalization. In addition, SmuFuzz automatically infers the complex SMM application input semantics for deep exploration. In our experiment, SmuFuzz achieved 4.45x higher unique basic block coverage compared to state-of-the-art fuzzers. It also found more vulnerabilities while significantly reducing false positives. Using SmuFuzz, we identified 38 new vulnerabilities in firmware from major vendors, all of which were disclosed responsibly.

Original publication: https://www.computer.org/csdl/proceedings-article/sp/2026/606500a187/2bojvesfTCE

Toward Inclusive Security and Privacy for Deaf and Hard-of-Hearing People: A Community-Based Interview Study

Authors: Mindy Tran, Xinru Tang, Adryana Hutchinson, Adam J. Aviv, Yixin Zou

About 5% of the world’s population experience disabling hearing loss. Nevertheless, deaf and hard-of-hearing (DHH) communities remain an understudied and underserved population in security and privacy (S&P) research. We conducted 24 semi-structured interviews with DHH participants (n=17) and their supporters (n=7) in Germany to explore (1) how DHH people perceive S&P risks in assistive technologies, (2) concerns about disclosing their identity and sharing sign language content online, and (3) sources of advice and common challenges. Our findings highlight participants’ limited awareness of S&P risks in assistive hearing devices and limited interest in sign language video anonymization tools. DHH participants expressed concerns about identity disclosure— whether voluntary, involuntary, or mediated by third parties— and found existing S&P mechanisms and resources largely inaccessible. As a result, they often relied on trusted networks for support. While supporters were generally willing to help, their limited S&P knowledge, social dynamics within the DHH community, and translation challenges between spoken and sign languages hindered effective information sharing. Our research provides implications for researchers, industry practitioners, and policymakers to develop more effective and inclusive S&P tools and resources for DHH communities.

Original publication: https://www.computer.org/csdl/proceedings-article/sp/2026/606500a494/2bojvrh2JPi

Cottontail: Large Language Model-Driven Concolic Execution for Highly Structured Test Input Generation

Authors: Haoxin Tu, Seongmin Lee, Yuxian Li, Peng Chen, Lingxiao Jiang, Marcel Böhme

How can we perform concolic execution to generate highly structured test inputs for systematically testing parsing programs? Existing concolic execution engines are significantly restricted by (1) input structure-agnostic path constraint selection, leading to the waste of testing effort or missing coverage; (2) limited constraint-solving capability, yielding many syntactically invalid test inputs; (3) reliance on manual acquisition of highly structured seed inputs, resulting in non-continuous testing. This paper proposes Cottontail, a new Large Language Model (LLM)-driven concolic execution engine, to mitigate the above limitations. A more complete program path representation, named Expressive Coverage Tree (ECT), is first constructed to help select structure-aware path constraints. Later, an LLM-driven constraint solver based on a Solve-Complete paradigm is designed to solve the path constraints smartly to get test inputs that are not only satisfiable to the constraints but also valid to the input syntax. Finally, a history-guided seed acquisition is employed to obtain new highly structured test inputs either before testing starts or after testing is saturated. We implemented Cottontail on top of SymCC and evaluated eight extensively tested open-source libraries across four different formats (XML, SQL, JavaScript, and JSON). The experimental result is promising: Cottontail significantly outperforms baseline approaches by 30.73% and 41.32% on average in terms of line and branch coverage. Besides, Cottontail found six previously unknown vulnerabilities (six CVEs assigned). We have reported these issues to developers, and four out of them have been fixed so far.

Original publication: https://www.computer.org/csdl/proceedings-article/sp/2026/606500c046/2bojwAJpW0g

LeakyLinks: Measuring the Security and Privacy Risks of URL Scanning Services

Authors: Ali Mustafa, Jannis Rautenstrauch, Florian Hantke, Shubham Agarwal, Stefano Calzavara, Ben Stock

URL scanning services are widely used in security workflows to detect malicious websites and protect users from online threats. However, their common practice of publicly indexing scanned URLs may unintentionally expose sensitive user information through URL-embedded access credentials. Although isolated accounts of such privacy incidents exist, a systematic assessment of their prevalence is still lacking. We present LEAKYLINKS, an automated analysis pipeline that combines URL filtering with LLM-driven semantic classification to identify URLs exposing Sensitive Personal Information (SPI). Using LEAKYLINKS, we analyze URLs collected from public feeds of six prominent URL scanning services over a period of three weeks. With the framework, we visited 338k URLs, identifying over 4k URLs which leak SPI with a precision of 97%. To further assess the extent to which published URLs are actively accessed by third parties, we deploy honeypages and submit their links to the selected URL scanning services. Our measurements confirm that external entities access URLs submitted to these scanners, often from potentially suspicious IPs exhibiting behavior commonly associated with reconnaissance or opportunistic probing. Taken together, these findings indicate that URL scanning services represent a valuable target for web adversaries and may already be subject to active exploitation in the wild.

Original publication: https://www.computer.org/csdl/proceedings-article/sp/2026/606500c347/2geEW3Coo7K

Jazzer: Coverage-Guided Fuzzing for Semantic Vulnerabilities in the Java Ecosystem

Authors: Sergej Dechand, Tobias Wienand, Fabian Meumertzheim, Peter Samarin, Simon Resch, Khaled Yakdan, Thorsten Holz, Flavio Toffalini

Fuzz testing has proven highly effective in uncovering software faults in low-level languages such as C and C++. Yet, memory-safe ecosystems like the Java Virtual Machine (JVM), which powers the majority of enterprise applications, have received limited attention from fuzzing research. Recent high-impact vulnerabilities such as Log4Shell and Spring4Shell highlight that memory-safe languages remain susceptible to severe security risks, including logic errors, injection vulnerabilities, and unsafe deserialization. Such vulnerability classes typically lie beyond the detection capabilities of traditional fuzzing frameworks, which are primarily designed to detect memory safety violations.In this paper, we address this gap with Jazzer, a fuzzing framework specifically designed for JVM applications. Jazzer adapts proven fuzzing techniques to the JVM via bytecode instrumentation, translating Java's high-level constructs into low-level coverage and trace feedback. To detect vulnerabilities beyond memory corruption, it complements C/C++ sanitizers with guiding oracles that hook into JVM APIs and provide guidance within sinks to uncover Java-specific vulnerabilities. Our comprehensive evaluation against JQF, the state-of-the-art Java fuzzer, shows that Jazzer achieves higher coverage and faster execution speed across eleven diverse libraries, while discovering 18 bugs missed by prior work. Finally, we demonstrate real-world impact through large-scale deployment in OSS-Fuzz, where Jazzer has continuously fuzzed 205 open-source Java projects over a period of three years. This field study resulted in the discovery of 1217 confirmed and fixed security issues.

Original publication: https://www.computer.org/csdl/proceedings-article/sp/2026/606500c407/2geEW7Xu7qU

Human-Centered Threat Modeling in Practice: Lessons, Challenges, and Paths Forward

Authors: Warda Usman, Yixin Zou, Daniel Zappala

Human-centered threat modeling (HCTM) is an emerging area within security and privacy research that focuses on how people define and navigate threats in various social, cultural, and technological contexts. While researchers increasingly approach threat modeling from a human-centered perspective, little is known about how they prepare for and engage with HCTM in practice. In this work, we conduct 23 semi-structured interviews with researchers to examine the state of HCTM, including how researchers design studies, elicit threats, and navigate values, constraints, and long-term goals. We find that HCTM is not a prescriptive process but a set of evolving practices shaped by relationships with participants, disciplinary backgrounds, and institutional structures. Researchers approach threat modeling through sustained groundwork and participant-centered inquiry, guided by values such as care, justice, and autonomy. They also face challenges including emotional strain, ethical dilemmas, and structural barriers that complicate efforts to translate findings into real-world impact. We conclude by identifying opportunities to advance HCTM through shared infrastructure, broader recognition of diverse contributions, and stronger mechanisms for translating findings into policy, design, and societal change.

Original publication: https://arxiv.org/abs/2511.13781

The Battle of Metasurfaces: Understanding Security in Smart Radio Environments

Authors: Paul Staat, Christof Paar, Swarun Kumar

Metasurfaces, or Reconfigurable Intelligent Surfaces (RISs), have emerged as a transformative technology for next-generation wireless systems, enabling digitally controlled manipulation of electromagnetic wave propagation. By turning the traditionally passive radio environment into a smart, programmable medium, metasurfaces promise advances in communication and sensing. However, metasurfaces also present a new security frontier: both attackers and defenders can exploit them to alter wireless propagation for their own advantage. While prior security research has primarily explored unilateral metasurface applications - empowering either attackers or defenders - this work investigates symmetric scenarios, where both sides possess comparable metasurface capabilities. Using both theoretical modeling and real-world experiments, we analyze how competing metasurfaces interact for diverse objectives, including signal power and sensing perception. Thereby, we present the first systematic study of context-agnostic metasurface-to-metasurface interactions and their implications for wireless security. Our results reveal that the outcome of metasurface "battles" depends on an interplay of timing, placement, algorithmic strategy, and hardware scale. Across multiple case studies in Wi-Fi environments, including wireless jamming, channel obfuscation for sensing and communication, and sensing spoofing, we demonstrate that opposing metasurfaces can substantially or fully negate each other's effects. By undermining previously proposed security and privacy schemes, our findings open new opportunities for designing resilient and high-assurance physical-layer systems in smart radio environments.

Original publication: https://arxiv.org/abs/2511.13939

Heap Localization: Cache Side-Channel based Linux Kernel Heap Exploit Techniques

Authors: Yoochan Lee, Sihyun Roh, Hyuk Kwon, Byoungyoung Lee, Thorsten Holz

As kernel mitigations that reduce exploit success rates continue to be deployed, exploitation techniques have become increasingly sophisticated to maintain high reliability under such constrained environments. These techniques, however, fundamentally rely on precise knowledge of the locations of kernel heap objects—information that is not available to unprivileged users and forces attackers to depend on coarse and speculative inferences about allocator behavior. As a result, existing exploit techniques inevitably exhibit structural failure cases when vulnerable or target objects occupy unexpected intra-page positions. To address this limitation, we present Heap Localization, the first primitive that enables object-level heap layout inference in the Linux kernel. Heap Localization recovers the precise intra-page offset of kernel heap objects by exploiting deterministic VIPT L1 cache behavior, enabling deterministic object placement without requiring memory disclosure. By providing exact object-location information, Heap Localization eliminates layout-induced failure cases and transforms several previously probabilistic heap exploitation techniques into deterministic ones. Our evaluation demonstrates that Heap Localization consistently localizes and reliably positions objects, achieving average success rates of 99.3% in the idle state and 95.7% under heavy load. We further demonstrate its practicality by applying Heap Localization to real-world kernel vulnerabilities, where it significantly increases exploit reliability.

Original publication: https://www.computer.org/csdl/proceedings-article/sp/2026/606500d092/2geEWSgn6Zq

Hardware Trojans from Invisible Inversions: On the Trojanizability of Standard Cell Libraries

Authors: Kolja Dorschel, René Walendy, Lukas Plätz, Thorben Moos, Christof Paar, Steffen Becker

At S&P 2023, Puschner et al. made a valuable dataset for hardware Trojan detection research publicly available. It contains a complete set of Scanning Electron Microscope (SEM) images of four different digital Integrated Circuits (ICs) fabricated at progressively smaller semiconductor technology nodes. Puschner et al. reported preliminary evidence that feature sizes affect Trojan detection performance, but they were unable to disentangle effects caused by insertion strategies or by degrading image quality from those intrinsic to the underlying standard cell libraries. Distinguishing those causes, however, is crucial to understand whether improved tooling (e. g., higher resolution imaging equipment) can remove the observed technology bias, or whether susceptibility to stealthy hardware Trojans is indeed an inherent property of a cell library. In this work, we dive deep into the S&P 2023 dataset to answer these questions. We devise alternative metrics to those of Puschner et al., in order to assess and compare the potential susceptibility of standard cell libraries more meaningfully. We find clear differences between the evaluated process nodes. However, in all cases we identify cells that implement distinct logic functions yet are visually indistinguishable in backside SEM images. We exploit this property to construct stealthy, standard-cell-based hardware Trojans and present a concrete case study: a privilege-escalation backdoor in an Ibex RISC-V core. Our results demonstrate that cell libraries can – and should – be evaluated for their potential “Trojanizability”, and we recommend practical defenses.

Original publication: https://www.computer.org/csdl/proceedings-article/sp/2026/606500d378/2geEX7igd1e

Evaluating Concept Filtering Defenses against Child Sexual Abuse Material Generation by Text-to-Image Models

Authors: Ana-Maria Cretu, Klim Kireev, Amro Abdalla, Wisdom Obinna, Raphael Meier, Sarah Adel Bargal, Elissa M. Redmiles, Carmela Troncoso
We evaluate the effectiveness of filtering child images from training datasets of text-to-image models to prevent model misuse to create child sexual abuse material (CSAM). First, we capture the complexity of preventing CSAM generation using a game-based security definition. Second, we show that current detection methods cannot remove all children from a dataset. Third, using an ethical proxy for CSAM (a child wearing glasses), we show that even when only a small percentage of child images are left in the training dataset after filtering, there exist prompting strategies that generate a child wearing glasses using only a few more queries than when the model is trained on the unfiltered data. Fine-tuning the filtered model on child images further reduces the additional query overhead. We also show that re-introducing a concept is possible via fine-tuning even if filtering is perfect. Our results show that current child filtering methods offer limited protection to closed-weight models and no protection to open-weight models, while reducing the generality of the model by hindering the generation of child-related concepts or changing their representation. We conclude by outlining challenges in conducting evaluations that establish robust evidence on the impact of concept filtering defenses for CSAM.

Original publication: https://arxiv.org/abs/2512.05707

LISA: A Scale-Optimized and Psychometrically-Validated Instrument for the Lightweight Assessment of Organizational Information Security Awareness in Heterogeneous Organizations

Authors: David Langer, Jan Tolsdorf, Luigi Lo Iacono

Human factors are central to an organization’s information security. Information Security Awareness (ISA) is a key construct in behavioral and organizational models explaining employees’ security compliance. However, existing ISA measures often lack theoretical grounding, psychometric rigor, and organizational relevance, or are too lengthy and complex for practical application. These shortcomings hinder empirical testing of behavioral models and the integration of ISA as a variable in organizational research. This paper introduces the Lightweight Information Security Awareness (LISA) scale – the first theory-based, psychometrically validated, and cross-language scale for efficiently assessing ISA in heterogeneous organizational contexts, balancing measurement precision with practical feasibility. Validation involved 1,182 participants from survey panels and 579 employees of a large German university hospital, representing a heterogeneous workforce. LISA demonstrates high internal consistency, measurement invariance across English and German, and strong construct and ecological validity. By correlating LISA with 11 enablers and barriers of organizational information security and differentiating it by a heterogeneous workforce in a hospital context, we demonstrate its ability to support both scientific investigations and practical assessments. LISA provides a quick, reliable, valid, and practical solution for measuring organizational ISA, ultimately offering researchers and practitioners without psychometric expertise a validated tool that is applicable in both behavioral models and everyday organizational environments.

Original publication: https://www.computer.org/csdl/proceedings-article/sp/2026/606500c869/2geEWCuuZTa

Language-Agnostic Detection of Computation-Constraint Inconsistencies in ZKP Programs via Value Inference

Authors: Arman Kolozyan, Bram Vandenbogaerde, Janwillem Swalens, Lode Hoste, Stefanos Chaliasos, Coen De Roover

Zero-knowledge proofs (ZKPs) allow a prover to convince a verifier of a statement's truth without revealing any other information. In recent years, ZKPs have matured into a practical technology underpinning major applications. However, implementing ZKP programs remains challenging, as they operate over arithmetic circuits that encode the logic of both the prover and the verifier. Therefore, developers must not only express the computations for generating proofs, but also explicitly specify the constraints for verification. As recent studies have shown, this decoupling may lead to critical ZKP-specific vulnerabilities. Unfortunately, existing tools for detecting them are limited, as they: (1) are tightly coupled to specific ZKP languages, (2) are confined to the constraint level, preventing reasoning about the underlying computations, (3) target only a narrow class of bugs, and (4) suffer from scalability bottlenecks due to reliance on SMT solvers. To address these limitations, we propose a language-agnostic formal model, called the Domain Consistency Model (DCM), which captures the relationship between computations and constraints. Using this model, we provide a taxonomy of vulnerabilities based on computation-constraint mismatches, including novel subclasses overlooked by existing models. Next, we implement an IR-based bug detection tool, called CCC-Check, which is based on abstract interpretation. Our evaluation shows that CCC-Check is, on average, two orders of magnitude faster than the SoTA verification tool CIVER, while achieving comparable precision. Finally, using the DCM, we examine six widely adopted ZKP projects and uncover 15 previously unknown vulnerabilities. We reported these bugs to the projects' maintainers, 13 of which have since been patched. Of these 15 vulnerabilities, 12 could not be captured by existing models.

Original publication: https://www.computer.org/csdl/proceedings-article/sp/2026/606500d224/2geEWZFmujK