Lucas Waye

Publications Distribution Security Privacy

The majority of my time as a computer scientist has been spent on communicating my ideas (in emails, meetings, presentations, design reviews, etc.) rather than actually developing them. Here you can find my published write-ups of some of what I have worked on in the past.

Click on one of the buttons above to see my publications on that topic.

Policy Zones: How Meta enforces purpose limitation at scale in batch processing systems , Blog Post (with Wenlong Dong, Boris Shulman, Yizhou Yan, and Yuxuan Zhang)
Meta's use of Information-Flow Control for batch processing systems

abstract: In this post, we walk through how Policy Zones supports data management in batch processing systems—including data warehouses, streaming, and AI systems—using an example we’re all familiar with: messaging via Facebook and the Family of Apps.
It’s easy to overlook just how vast Meta’s backend tech stack is. Orchestrating all the data that runs through our enormous batch processing systems is no trivial feat. Policy Zones integrates with Meta’s exabyte-scale data warehouse and processing systems using runtime enforcement and SQL parsing to propagate and support data annotations across millions of daily data flows, performing trillions of checks per hour, and through our stream processing systems which transport multiple petabytes per hour.
Lucas Waye, Wenlong Dong, Boris Shulman, Yizhou Yan, and Yuxuan Zhang
Blog post. July 2025. [Blog Post]
How Meta enforces purpose limitation via Privacy Aware Infrastructure at scale , Blog Post (with Wenlong Dong, Rituraj Kirti, Diana Marsala, Kendall Hopkins, Hannes Roth, Jonathan Bergeron, Alex Ponomarenko, and Avtar Brar)
Meta's use of Information-Flow Control

abstract: At Meta, we’ve been diligently working to incorporate privacy into different systems of our software stack over the past few years. Today, we’re excited to share some cutting-edge technologies that are part of our Privacy Aware Infrastructure (PAI) initiative. These innovations mark a major milestone in our ongoing commitment to honoring user privacy.
PAI offers efficient and reliable first-class privacy constructs embedded in Meta infrastructure to address complex privacy issues. For example, we built Policy Zones that apply across our infrastructure to address restrictions on data, such as using it only for allowed purposes, providing strong guarantees for limiting the purposes of its processing.
As we expanded PAI across Meta, increasing its maturity, we gained valuable insights. Our understanding of the technology evolved, revealing the need for a larger investment than initially planned to create a cohesive ecosystem of libraries, tool suites, integrations, and more. These investments have been crucial in enforcing complex purpose limitation scenarios while ensuring scalability, reliability, and a streamlined developer experience.
Wenlong Dong, Rituraj Kirti, Diana Marsala, Kendall Hopkins, Lucas Waye, Hannes Roth, Jonathan Bergeron, Alex Ponomarenko, and Avtar Brar.
Blog post. August 2024. [Blog Post]
Systems and methods for displaying supplemental content for print media using augmented reality , US Patent (with Theresa Tokesky, Michael A. Montalto, and Kanagasabai Sivanadian)
Integrate print media into augmented reality

abstract: Systems and methods are provided for displaying supplemental content for print media using augmented reality. A user profile for a user of an augmented reality device is determined. Content of the print media is searched to identify a first portion of the print media that matches the user profile and a second portion of the print media that does not match the user profile. Supplemental content is obtained based on content of the first portion of the print media. A display of the supplemental content is positioned over the second portion of the print media.
Lucas Waye, Theresa Tokesky, Michael A. Montalto, and Kanagasabai Sivanadian
US Patent 10,580,215. March 2020.
[Google Patent]
Systems and methods for alerting a user to missed content in previously accessed media , US Patent (with Clayton Kim, Richard Eric Miller, and Matthew John Emerson)
Intelligent alerts about missed content while watching TV

abstract: Methods and systems are described herein for notifying a user about content that the user has previously missed in media. Users may get easily distracted when viewing media by mobile device notifications or other users viewing the media with the user. Consequentially, the user may miss content in media. The user may miss the same content in the media when consuming the media for a second time if the distractions are still present. An interactive media guide determines when the user has previously missed content in media and alerts the user when the content is going to be missed for a second time.
Clayton Kim, Lucas Waye, Richard Eric Miller, and Matthew John Emerson
US Patent 10,341,742. July 2019.
[Google Patent]
MAC A Verified Static Information-Flow Control Library , JLAMP 2018 (with Marco Vassena, Alejandro Russo, and Pablo Buiras)
A fully verified core model of the dynamic information-flow control library LIO

abstract: The programming language Haskell plays a unique, privileged role in information-flow control (IFC) research: it is able to enforce information security via libraries. Many state-of-the-art IFC libraries (e.g., LIO and HLIO) support a variety of advanced features like mutable data structures, exceptions, and concurrency, whose subtle interaction makes verification of security guarantees challenging. In this work, we focus on MAC, a statically-enforced IFC library for Haskell. In MAC, like other IFC libraries, computations have a well-established algebraic structure for computations (i.e., monads) responsable to manipulate labeled values—values coming from an abstract data type which associates a sensitivity label to a piece of information. In this work, we enrich labeled values with a functor structure and provide an applicative functor operator which encourages a more functional programming style and simplifies code. Furthermore, we present a full-fledged, mechanically-verified model of MAC. Specifically, we show progress-insensitive noninterference for our sequential calculus and pinpoint sufficient requirements on the scheduler to prove progress-sensitive noninterference for our concurrent calculus. For that, we study the security guarantees of MAC using term erasure, a proof technique that ensures that the same public output should be produced if secrets are erased before or after program execution. As another contribution, we extend term erasure with two-steps erasure, a flexible novel technique that greatly simplifies the noninterference proof and helps to prove many advanced features of MAC.
Marco Vassena, Alejandro Russo, Pablo Buiras, and Lucas Waye
In Journal of Logical and Algebraic Methods in Programming (JLAMP), 2018.
[Paper]
Cryptographically secure information flow control on key-value stores , CCS 2017 (with Pablo Buiras, Owen Arden, Alejandro Russo, and Stephen Chong)
An extension to LIO that automatically and correctly uses cryptography when interacting with an untrusted store

abstract: We present Clio, an information flow control (IFC) system that transparently incorporates cryptography to enforce confidentiality and integrity policies on untrusted storage. Clio insulates developers from explicitly manipulating keys and cryptographic primitives by leveraging the policy language of the IFC system to automatically use the appropriate keys and correct cryptographic operations. We prove that Clio is secure with a novel proof technique composing cryptographic proof techniques with standard programming language techniques. We present a prototype Clio implementation along with a case study that demonstrates Clio's practicality.
Lucas Waye, Pablo Buiras, Owen Arden, Alejandro Russo and Stephen Chong
In Proceedings of the 24th ACM Conference on Computer and Communications Security (CCS), October 2017.
[Paper | Presentation Video | arXiv (with proofs) | Artifact]
Composable enhancements for gradual assurances , Ph.D. dissertation, Harvard University, June 2017.
A presentation of Whip, Clio, and restricted privileges

abstract: This dissertation presents three enhancements to software components that increase the trustworthiness and usability of the systems they comprise. The enhancements are composable: they are local to the components they enhance and are self-contained. Since they are self-contained and local, composable enhancements can be partially deployed in a system without adversely affecting system behavior. The assurances provided by the enhancements are gradual: as the number of enhanced components in the system increases, the overall assurances provided by the entire system also increases. This dissertation investigates these enhancements in three different settings. In each setting, it identifies an inherent security or functional correctness problem with the components in that setting and provides a composable enhancement that remedies the problem gradually.
In the first setting, this dissertation identifies that although modern services expose interfaces that are higher-order in spirit, the simplicity of the network protocols used forces services to rely on brittle encodings. To bridge the semantic gap, this dissertation presents Whip, a higher-order contract system that allows programmers to detect when services do not live up to their advertised higher-order interfaces. As more services use Whip, it becomes easier to identify errant services in the system.
In the second setting, this dissertation identifies that in the Disjunction Category label model, capability-like privileges are used to downgrade information, which when used inappropriately can compromise security. To ensure privileges are used as intended, this dissertation presents restricted privileges, an enhancement to privileges to control downgrading based on a specification of security conditions for when they can be legitimately used. As more privileges are restricted, information in the system becomes more protected from the accidental or malicious exercise of privileges to downgrade more information than intended.
In the third setting, this dissertation identifies that information-flow control (IFC) programs often need to interact with a key-value store that can also be accessed by non-IFC programs. These non-IFC programs may inadvertently (or maliciously) fail to respect the policies enforced by the IFC programs. In order to ensure the information protected by the IFC programs is not exfiltrated or corrupted by non-IFC programs through the key-value store, this dissertation presents Clio, an extension to a popular IFC language that transparently incorporates cryptography for data on the untrustworthy key-value store. As more IFC programs use Clio, the more information is protected from non-IFC programs in the system.
Lucas Waye. Harvard University, Ph.D. dissertation, June 2017.
[Dissertation]
Whip: Higher-order contracts for modern services , ICFP 2017 (with Christos Dimoulas and Stephen Chong)
A run-time software contract monitor for microservices

abstract: Modern service-oriented applications forgo semantically rich protocols and middleware when composing services. Instead, they embrace the loosely-coupled development and deployment of services that communicate via simple network protocols. Even though these applications do expose interfaces that are higher-order in spirit, the simplicity of the network protocols forces them to rely on brittle low-level encodings. To bridge the apparent semantic gap, programmers introduce ad-hoc and error-prone defensive code. Inspired by Design by Contract, we choose a different route to bridge this gap. We introduce Whip, a contract system for modern services. Whip (i) provides programmers with a higher-order contract language tailored to the needs of modern services; and (ii) monitors services at run time to detect services that do not live up to their advertised interfaces. Contract monitoring is local to a service. Services are treated as black boxes, allowing heterogeneous implementation languages without modification to services' code. Thus, Whip does not disturb the loosely coupled nature of modern services.
Lucas Waye, Christos Dimouas, and Stephen Chong
In Proceedings of the 22nd ACM SIGPLAN International Conference on Functional Programming (ICFP), September 2017.
[Paper | Presentation Video | Artifact | Appendices]
Flexible manipulation of labeled values for information-flow control libraries , ESORICS 2016 (with Marco Vassena, Pablo Buiras, and Alejandro Russo)
Allow secure computation on labeled values in IFC systems without unlabeling by using the functor structure and applicative operator

abstract: The programming language Haskell plays a unique, privileged role in Information-Flow Control (IFC) research: it is able to enforce information security via libraries. Many state-of-the- art libraries (e.g., LIO, HLIO, and MAC) allow computations to manipulate data with different security labels by introducing the notion of labeled values, which protect values with explicit labels by means of an abstract data type. While computations have an underlying algebraic structure in such libraries (i.e. monads), there is no research on structures for labeled values and their impact on the programming model. In this paper, we add the functor structure to labeled values, which allows programmers to conveniently and securely perform computations without side-effects on such values, and an applicative operator, which extends this feature to work on multiple labeled values combined by a multi- parameter function. This functionality simplifies code, as it does not force programmers to spawn threads to manipulate sensitive data with side-effect free operations. Additionally, we present a relabel primitive which securely modifies the label of labeled values. This operation also helps to simplify code when aggregating data with heterogeneous labels, as it does not require spawning threads to do so. We provide mechanized proofs of the soundness our contributions for the security library MAC, although we remark that our ideas apply to LIO and HLIO as well.
Marco Vassena, Pablo Buiras, Lucas Waye, and Alejandro Russo
In Proceedings of the 21st European Symposium on Research in Computer Security (ESORICS), September 2016.
[Paper]
It's my privilege: controlling downgrading in DC-labels , STM 2015 (with Pablo Buiras, Dan King, Stephen Chong, and Alejandro Russo)
The use of privileges in IFC systems can be limited to prevent their misuse by restricting where in the security lattice they may be used and who may influence their use.

abstract: Disjunction Category Labels (DC-labels) are an expressive label format used to classify the sensitivity of data in information-flow control systems. DC-labels use capability-like privileges to downgrade information. Inappropriate use of privileges can compromise security, but DC-labels provide no mechanism to ensure appropriate use. We extend DC-labels with the novel notions of bounded privileges and robust privileges. Bounded privileges specify and enforce upper and lower bounds on the labels of data that may be downgraded. Bounded privileges are simple and intuitive, yet can express a rich set of desirable security policies. Robust privileges can be used only in downgrading operations that are robust, i.e., the code exercising privileges cannot be abused to release or certify more information than intended. Surprisingly, robust downgrades can be expressed in DC-labels as downgrading operations using a weakened privilege. We provide sound and complete run-time security checks to ensure downgrading operations are robust. We illustrate the applicability of bounded and robust privileges in a case study as well as by identifying a vulnerability in an existing DC-label-based application.
Lucas Waye, Pablo Buiras, Dan King, Stephen Chong, and Alejandro Russo
In Proceedings of the 11th International Workshop on Security and Trust Management (STM), September 2015.
[Paper | Slides]
Exploring and enforcing security guarantees via program dependence graphs , PLDI 2015 (with Andrew Johnson, Scott Moore, and Stephen Chong)
A program analysis and understanding tool that enables the specification and enforcement of precise application-specific information security guarantees

abstract: We present Pidgin, a program analysis and understanding tool that enables the specification and enforcement of precise application-specific information security guarantees. Pidgin also allows developers to interactively explore the information flows in their applications to develop policies and investigate counter-examples.
Pidgin combines program-dependence graphs (PDGs), which precisely capture the information flows in a whole application, with a custom PDG query language. Queries express properties about the paths in the PDG; because paths in the PDG correspond to information flows in the application, queries can be used to specify global security policies.
Pidgin is scalable. Generating a PDG for a 335k line Java application takes under 90 seconds, and checking each of our policies on that PDG takes under 14 seconds. The query language is expressive, supporting a large class of precise, application-specific security guarantees. Policies are separate from the code and do not interfere with testing or development, and can be used for security regression testing.
We describe the design and implementation of Pidgin and report on using it: (1) to explore information security guarantees in legacy programs; (2) to develop and modify security policies concurrently with application development; and (3) to develop policies based on known vulnerabilities.
Andrew Johnson, Lucas Waye, Scott Moore, and Stephen Chong
In Proceedings of the 36th ACM SIGPLAN Conference on Programming Langauge Design and Implementation (PLDI), June 2015.
[Paper | Website]
Privacy integrated data stream queries , PSP 2014
A practical framework for which a non-expert can perform differentially private operations on data streams

abstract: Research on differential privacy is generally concerned with examining data sets that are static. Because the data sets do not change, every computation on them produces "one-shot" query results; the results do not change aside from randomness introduced for privacy. There are many circumstances, however, where this model does not apply, or is simply infeasible. Data streams are examples of non-static data sets where results may change as more data is streamed. Theoretical support for differential privacy with data streams has been researched in the form of differentially private streaming algorithms. In this paper, we present a practical framework for which a non-expert can perform differentially private operations on data streams. The system is built as an extension to PINQ (Privacy Integrated Queries), a differentially private programming framework for static data sets. The streaming extension provides a programmatic interface for the different types of streaming differential privacy from the literature so that the privacy trade-offs of each type of algorithm can be understood by a non-expert programmer
Lucas Waye
In Proceedings of the 1st International Workshop on Privacy and Security in Programming (PSP), October 2014.
[Paper | Slides | GitHub]
Framework for generating programs to process beacons , US Patent (with Kevin Seng, Viral Bajaria, and Shane Moriah)
A translator from high-level, declarative beacon specifications to programs that process and validate those beacons

abstract: A method receives a specification for processing beacons. The beacons are associated with an event occurring at a client while a user is interacting with a web application and include unstructured data. The method parses the specification to determine an object model including objects determined from the specification where different specifications are parsed into a format of the object model. A generator is determined and each generator is configured to process the format of the object model to generate a different type of target program to process the beacons and multiple generators can process different specifications that are parsed into the format of the object model. The method runs the generator with the object model to generate a target program configured to identify the beacons for the specification, determine unstructured data in the beacons that were specified in the specification, and transform the unstructured data into structured data.
Lucas Waye, Kevin Seng, Viral Bajaria, and Shane Moriah
US Patent 8,725,750. May 2014.
[Google Patent | Slides]
Fabric: A platform for secure distributed computation and storage , SOSP 2009 (with Jed Liu, Michael D. George, K. Vikram, Xin Qi, and Andrew C. Myers)
A high-level programming language for building open distributed applications with strong security

abstract: Fabric is a new system and language for building secure distributed information systems. It is a decentralized system that allows heterogeneous network nodes to securely share both information and computation resources despite mutual distrust. Its high-level programming language makes distribution and persistence largely transparent to programmers. Fabric supports data-shipping and function-shipping styles of computation: both computation and information can move between nodes to meet security requirements or to improve performance. Fabric provides a rich, Java-like object model, but data resources are labeled with confidentiality and integrity policies that are enforced through a combination of compile- time and run-time mechanisms. Optimistic, nested transactions ensure consistency across all objects and nodes. A peer-to-peer dissemination layer helps to increase availability and to balance load. Results from applications built using Fabric suggest that Fabric has a clean, concise programming model, offers good performance, and enforces security.
Jed Liu, Michael D. George, K. Vikram, Xin Qi, Lucas Waye, and Andrew C. Myers
In Proceedings of the 22nd ACM Symposium on Operating Systems Principles (SOSP), October 2009.
[Paper | Website]

Lucas Waye

About Me

Publications Distribution Security Privacy

National Parks Checklist

Work Experience