osdi 2021 accepted papers
Paper Submission Information All submissions must be received by 11:59 PM AoE (UTC-12) on the day of the corresponding deadline. Although SSDs can be simplified under the current ZNS interface, its counterpart LFS must bear segment compaction overhead. For realistic workloads, KEVIN improves throughput by 68% on average. Our evaluation shows that, compared to existing participant selection mechanisms, Oort improves time-to-accuracy performance by 1.2X-14.1X and final model accuracy by 1.3%-9.8%, while efficiently enforcing developer-specified model testing criteria at the scale of millions of clients. HotNets provides a venue for discussing innovative ideas and for debating future research agendas in networking. This distinction forces a re-design of the scheduler. Existing algorithms are designed to work well for certain workloads. High-performance tensor programs are critical for efficiently deploying deep neural network (DNN) models in real-world tasks. PET then automatically corrects results to restore full equivalence. Sep 2021 - Present 1 year 7 months. We also verified a simple NFS server using GoJournals specs, which confirms that they are helpful for application verification: a significant part of the proof doesnt have to consider concurrency and crashes. Nico Lehmann and Rose Kunkel, UC San Diego; Jordan Brown, Independent; Jean Yang, Akita Software; Niki Vazou, IMDEA Software Institute; Nadia Polikarpova, Deian Stefan, and Ranjit Jhala, UC San Diego. The conference papers and full proceedings are available to registered attendees now and will be available to everyone beginning Wednesday, July 14, 2021. The full program will be available in May 2021. OSDI '21 Technical Sessions All the times listed below are in Pacific Daylight Time (PDT). In particular, I'll argue for re-engaging with what computer hardware really is today and give two suggestions (among many) about how the OS research community can usefully do this, and exploit what is actually a tremendous opportunity. Federated Learning (FL) is an emerging direction in distributed machine learning (ML) that enables in-situ model training and testing on edge data. To achieve low overhead, selective profiling gathers runtime execution information selectively and incrementally. Our further evaluation on 38 CVEs from 10 commonly-used programs shows that SanRazor reduced checks suffice to detect at least 33 out of the 38 CVEs. We propose a learning-based framework that instead explicitly optimizes concurrency control via offline training to maximize performance. Sam Kumar, David E. Culler, and Raluca Ada Popa, University of California, Berkeley. Thanks to selective profiling, DMons profiling overhead is 1.36% on average, making it feasible for production use. (Registered attendees: Sign in to your USENIX account to download these files. For instance, the following are not sufficient grounds to specify a conflict with a PC member: they have reviewed the work before, they are employed by your competitor, they are your personal friend, they were your post-doc advisor or advisee, or they had the same advisor as you. Each new model trained with DP increases the bound on data leakage and can be seen as consuming part of a global privacy budget that should not be exceeded. Writing a correct operating system kernel is notoriously hard. Main conference program: 5-8 April 2022. VLDB 2021: Venue Tivoli Hotel & Congress Center Arni Magnussons Gade 2 1577 Copenhagen, Denmark +45 3268 4300 In-person attendees can purchase tickets for the park / gardens with a 15% discount, which is a special offer by Tivoli Hotel & Congress Center to VLDB 2021 attendees. In contrast, CLP achieves significantly higher compression ratio than all commonly used compressors, yet delivers fast search performance that is comparable or even better than Elasticsearch and Splunk Enterprise. Sanitizers detect unsafe actions such as invalid memory accesses by inserting checks that are validated during a programs execution. Camera-ready submission (all accepted papers): 2 April 2021; Main conference program: 27-28 April 2021; All deadline times are . Leveraging these information, Pollux dynamically (re-)assigns resources to improve cluster-wide goodput, while respecting fairness and continually optimizing each DL job to better utilize those resources. PLDI is a premier forum for programming language research, broadly construed, including design, implementation, theory, applications, and performance. To remedy this, we introduce DeSearch, the first decentralized search engine that guarantees the integrity and privacy of search results for decentralized services and blockchain apps. The key to our solution, Horcrux, is to account for the non-determinism intrinsic to web page loads and the constraints placed by the browsers API for parallelism. AI enables principled representation of knowledge, complex strategy optimization, learning from data, and support to human decision making. The biennial ACM Symposium on Operating Systems Principles is the world's premier forum for researchers, developers, programmers, and teachers of computer systems technology. Our evaluation shows that NrOS scales to 96 cores with performance that nearly always dominates Linux at scale, in some cases by orders of magnitude, while retaining much of the simplicity of a sequential kernel. It then feeds those invariants and the desired safety properties to an SMT solver to check if the conjunction of the invariants and the safety properties is inductive. NrOS is primarily constructed as a simple, sequential kernel with no concurrency, making it easier to develop and reason about its correctness. We built a functional NFSv3 server, called GoNFS, to use GoJournal. Such centralized engines are in a perfect position to censor content and violate users privacy, undermining some of the key tenets behind decentralization. Password In addition, CLP outperforms Elasticsearch and Splunk Enterprise's log ingestion performance by over 13x, and we show CLP scales to petabytes of logs. CLP's gains come from using a tuned, domain-specific compression and search algorithm that exploits the significant amount of repetition in text logs. We present DPF (Dominant Private Block Fairness) a variant of the popular Dominant Resource Fairness (DRF) algorithmthat is geared toward the non-replenishable privacy resource but enjoys similar theoretical properties as DRF. This formulation of memory management, which we call memory programming, is a generalization of paging that allows MAGE to provide a highly efficient virtual memory abstraction for SC. Authors are also encouraged to contact the program co-chairs, osdi21chairs@usenix.org, if needed to relate their OSDI submissions to relevant submissions of their own that are simultaneously under review or awaiting publication at other venues. She also has made contributions in network security, including scalable data expiration, distributed algorithms despite malicious participants, and DDOS prevention techniques. Pollux is implemented and publicly available as part of an open-source project at https://github.com/petuum/adaptdl. Session Chairs: Moshe Gabel, University of Toronto, and Joseph Gonzalez, University of California, Berkeley, John Thorpe, Yifan Qiao, Jonathan Eyolfson, and Shen Teng, UCLA; Guanzhou Hu, UCLA and University of Wisconsin, Madison; Zhihao Jia, CMU; Jinliang Wei, Google Brain; Keval Vora, Simon Fraser; Ravi Netravali, Princeton University; Miryung Kim and Guoqing Harry Xu, UCLA. Youngseok Yang, Seoul National University; Taesoo Kim, Georgia Institute of Technology; Byung-Gon Chun, Seoul National University and FriendliAI. In this talk, I'll speculate on how we came to this unfortunate state of affairs, and what might be done to fix it. These are hard deadlines, and no extensions will be given. Existing systems that hide voice call metadata either require trusted intermediaries in the network or scale to only tens of users. Distributed Trust: Is Blockchain the answer? In addition, increasing CPU core counts further complicate kernel development. Devices employ adaptive interrupt coalescing heuristics that try to balance between these opposing goals. Today, privacy controls are enforced by data curators with full access to data in the clear. Paper abstracts and proceedings front matter are available to everyone now. Reviews will be available for response on Wednesday, March 3, 2021. We present selective profiling, a technique that locates data locality problems with low-enough overhead that is suitable for production use. This yielded 6% fewer TLB miss stalls, and 26% reduction in memory wasted due to fragmentation. blk-switch evaluation over a variety of scenarios shows that it consistently achieves s-scale average and tail latency (at both 99th and 99.9th percentiles), while allowing applications to near-perfectly utilize the hardware capacity. Uniquely, Dorylus can take advantage of serverless computing to increase scalability at a low cost. We prove that DistAI is guaranteed to find the -free inductive invariant that proves the desired safety properties in finite time, if one exists. As the emerging trend of graph-based deep learning, Graph Neural Networks (GNNs) excel for their capability to generate high-quality node feature vectors (embeddings). Authors may use this for content that may be of interest to some readers but is peripheral to the main technical contributions of the paper. Furthermore, such performance can be achieved without any modification in applications, network hardware, kernel CPU schedulers and/or kernel network stack. As has been standard practice in OSDI and SOSP in recent years, we will allow authors to submit quick responses to PC reviews: they will be made available to the PC before the final online discussion and PC meeting. She developed the technology for making network routing self-stabilizing, largely self-managing, and scalable. We develop rigorous theoretical foundations to simplify equivalence examination and correction for partially equivalent transformations, and design an efficient search algorithm to quickly discover highly optimized programs by combining fully and partially equivalent optimizations at the tensor, operator, and graph levels. A glance at this year's OSDI program shows that Operating Systems are a small niche topic for this conference, not even meriting their own full session. She is the author of the textbook Interconnections (about network layers 2 and 3) and coauthor of Network Security. Papers not meeting these criteria will be rejected without review, and no deadline extensions will be granted for reformatting. The chairs will review paper conflicts to ensure the integrity of the reviewing process, adding or removing conflicts if necessary. However, Addra improves message latency in this architecture, which is a key performance metric for voice calls. Dorylus is up to 3.8 faster and 10.7 cheaper compared to existing sampling-based systems. Manuela M. Veloso is the Head of J.P. Morgan AI Research, which pursues fundamental research in areas of core relevance to financial services, including data mining and cryptography, machine learning, explainability, and human-AI interaction. Tao Luo, Mingen Pan, Pierre Tholoniat, Asaf Cidon, and Roxana Geambasu, Columbia University; Mathias Lcuyer, Microsoft Research. Performance experiments show that GoNFS provides similar performance (e.g., at least 90% throughput across several benchmarks on an NVMe disk) to Linuxs NFS server exporting an ext4 file system, suggesting that GoJournal is a competitive journaling system. Computation separation makes it possible to construct a deep, bounded-asynchronous pipeline where graph and tensor parallel tasks can fully overlap, effectively hiding the network latency incurred by Lambdas. Our approach outperforms existing file systems on a block SSD by a wide margin 6.2 on average for metadata-intensive benchmarks. Evaluation on a four-node machine with Optane DC Persistent Memory shows that Nap can improve the throughput by up to 2.3 and 1.56 under write-intensive and read-intensive workloads, respectively. First, GNNAdvisor explores and identifies several performance-relevant features from both the GNN model and the input graph, and use them as a new driving force for GNN acceleration. Some recent schedulers choose job resources for users, but do so without awareness of how DL training can be re-optimized to better utilize the provided resources. Professor Veloso is the Past President of AAAI (the Association for the Advancement of Artificial Intelligence), and the co-founder, Trustee, and Past President of RoboCup. Typically, monolithic kernels share state across cores and rely on one-off synchronization patterns that are specialized for each kernel structure or subsystem. These limitations require state-of-the-art systems to distribute training across multiple machines. How can we design systems that will be reliable despite misbehaving participants? Many application domains can benefit from hybrid transaction/analytical processing (HTAP) by executing queries on real-time datasets produced by concurrent transactions. We describe Fluffy, a multi-transaction differential fuzzer for finding consensus bugs in Ethereum. JEL codes: Q18, Q28, Q57 . In this paper, we present P3, a system that focuses on scaling GNN model training to large real-world graphs in a distributed setting. blk-switch uses this insight to adapt techniques from the computer networking literature (e.g., multiple egress queues, prioritized processing of individual requests, load balancing, and switch scheduling) to the Linux kernel storage stack. All submissions will be treated as confidential prior to publication on the USENIX OSDI 21 website; rejected submissions will be permanently treated as confidential. We propose a new framework for computing the embeddings of large-scale graphs on a single machine. This paper presents Zeph, a system that enables users to set privacy preferences on how their data can be shared and processed. Secure hardware enclaves have been widely used for protecting security-critical applications in the cloud. To evaluate the security guarantees of Storm, we build a formally verified reference implementation using the Labeled IO (LIO) IFC framework. Jaehyun Hwang and Midhul Vuppalapati, Cornell University; Simon Peter, UT Austin; Rachit Agarwal, Cornell University. This talk will discuss several examples with very different solutions. Amy Tai, VMware Research; Igor Smolyar, Technion Israel Institute of Technology; Michael Wei, VMware Research; Dan Tsafrir, Technion Israel Institute of Technology and VMware Research. We implement and evaluate a suite of applications, including MICA, Raft and Set Algebra for document retrieval; and we demonstrate that the nanoPU can be used as a high performance, programmable alternative for one-sided RDMA operations. We also propose two file system techniques for ZNS+-aware LFS. Jiang Zhang, University of Southern California; Shuai Wang, HKUST; Manuel Rigger, Pinjia He, and Zhendong Su, ETH Zurich. Attaching supplementary material is optional; if your paper says that you have source code or formal proofs, you need not attach them to convince the PC of their existence. In this paper, we show how to address this inefficiency without requiring pages to be rewritten or browsers to be modified. Yuke Wang, Boyuan Feng, Gushu Li, Shuangchen Li, Lei Deng, Yuan Xie, and Yufei Ding, University of California, Santa Barbara. We propose PET, the first DNN framework that optimizes tensor programs with partially equivalent transformations and automated corrections. Pollux simultaneously considers both aspects. SanRazor adopts a novel hybrid approach it captures both dynamic code coverage and static data dependencies of checks, and uses the extracted information to perform a redundant check analysis. If you have any questions about conflicts, please contact the program co-chairs. These results outperform state-of-the-art HTAP systems by several orders of magnitude on transactional performance, while just incurring little performance slowdown (5% over pure OLTP workloads) and still enjoying data freshness for analytical queries (less than 20 ms of maximum delay) in the failure-free case. Mothy joined the Computer Science Department ETH Zurich in January 2007 and was named Fellow of the ACM in 2013 for contributions to operating systems and networking research. We compare Marius against two state-of-the-art industrial systems on a diverse array of benchmarks. We describe PrivateKube, an extension to the popular Kubernetes datacenter orchestrator that adds privacy as a new type of resource to be managed alongside other traditional compute resources, such as CPU, GPU, and memory. Ethereum is the second-largest blockchain platform next to Bitcoin. Haojie Wang, Jidong Zhai, Mingyu Gao, Zixuan Ma, Shizhi Tang, and Liyan Zheng, Tsinghua University; Yuanzhi Li, Carnegie Mellon University; Kaiyuan Rong and Yuanyong Chen, Tsinghua University; Zhihao Jia, Carnegie Mellon University and Facebook. Proceedings Cover | See www.cs.cmu.edu/~mmv/Veloso.html for her scientific publications. A PC member is a conflict if any of the following three circumstances applies: Institution: You are currently employed at the same institution, have been previously employed at the same institution within the past two years (not counting concluded internships), or are going to begin employment at the same institution during the review period. As a member of ACCT, I have served two years on the bylaws and governance committee and two years on the finance and audit committee. This paper describes the design, implementation, and evaluation of Addra, the first system for voice communication that hides metadata over fully untrusted infrastructure and scales to tens of thousands of users. Mothy's current research centers on Enzian, a powerful hybrid CPU/FPGA machine designed for research into systems software. We develop MAGE, an execution engine for SC that efficiently runs SC computations that do not fit in memory. DeSearch then introduces a witness mechanism to make sure the completed tasks can be reused across different pipelines, and to make the final search results verifiable by end users. By monitoring the status of each job during training, Pollux models how their goodput (a novel metric we introduce that combines system throughput with statistical efficiency) would change by adding or removing resources. Using selective profiling, we build DMon, a system that can automatically locate data locality problems in production, identify access patterns that hurt locality, and repair such patterns using targeted optimizations. Session Chairs: Gennady Pekhimenko, University of Toronto / Vector Institute, and Shivaram Venkataraman, University of WisconsinMadison, Aurick Qiao, Petuum, Inc. and Carnegie Mellon University; Sang Keun Choe and Suhas Jayaram Subramanya, Carnegie Mellon University; Willie Neiswanger, Petuum, Inc. and Carnegie Mellon University; Qirong Ho, Petuum, Inc.; Hao Zhang, Petuum, Inc. and UC Berkeley; Gregory R. Ganger, Carnegie Mellon University; Eric P. Xing, MBZUAI, Petuum, Inc., and Carnegie Mellon University. Instead of choosing among a small number of known algorithms, our approach searches in a "policy space" of fine-grained actions, resulting in novel algorithms that can outperform existing algorithms by specializing to a given workload. Extensive experiments show that GNNAdvisor outperforms the state-of-the-art GNN computing frameworks, such as Deep Graph Library (3.02 faster on average) and NeuGraph (up to 4.10 faster), on mainstream GNN architectures across various datasets. If you submit a paper to either of those venues, you may not also submit it to OSDI 21. The OSDI Symposium emphasizes innovative research as well as quantified or insightful experiences in systems design and implementation. We demonstrate the above using design, implementation and evaluation of blk-switch, a new Linux kernel storage stack architecture. Welcome to the SOSP 2021 Website. Our evaluation shows that PET outperforms existing systems by up to 2.5, by unlocking previously missed opportunities from partially equivalent transformations. First, Fluffy mutates and executes multi-transaction test cases to find consensus bugs which cannot be found using existing fuzzers for Ethereum. One classical approach is to increase the efficiency of an allocator to minimize the cycles spent in the allocator code. We implemented the ZNS+ SSD at an SSD emulator and a real SSD. For example, optimistic concurrency control (OCC) is better than two-phase-locking (2PL) under low contention, while the converse is true under high contention. Conference Dates: Apr 12, 2021 - Apr 14, 2021. Upon these two primitives, our system can scale to thousands of concurrent enclaves with high resource utilization and eliminate the high-cost initialization of secure memory using fork-style enclave creation without weakening the security guarantees. She is the recipient of several best paper awards, the Einstein Chair of the Chinese Academy of Science, the ACM/SIGART Autonomous Agents Research Award, an NSF Career Award, and the Allen Newell Medal for Excellence in Research. The NAL maintains 1) per-node partial views in PM for serving insert/update/delete operations with failure atomicity and 2) a global view in DRAM for serving lookup operations. While compiler-based techniques have been proposed to improve data locality, they depend on heuristics, which can sometimes hurt performance. Pages should be numbered, and figures and tables should be legible in black and white, without requiring magnification. Despite their extensive use for debugging and vulnerability discovery, sanitizer checks often induce a high runtime cost. We introduce a hybrid cryptographic protocol for privacy-adhering transformations of encrypted data. Papers must be in PDF format and must be submitted via the submission form. To help more profitably utilize sanitizers, we introduce SanRazor, a practical tool aiming to effectively detect and remove redundant sanitizer checks.
Leeds United Signed Champions Shirt,
Sarah Dilorenzo Nutritionist Recipes,
Articles O