Searchlight

Papers, software, and other artifacts in support of the DARPA Searchlight program on MergeTB.

The Test and Evaluation (T&E) team, consisting of researchers at Sandia National Laboratories (SNL) and USC/ISI, developed tools and methodology on Merge testbeds to evaluate performer technologies for the DARPA Searchlight program.

We have released a variety of datasets and software, and published research papers.

Questions, comments, and bugfixes should be sent to the corresponding author(s) of the artifact in question, or you can email calvin@isi.edu and get a pointer in the right direction.

Distribution Statement “A” (Approved for Public Release, Distribution Unlimited).
This research was developed with funding from the Defense Advanced Research Projects Agency (DARPA). The views, opinions and/or findings expressed are those of the author and should not be interpreted as representing the official views or policies of the Department of Defense or the U.S. Government.

Datasets

Software

Experimentation on Network Emulation Testbeds

Traffic Generators

Research Papers

2022

  • Improving Fidelity in Video Streaming Experimentation on Testbeds with a CDN

    Calvin Ardi, Alefiya Hussain, Michael Collins, and Stephen Schwab. 2022. Improving Fidelity in Video Streaming Experimentation on Testbeds with a CDN. In Workshop on Design, Deployment, and Evaluation of Network-assisted Video Streaming (ViSNext ‘22), December 9, 2022, Roma, Italy. Association for Computing Machinery, New York, NY, USA, 1–7. DOI: 10.1145/3565476.3569097

    Video streaming is the leading network traffic on the Internet, yet there are few tools to run high fidelity experiments with video streaming traffic on network emulation-based testbeds. In this paper, we present a framework to enable higher fidelity and principled experimentation with 36 different video streaming traffic scenario combinations that can be configured and deployed on a notional CDN and data metrics infrastructure. This framework can be used to further study and experiment with adaptive bitrate algorithms and other AI/ML solutions for video delivery.

    DOI PDF

    
    @inproceedings{10.1145/3565476.3569097,
      author = {Ardi, Calvin and Hussain, Alefiya and Collins, Michael and Schwab, Stephen},
      title = {Improving Fidelity in Video Streaming Experimentation on Testbeds with a CDN},
      year = {2022},
      isbn = {9781450399364},
      publisher = {Association for Computing Machinery},
      address = {New York, NY, USA},
      url = {https://doi.org/10.1145/3565476.3569097},
      doi = {10.1145/3565476.3569097},
      abstract = {Video streaming is the leading network traffic on
          the Internet, yet there are few tools to run high fidelity
          experiments with video streaming traffic on network
          emulation-based testbeds. In this paper, we present a framework
          to enable higher fidelity and principled experimentation with 36
          different video streaming traffic scenario combinations that can
          be configured and deployed on a notional CDN and data metrics
          infrastructure. This framework can be used to further study and
          experiment with adaptive bitrate algorithms and other AI/ML
          solutions for video delivery.},
      booktitle = {Proceedings of the 2nd International Workshop on
          Design, Deployment, and Evaluation of Network-Assisted Video
          Streaming},
      pages = {1–7},
      numpages = {7},
      keywords = {network experimentation, content distribution
          network, network traffic, video streaming applications},
      location = {Rome, Italy},
      series = {ViSNext '22}
    }
    
  • The DARPA SEARCHLIGHT Dataset of Application Network Traffic

    Calvin Ardi, Connor Aubry, Brian Kocoloski, Dave DeAngelis, Alefiya Hussain, Matt Troglia, and Stephen Schwab. 2022. The DARPA SEARCHLIGHT Dataset of Application Network Traffic. In Proceedings of the 15th Workshop on Cyber Security Experimentation and Test (CSET ‘22). Association for Computing Machinery, New York, NY, USA, 59–64. DOI: 10.1145/3546096.3546103.

    Researchers are in constant need of reliable data to develop and evaluate AI/ML methods for networks and cybersecurity. While Internet measurements can provide realistic data, such datasets lack ground truth about application flows. We present a ∼ 750GB dataset that includes ∼ 2000 systematically conducted experiments and the resulting packet captures with video streaming, video teleconferencing, and cloud-based document editing applications. This curated and labeled dataset has bidirectional and encrypted traffic with complete ground truth that can be widely used for assessments and evaluation of AI/ML algorithms.

    DOI PDF Data

    
    @inproceedings{10.1145/3546096.3546103,
    author    = {Ardi, Calvin and Aubry, Connor and Kocoloski, Brian and
        DeAngelis, Dave and Hussain, Alefiya and Troglia, Matt and Schwab,
        Stephen},
    title     = {The DARPA SEARCHLIGHT Dataset of Application Network Traffic},
    year      = 2022,
    month     = aug,
    isbn      = {9781450396844},
    publisher = {Association for Computing Machinery},
    address   = {New York, NY, USA},
    url       = {https://doi.org/10.1145/3546096.3546103},
    doi       = {10.1145/3546096.3546103},
    abstract  = {Researchers are in constant need of reliable data to
        develop and evaluate AI/ML methods for networks and cybersecurity.
        While Internet measurements can provide realistic data, such
        datasets lack ground truth about application flows. We present a ∼
        750GB dataset that includes ∼ 2000 systematically conducted
        experiments and the resulting packet captures with video streaming,
        video teleconferencing, and cloud-based document editing
        applications. This curated and labeled dataset has bidirectional and
        encrypted traffic with complete ground truth that can be widely used
        for assessments and evaluation of AI/ML algorithms.},
    booktitle = {Proceedings of the 15th Workshop on Cyber Security Experimentation and Test},
    pages     = {59–64},
    numpages  = {6},
    keywords  = {datasets, network experimentation, network traffic},
    location  = {Virtual, CA, USA},
    series    = {CSET '22}
    }
    
  • Generating Representative Video Teleconferencing Traffic

    David DeAngelis, Alefiya Hussain, Brian Kocoloski, Calvin Ardi, and Stephen Schwab. 2022. Generating Representative Video Teleconferencing Traffic. In Cyber Security Experimentation and Test Workshop (CSET ‘22). Association for Computing Machinery, New York, NY, USA, 91–95. DOI: 10.1145/3546096.3546107.

    Video teleconferencing (VTC) is a dominant network application, yet there is a dearth of tools to generate such traffic for systematic and reproducible experimentation. We present a framework to create representative video teleconferencing traffic and discuss our methodology for behavioral control of multiple bots to create human-like dialog coordination, including interactive talking and silence patterns. Our framework can be coupled with proprietary commercial VTC applications as well as deployed completely within a testbed environment to benchmark emerging networking technology and evaluate the next generation of traffic classification, quality of service (QoS) algorithms, and traffic engineering systems.

    DOI PDF Code

    
    @inproceedings{10.1145/3546096.3546107,
    author = {DeAngelis, David and Hussain, Alefiya and Kocoloski, Brian and
        Ardi, Calvin and Schwab, Stephen},
    title = {Generating Representative Video Teleconferencing Traffic},
    year = {2022},
    isbn = {9781450396844},
    publisher = {Association for Computing Machinery},
    address = {New York, NY, USA},
    url = {https://doi.org/10.1145/3546096.3546107},
    doi = {10.1145/3546096.3546107},
    abstract = {Video teleconferencing (VTC) is a dominant network
        application, yet there is a dearth of tools to generate such traffic
        for systematic and reproducible experimentation. We present a framework
        to create representative video teleconferencing traffic and discuss our
        methodology for behavioral control of multiple bots to create
        human-like dialog coordination, including interactive talking and
        silence patterns. Our framework can be coupled with proprietary
        commercial VTC applications as well as deployed completely within a
        testbed environment to benchmark emerging networking technology and
        evaluate the next generation of traffic classification, quality of
        service (QoS) algorithms, and traffic engineering systems.},
    booktitle = {Cyber Security Experimentation and Test Workshop},
    pages = {100–104},
    numpages = {5},
    keywords = {video teleconference, VoIP, network traffic generation,
        cybersecurity testbeds},
    location = {Virtual, CA, USA},
    series = {CSET 2022}
    }
    
  • Towards an Operations-Aware Experimentation Methodology

    M. Collins, A. Hussain and S. Schwab. 2022. Towards an Operations-Aware Experimentation Methodology. In 2022 IEEE European Symposium on Security and Privacy Workshops (EuroS&PW), 2022, pp. 384-393, DOI: 10.1109/EuroSPW55150.2022.00046.

    Security Operations Centers (SOCs) serve a critical role in protecting enterprise networks and systems. Despite this critical role, only a limited number of researchers in the field have an awareness of the obstacles and challenges in applying cyber ranges and cybersecurity testbeds to the area of SOC training, exercises and evaluation. This paper introduces a systematic approach to incorporating SOCs into cybersecu-rity experiments, including both training and evaluation. We present a reference SOC model, an implementation of that model and downloadable software distributions suitable for deploying on cyber ranges, and guidance towards a methodol-ogy to promote rigorous experiments including those involving human cyber operators. Metrics focused on analyst event load are presented in the context of measuring the impact of new threats, technologies and procedures on SOC performance. Collectively, these contributions serve as a basis for future work to engage the research and operational communities to work together to advance the state-of-the-art of SOC technologies and SOC operators.

    DOI PDF

    
    @INPROCEEDINGS{10.1109/EuroSPW55150.2022.00046,
      author = {M. Collins and A. Hussain and S. Schwab},
      booktitle = {2022 IEEE European Symposium on Security and Privacy Workshops (EuroS&PW)},
      title = {Towards an Operations-Aware Experimentation Methodology},
      year = {2022},
      volume = {},
      issn = {},
      pages = {384-393},
      abstract = {Security Operations Centers (SOCs) serve a critical
        role in protecting enterprise networks and systems. Despite this
        critical role, only a limited number of researchers in the field
        have an awareness of the obstacles and challenges in applying
        cyber ranges and cybersecurity testbeds to the area of SOC
        training, exercises and evaluation. This paper introduces a
        systematic approach to incorporating SOCs into cybersecu-rity
        experiments, including both training and evaluation. We present a
        reference SOC model, an implementation of that model and
        downloadable software distributions suitable for deploying on
        cyber ranges, and guidance towards a methodol-ogy to promote
        rigorous experiments including those involving human cyber
        operators. Metrics focused on analyst event load are presented in
        the context of measuring the impact of new threats, technologies
        and procedures on SOC performance. Collectively, these
        contributions serve as a basis for future work to engage the
        research and operational communities to work together to advance
        the state-of-the-art of SOC technologies and SOC operators.},
      keywords = {training;measurement;systematics;software;computer security;load modeling},
      doi = {10.1109/EuroSPW55150.2022.00046},
      url = {https://doi.ieeecomputersociety.org/10.1109/EuroSPW55150.2022.00046},
      publisher = {IEEE Computer Society},
      address = {Los Alamitos, CA, USA},
      month = jun
    }
    

2021

  • Case Studies in Experiment Design on a Minimega Based Network Emulation Testbed

    Brian Kocoloski, Alefiya Hussain, Matthew Troglia, Calvin Ardi, Steven Cheng, Dave DeAngelis, Christopher Symonds, Michael Collins, Ryan Goodfellow, and Stephen Schwab. 2021. Case Studies in Experiment Design on a minimega Based Network Emulation Testbed. In Cyber Security Experimentation and Test Workshop (CSET ‘21). Association for Computing Machinery, New York, NY, USA, 83–90. DOI: 10.1145/3474718.3474730.

    This paper describe our team’s experience using minimega, a network emulation system using node and network virtualization, to support evaluation of a set of networked and distributed systems for topology discovery, traffic classification and engineering in the DARPA Searchlight program. We present the methodology we developed to encode network and traffic definitions into an experiment description model, and how our tools compile this model onto the underlying minimega API. We then present three cases studies which demonstrate the ability of our EDM to support experiments with diverse network topologies, diverse traffic mixes, and networks with specialized layer-2 connectivity requirements. We conclude with the overall takeaways from using minimega to support our evaluation process.

    DOI PDF Presentation Code

    
    @inproceedings{10.1145/3474718.3474730,
    author    = {Kocoloski, Brian and Hussain, Alefiya and Troglia, Matthew and
    Ardi, Calvin and Cheng, Steven and DeAngelis, Dave and Symonds,
    Christopher and Collins, Michael and Goodfellow, Ryan and Schwab,
    Stephen},
    title     = {Case Studies in Experiment Design on a Minimega Based Network
    Emulation Testbed},
    year      = 2021,
    isbn      = {9781450390651},
    publisher = {Association for Computing Machinery},
    address   = {New York, NY, USA},
    url       = {https://doi.org/10.1145/3474718.3474730},
    doi       = {10.1145/3474718.3474730},
    abstract  = {This paper describe our team’s experience using minimega, a
        network emulation system using node and network virtualization,
        to support evaluation of a set of networked and distributed
        systems for topology discovery, traffic classification and
        engineering in the DARPA Searchlight program. We present the
        methodology we developed to encode network and traffic
        definitions into an experiment description model, and how our
        tools compile this model onto the underlying minimega API. We
        then present three cases studies which demonstrate the ability
        of our EDM to support experiments with diverse network
        topologies, diverse traffic mixes, and networks with specialized
        layer-2 connectivity requirements. We conclude with the overall
        takeaways from using minimega to support our evaluation
        process.},
    booktitle = {Cyber Security Experimentation and Test Workshop},
    pages     = {83–90},
    numpages  = {8},
    location  = {Virtual, CA, USA},
    series    = {CSET '21}
    }
    
  • Building Reproducible Video Streaming Traffic Generators

    Calvin Ardi, Alefiya Hussain, and Stephen Schwab. 2021. Building Reproducible Video Streaming Traffic Generators. In Cyber Security Experimentation and Test Workshop (CSET ‘21). Association for Computing Machinery, New York, NY, USA, 91–95. DOI: 10.1145/3474718.3474721.

    Video streaming traffic dominates Internet traffic. However, there is a dearth of tools to generate such traffic on emulation-based testbeds. In this paper we present tools to create representative and reproducible video streaming traffic to evaluate the next generation of traffic classification, Quality of Service (QoS) algorithms and traffic engineering systems. We discuss 27 different combinations of streaming video traffic types in this preliminary work, and illustrate the diversity of network-level dynamics in these protocols.

    DOI PDF Code

    
    @inproceedings{10.1145/3474718.3474721,
    author    = {Ardi, Calvin and Hussain, Alefiya and Schwab, Stephen},
    title     = {Building Reproducible Video Streaming Traffic Generators},
    year      = 2021,
    month     = aug,
    isbn      = {9781450390651},
    publisher = {Association for Computing Machinery},
    address   = {New York, NY, USA},
    url       = {https://doi.org/10.1145/3474718.3474721},
    doi       = {10.1145/3474718.3474721},
    abstract  = {Video streaming traffic dominates Internet traffic.
        However, there is a dearth of tools to generate such traffic on
        emulation-based testbeds. In this paper we present tools to
        create representative and reproducible video streaming traffic
        to evaluate the next generation of traffic classification,
        Quality of Service (QoS) algorithms and traffic engineering
        systems. We discuss 27 different combinations of streaming video
        traffic types in this preliminary work, and illustrate the
        diversity of network-level dynamics in these protocols.},
    booktitle = {Cyber Security Experimentation and Test Workshop},
    pages     = {91–95},
    numpages  = {5},
    location  = {Virtual, CA, USA},
    series    = {CSET '21}
    }