Datasets

The DARPA SEARCHLIGHT dataset contains ~2000 systematically conducted experiments and resulting packet captures with contemporary video streaming, video teleconferencing, and cloud-based document editing applications.

For more information about this dataset and our motivations for curating it, please see our research paper, titled the same.

Software (testbed, traffic generators, etc.) used to create this dataset can be found on the Searchlight project page.

Experiment Naming Conventions

Phase 3 experiments were focused on generating large amounts of traffic across multiple topologies using a specific configuration for each application network traffic generator.

  • Experiment configurations: 213
  • Number of .pcaps: 687
  • Size: 1.2TB

Phase 2 experiments were focused on the breadth of each traffic generator, iterating over all possible configurations.

  • Number of .pcaps: 3512
  • Size: 750GB
Instructions: Filtering by Tags

In general, the experiment/tag selections you make on the left side will be ANDed between groups and ORed within a group.

Example #1: selecting “ipsec” (VPN), “sts” (VPN-Topology), and “ptp” (VPN-Topology) will result in the equivalent query: (vpn: ipsec) && (vpn-topology: sts || vpn-topology: ptp).

Example #2: selecting “video: resolution-1080”, “video: streaming-dash”, “video: transport-http3”, “video: transport-http1” is equivalent to: (video: resolution-1080) && (video: streaming-dash) && (video: transport-http3 || video: transport-http1).

Mouseover the next to each category for more info.

Raw Data (.csv)

If you want to download data in bulk or downselect on more complex tag queries, please download the dataset listing in CSV format.

The CSV header is:

	url,tags,sha256sum
  • url contains the resource path (e.g., dataset/.../foo.pcap)
    • add the prefix https://lhst0.sphere-testbed.net/ to build the full URL (this prefix may change over time and will be updated here)
    • optionally, add one of the following suffixes: .summary.txt (Wireshark summary), .pps.png (Packets per Second over time), .bw.png (Bytes over time)
  • tags are delimited by |.
  • sha256sum is the SHA-256 hash of the .pcap file
License

This dataset is licensed under Creative Commons Attribution 4.0 International (CC-BY-4.0). 2024. The DARPA SEARCHLIGHT Dataset of Application Network Traffic. [DISTAR Case 38213]

These documents were cleared by DARPA on August 15, 2023. All copies should carry Distribution Statement "A" (Approved for Public Release, Distribution Unlimited).

Searchlight Traffic Generation Dataset and Tools of Application Network Traffic. [DISTAR case 36697]

This document was cleared by DARPA on August 22, 2022. All copies should carry the Distribution Statement "A" (Approved for Public Release, Distribution Unlimited).

This research was developed with funding from the Defense Advanced Research Projects Agency (DARPA). The views, opinions and/or findings expressed are those of the author and should not be interpreted as representing the official views or policies of the Department of Defense or the U.S. Government.

Dataset created and curated by researchers at USC/ISI and SNL.

Website originally designed by Mahek Savani.