About this Website

Why? What? How?

While doing research in cyber forensics, we realized that we often had difficulties to find an appropriate dataset.  Often we found ourselves spending hours on Google in order to find the best dataset. This website is a collection of datasets that are available for researchers.

In Datasets we directly link to available datasets and provide some general stats about the set. Services summarizes some 3rd Party Services that we identified as helpful when doing research. Lastly, other repositories is a list of websites similar to this website which will help you to find good and comprehensive datasets.  

We are actively working on this website and will try to maintain it as good as possible. In case you have a dataset that you would like to add or you find a link which is incorrect, please notify us using the contact form on this website. 

About the Project

Who? When? Where?

This work was mainly carried out by Cinthya Grajeda, an undergraduate researcher at the University of New Haven and class of '17 who did her internship with the University of New Haven Cyber Forensics Research and Education Lab. During her internship Cinthya was supervised by Dr. Frank Breitinger and Dr. Ibrahim Baggili. The creation of this website was supported Mateusz Topor, a graduate researcher at the University of New Haven.

The project analyzed 715 research articles published between 2010 and 2015, started in February 2016 and got accepted for publication in April'17. The results were published at the Digital Forensics Research Workshop '17 (US). The article can be freely downloaded from Digital Investigation. The reference to this article is: Cinthya Grajeda, Frank Breitinger, and Ibrahim Baggili. “Availability of Datasets for digital forensics - and what is missing”. In: Digital Investigation (2017).