Scraping the TOR for rare contents

Scraping the “TOR hidden world” is a quite complex topic. First of all you need an exceptional computational power (RAM mostly) for letting multiple runners grab web-pages, extracting new links and re-run the scraping-code against the just extracted links. Plus a queue manager system to manage scrapers conflicts and a database to store scraped data […]

Read more "Scraping the TOR for rare contents"

“Collection #I” Data Breach Analysis – Part 2

On January 19th we downloaded Collectoin #1 to make statistics on its content (you might find more information here). During these days we finished the two main activities to be able to answer some more questions about it data: (i) ELK import and (ii) building of simple views to visualise desired informations. The following image shows […]

Read more "“Collection #I” Data Breach Analysis – Part 2"