Dealing with a big amount of Malware is a quite complex discipline especially for private and independent tools which doesn’t rely on huge infrastructures and quick database rings. Aim of this dashboard is to monitor trends over thousands even millions of samples providing quantitative analyses on what has observed during the performed automatic analyses. The data in this dashboard is totally auto-generated without control and with no post-processing. You should consider it as raw-data where you can start to elaborate your own research and eventually where you can apply your personal filters or considerations. If you do that, you should be aware that false positives could be behind the corner Let’s move on the current graphs and let’s try to explain what I’d like to show with them but before getting in you should be aware that all the digits on the graphs are expressing percentages and not absolute numbers. Now let’s dig a little bit on them.
- Malware Families Trends. Detection distribution over time. In other words what are time-frames in where specific families are most active respect to others.
- Malware Families. Automatic Yara rules classify samples into families. Many samples were not classified in terms of families, this happens when no signatures match the samples or if multiple family signatures match the same sample. In both ways I am not sure where the sample belong with, so it would be classified as “unknown” and not visualized on this graph. Missing slice of the cake is attributed to “unknown”.
- Distribution Types. Based on the magic file bytes this graph would track the percentages of file types that Malware used as carrier.
- Threat Level Distribution. From 0 to 3 is getting more and more dangerous. It would be interesting to understand the threat level of unknown families as well, in order to understand if hidden in unknown families Malware or false positives would hide. For such a reason a dedicated graph named Unknown Families Threat Level Distribution has created.
- Stereotypes. Studying stereotypes would be useful to analyze similarities in clusters. In other words, it could be nice to see what are the patterns used by malware in both: domain names, file names and process names. It would be important for detection and even for preemptive blocking.
- TOP domains, TOP processes and TOP File Names. With a sliding window of 300 last analyzed samples, the backend extracts the TOP (in terms of frequency) contacted domains, spawned processes and utilized file names. Again, there is no filter and no post-processing analysis in that fields, by meaning you could probably find as TOP domain “google.com” or “microsoft update”, which is fine, since if the sample queried them before performing its malicious intent, well, it is simply recorded and took to your attention. Same cup of tea with processes and file names.Indeed those fields are include the term “involved” into their title, if something is involved it does not mean that it is malicious , but that it is accounted to be in a malicious chain.
If you have suggestions or request for features, please contact me HERE.
Updated every 24h
Families Distribution Over Time
Distribution Types No “Exe” (%)
Threat Level Distribution (%)
Unkown Malware Families Threat Level (%)