Common Vulnerabilities and Exposures (CVE) databases such as Mitre's CVE List and NIST's NVD database identify every disclosed vulnerability affecting any public software. However, during the early hours of a vulnerability disclosure, the metadata associated with these vulnerabilities is either missing, wrong, or at best sparse. This creates a challenge for robust automated analysis of new vulnerabilities. We present a new technique based on TF-IDF to map newly disclosed vulnerabilities to the most probably affected software products, formulated as an ordered list of relevant entries in the Common Platform Enumeration (CPE) database. For doing so we rely only on the human readable description of the vulnerability without any need for metadata.
- Poster