Today I went through and ran the newly collected malware I found through a couple scanners. For the most part all the vulnerabilities exploited seemed to match with the existing samples I already had. There was a difference in a few files though in that they used CVE-2009-4324 (as a single exploit or part of a group). When I merged the result sets together I ended up with a total of 53 unique PDF malware samples.
At this point, I think the best way to proceed with this project is to come up with a database schema to store as much data as I can. Once inputted, I can expose a web front end where users can search the malware samples and find characteristics that match across multiple files. Doing so should reveal more ways to detect malicious files.