Drexel Lab Releases Software to Discern and Disguise Authorship
January 23, 2012
A set of new software programs, developed by Drexel University computer scientists, could soon help protect the speech of the disenfranchised and ensure the voice of the whistle-blower, all by confirming or contorting one’s writing style.
JStylo and Anonymouth are a set of competing programs designed, respectively, to divine the authorship of a disputed text and help authors to remain anonymous. The open-source software, whose genesis is the Drexel College of Engineering’s Privacy, Security and Automation Lab, was recently released -in its first version- at the Chaos Communication Congress.
“JStylo and Anonymouth are intended to fill holes that we see in both the research and privacy communities,” said Michael Brennan, a lead developer of the project who is in this fifth year of doctoral study at Drexel.
“JStylo allows for more effective stylometry research and Anonymouth enables people to maintain their anonymity when publishing sensitive writing.”
JStylo, the program intended to identify authors, uses a rigorous set of filters to sift out patterns in the text. Starting with broad categories such as sentence length and lines per paragraph and winnowing down to such characteristics as word choice and frequency of certain letter combinations, the program generates an author profile which is then compared to the a baseline writing sample from the suspected author. With a writing sample of about 6,500 words as a comparison, the software can select an author from a pool of 40 candidates with 80-85 percent accuracy, according to Brennan. The tool is even more accurate as the number of possible authors is decreased.
“JStylo and authorship recognition in general can be used to discover instances of deception,” Brennan said. “There have been occurrences of deceptive authors such as the American male who wrote the blog ‘Gay Girl in Damascus’ during the Arab Spring movement, that authorship recognition has unveiled after the fact.”
Conversely, Anonymouth is designed to help cloak an author’s unique writing characteristics to a point where the text could not be traced back to them using authorship recognition software - such as JStylo. Anonymouth does not encode the writing, but it goes through a similar set of analyses to its counterpart and suggests changes that the author could make in order to mask his or her writing tendencies.
“When people want to speak anonymously, whether it be for reporting on human rights issues or whistleblowing or simply voicing unpopular opinions, they need to know how to be safe and whether stylometry may reveal their identity,” said Dr. Rachel Greenstadt, the director of Drexel’s Privacy, Security and Automation Lab. “I am hopeful that these programs will help us answer these questions and also provide a mechanism for people outside our lab to learn about these issues.”
The products’ launch at the Chaos Communication Congress marked the first release of an open-code software of this kind. Brennan and Greenstadt are projecting a beta release in the spring and an accompanying research paper, produced by the lab is currently under review.