SMU Office of Research & Tech Transfer – In recognition of the lasting impact of his research, Associate Professor Jiang Lingxiao of the Singapore Management University (SMU) School of Information Systems has received the 2018 ACM SIGSOFT Impact Paper Award. The annual award is presented to authors of papers published at least ten years ago that have been deemed highly influential by the international software engineering community.
Professor Jiang received the award together with his co-authors Mr Ghassan Misherghi, formerly of the University of California, Davis; Professor Su Zhendong of ETH Zurich; and Dr Stéphane Glondu of the French Institute for Research in Computer Science and Automation for their paper titled ‘DECKARD: Scalable and Accurate Tree-Based Detection of Code Clones’.
In the paper, which was presented at the 29th International Conference on Software Engineering in 2007, Professor Jiang described a method of identifying code clones, similar codes that appear when developers copy-paste existing codes into software. “If there is a bug in the source code base, then it may also appear in the new software because it has been duplicated by the developer,” he explains.
“Another use case of code clone detection is in malware detection. People typically use signatures to find malware, but because malware is constantly evolving, the exact signature may not be a good match. Automatically identifying related code clones accurately and efficiently on a large scale would help us classify malware into different families and improve the detection engine.”
At the time of publication, the prevailing methods of detecting code clones were either text-based or semantic. Although semantic approaches that considered the meaning in the code were more accurate than text-based methods, they were also more expensive to run. As an alternative, Professor Jiang and his colleagues borrowed an idea from the field of information retrieval: the concept of vectorisation.
Converting code into a numerical vector allowed them to find code clones efficiently by simply comparing vectors, rather than exhaustively searching through the whole code. The team also developed a tool allowing others to easily implement their algorithm, a piece of open-source software called DECKARD.
Since then, other software developers have extended the original idea of vectorisation, reimplementing the technique in their own bug and malware detection tools. While DECKARD relied on manual identification of important patterns for vectorisation, Professor Jiang and others are exploring the use of deep learning to identify patterns in code and turn them into vectors automatically.
“I am excited to receive this award as it means that the tool we developed ten years ago – and continued to maintain and add features to – has made a real impact and been used by many people,” Professor Jiang says. “With this recognition, the impact of the paper will be further strengthened.”
Apart from an honorarium to be split between the authors and an award plaque, Professor Jiang has also been invited to deliver a keynote talk at the SIGSOFT Foundations of Software Engineering symposium which will be held on 4-8 November 2018 in Florida.