Keegan Kang’s research is in LSH algorithms, machine learning, statistical theory, and educational pedagogy. He is exploring using Monte Carlo methods to complement random projections and other LSH algorithms in conjunction with machine learning algorithms to increase their predictive power without incurring a large speed tradeoff. He received his Ph.D. from the Statistics and Data Science Department at Cornell University. He was a faculty fellow in the Science, Math, and Technology cluster in SUTD before joining ESD.


I completed my PhD at Cornell University in 2017.  Before my PhD, I did a four year integrated masters BSc MMORSE degree at the University of Warwick, and graduated with First Class Honours in 2012.

Research Interests

My research interests are in LSH algorithms, machine learning, statistical theory, and educational pedagogy. I am currently exploring using Monte Carlo methods to complement random projections and other LSH algorithms in conjunction with machine learning algorithms to increase their predictive power without incurring a large speed tradeoff. I am also constantly seeking ways to improve the teaching of courses to undergraduates.

Research Projects

I am currently involved in the following projects.

  • Improving Random Projections and LSH Schemes with Statistical Techniques {PI} (MOE2018-T2-2-013)
    This is joint and ongoing work with co-PIs Wong Wei Pin and Sergey Kushnarev, and collaborators Karthyek Murthy and Bob Durrant. This work is funded by the Singapore Ministry of Education’s Tier 2 AcRF grant.
  • Autograder for R Code
    I have been involved in developing an autograder for R code since becoming a TA for Statistical Computing in Spring 2016 at Cornell University. The autograder is written in such a way that a TA with minimal programming experience should be able to use the autograder to grade code.


  1. Jeremy Chew and Keegan Kang. “Control Variates for Similarity Search“. To appear in PRCV 2021.
  2. Daniel Jing En Toh, Matthew Rui Kan Xian, and Keegan Kang. “Applying James-Stein Estimation to b-bit Minwise Hashing” To appear in IRC-SET 2021.
  3. Keegan Kang, Wong Wei Pin, Sergey Kushnarev, Haikal Yeo, Rameshwar Pratap and Chen Yijia. “Improving Hashing Algorithms for Similarity Search via MLE and the Control Variates Trick“. To appear in ACML 2021.
  4. Keegan Kang, Wong Wei Pin, Sergey Kushnarev, and Haikal Yeo. “Improving Locality Sensitive Hashing Algorithms with Maximum Likelihood Estimators” (submitted)
  5. Foo Lin Geng, Jiang Yan Li, Haikal Yeo, Sergey Kushnarev and Keegan Kang. “Improving Random Projections with Control Variates” (submitted)
  6. Keegan Kang. “Correlations Between Random Projections and the Bivariate Normal” Data Mining and Knowledge Discovery, May 2021.
  7. Hengyue Wang, Hsin Wei Kuo, Ryan Nathaniel Thesman, and Keegan Kang. “Improving b-bit minwise hashing with addition of optimal standard vectors” IRC-SET 2020, pages 349-364, Springer Singapore.
  8. Sergey Kushnarev, Keegan Kang and Shubham Goyal. “Assessing the Efficacy of Personalized Online Homework in a First-Year Engineering Multivariate Calculus Course“. In 2020 IEEE International Conference on Teaching, Assessment, and Learning for Engineering (TALE), pages 770–773, 2020.
  9. Yulong Li, Zhihao Kuang, Jiang Yan Li and Keegan Kang. “Improving Random Projections with Extra Vectors to Approximate Inner Products” IEEE ACCESS, 8:78590-78607.
  10. Weijie Chi, Jie Chen, Wenjuan Liu, Chao Wang, Qingkai Qi, Qinglong Qiao, Tee Meng Tan, Kangming Xiong, Xiao Liu,  Keegan Kang, Young-Tae Chang, Zhaochao Xu, Xiaogang Liu. “A General Descriptor $\Delta$ E Enables the Quantitative Development of Luminescent Materials based on Photoinduced Electron Transfer“.  In Journal of the American Chemical Society, 142(14):6777-6785, 2020. PMID: 32182060.
  11. Keegan Kang, Sergey Kushnarev, Wong Wei Pin, Omar Ortiz, and Jacob Chen Shihang (2020). “Impact of Virtual Reality on the Visualization of Partial Derivatives In A Multivariable Calculus Class” IEEE ACCESS, 8:58940-58947.
  12. Keegan Kang, Wong Wei Pin (2018). “Improving Sign Random Projections with Extra Information“. In Proceedings of the 35th International Conference of Machine Learning volume 80 of Proceedings of Machine Learning Research, Stockholm, Sweden, Jul 10-15, 2018, pp. 2484-2492.
  13. Keegan Kang (2017). “Random Projections with Bayesian Priors“. In Natural Language Processing and Chinese Computing – 6th CCF International Conference, NLPCC 2017, Dalian, China, November 8-12, 2017, Proceedings, pp. 170-182.
  14. Keegan Kang (2017). “Using the Multivariate Normal to Improve Random Projections“. In Intelligent Data Engineering and Automated Learning – IDEAL 2017: 18th International Conference, Guilin, China, October 30 – November 1, 2017, Proceedings, pp. 397-405.
  15. Keegan Kang and Giles Hooker (2017). “Control variates as a variance reduction technique for random projections“. In Pattern Recognition Applications and Methods – 6th International Conference, ICPRAM 2017, Porto, Portugal, February 24-26, 2017, Revised Selected Papers, pp.1-20.
  16. Keegan Kang and Giles Hooker (2017). “Random projections with control variates“. In Proceedings of the 6th International Conference on Pattern Recognition Applications and Methods, Volume 1: ICPRAM, pp. 138-147.
  17. Keegan Kang and Giles Hooker (2016). “Improving the Recovery of Principal Components with Semi-Deterministic Random Projections“. In 2016 Annual Conference on Information Science and Systems, CISS 2016, Princeton, NJ, USA, March 16-18, 2016, pp. 596-601.
  18. Keegan Kang and Giles Hooker (2016). “Block Correlated Deterministic Random Projections”. 5th Annual International Conference On Computational Mathematics, Computational Geometry and Statistics.
  19. Ben Athiwaratkun and Keegan Kang (2015). “Feature Representation in Convolutional Neural Networks“, arXiv preprint arXiv:1507.02313.

Service and Outreach Activities

  • Science Mentorship Programme (SMP) 2019-2020. Project titles are shown below.Wang Jingtao, Chiew Chun Jia, and Chin Yi Ling “Application of Least Significant Bit Watermarking with Image Segmentation to Enhance Cloud Security” (2020)
    Andrew Chua, Amelyn Siew, Randall So. “Scaling Behaviour of Key Rates With Transmittance and Optimisation of Angles for Optimal Quantum Key Distribution” (co-mentored with Tan Da Yang and Wong Wei Pin) (2020)

    Toh Jing En Daniel and Kan Rui Xian Matthew. “Investigating James Stein Paradox and Applying James Stein Estimation to Machine Learning Problems” (2020)

    Project won the SUTD R&I Award for Artificial Intelligence in the Singapore Science and Engineering Fair (SSEF) 2021, and published in IRC-SET 2021.

    Daniel Ng and Dylon Wong.  “James Stein Estimator Paradox” (2020)

    Glenda Chong Rui Ting and Tan Wee Le and Ryan Tan Zi Lin and Keegan Kang. “New b-bit minwise hashing” (2019)
    Project won 3M Best Project Award in the category Computer Science / Mathematics

    Wang Hengyue and Kuo Hsin Wei and Ryan Nathaniel Thesman and Keegan Kang. “Improving b-bit Minwise Hashing with Addition of Standard Vectors” (2019)
    Project entered Final Judging Round for SSEF 2020, and published in IRC-SET 2020.

    Kuai En Kai, Ethan and Krithikh Gopalakrishnan and Tan Jun Wei and Keegan Kang. “Improving Simple and Efficient Minwise Hashing with Extra Information (2019)
    Project entered Final Judging Round for Singapore Science and Engineering Fair (SSEF) 2020

    Leow May Gwen Veronica and Carol Gan Jianing and Caithlin Ho and Keegan Kang. “Comparison of Different Hashing Algorithms Using Permutations with Binary Data” (2019)

  • I conduct the SUTD Academy course: “(Statistical) Reading and Writing for the 21st Century”
  • SkillsFuture Festival Activity: “Don’t Trust Numbers” (June, 2018)

Previous Projects

  • “Big Data” and Theoretical Calculations Aided Molecular Design of Fluorophores: from Trial-and-Error to Molecular Engineering {co-PI} (IDG31800104)
    This is joint and ongoing work with PI Liu Xiaogang, and co-PIs Richmond Lee, and Michinao Hashimoto. This work is funded by a grant from the SUTD-MIT International Design Centre.
  • Identifying bottlenecks in teaching and learning mathematics at university {PI}
    This is joint and ongoing work with co-PIs Wong Wei Pin, Nachamma Sockalingam, Sergey Kushnarev, and Tan Da Yang in understanding the difficulties that students face when transitioning to university mathematics, in order to develop a mathematics course which caters to their needs. This work was funded in part by the SUTD Faculty Early Career Award.

Honours and Awards

  • NUS High Inspiring Mentor Award for SMP (2020)
  • SMP Outstanding Mentor Award (2020)
  • SMP Outstanding Mentor Award (2019)
  • SUTD Faculty Early Career Award (2017)
  • Outstanding Graduate Teaching Assistant (2017)
  • Giving to Warwick student prize (2011)
  • Giving to Statistics student prize (2011)
  • Warwick Advantage gold award (2011)

Miscellaneous Links

  • When I was an undergraduate student, I wrote a few revision guides and scribed lecture notes for some mathematics and statistics courses, which can be found here. They may still be of some use.
  • I once appeared on the BBC.