My Résumé


Dailun Li

Room 404, 1910 Oxford St, Berkeley, CA, 94704 (510) 316-4347 | lidl200147@gmail.com

GitHub Repo: github.com/lidl2019

EDUCATION

————————————————————————————————————————————————–

University of California, Berkeley, United States Expected Graduation: 2024.6

McGill University, Montreal, Canada 2019.9 - 2023.1

  • Major in CS Minor in ECON CGPA 3.98/4.00 – Dean’s Honour List

University of Oxford, Oxford, United Kingdom 2021.4 – 2021.7

  • 7 weeks of online research in deep learning on EEG signal processing

————————————————————————————————————————————————–

Algorithm Engineer 2022-5 - 2022-8

ChromX Health Ltd., Guangzhou, China

  • Developed two techiques on evaluating mass spectrums on people’s breathing materials to help predicting Lung Cancer (Specifically Squamous Cell Carcinoma and Adenocarcinoma). These methods involve machine learning techiques like PCA, Logistic Regression, MLP that consists of 2216 sample points and over 400 chemicals.
  • Developed algorithms to support wave detecting, wave alignment, and wave separation to measure the specific magnitude of each chemical on the spectrum.
  • The two algorithms (one using preprocessed wave magnitudes, the other using data points on mass spectrums) are able to robustly separate malignant cancer patients from healthy people, with a 94.6% F1 score over 628 healthy and 605 patients.
  • Typical extracted biomarkers including Isoprene and Acetone are proven to be identifiers of lung cancer.
Research Assistant 2022-5 - 2022-9

McGill University, Montreal, Quebec

  • Supervised by Professor Bettina Kemme
  • Help formalized and implement operations for SQL support in AIDA, using pandas-like APIs. Such implementation will turn transformations in pandas to SQL table operations in order to speed up the process. E.g. A scheduler will detect and estimate the operation time and to decide whether the operation will perform locally, or perform in a remote database.
Software Development Engineer (Intern) 2021.7 - 2021.9

Beijing Venustech Technology Ltd., Beijing, China

  • Develop web pages in Vue.js that monitor network stream for abnormal message detection
  • Develop a Honeypot system that analyzes attempts by malicious users
  • Use Logistic Regression model to label abnormal messages, which has a 78% accuracy
  • Store user request logs to MySQL database using sqlalchemy
Research Assistant 2021.4 - 2021.7

University of Oxford, Oxford, United Kingdom

  • Supervised by Professor Maarten De Vos
  • Collect human sleep EEG signals and analyze the corresponding sleep qualities with RNN
  • The RNN model reaches an 87.5% accuracy on 5-category classification over 5,600,000 samples

*Report can be found at https://github.com/lidl2019/Automated-EEG-grading, which is going to be published at IEEE Explore in late 2022

————————————————————————————————————————————————–

Construct VITS Text to Speech Model, using Anime characters
  • Convert .xp3 file to .wav file, with a sampling frequency of 22050Hz
  • align such speech with the proper txt extracted from subtitles
  • Using VITS to Transfer Learning the proper speeches
  • With the trained model, evaluate using HifiGAN / WaveGlow to synthesize speech
Detailed Analyzation on Self-attention in RNN
  • Explore the effects of datasets on the self-attention mechanism with datasets such as Yelp, Xinhua, etc.
  • Explore the effects of various types of RNN models (LSTM, GRU) on the self-attention layer
  • Confirm that the self-attention layer may cast negative influence, which contradicts common knowledge

*Codes and report can be found at https://github.com/lidl2019/Comp550-Final

CNN for Multi-MNIST Handwritten Character Classifier
  • Implement a CNN classifier that classifies handwritten digits and alphabets in 52x52 images
  • Experiment with various CNN models including AlexNet, VGG-16, ResNet, etc.
  • Apply random forest, data augmentation, and data filtering to optimize performance
  • Reach a 96.5% accuracy, which surpasses manual precision, and stand as the top team in 86 teams

*Codes can be found via https://github.com/lidl2019/Multi-minist-detector

CYK Parser of French Grammar
  • Implement CYK parser for French grammar from scratch
  • Implement TFIDF vectorizer, the CYK algorithm, parsers that converts CFG into CNF
  • Define detailed French grammar in CFG (Context-free Grammar)

*Codes can be found via https://github.com/lidl2019

AWARDS

————————————————————————————————————————————————–

  • Gold Medal, Canwa CUP Go Competition, Canada 2022
  • Faculty of Science Scholarship, McGill University, Canada 2020 - 2021
  • IEEExtreme 15.0 rank 203rd globally 4th in Canada, IEEE, U.S. 2020 - 2021
  • Dean’s Honour List, McGill University, Canada 2020 - 2021
  • Official Go player (5 dan), Canadian Go Association, Canada 2019
  • AP Scholar National Honour Roll, College Board, U.S. 2019

SKILLS AND LANGUAGES
————————————————————————————————————————————————–

  • Skills (Statistics): pandas, sns, bokeh, matplotlib, beautifysoup
  • Skills (Database): MySQL, MongoDB
  • Skills (Programming Language): Python, Java, Javascript, C, Ocaml, Bash, MIPS
  • Skills (Web Framework): Django, Vue
  • Language: English, Mandarin (Native) Japanese (Limited)
  • Additional Skills: Go, Guitar, Music Theory

Author: Dailun Li
Reprint policy: All articles in this blog are used except for special statements CC BY 4.0 reprint policy. If reproduced, please indicate source Dailun Li !
  TOC