Resume
Harsh Sharma
Interests: Medical AI, Climate AI
Summary
Artificial intelligence expert with a Master of Science in AI from Boston University and experience researching and developing techniques in natural language processing and Prompt engineering. I am skilled in data science and machine learning with a proven track record of designing and implementing models that improve operational efficiency and reduce costs. I have demonstrated success in applying these skills to real-world problems in vehicle lifecycle management, route optimization, and medical research and seeking a summer intern position where I can use my expertise to advance the field of artificial intelligence and contribute to the organization’s mission.
Checkout my Certifications: link
Education
Boston University
Master of Science in Artificial Intelligence
GPA:3.8/4.0
Expected Dec 2023
Courses: Deep Learning(CS523), Image and Video Computation(CS585), Algorithms for Big Data(CS551)
Indian Institute of Technology, Guwahati
Bachelor of Technology in Engineering Physics
Minor in Electronics and Communication Engineering
Jul 2015 - Jun 2019
Courses: Data Structures and Algorithm, Pattern Recognition and Machine Learning, Parallel Computing
Currently
Graph Prediction on Medical Multimodal data, Advisor: Prof. Vijaya Kolachalama
- Alzhimers prediction using Whole slide images as graphs
- Creating heterogenous graphs, combining data from multiple sources
Course Project
Multi-object Tracking and Segmentation with BDD100k dataset
Image and Video Computing, CS 585
- Increased MOTSP metric by over current SOTA by developing an end-to-end framework combining MaskDINO for segmentation and customized DeepSORT for tracking.
Graphics Generation using Natural Language
Deep Learning, CS523
- Implemented a code generation and modification loop using a natural language prompt using Python and Pytorch
- Demonstrated adding and removing objects on CARLA simulations using GPT-3
Work Experience
Data Science Intern, WeaveGrid (EV B2B SaaS)
Jun 2023 - Aug 2023
Vehicle Park Time Prediction
- Implemented models predicting EV charging behavior, including predicting plug-in times (average error <1.5hrs)
and plug-in demand(KWh) (average error <5KWh
- Results in reducing peak load by 30% for managed group EV charging, leading to substantial cost savings and
significantly improved grid performance.
Machine Learning Engineer, OlaElectric (Electric Two Wheeler Manufacturer)
Feb 2022 - Aug 2022
Vehicle Lifecycle Management
Predicting Battery Faults
- Designed Attention LSTM-based models to predict battery faults with a recall rate of 80% using past 100km driving data
- The model is comprised of two parts: The first LSTM model focuses on forecasting the time series, the second LSTM focuses on Predicting if the forecasted time series has a fault
Data Pipeline Automation for Efficient Complaint Resolution
- Designed and deployed a data pipeline to access all organization data through RESTful APIs, reducing complaint resolution time by ~5 days (40%)
- The Django-based application runs ETL jobs scheduled(using Celery) to fetch new data from identified data sources and puts them into the AWS bucket
Real-time Analytics using Apache Spark and Airflow
- Designed and implemented a real-time fault monitoring system for electric scooters using Apache Spark and Airflow.
- The system provided daily fault statistics and trend analysis to stakeholders via custom email templates
- Automated the data collection and presentation process, reducing manual intervention and improving efficiency.
Route Optimization
- Implemented a solution that would reduce the cost of transportation
for electric bikes by ∼USD 6M per year, a reduction of 30% through the use of mixed integer programming
- Resulted in a reduction of per-electric bike transportation cost by 30% and Turned Around Time(TAT) by 20% from placing an order to fulfilling it
Technologies used: Python, Pytorch, Pyspark, Django, PostgreSQL, PowerBI
Machine Learning Engineer, Fractal Analytics (Management Consulting)
Jun 2019 - Feb 2022
Lead Generation for Relationship managers, Standard Chartered (Multinational Bank)
- Created ETL pipeline using hive to clean and process transactional data used in 3 downstream applications
- Improved write speed into hive tables by 50% by exploiting hive’s properties for the internal module using python
Spot Award: Awarded for going above and beyond for the client
- Created and Deployed an Elasticnet-based regression model that accurately forecasts retail sales and reduces run time by 20 hours; a reduction of 80% compared to earlier deployment
- Success of the solution lead to winning contracts for 7 more markets
Star Award: Awarded for out of the box thinking and innovation in deploying solutions
Baseline Forecasting in Covid, Reckitt Benckiser (Consumer Goods Company)
- Built a new pipeline with a three-person team to forecast baseline sales for all retail stores in a market near the COVID period
- Created novel features based on covid geographic data which helped the model forecast good results even in unstable Covid markets
- The created model performed <15% MAPE on unseen near-future forecasts
Entity Mapping, Standard Chartered (Multinational Bank)
- Developed entity recognition model using a combination of GloVe embeddings and expectation maximization
- Enriched the Embeddings using alternate names extracted from news data and by matching child entities.
- Achieved a 90% match score on all client names within the organization
Feedback Capturing, Standard Chartered (Multinational Bank)
- Designed and implemented scalable feedback capturing and processing mechanism for Signals generated, which was extended to 16 use cases
Technologies used: Python, Pyspark, Tensorflow, Hadoop, PowerBI
Projects
Contradictory Claims Identification, Coronawhy.org
link
Blog
Apr 2020 - Apr 2021
- Developed an end-to-end machine learning pipeline to identify contradictory claims from medical research papers
- Achieved a ROC-AUC score of 0.8 using a PyTorch and Python-based sentence-encoder model
Research
Bachelor’s Thesis Project, IIT Guwahati, Advisor: Prof. Prabin K Bora
link
May 2018 - Mar 2019
- Conducted a study of denoising techniques for speckle noise reduction in OCT images
- Designed a novel composite loss function in Keras and Python, resulting in a 3x improvement in the Speckle Suppression Index compared to existing methods
Network Analysis Intern, Center for Development of Advanced Computing
link
Pune May 2018 - Jul 2018
- Implemented semi-supervised clustering in Python and C++ to identify anomalous network activities
- Annotated clusters to uncover malicious network access and Malfunctioning devices.
- Anomalous clusters helped identify open ports, malicious devices on the network, devices with malfunctioning software which had high levels of network activity