Machine Learning

Federated Learning

Increasing privacy concerns and unrestricted access to data lead to the development of a novel machine learning paradigm called Federated Learning (FL). FL borrows many of the ideas from distributed machine learning, however, the challenges associated with federated learning makes it an interesting engineering problem since the models are trained on edge devices. It was introduced in 2016 by Google, and since then active research is being carried out in different areas within FL such as federated optimization algorithms, model and update compression, differential privacy, robustness, and attacks, federated GANs and privacy preserved personalization. There are many open challenges in the development of such federated machine learning systems and this project will be focusing on the communication bottleneck and data Non IID-ness, and its effect on the performance of the models. These issues are characterized on a baseline model, model performance is evaluated, and discussions are made to overcome these issues.

Movie Rating Prediction

The Hollywood movie production business has a very instinct and contact driven low-tech decision-making process that generates a portfolio of movies that a production house decides to fund in any given year. The same type of decision-making process is employed by movie stars and their agents to decide which projects to pursue and which ones to pass. This leads to a high degree of variation in the success rate of projects (as measured by gross box office receipts and return in investment). Most production houses employ a portfolio driven approach and diversify their risk across a number of low, medium and high budget movies. I have attempted several data centric ML approaches to solve this interesting predictive problem.