Fotis Savva

Past & Current Projects :

Predicting Latencies in Networks
Predicting latencies of communicating hosts over large networks. Made use of Matrix Factorization and Time Series Forecasting to predict latencies of random hosts from incomplete data.

Explaining Aggregates for Exploratory Analytics
Building explanatory regression functions to facililate large scale data exploration. Tool to guide data analysts when exploring unknown data sets.

Query-Driven Learning for Approximate Query Processing
Building Machine Learning models that learn to predict the answers of queries using previously executed queries. This approach offers order of magnitude speedups in executing aggregate queries by trading off some of the accuracy.

Detecting Interesting Regions in Data
Using Query-Driven learning models to automatically detect interesting regions in large data sets. This approach fuses Machine Learning and Evolutionary Optimization to discover regions that are potentially interesting to the user.

EDA Analysis for Chicago Crimes
An EDA analysis for Chicago Crimes focusing on the spatio-temporal dimensions of the dataset. In this kaggle kernel I constructed interesting visualisations to answer questions that I had when I first encountered the data set.

Research Paper Recommender
This project is aimed at helping researchers and people who read scientific papers find new papers that might be of interest to them. It uses arXiv and recent developments in word embeddings (using a popular library gensim) to automatically fetch new papers for a given category. If they are found to be similar then they are added to a specified folder.

Growing Networks
An implementation of the algorithm listed in the paper "A self-organising network that grows when required". This algorithm can be used for clustering tasks, vector quantization tasks and dimensionality reduction tasks

Cyprus Budget Explorer
This project aims to build a system that would allow for exploring the budget of the Republic of Cyprus. The data is available in a pdf format from here : http://bit.ly/2biYQ0y Because the data is not available in any other format, the pdf file was parsed and cleaned using Tabula and Google Refine.

MapReduce/HBase code examples
As part of the Big Data Course, implemented a number of algorithms performing top-k queries, exploratory analysis and more. The code examples, show how to use secondary sorting and how to efficiently leverage the clean-up phase of Mappers and Reducers.