I am a software, database and website developer.
Please visit the links below or scroll down and explore this page to learn more about me.
I grew up near Cologne and studied Computer Science in Aachen, Germany. After finishing my Bachelor and Master (with distinction) with a focus on software development and data mining, I moved to Scotland to join the James Hutton Institute's Information & Computational Sciences Group as a Bioinformatics Software Developer. I spend my time working with international collaborators, gathering requirements and developing new features for our software tools.
I write various kinds of software. Starting with full-stack web development to desktop and Android development - I've got it covered. I mainly work on projects developed by small teams, so I'm involved in all development steps:
Clustering graphs annotated with feature vectors has recently gained much attention. The goal is to detect groups of vertices that are densely connected in the graph as well as similar with respect to their feature values. While early approaches treated all dimensions of the feature space as equally important, more advanced techniques consider the varying relevance of dimensions for different groups. In this work, we propose a novel clustering method for graphs with feature vectors based on the principle of spectral clustering. Following the idea of subspace clustering, our method detects for each cluster an individual set of relevant features. Since spectral clustering is based on the eigendecomposition of the affinity matrix, which strongly depends on the choice of features, our method simultaneously learns the grouping of vertices and the affinity matrix. To tackle the fundamental challenge of comparing the clustering structures for different feature subsets, we define an objective function that is unbiased regarding the number of relevant features. We develop the algorithm SSCG and we show its application for multiple real-world datasets.
In today's applications, data analysis tasks are hindered by many attributes per object as well as by faulty data with missing values. Subspace clustering tackles the challenge of many attributes by cluster detection in any subspace projection of the data. However, it poses novel challenges for handling missing values of objects, which are part of multiple subspace clusters in different projections of the data. In this work, we propose a general fault tolerance deﬁnition enhancing subspace clustering models to handle missing values. We introduce a ﬂexible notion of fault tolerance that adapts to the individual characteristics of subspace clusters and ensures a robust parameterization. Allowing missing values in our model increases the computational complexity of subspace clustering. Thus, we prove novel monotonicity properties for an efﬁcient computation of fault tolerant subspace clusters. Experiments on real and synthetic data show that our fault tolerance model ields high quality results even in the presence of many missing values.