Sebastian Raubach

Software / Web / Database Developer

My name is Sebastian Raubach. I am a Bioinformatics Database Developer at the James Hutton Institute.

I finished my Master of Science in Computer Science at the RWTH Aachen University with distinction in 2012.

During my studies I focused on data mining and software development while maintaining my own small software projects.
I mainly develop in Java, but I have experience in other OOP languages and script languages, too. Recently I started developing for Android.

This website contains some of my software projects as well as an up-to-date CV.

If you are interested in my skills and want to hire me, please send an e-mail to:

Email Address

Work Experience

  • Bioinformatics Database Developer
    Dec. 2012 - Present
    at the James Hutton Institute
    Responsibilities:
    • MySQL, Perl and Java programming
    • Development of a new web interface with GWT
    • Development of an Android interface for the Germinate Data Management System
    • Maintenance of the Seeds of Discovery website
    • Development of visualizations for the SeeD project
    • Visualization with JavaScript and d3.js
  • Database/Web Developer
    Oct. 2012 - Dec. 2012
    at the James Hutton Institute
    Responsibilities:
    • MySQL programming, Perl and Java
    • Germinate Data Management System
    • Visualization with JavaScript and d3.js

Education

  • Master Thesis
    Sep. 2011 - Mar. 2012
    Topic: "Spectral Projected Clustering on Graphs with Feature Vectors"
    • Grade: 1.0
    Clustering graphs annotated with feature vectors has recently gained much attention. The goal is to detect groups of vertices that are densely connected in the graph as well as similar with respect to their feature values. While early approaches treated all dimensions of the feature space as equally important, more advanced techniques consider the varying relevance of dimensions for different groups. In this work, we propose a novel clustering method for graphs with feature vectors based on the principle of spectral clustering. Following the idea of subspace clustering, our method detects for each cluster an individual set of relevant features. Since spectral clustering is based on the eigendecomposition of the affinity matrix, which strongly depends on the choice of features, our method simultaneously learns the grouping of vertices and the affinity matrix. To tackle the fundamental challenge of comparing the clustering structures for different feature subsets, we define an objective function that is unbiased regarding the number of relevant features. We develop the algorithm SSCG and we show its application for multiple real-world datasets.
  • Master of Science in Computer Science
    Apr. 2010 - Mar. 2012
    at the RWTH Aachen University
    • Grade: 1.1
  • Bachelor Thesis
    Dec. 2009 - Mar. 2010
    Topic: "Fault-tolerant Subspace Clustering"
    • Grade: 1.3
    In today's applications, data analysis tasks are hindered by many attributes per object as well as by faulty data with missing values. Subspace clustering tackles the challenge of many attributes by cluster detection in any subspace projection of the data. However, it poses novel challenges for handling missing values of objects, which are part of multiple subspace clusters in different projections of the data. In this work, we propose a general fault tolerance definition enhancing subspace clustering models to handle missing values. We introduce a flexible notion of fault tolerance that adapts to the individual characteristics of subspace clusters and ensures a robust parameterization. Allowing missing values in our model increases the computational complexity of subspace clustering. Thus, we prove novel monotonicity properties for an efficient computation of fault tolerant subspace clusters. Experiments on real and synthetic data show that our fault tolerance model ields high quality results even in the presence of many missing values.
  • Bachelor of Science in Computer Science
    Oct. 2006 - Mar. 2010
    at the RWTH Aachen University
    • Grade: 2.3

A Tool for Automated Evaluation of Algorithms

P. Kranen, S. Wels, T. Rohlfs, S. Raubach and T. Seidl

In Proceedings of the 2012 ACM Conference on Information and Knowledge Management, pages 2692 - 2694, 2012

Flexible Fault Tolerant Subspace Clustering for Data with Missing Values

S. Günnemann, E. Müller, S. Raubach and T. Seidl

In Proceedings of the 2011 IEEE International Conference on Data Mining, pages 231 - 240, 2011

Android apps

Desktop apps