Career Profile

who am I? I am a solution-oriented Data professional with strong analytical and numerical skills. I have ample experience analyzing large datasets and using Big Data frameworks. I have a broad knowledge of IT and data engineering, and a solid background in machine learning and econometrics. Furthermore, I am a net worker capable of working independently. Critical and innovative. I have an eye for detail, but I keep focused on the larger picture. My reports are clear and comprehensive. Results focused. I take responsibility and I keep the interests of all stakeholders in mind.

Experience

Data Engineer

June 2023 - now

Shell, HR, The Hague

I am the technical lead of a team of specialized data engineers. We create a Kimball-style data lakehouse for Shell’s HR department. Shell uses Databricks on Azure. I use DBT and pySpark to implement Databricks workflows. I test these workflows using DBT, pytest and great expectations. I use GitHub Actions to implement CI/CD.

Data Engineer

Augustus 2021 - June 2023

Nike, Enterprise Data & Analytics, Hilversum

I migrated legacy SqlServer solutions to the cloud making extensive use of Python, pySpark, sparkSql, Databricks, Airflow, Kubernetes, Snowflake, Sqlserver, Teradata and related technology. I developed an integration testing framework for Nike’s custom data pipeline library.

Machine Learning Engineer

Februari 2021 - Juli 2021

APG, Wealth Management, Amsterdam

I designed the data science capability for APG’s conversation platform, which will serve conversations with nearly 5 million participants. I worked on an Azure Platform, and I used Databricks, pySpark, Mlflow, BentoML, Docker.

Data Engineer

August 2019 - November 2020

Bayer, Advanced Precision Horticulture, Tomato Cultivation, Bergschenhoek

I was responsible for developing data pipelines, which I implemented on an AWS k8s cluster. I designed a data monitoring system, which I implemented on top of Elasticsearch.

Machine Learning Engineer

November 2018 - April 2019

FD Mediagroep, Company.info, Amsterdam

Extracting source data from a MySQL database I created Machine-Learning models. I made predictions and bootstrap prediction intervals.

Data engineer

September 2017 - November 2018

VITO Institute for Technological Research, Remote Sensing, Mol, Belgium

I led the development of an automated workflow system for satellite remote sensing. Liaising between two development teams, I designed and implemented solutions using Spark, Yarn, Docker, Linux, Airflow, Elasticsearch, PostgreSQL, Kafka, Nifi, Dask, Python, Pandas

Data Scientist

February 2017 - July 2017

Liberty Global, Parent Company of Ziggo.nl, Capacity Management, Amsterdam

Python, R, Timeseries modeling (Arima/ETS/Fbprophet/LSTM neural networks), NeuralNet/ElasticNet/MARS/Poisson, Clustering techniques and Cross validation. I intensively used Python libraries like Pandas, Numpy, Sklearn, Keras/TensorFlow, R2py

Data Scientist

May 2016 - December 2016

Postbank AG, CRM Department, Bonn, Germany

I analyzed large data sets consisting of financial transactions, using Oracle, Hadoop, PySpark, Python, Kafka, Hive and SAS.

Data Scientist

November 2015 - April 2016

Belgian Ministry of Finance, Fraud Detection, Brussels, Belgium

I guided a team of data scientists with advanced methods, such as Random Forests, experimental design and calibration. I used SAS to implement these methods.

Previous Positions

Central Fund for Social Housing - Senior Model Developer, Modelling the financial situation of the social rented sector in the Netherlands (2010–2015)

University of York - Senior Advisor, Coordinated statistical and economic aspects of an international EU project (2009–2010)

Netherlands Ministry of Housing - Senior Model Developer, Analysis of the Netherlands’ national housing policy (2006-2010)

Delft University of Technology - Senior Scientific Researcher, International comparative research of housing policy (2003-2008)

CPB Netherlands Bureau of Economic Policy Analysis - Senior Scientific Researcher, Economic policy analysis: Housing economics, Housing finance, International economics. Extensive experience with SAS and Python scripting. Early adopter of Linux, Python and R (1995-2006)

Guido van Steen

Data and machine learning engineer

Education

MSc in Economics

Wageningen University

MSc in Information Science

Tilburg University

Languages

Interests

Career Profile

Experience

Data Engineer

Data Engineer

Machine Learning Engineer

Data Engineer

Machine Learning Engineer

Data engineer

Data Scientist

Data Scientist

Data Scientist

Previous Positions

Skills & Proficiency

Python

pySpark

pytest

pandas & numpy

SQL

Databricks

Dbt

Shell Scripting

Azure

AWS

Airflow

Snowflake

Docker

API management

R

Mlflow

BentoML

Kafka

Elasticsearch

Plotly

SAS