Kacper Noniewicz
← Projects

Earlier work / Diploma project

Futura

My diploma project at HTBLA Grieskirchen. A team project where I owned the entire ML pipeline, from raw market data to trained model to database output.

What it does

Takes historical stock data, engineers features from it, trains a classifier to predict next-day price direction, and writes results to MongoDB for the rest of the team's app.

Why I built it

Can you take historical stock data, engineer enough useful features from it, and train a model that predicts whether the price goes up or down the next day? Not a production trading system. A structured exercise in building an end-to-end ML workflow that actually runs.

How it works

Raw OHLCV data → ~100 technical indicators (TA-Lib) → Lag + date features → RFE feature selection (60 features) → LightGBM classifier (GPU) → Evaluation → MongoDB writeback

I pulled OHLCV data from Yahoo Finance, generated around 100 technical indicators with TA-Lib across multiple time horizons, added lag features and date signals, then used Recursive Feature Elimination to cut the noise down to 60 usable features.

The model was a LightGBM classifier trained on GPU, with hyperparameter search via GridSearchCV. Results got written back to MongoDB so the rest of the team's app could consume predictions. The whole thing ran as a Jupyter pipeline. Messy in places, honest about what it was.

Stack

Python 3.10JupyterpandasTA-LibLightGBM (CUDA)scikit-learnyfinanceMongoDB

Status

Archived educational work. The value is in demonstrating end-to-end ML workflow execution, not in claiming trading performance.