# Mathlas repository

This section is still a work in progress. While the contents that are already available are final, the wording might not be and some sections still need to be completed or expanded.

## Introduction

While working at Mathlas Consulting we developed a set of methods that we used while working on various projects for our clients. The mathlas repo —as we call it— is now publicly available here under an Apache 2.0 license.

This set of methods reflects the work of the almost 4 years Luis S. Lorente Manzanares, PhD & Diego Alonso Fernández, PhD spent at Mathlas, my almost 3 years there and our combined previous experience in the various fields required to assemble the collection.

The repo contains the project-independent parts of the projects we worked on and is composed of different packages, described here:

• analytical: Analytical function definitions, mainly used for testing optimization methods. Includes Branin, humps and Rosenbrock functions.
• dimensionality_reduction: Methods related to data dimensionality reduction, including Incremental Association Markov Blanket routines and categorizers.
• doe: Design of experiments methods, including LHC and others.
• geo: Geographical routines, including a fuzzy street names matcher (for use with spanish land registry data) and a geographical distance computer amongst others.
• machine_learning: Various machine learning algorithms, including decision tree based methods (Random Forest, AdaBoost), neural networks & Hidden Markov Models.
• misc: Some stuff we found useful like colour codes for printing formatted text to an ANSI terminal, some graphical notifications code, a progress bar and a simple profiler.
• not_for_clients: A dragon exception I like to include to indicate conditions which should not happen.
• optimization: Quite a few numerical optimization code using various algorithms.
• plotting: Various plotting routines on top of Matplotlib (to get corporate colours and use presentation-friendly font sizes by default), interactive map creation routines based on OpenLayers and some colourmaps.
• statistics: Various probability distributions, as well as statistical tests and some helper code.
• surrogate_modelling: Various surrogate modelling routines, including various types of probability density mixture models, Kriging models & others

## References

While we developed new methods, we routinely relied on Christopher M. Bishop's excellent Pattern Recognition and Machine Learning book. I personally also found Jeremy Kun's Math ∩ Programming blog to be very instructive.

## Disclaimer

This page does not aim to provide a rigorous description of the implemented methods, instead it serves as an informal explanation of what some of the methods do. Links to rigorous descriptions are provided where deemed appropriate.

## Contact Information

Should you have any questions with the Mathlas repo, please let me know at: joseba.gar@gmail.com