Blog

2017

Machine Learning Algorithms Explained – Random Forests
Random Forests are supervised ensemble-learning models used for classification and regression. Ensemble learning models aggregate multiple machine learning models, allowing for overall better performance. The logic behind this is that each of the models used is weak when employed on its own, but strong when put together in an ensemble. In the case of Random Forests... [Read more]

Fraud Detection by Stacking Cost-Sensitive Decision Trees
Recently, we published a research paper showing how it is possible to detect fraudulent credit card transactions with a high level of accuracy and a low number of false positives. By using ensembles of cost-sensitive decision trees, we can save up to 73 percent of losses stemming from fraud. Here’s how.[Read more]

Machine Learning Algorithms Explained – Decision Trees
In our new series, Machine Learning Algorithms Explained, our goal is to give you a good sense of how the algorithms behind machine learning work, as well as the strengths and weaknesses of different methods. Each post in this series will briefly explain a different algorithm. [Read more]

From Real-Time Learning to Reinforcement Learning with Asynchronous Feedback
Online, or real-time, transactional fraud detection systems have recently created quite the buzz in the info security industry. They are an appealing concept: Because we know that fraud patterns change over time, the ability to use machine-learning algorithms to automatically learn new patterns instantly allows us to have a stronger defense system. [Read more]

Building AI Applications Using Deep Learning
Recently, we have seen a huge boom around the field of deep learning; it is currently being implemented in a wide variety of fields, from driverless cars to product recommendation. In their most primitive form, deep learning algorithms originated in the 1960s... [Read more]

Classifying Phishing URLs Using Recurrent Neural Networks
In a recent research paper, we showed how we are able to detect with a high level of accuracy if a website is a phish just by looking at the URL. This post lays out in greater detail how, by using a deep recurrent neural network, we’re able to accurately classify more than 98 percent of URLs... [Read more]

Machine Learning Explained
Machine learning models are often dismissed on the grounds of lack of interpretability. There is a popular story about modern algorithms that goes as follows: Simple linear statistical models such as logistic regression yield to interpretable models. On the other hand, advanced models such as random forest or deep neural networks are black boxes, meaning it is nearly impossible to understand how a model is making a prediction... [Read more]

2016

5 Minutes with a Data Scientist: Alejandro Correa Bahnsen of Easy Solutions
Interview by James Powell from TDWI [Read more]

How to Use Isolation Forests for Anomaly Detection
Reprint Inside Big Data [Read more]

Benefits of Anomaly Detection Using Isolation Forests
One of the newest techniques to detect anomalies is called Isolation Forests. The algorithm is based on the fact that anomalies are data points that are few and different. As a result of these properties, anomalies are susceptible to a mechanism ... [Read more]

The Technical Side of Phishing and How to Prevent It
Phishing, by definition, is the act of defrauding an online user and tricking them into clicking on a malicious link ... [Read more]

Applying Data Science to Fraud Prevention
Eighty thousand Kindle users. Sixty-five million Tumblr users. What do they have in common? Both groups had their login credentials breached, courtesy of hackers. While these attacks didn’t directly target financial ... [Read more]

Fraud Detection That Accounts for Misclassification Using Cost-Sensitive Logistic Regression
Fraud detection is a cost-sensitive problem, in the sense that falsely flagging a transaction as fraudulent carriesa significantly different financial cost than missing an actual fraudulent transaction. In order to take these costs ... [Read more]

Phishing Attack Analysis: Estimating Key Cluster Features and Why It’s Important
A recent report showed how we can gain a better understanding of phishing attacksand attackers by using cluster analysis.Subsequently, in a recent post we showed how ... [Read more]

Clustering of Phishing Attacks
Easy Solutions data scientists, including the author of this article, will present extensive research on phishing patterns and correlations between attacks. ... [Read more]

Evaluating a Fraud Detection Using Cost-Sensitive Predictive Analytics
A credit card fraud detection algorithm consists in identifying those transactions with a high probability of being fraudulent based on historical fraud patterns. ... [Read more]

2015

Feature Engineering for Fraud Detection Models
As cybercriminals are constantly updating their strategies to avoid being detected, traditional fraud detection tools, such as expert rules, are less effective as ... [Read more]

Fraud Detection with Advanced Outlier Detection Algorithms
Online fraud costs the global economy more than $400 billion, with more than 800 million personal records stolen in 2013 alone. Increasingly, fraud has diversified to different digital channels ... [Read more]

Hello! Let Me Introduce Myself
As one of the newest employees at Easy Solutions, I’d like to take this opportunity to introduce myself. I am joining the Company as a Data Scientist. Before becoming part of Easy Solutions, I spent my time working at SIX Financial Services ... [Read more]