Final 4 time in the NCAA Tournament and my next step was to create a model to give me another set of predictions for this years tournament. See how it does with the data I have. The last part will come next week after the completion of the tournament where I will try and find the best model, in some cases using less statistics based on some feature analysis, and then finding the best parameters for the model. But for this one I wanted to get a base model down.

For my model I went with an XGBoost model. This…


Wow the tournament has been absolutely wild so far, exciting games and ruined brackets everywhere. It was an exciting year for me having a perfect bracket for all of an hour and a half, then the first game ended and with it my perfect bracket dream. But what can we pull out of this weekend of games from the first 2 rounds statistically.

As I was talking about in part 1, sports in general are very hard to predict because people play them and not robots and people are impossible to predict. But millions of people try anyway, myself included…


March Madness, so many people’s favorite time of the year. Everyone who enjoys college basketball and even some who don’t always have something to root for. Whether it’s the great underdog stories, or someone rooting for their alma mater or if you’re like me, trying to beat your friends in that March Madness bracket challenge without having really watched any college basketball throughout the year. But not to worry that’s where data analysis comes into play. So that’s what I did and am going to show over a series of posts here.

First let me start with some quick background…


Until recently a players position in the NBA was never something that was argued or something people tried to change. However, about 6 or 7 years ago teams started to realize that peoples skill set might be more effective a way to put players into position. Generally a player’s position has been decided by size. A shorter player being a point guard, a bigger player being a center and everything in between. So players like PJ Tucker who was a small forward just 4 years ago is now playing center or Power Forward for the Houston Rockets exclusively because he…


Winning basketball games is hard, and every year 30 teams put in a lot of work from a lot of different people to do just that. With this project I’m trying to help by finding some factors in games that lead to these wins. I took data from teams from the last decade and ran models to predict wins based on those stats and compared them to the actual results. …


A project with Jude Buenaseda

This is the classification of well functionality in Tanzania. Located in East Africa and known for its safaris and the Serengeti National Park. It is one of the fastest growing economies in Africa but there are still a lot of communities that get left behind — especially in rural areas. One of the biggest issues in these communities is access to clean water. Only about 50% of the population have access to safe water and the other half? They collect water from wells that sometimes are of long distances and sometimes are nonfunctional. …


I decided to write this blog while thinking of a project idea I had, to try and analyze soccer statistics and make predictions. I found a problem in that soccer statistics in general are not great. The base statistics that everyone talks about, argues over, are things like goals and assists. Which are not a great measure of value by themselves. Obviously, the whole point of the game is to score so goals are not entirely dismissible but there is so much more that goes into the value of a player than just how many goals they score. But I…


Classification algorithms are supervised learning concepts which categorizes data into classes. For example if you wanted to categorize dogs by large or small you could use a classification model to take the data for dogs and classify them as either large or small. Data used for a classification model can be structured or unstructured. Classification models can be used to output data as binary results, yes or no, 1’s or 0’s, win or loss, etc. They are used very commonly in speech recognition, face detection systems etc.

Classifiers are algorithms used to map input data and put them into specific…


Video games movies have been very popular in general. Media in general that come from movies like Warcraft or The Witcher come out with a large number of people already interested in the game. So what games out there now could follow in the footsteps of those movies and become extremely popular blockbuster type movies. To investigate this we looked at data from 3 main sites; Metacritic, which had critic and user scores as well as the number of reviews, Gamespot which had critic reviews in their api and VGChartz which had sales of games. …


Sports betting has always been a big part of sports, whether it’s just between friends or betting on Vegas lines. I’m not a big gambler myself but I always like to predict which teams will win against the spread. Before I get into that some information on Vegas lines for those that don’t know. The spread is the score by which Vegas predicts one team will win over another. So if the Knicks are favored (just an example obviously) by 3.5 points, then to win back the same amount of money you put in the Knicks would have to win…

Antonio Hila

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store