Monday, March 12, 2012

Pythagorean Expectation Part 1.5: 2011 San Francisco Giants

Wanted to get out another example about Pythagorean Expectation. The San Francisco Giants' 2011 Season. Unless you are also a Giants fan, I cannot even begin to tell you how damn stressful it is to watch these games sometimes. They are torture. I don't know how many hours of sleep I've lost. Leading 3-2 in the bottom of the 9th. Brian Wilson on the mound. Walk. Out. Out. Walk. Walk. Pitch count 3-2. Foul. Foul. Foul. Strike Three. Whew. That's Giants baseball. One run squeakers. We can do this all season. Not. They were just setting themselves up for an epic fail. Let's take a look at the season ending numbers.

No surprise to anyone who follows baseball. The Giants were one of the bottom-feeders in offense. Third to last in my ratings. However, they maneuvered themselves into 86-76 record.... scoring ONLY 3.52 runs per game. That's 570 runs scored all season. And how many runs do they allow? 578. Now what does that tell us?

How do the Giants have a positive winning record? They allowed more runs than they scored. What happens when you allow more runs? You lose. With 570 runs scored and 578 runs allowed, the Giants end with a PW% of 0.494. This translates to a Pwin of 79.97 games, giving us a Pdiff of -6.03. From what I remember, they already had a Pdiff of about -5.0 (estimate) at the All-Star Break. The Arizona Diamondbacks however, had a positive Pdiff. The SF Giants were not going to repeat as Division Champions (although it got close at the end as the Dbacks tried to screw themselves over as well).

If you updated the stats daily, you can see the Pdiff for the Giants progressively grow bigger and bigger in the negative with all of their one-two run wins. When they lose a game, they would not lose by one. They lose by three, seven, thirteen. What happens as we allow more runs in the equation? The denominator gets bigger, giving us a lower win percentage. And then we lose. We lose. We lose. Alright, I'm done ranting. Peace.

Pythagorean Expectation Part I: The Luck Factor

Let's start with the most basic of basics. How does a team win? Score more runs (or goals, points) than the opponent, and allow fewer runs than the opponent. That is true in any given game.  It is also the same over the course of a season. The team with the better run differential has a better tendency to win because they score more than they allow. With this concept, the following formula known as the Pythagorean Expectation was created:

PW% = PSx / (PSx + PAx)

where   PW% = Pythag win percentage
            PS = Runs/Goals/Points scored
            PA = Runs/Goals/Points allowed
            x = 1.82 for MLB, 2.0 for NHL, 2.37 for NFL and 13.91 for NBA. More of this later.


This formula approximates a team's win percentage based on their scoring capability. Using this approximation, we can compare it to the team's actual win percentage and determine whether a team has been "lucky" or "unlucky" and whether they have been under performing or over performing. Let's take a look at an example on how Pythag is calculated:

As of 3/12/12, the Boston Bruins (40-28) scored 222 goals and allowed 164 goals. Plugging in the numbers, this gives the Bruins a PW% of 0.647 as opposed to their actual win percentage of 0.588. This implies that the Bruins were either under performing or unlucky. Have they been getting a lot of bad breaks? Have they been lackadaisical in their games? Have injuries become a factor? Eventually, this "luck" will even itself out and regress to the Pythagorean Expectation. A PW% of 0.647 suggests that Bruins are really a 44 win team with 24 losses. PW% = 0.5, when PS = PA.

Using this theory, when two teams are matched together, we can calculate their Pythagorean difference to find value in matchups. It can be calculated as follows:

Pwin = PW% * (GP) 
Pdiff = Pwin - W

where Pwin = Pythag wins
          GP = Games played
          W = Actual wins
          Pdiff = Difference in Pwin and actual wins

The Boston Bruins have a Pwin of 44 and a Pdiff of 4.0. Tomorrow 3/13/12, the Bruins play the Tampa Bay Lightning, who hold a Pdiffof -3.53. Put the two together, the Bruins have a 7.53 game advantage in wins. Looking at their actual records, the Bruins already look like the better team (Lightning are 30-37).  Take into account the "luck" and performance factor, the Bruins should be an even more favorable team. By looking at the Pdiffalone, Bruins should win the matchup. However, we must keep in mind that this is the first filter and other statistics, such as goaltending and offensive power must be taken into consideration.

More to come soon.... and for those of you who don't believe in the "luck" factor. Watch the following videos and think about it. You watch enough sports, and you'll start seeing some crazy, unbelievable and WTF moments.

Rangers vs. Giants: Game 2 of 2010 World Series


Sharks vs Avalanche: Game 3 of 2010 Western Conference Quarterfinals 

Sunday, March 4, 2012

Intro.


Welcome to Xeziometrix. Over the past year and a half or so, I have had several requests from friends asking me about my approach to sabremetrics and how I evaluate team performance in professional sports.  For those of you who are not familiar with the concept, just think Moneyball. Using statistics, Billy Beane and the Oakland A's revolutionized the way baseball is managed and played. Prior to that, Bill James used mathematical models determine the value of players. I stumbled upon these ideas a few years ago when I got tired of losing in sports wagering. How did Las Vegas sports bookies settle on the point spread of a game? How did they determine the odds of a team winning straight up? These were some questions I asked myself. In the summer of 2010, with the help of a few people I've gotten to know along the way, I searched for the answers.

Prior to this, I never really enjoyed watching baseball. It was boring, and I never really understood it. However, because sabremetrics was founded on the extensiveness of baseball statistics, that is where I had to start. It was also a great time to start tracking the sport as the San Francisco Giants went on to secure the World Series title that year. Call me a bandwagoner if you will.

My approach to sabremetrics is primarily focused on evaluating a team as a whole and to predict the winners of each game. I could really care less if Albert Pujols' 10-year contract is really worth $254 million. The goal of this blog is to demonstrate how to interpret the numbers and to exploit mistakes and public perception presented by bookies, to evaluate sports wagers and rank sports teams using algebra and statistics. Just look at the stats and find out what works and what doesn't. Simple as that. We will explore four major sports: MLB, NHL, NFL and NBA. Eventually, I will also provide a winning NCAAF system that hit at a rate of 68% last season. The beauty of this system is, you won't even have to think for a second when you make these picks. 

Also, please be aware that I am no longer running any analyses nor tracking any games besides my home teams for the upcoming sport seasons. However, it does not mean I am not paying attention. I will not shut the door on what could be a great opportunity. Let me tell you now, once you get all these numbers in your head, it is difficult to watch a game without thinking about it. I am willing to share what I have personally learned and explain ideas utilized by sabremetricians in hopes that you will be able to build upon and formulate new ideas. As games evolve and change, equations and focus will need tweaking as well. There is a saying, "We cannot direct the wind, but we can adjust the sails." Remember, whatever is published on the internet, Vegas has access to them. Develop your own methods to beat the game. 

Questions and comments are encouraged. Please feel free to post to the blog.