North Shore Tavern Mound Visit: Answering your analytics questions

Nov 28, 2023 • 12:25 am

1.6K

I often get analytics questions in my weekly Live Qs, and while I always try to answer them to the best of my ability, I try to limit answers to just a couple sentences, limiting how in-depth I can go.

So I wanted to open up this week's Mound Visit to the DK Pittsburgh Sports readers for an analytical mailbag.

I could not get to every question asked, but I do want to thank everyone for the comments! I may have to do this again in the future, and some may be expanded on for their own Mound Visit in the future. But for now, let's take a look at five:

Sloshyj: "Over his last 200 PA starting August 8, Hayes had a .933 OPS that included 10 HR. Is this the long-awaited emergence of Hayes as an offensive threat? Did he have a breakthrough or was this just a streak? Wondering what we should expect from him in 2024."

I wrote a couple pieces on Hayes' footwork and how it was a catalyst for his offensive breakthrough last year. I always try to look for a "why" for the data rather than just buying into pure results, and he seemed a lot more comfortable at the plate and was able to repeat his mechanics more consistently, leading to better results.

To answer your question, I do think this was a breakthrough year, assuming he's able to maintain that good approach.

I've written for years that Hayes hits the ball hard (his 92.2 mph average exit velocity ranked in the 93rd percentile league-wide), but so much of that hard contact was wasted on ground balls. Last year, between his better timing cues and emphasis on getting backspin on the ball, his ground ball rate plummeted to rates we had barely seen before:

Mix in that he maintained his good contact rate and that's a formula for success. He is still hitting the ball hard, but if he can elevate it more, that should lead to more extra-base hits instead of ground outs, and it doesn't come with the trade off of more whiffs.

His walk rate did drop last year, and I'm not sure if that .900+ OPS is maintainable over a full year. But assuming he's healthy, I saw a lot of good things from him starting in June. If he's able to post an OPS over .800 to pair with his Gold Glove defense, I think he stakes a claim as one of the best third basemen in the game.

bozii: The Pirates have said they are happy with the progress made by team in regards to hitting. To the average fan there does not seem to be much, if any, progress. What stats are the Pirates looking at to evaluate the hitters that we don't see?

I would argue first that even at surface level, the Pirates' offense improved in 2023. They did score over 100 more runs, after all (591 in 2022, 692 in 2023). With that said, they were still just 22nd in runs scored and 22nd in team OPS (.707). They need to take another big step to get to even average.

To answer your question, a lot of it has to do with swing decisions and approach. A big part of that is the Pirates, as a whole, didn't chase much in 2023. Their 25.6% chase percentage was the lowest in the National League, which is a two and a half point improvement from 2022 (that improvement was the second-best in the National League). They also swung at more pitches over the heart of the plate, as defined by Baseball Savant, going from 69.8% in 2022 to 73.2% in 2023. That may not seem like a huge shift, but continuing to trend in the right direction in terms of what to swing at will make a difference over time. Especially since last year was the first time since 2015 that they had a better than average hard-hit rate:

There was an improvement in run total last year, and peripherals like these are a big reason why. If you swing at the right pitch, you'll generally make better contact which should lead to better results.

Rutherford B: How much is a spike in BABIP or a high BABIP in a small sample size an indication that a hitter will regress? For example, thinking of a specific player, in Jared Triolo's high 2023 BABIP vs. future expectations?

Batting average on balls in play (BABIP) is an interesting stat in that the largest sample sizes are very uniform year by year, but the smaller a sample size you take, the less reliable it becomes. It's one of the ultimate small sample size red flag stats. League-wide, the average BABIP has ranged between .290 and .300 over the last 10 years, so there is only a couple points of fluctuation every season.

That's not to say every hitter is going to have a BABIP around .300. A speedy hitter is going to leg out more infield singles than Jacob Stallings. A hitter who makes consistently quality contact is almost certainly going to have a higher BABIP than a hitter who hits a lot of soft worm-killers. Using Triolo as an example, his BABIP was .440, but it's no coincedence that it really started to rise once he started hitting the ball harder:

I took a look at Triolo more in-depth in an earlier Mound Visit this offseason, so for anyone interested about him specifically, I would start there.

But for how to use BABIP as a stat, it's difficult with these young hitters. I had analyst once tell me it can take 1,000 to 2,000 plate appearances to get a true indicator of what their BABIP should be moving forward. Obviously you don't have that with young players, though you could dip into the minors a bit if you want some real back of the envelope math.

I'm more inclined to buy into expected stats now. I would refer to Andrew Krutz's analysis for Pitcher List on the difference between expected stats and BABIP for projecting future results, where he found there's a better corolation between XBA over BABIP in year-to-year results. For Triolo, I took a look at xBA on just balls in play, minus homers, and it came out to .370. That's significantly lower than his .440 BABIP, but it indicates that he should be well above that league average. That also took about five extra steps of research to get to what I think most people would have assumed just watching Triolo or looking at his BABIP: This isn't maintainable, but the overall body of results is still really good.

Be mindful of sample sizes when it comes to stats like BABIP, but it's a good tool if you keep in mind that not everyone is going to hit around .300.

Neil Pert: I'm sure you saw this Fangraphs piece on Henry. They are not bullish on his approach so far. Nutshell: He's not swinging at the right pitches to do damage. Question: do you agree, and what analytics do you look at when you see a young hitter who needs to improve as much as he does?

Here's the link for the story.

The short version is that Davis needs to improve his swing decisions, and the data does suggest that. He's also a rookie who had only a cup of coffee with Class AAA Indinapolis before being promoted to the majors and battled an injury and a new position. There was a lot on his plate, and I can't help but wonder if he was pressing. After all, his highest chase rates came right before he hit the injured list:

I'm keeping this answer short because I think Michael Baumann covered this subject well, but I would say Davis is young, there's plenty of room to grow and he's the type of player who usually rises to those personal challenges.

franciseii: Are Defensive Runs Saved (DRS) and Wins Above Replacement (WAR) calculable by the layman? The *value* of DRS & WAR as measures of overall player impact & defensive impact are clear to me. But I have as little idea of how one *arrives* at these numbers as Moses might have had about the Ten Commandments.

I had to keep the 10 Commandments kicker in there. I'm a sucker for a good closer.

Well, yes and no. I can give you the formula for WAR: (Batting Runs + Base Running Runs +Fielding Runs + Positional Adjustment + League Adjustment +Replacement Runs) / (Runs Per Win). I can even give more context for what that all means, or how DRS is calcuated. But if we're talking about can I calculate WAR on the back of an envelope the same way I could slugging percentage or ERA, no. Heck, FanGraphs and Basebal Reference can't even agree, which is why there is some deviation between the two (though in many cases it's minimal).

But I think this helps reinforce that analytics and statistics should be tools for evaluating players. It can help explain why a player is excelling or struggling, or make a compelling case for why one player is better than another, but there is a layer of subjectivity to all of this. And if you properly vet the source, it's ok to use WAR, even if it may not have quite come from the mountain top.