If you are trying to lose weight, there is no shortage of “information” available today from “experts” telling you how to do it. Take a look at the New York Times Best Seller list for Non-Fiction, and you will inevitably find a book about dieting. Pick up a random newspaper or visit a random news website. Chances are, there will be some sort of article about how to lose weight.
Take a look at this list of diets. People try anything and everything to lose weight. They hear stories about people going “low-carb” and losing a bunch of weight or about people going “Paleo” and losing a bunch of weight. Everyone knows someone who swears by a diet. But how often do you actually see data attached to such claims? Not very.
Over the last 11 weeks, I have been meticulously tracking everything I eat so as to determine how variation in my diet influences my weight fluctuation. A story isn’t enough for me–I want evidence. I want to know what really causes me to lose weight. And the only way to find that out, I’ve come to believe, is to track it.
Between August 29 and November 13 of 2014, I lost 30.2 pounds. I didn’t exercise at all; I only dieted. Between that time frame,
- I lost 16.2% of my body weight.
- I went from weighing in at 186.7 pounds to weighing in at 156.5 pounds.
- I dropped 2 1/2 inches off my waist line.
- My BMI dropped from 27.97 to 23.45.
Below is a nice before and after, but if you want the really good stuff, you can download my complete data set here.
Okay, so I know the above image does not show a particularly dramatic change, but I’m not trying to sell anything. I’m just trying to explore how one set of variables influences another–in this case, how dietary consumption influences weight change.
Although it’s quite likely there are some variables I’m leaving out that could affect my metabolic rate, I tried to be consistent for each weigh in. I didn’t exercise at all the entire time. I ate nothing after 11pm and weighed myself every morning (after using the restroom) between 5am and 6am–wearing the same amount of clothing (see above).
All in all, I measured 10 variables and only found 2 to influence weight fluctuation. Here are the results, at a 99% confidence interval (meaning that I can be confident these results will hold at least 99% of the time)…
- Calories: No Effect
- Fat: Increases Weight
- Saturated Fat: No Effect
- Cholesterol: No Effect
- Sodium: Increases Weight
- Potassium: No Effect
- Carbohydrates: No Effect
- Fiber: No Effect
- Sugar: No Effect
- Protein: No Effect
Now, several of these findings do not make much sense intuitively. I’m not a dietitian or a biologist, so I cannot explain to you why I got the results I did. All I can tell you is what happened to my body when I put these things into it.
If you look at the correlation matrix at the beginning of the post, you will see that many of these variables are highly correlated with weight loss. In particular, saturated fat and calories are highly correlated with weight loss. However, as any good statistician will tell you, correlation does not equal causation. For example, my weight loss is also highly correlated with the recent outbreak of Ebola in the Democratic Republic of the Congo, as well as with the declining temperature in the last few months; however, it would be a large leap in logic to conclude that these things have caused me to lose weight.
So, how do we tease how the variables that are actually explanatory from those which just happen to be occurring simultaneously? Statisticians use a method called multivariate regression. Without getting too technical, this approach discovers the explanatory power of any given variable by holding other variables in the model constant. Given my examples above, the method would look at the data and ask, “what happens on the occasions when the temperature stays the same and there are no new cases of Ebola? Is Doug still losing weight?” If it turns out that I am, there is probably something else explaining it.
After accounting for multicollinearity and throwing out Fiber and Potassium as variables adding unnecessary clutter to the model, I came up with the following regression output. Again, I would encourage you to download the data set to check my math and intuition, but the results were pretty much the same no matter how I played with the model.
Interpreting the Model
Okay, so you see in the image above a bunch of random numbers. If you are not trained in statistics, this will look like a bunch of gobbledygook. So, let me explain a few things. For the purposes of this post, there are only 5 items that really matter:
- Observations: This is important, because your model can’t be valid without enough samples. There is no hard-and-fast rule for how many samples you need, but most statisticians use a minimum of 30 as a rule of thumb. For my model, I tracked my weight change and dietary consumption for 77 days, so there are 77 observations.
- Significance F: This is the statistical validity of the model as a whole. If you want to say that your model is statistically significant at the 99% confidence level (which I do), this number cannot be grater than 0.01 (or 1%). Many statisticians use a 95%, which would require a minimum f-statistic of 0.05 (or 5%) instead. It all depends on how confident you want to be. As for me, I’m going with 99%.
- R Square: This number is about variation. It answers the question, what percentage of the variation in your output variable (the one you’re trying to explain, weight loss in my case) is explained by the variation in your other variables. In other words, if the model were a line, how close would the plots of the other variables be to the line? My R Square is about 55%–not terrible, but not fantastic either. A lower R Square means less precision when you are trying to predict a particular occurence.
- P Value: This is like the “Significance F” but for individual variables within the model. Again, since I am trying to find an explanation for my weight changes with 99% confidence, I am looking for P Values lower than 0.01. As you can see, there are only two in this model–fat and sodium. This means that, holding other variables constant, the only factors that explain changes in my weight are fat and salt consumption. Not carbs (sorry, Atkins!). Not cholesterol (sorry, Cheerios!). Not sugar (sorry, Dr. Brunetti!). Just salt and fat. Turns out, at least as far as I’m concerned, one author was 2/3 of the way right.
- Coefficients: These numbers explain the magnitude of the effect for each variable. So, I know that salt and fat influence by weight changes, but how much do they influence those fluctuations? That’s what the coefficients tell us. For the other variables in the model–sugar, carbs, and cholesterol–we would replace the coefficients with “0” since they are statistically significant.
Okay, so I’ve explained what everything here means. How do I put it together in order to determine how much fat and salt I should consume each day in order to maintain or lose weight? Here’s the formula:
So, if I input the data from the Regression Output above, here’s what it looks like:
For example, If wanted to maintain weight, one combination could be consuming 55g of Fat and 3,000 mg of Sodium on a daily basis. If I followed the FDA’s recommendations, I would actually lose a tenth of a pound each day. Check out this table I created. You can change the data in the yellow cell and see how the weight would fluctuate based on varying levels of fat and sodium intake in the green cell.
So, did cutting back on fat and sodium cause me to lose weight. Well, no, not exactly. Sodium and fat do tend to be the variables that influence my weight fluctuation but, it turns out that I ended up eating–on average–the daily recommended amount of sodium. Here’s how my average daily consumption compares to the recommendations from the FDA:
As you can see, I consumed just about exactly what the FDA recommends in regards to sodium. However, I consumed 32% less fat than is recommended by the FDA. So, while sodium and fat both influenced my weight fluctuation, cutting back on fat throughout my diet made the real difference in helping me to shed those 30 pounds.
For those of you who are visual, here’s how my fat consumption changed as my weight fluctuation went from high (gaining weight) to low (losing weight):
So, How Do YOU Lose 30 Pounds in 11 Weeks?
All that I’ve discussed above, I will freely admit, can only be applied with certainty to me–or, to a lesser extent, others like me. A sample of 1 is high in a population of 1 (me), but it is extremely low in a population of millions (everyone who wants to lose weight, including perhaps you). So, if you’re asking me how I think you can lose weight, the honest answer is, “I don’t know.”
I can only tell you what I’ve done to lose weight. I’m a 27 year old, 5’8.5″ male with countless genetic and environmental factors influencing everything about me. Will cutting back on fat and sodium help you lose weight, too? I don’t know. It’s possible. But, I can tell you how to find out…
COLLECT THE DATA
If you want to know something for sure, whether it’s what causes you to lose weight, what kind of books you like to read, or what activities make you the happiest, the only way to know for sure is to measure it.
Life is data.
Step 1: Collect the Data
Use any kind of nutritional intake tracker. I used the MyFitnessPal app for the iPhone. Record everything you eat in the tracker. The next morning, after your food has digested, use the bathroom and then weight yourself. Keep a spreadsheet in Excel (or whatever software you use), listing your weight each morning and the data from whatever nutritional variables you want to track.
To determine the effect of consumption on weight change, subtract the previous day’s weight from your current weight and put the resulting number on the row for the day before you way yourself. (You want the weight fluctuation to represent the food you consumed the previous day and not the food you are about to consume after weighing in).
At the end of your project, your data set should look something like mine. Remember, track your consumption for at least 30 days, but the longer the better.
Step 2: Run a Correlation
If you haven’t already done so, install the Data Analysis Tool Pak in Microsoft Excel. This will permit you to run correlations and regressions. One you’ve installed the plug-in, go to your data set, select “Data Analysis” from the data tab, and choose “Correlation” in the drop down menu. It should look something like this:
Once you select “Correlation”, click on the little box beside the “Input Range” cell. Then, highlight of the variables (including the headers) for which you want to see correlations. Then, check the box for “Labels in First Row.” By default, the program will create a correlation in a new tab of your spreadsheet. Go ahead and click “OK.”
The result from this should be exactly what you see at the beginning of this post. A correlation doesn’t really explain how one thing causes another, but it does show you how different variables occur together. For example, in my model, saturated fat is highly correlated with weight loss but, in the end, I discovered that it was because fat influences weight change and saturated fat is highly correlated with fat. It takes some experience to begin to see relationships, but it can really help you in building toward your final model.
Step 3, 4, 5, 6, etc.: Run Regressions
Okay, this is where it gets tricky. I’m going to show you how to run a regression in Excel, but you will probably have to do it more than once. What I’m showing includes only the variables I decided to included after playing around for a bit.
Go to the same place where you would to run a correlation but, this time, select “Regression” from the drop down menu.
When the dialogue box pops up, click on the box beside the “Input Y Range” cell. You will then want to highlight only the data for your output variable (in this case, “weight change”)–and be sure to include the header. Then, click on the box beside the “Input X Range” cell. Highlight all of the data for your other variables, including the headers for the columns. Then, check the box that says “Labels.” By default, the “Confidence Level” will be set at 95. You can change it to 99 if you would like, but it isn’t necessary.
Go ahead and click “OK.” The result should be the regression that I ran earlier in this post.
Step 43: Take a Course on Statistics
There are dozens of great tutorials for free on YouTube to find out how to use multivariate regression, but if you really want to know how to do it, I would recommend taking a legitimate course on statistics. Go to your local college or university and see if you can take a course. Seriously, it’s that important. It’s worth shelling out $1,000 bucks, or whatever it might cost. Statistics can change your life.
Remember, this isn’t just about dieting. It’s about everything. If you know how to look at things statistically, you can look at them objectively. Yes, there is even an element of human judgment in statistics and biases can creep into the interpretation. But, at the end of the day, you’ll make much better decisions and form much stronger beliefs when they are based on the data.
So, you want my advice? Stop asking for advice.
Start collecting the data.