过去30年,麦当劳已经认识到均衡饮食的重要性。他们花费多年时间探索和开发健康餐单。他们进行了不同研究和试验,对麦当劳的正式供应商,以及其他资源完成实验室认证。经过对麦当劳食品的严谨研究,证明它确实是一种健康生活方式的组成部分。美国营养师协会的一些合格专业人员已经认识和发现沙拉、汤和烤三明治,水果派等麦当劳食品真的非常健康。
尽管麦当劳提供大量汉堡包和炸薯条,适量食用是不会有任何问题的。但由于麦当劳食品非常美味,以至于很多孩子喜欢大量的吃,这也是导致出现问题的原因所在。吃任何东西过量都对健康不利。很多人对新的麦当劳营养指南感兴趣,因为他们需要减肥,但却经常吃这种快餐食品。但问题不仅仅是控制热量摄入这么简单。
该项目针对麦当劳菜单中的营养成分进行分析,用数据说话。
# 引入必要的包
import csv
import os
import numpy as np
import pandas as pd
# 指定数据集路径
dataset_path = '../data'
datafile = os.path.join(dataset_path, 'menu.csv')
# 读入数据
menu_data = pd.read_csv(datafile)
# 数据预览
menu_data.head()
Category | Item | Serving Size | Calories | Calories from Fat | Total Fat | Total Fat (% Daily Value) | Saturated Fat | Saturated Fat (% Daily Value) | Trans Fat | ... | Carbohydrates | Carbohydrates (% Daily Value) | Dietary Fiber | Dietary Fiber (% Daily Value) | Sugars | Protein | Vitamin A (% Daily Value) | Vitamin C (% Daily Value) | Calcium (% Daily Value) | Iron (% Daily Value) | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
0 | Breakfast | Egg McMuffin | 4.8 oz (136 g) | 300 | 120 | 13.0 | 20 | 5.0 | 25 | 0.0 | ... | 31 | 10 | 4 | 17 | 3 | 17 | 10 | 0 | 25 | 15 |
1 | Breakfast | Egg White Delight | 4.8 oz (135 g) | 250 | 70 | 8.0 | 12 | 3.0 | 15 | 0.0 | ... | 30 | 10 | 4 | 17 | 3 | 18 | 6 | 0 | 25 | 8 |
2 | Breakfast | Sausage McMuffin | 3.9 oz (111 g) | 370 | 200 | 23.0 | 35 | 8.0 | 42 | 0.0 | ... | 29 | 10 | 4 | 17 | 2 | 14 | 8 | 0 | 25 | 10 |
3 | Breakfast | Sausage McMuffin with Egg | 5.7 oz (161 g) | 450 | 250 | 28.0 | 43 | 10.0 | 52 | 0.0 | ... | 30 | 10 | 4 | 17 | 2 | 21 | 15 | 0 | 30 | 15 |
4 | Breakfast | Sausage McMuffin with Egg Whites | 5.7 oz (161 g) | 400 | 210 | 23.0 | 35 | 8.0 | 42 | 0.0 | ... | 30 | 10 | 4 | 17 | 2 | 21 | 6 | 0 | 25 | 10 |
5 rows × 24 columns
# 数据信息
menu_data.info()
<class 'pandas.core.frame.DataFrame'>
RangeIndex: 260 entries, 0 to 259
Data columns (total 24 columns):
Category 260 non-null object
Item 260 non-null object
Serving Size 260 non-null object
Calories 260 non-null int64
Calories from Fat 260 non-null int64
Total Fat 260 non-null float64
Total Fat (% Daily Value) 260 non-null int64
Saturated Fat 260 non-null float64
Saturated Fat (% Daily Value) 260 non-null int64
Trans Fat 260 non-null float64
Cholesterol 260 non-null int64
Cholesterol (% Daily Value) 260 non-null int64
Sodium 260 non-null int64
Sodium (% Daily Value) 260 non-null int64
Carbohydrates 260 non-null int64
Carbohydrates (% Daily Value) 260 non-null int64
Dietary Fiber 260 non-null int64
Dietary Fiber (% Daily Value) 260 non-null int64
Sugars 260 non-null int64
Protein 260 non-null int64
Vitamin A (% Daily Value) 260 non-null int64
Vitamin C (% Daily Value) 260 non-null int64
Calcium (% Daily Value) 260 non-null int64
Iron (% Daily Value) 260 non-null int64
dtypes: float64(3), int64(18), object(3)
memory usage: 48.8+ KB
menu_data.describe()
Calories | Calories from Fat | Total Fat | Total Fat (% Daily Value) | Saturated Fat | Saturated Fat (% Daily Value) | Trans Fat | Cholesterol | Cholesterol (% Daily Value) | Sodium | ... | Carbohydrates | Carbohydrates (% Daily Value) | Dietary Fiber | Dietary Fiber (% Daily Value) | Sugars | Protein | Vitamin A (% Daily Value) | Vitamin C (% Daily Value) | Calcium (% Daily Value) | Iron (% Daily Value) | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
count | 260.000000 | 260.000000 | 260.000000 | 260.000000 | 260.000000 | 260.000000 | 260.000000 | 260.000000 | 260.000000 | 260.000000 | ... | 260.000000 | 260.000000 | 260.000000 | 260.000000 | 260.000000 | 260.000000 | 260.000000 | 260.000000 | 260.000000 | 260.000000 |
mean | 368.269231 | 127.096154 | 14.165385 | 21.815385 | 6.007692 | 29.965385 | 0.203846 | 54.942308 | 18.392308 | 495.750000 | ... | 47.346154 | 15.780769 | 1.630769 | 6.530769 | 29.423077 | 13.338462 | 13.426923 | 8.534615 | 20.973077 | 7.734615 |
std | 240.269886 | 127.875914 | 14.205998 | 21.885199 | 5.321873 | 26.639209 | 0.429133 | 87.269257 | 29.091653 | 577.026323 | ... | 28.252232 | 9.419544 | 1.567717 | 6.307057 | 28.679797 | 11.426146 | 24.366381 | 26.345542 | 17.019953 | 8.723263 |
min | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | ... | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 |
25% | 210.000000 | 20.000000 | 2.375000 | 3.750000 | 1.000000 | 4.750000 | 0.000000 | 5.000000 | 2.000000 | 107.500000 | ... | 30.000000 | 10.000000 | 0.000000 | 0.000000 | 5.750000 | 4.000000 | 2.000000 | 0.000000 | 6.000000 | 0.000000 |
50% | 340.000000 | 100.000000 | 11.000000 | 17.000000 | 5.000000 | 24.000000 | 0.000000 | 35.000000 | 11.000000 | 190.000000 | ... | 44.000000 | 15.000000 | 1.000000 | 5.000000 | 17.500000 | 12.000000 | 8.000000 | 0.000000 | 20.000000 | 4.000000 |
75% | 500.000000 | 200.000000 | 22.250000 | 35.000000 | 10.000000 | 48.000000 | 0.000000 | 65.000000 | 21.250000 | 865.000000 | ... | 60.000000 | 20.000000 | 3.000000 | 10.000000 | 48.000000 | 19.000000 | 15.000000 | 4.000000 | 30.000000 | 15.000000 |
max | 1880.000000 | 1060.000000 | 118.000000 | 182.000000 | 20.000000 | 102.000000 | 2.500000 | 575.000000 | 192.000000 | 3600.000000 | ... | 141.000000 | 47.000000 | 7.000000 | 28.000000 | 128.000000 | 87.000000 | 170.000000 | 240.000000 | 70.000000 | 40.000000 |
8 rows × 21 columns
used_cols = ['Calories', 'Calories from Fat', 'Total Fat', 'Cholesterol', 'Sugars']
# 营养成分最高的单品
max_idxs = [menu_data[col].argmax() for col in used_cols]
for col, max_idx in zip(used_cols, max_idxs):
print('{} 最高的单品:{}'.format(col, menu_data.iloc[max_idx]['Item']))
Calories 最高的单品:Chicken McNuggets (40 piece)
Calories from Fat 最高的单品:Chicken McNuggets (40 piece)
Total Fat 最高的单品:Chicken McNuggets (40 piece)
Cholesterol 最高的单品:Big Breakfast with Hotcakes (Regular Biscuit)
Sugars 最高的单品:McFlurry with M&M’s Candies (Medium)
# 营养成分最低的单品
min_idxs = [menu_data[col].argmin() for col in used_cols]
for col, min_idx in zip(used_cols, min_idxs):
print('{} 最低的单品:{}'.format(col, menu_data.iloc[min_idx]['Item']))
Calories 最低的单品:Diet Coke (Small)
Calories from Fat 最低的单品:Side Salad
Total Fat 最低的单品:Side Salad
Cholesterol 最低的单品:Hash Brown
Sugars 最低的单品:Hash Brown
# 菜单类型的单品数目分布
cat_grouped = menu_data.groupby('Category')
print('菜单类型的单品数目:')
print(cat_grouped.size().sort_values(ascending=False))
菜单类型的单品数目:
Category
Coffee & Tea 95
Breakfast 42
Smoothies & Shakes 28
Chicken & Fish 27
Beverages 27
Beef & Pork 15
Snacks & Sides 13
Desserts 7
Salads 6
dtype: int64
# 菜单类型的营养成分分布
print('菜单类型的营养成分分布:')
used_cols = ['Calories', 'Calories from Fat', 'Total Fat', 'Cholesterol', 'Sugars']
print(cat_grouped[used_cols].mean())
菜单类型的营养成分分布:
Calories Calories from Fat Total Fat Cholesterol \
Category
Beef & Pork 494.000000 224.666667 24.866667 87.333333
Beverages 113.703704 0.740741 0.092593 0.555556
Breakfast 526.666667 248.928571 27.690476 152.857143
Chicken & Fish 552.962963 242.222222 26.962963 75.370370
Coffee & Tea 283.894737 71.105263 8.021053 27.263158
Desserts 222.142857 64.285714 7.357143 15.000000
Salads 270.000000 108.333333 11.750000 51.666667
Smoothies & Shakes 531.428571 127.678571 14.125000 45.000000
Snacks & Sides 245.769231 94.615385 10.538462 18.461538
Sugars
Category
Beef & Pork 8.800000
Beverages 27.851852
Breakfast 8.261905
Chicken & Fish 7.333333
Coffee & Tea 39.610526
Desserts 26.142857
Salads 6.833333
Smoothies & Shakes 77.892857
Snacks & Sides 4.076923
# 营养成分最高的菜单类型
max_cats = [cat_grouped[col].mean().argmax() for col in used_cols]
for col, cat in zip(used_cols, max_cats):
print('{} 最高的菜单类型:{}'.format(col, cat))
Calories 最高的菜单类型:Chicken & Fish
Calories from Fat 最高的菜单类型:Breakfast
Total Fat 最高的菜单类型:Breakfast
Cholesterol 最高的菜单类型:Breakfast
Sugars 最高的菜单类型:Smoothies & Shakes
# 营养成分最低的菜单类型
min_cats = [cat_grouped[col].mean().argmin() for col in used_cols]
for col, cat in zip(used_cols, min_cats):
print('{} 最低的菜单类型:{}'.format(col, cat))
Calories 最低的菜单类型:Beverages
Calories from Fat 最低的菜单类型:Beverages
Total Fat 最低的菜单类型:Beverages
Cholesterol 最低的菜单类型:Beverages
Sugars 最低的菜单类型:Snacks & Sides
menu_data['Serving Size'].head()
0 4.8 oz (136 g)
1 4.8 oz (135 g)
2 3.9 oz (111 g)
3 5.7 oz (161 g)
4 5.7 oz (161 g)
Name: Serving Size, dtype: object
# 过滤数据,只保留包含 'g'的单品
sel_menu_data = menu_data[menu_data['Serving Size'].str.contains('g')].copy()
def proc_size_str(size_str):
"""
处理serving size字符串,返回g
"""
start_idx = size_str.index('(') + 1
end_idx = size_str.index('g')
size_val = size_str[start_idx : end_idx]
return float(size_val)
sel_menu_data['Size'] = sel_menu_data['Serving Size'].apply(proc_size_str)
sel_menu_data.head()
Category | Item | Serving Size | Calories | Calories from Fat | Total Fat | Total Fat (% Daily Value) | Saturated Fat | Saturated Fat (% Daily Value) | Trans Fat | ... | Carbohydrates (% Daily Value) | Dietary Fiber | Dietary Fiber (% Daily Value) | Sugars | Protein | Vitamin A (% Daily Value) | Vitamin C (% Daily Value) | Calcium (% Daily Value) | Iron (% Daily Value) | Size | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
0 | Breakfast | Egg McMuffin | 4.8 oz (136 g) | 300 | 120 | 13.0 | 20 | 5.0 | 25 | 0.0 | ... | 10 | 4 | 17 | 3 | 17 | 10 | 0 | 25 | 15 | 136.0 |
1 | Breakfast | Egg White Delight | 4.8 oz (135 g) | 250 | 70 | 8.0 | 12 | 3.0 | 15 | 0.0 | ... | 10 | 4 | 17 | 3 | 18 | 6 | 0 | 25 | 8 | 135.0 |
2 | Breakfast | Sausage McMuffin | 3.9 oz (111 g) | 370 | 200 | 23.0 | 35 | 8.0 | 42 | 0.0 | ... | 10 | 4 | 17 | 2 | 14 | 8 | 0 | 25 | 10 | 111.0 |
3 | Breakfast | Sausage McMuffin with Egg | 5.7 oz (161 g) | 450 | 250 | 28.0 | 43 | 10.0 | 52 | 0.0 | ... | 10 | 4 | 17 | 2 | 21 | 15 | 0 | 30 | 15 | 161.0 |
4 | Breakfast | Sausage McMuffin with Egg Whites | 5.7 oz (161 g) | 400 | 210 | 23.0 | 35 | 8.0 | 42 | 0.0 | ... | 10 | 4 | 17 | 2 | 21 | 6 | 0 | 25 | 10 | 161.0 |
5 rows × 25 columns
# 份量最多的单品
max_idx = sel_menu_data['Size'].argmax()
print('份量最多的单品:{},{}g'.format(sel_menu_data.iloc[max_idx]['Item'], sel_menu_data['Size'].max()))
min_idx = sel_menu_data['Size'].argmin()
print('份量最少的单品:{},{}g'.format(sel_menu_data.iloc[min_idx]['Item'], sel_menu_data['Size'].min()))
份量最多的单品:Chicken McNuggets (40 piece),646.0g
份量最少的单品:Kids Ice Cream Cone,29.0g
sel_cat_grouped = sel_menu_data.groupby('Category')
print('份量最多的类别:{},{}g'.format(sel_cat_grouped['Size'].mean().argmax(),
sel_cat_grouped['Size'].mean().max()))
print('份量最少的类别:{},{}g'.format(sel_cat_grouped['Size'].mean().argmin(),
sel_cat_grouped['Size'].mean().min()))
份量最多的类别:Smoothies & Shakes,304.75g
份量最少的类别:Desserts,101.57142857142857g