Vision-based food nutrition estimation via RGB-D fusion network

Wenjing Shao^a, Weiqing Min^b,c,*, Sujuan Hou^a,*,Mengjiang Luo^b,c, Tianhao Li^b,c,Yuanjie Zheng^a,Shuqiang Jiang^b,c

^aSchool of Information Science and Engineering, Shandong Normal University

^bThe Key Laboratory of Intelligent Information Processing, Institute of Computing Technology, Chinese Academy of Sciences
^cUniversity of Chinese Academy of Sciences

Food nutrition evaluation model RGB-D Net explores the method of multi-modal feature fusion from the perspective of image fusion of RGB and Depth, and fully considers the complementity of the two image features to obtain more abundant feature information to achieve robust food nutrition evaluation. Experiments on the Nutrition5k dataset demonstrated the effectiveness of the proposed method.

Download

The following files are provided:

myresnet.py       
bfp.py 
util_data.py
mydataset.py
train_RGBD_multi_fusion.py
test_RGBD_multi_fusion.py
food2k_resnet101_0.0001.pth

To download all files as a zip package, click here.

Usage

Train

Before training the model, you need to load the weight file that resnet101 pre-trained on Food2k and modify the weight path information in the code:

resnet101_food2k=torch.load("root/to/weight file")

Secondly, you need to modify the path for predicting nutritional values, training process, and other paths.For example:

log_file_path="root/to/you"

In addition, some parameters need to be set in order to train the model properly.Simply run:

python train_RGBD_multi_fusion.py --model resnet101  --dataset nutrition_rgbd  --data_root root/to/dataset  --rgbd  --direct_prediction

The training model resnet101 can be set through the --model option. The dataset is set using the --dataset option, where we use nutrition_rgbd, --rgbd: meant that the default was rgb and d modes, -- direct_prediction: was set as direct prediction result, and the path of the dataset was loaded by --data root.

Test

Before testing, you need to change the path of the data set in code:

data_root="root/to/dataset"

Then simply run:

python test_RGBD_multi_fusion.py

If necessary, the following arguments can also be set:

--lr : learning rate
--wd : weight decay
--b : batch size
--resume : resume from checkpoint
--print_freq ：print frequency
--bn_momentum :batchnorm momentum

Config dataset

Dataset metadata is stored in nutrition5k_dataset. The organizational form of the dataset is as follows:

|-nutrition5k_dataset
    |---imagery
        |---realsense_overhead
            |---Dish1
		|---depth_color.png
		|---rgb.png
            |---Dish2
                |---depth_color.png
		|---rgb.png
            ......
            |---DishM
                |---

Also,The labels for the training and testing sets are as follows:

Training set tags:
rgbd_train_processed.txt  
rgb_in_overhead_train_processed.txt
Testing set tags:
rgbd_test_processed.txt
rgb_in_overhead_test_processed.txt

Before you start training,you can store the tags in the imagery folder like this:

|-nutrition5k_dataset
    |---imagery
        |---realsense_overhead
            |---Dish1
            ......
            |---DishM
        |---rgbd_train_processed.txt  
	|---rgb_in_overhead_train_processed.txt
	|---rgbd_test_processed.txt
	|---rgb_in_overhead_test_processed.txt

Reference

If you would like to use our paper, please cite it:

    @article{SHAO2023136309,
    title = {Vision-based food nutrition estimation via RGB-D fusion network},
    journal = {Food Chemistry},
    volume = {424},
    pages = {136309},
    year = {2023},
    issn = {0308-8146},
    doi = {https://doi.org/10.1016/j.foodchem.2023.136309},
    url = {https://www.sciencedirect.com/science/article/pii/S0308814623009275},
    }