IC Hack 2018 (Imperial College London’s 2018 hackathon) has just finished and in this post I’d like to share my experiences from the event and what our team has built. In the period of 24 hours me and 4 of my friends from Imperial have made an application which uses a camera to track users’ consumption of soft drinks and shows them various real-time health-related data such as sugar or energy consumption over time. To do this, we have used technologies including YOLO, Microsoft Cognitive API, Flask, Chart.js and Bootstrap. Read on to find out more!
To give proper credit, the idea for the program came (to the best of my knowledge) from my teammates Piers and Rajan. The exact way the program functions is the following. In the first part — done by Haaris — the user’s camera (in this case a webcam) is continuously pointed at the user’s face and the camera input is then used with the YOLO object detection system to detect images of bottles/cans in real time. These are then extracted from the video frames and saved as image files (with filename containing the date of processing) for further processing.
In the second part — done by me — I look through the directory containing all the saved images and check if the newest image in the directory is newer than the latest processed image (by comparing the newest saved image’s date with the date of the latest processed image stored in food_log.json file). If there is, the picture is sent to Microsoft Cognitive Services server to perform a more precise object recognition using a neural network which I have trained through Microsoft’s web interface. The training consisted of 4 types of drinks (Cola can, Fanta can, water bottle and Oasis bottle) where each drink class had around 5 training samples where I just took photographs of the drinks from different angles (despite the small number of examples, the net had a quite good precision and recall — around 75%). After being analyzed, the label and corresponding confidence for the submitted picture was then received from Microsoft’s server (e.g. water, 85%). Based on the image label, the program then logs this newly analyzed drink into the file ‘food_log.json’ where the drink label, its energy content, its sugar content, and the time processed is then saved as a new JSON entry to be used for visualisation later (there is a separate file nutritional_information.json which contains the energy and sugar content for each drink and was manually entered for each drink). In the next part, the food_log.json file will then be requested from the server to plot graphs on the client side, to do which we have used the Flask server.
All in all, the event was really fun and also a great learning experience (for example I have learned more about the server-client interaction as Haaris explained it). We have finished the project in time and had a (mostly) successful demonstration for different groups of judges which included Microsoft, DoCSoc and others. And although our project didn’t get selected for the final presentations the experience was still worth it! 🙂
Note: you can find the project on GitHub or Devpost and see a demo video on YouTube. If you’re using the GitHub repo, you first need to make and run the Yolo object detection/recognition part of the program located in the ‘yolo/darknet/’ subfolder, then run the ‘recognition.py’ script for further object recognition using Microsoft’s API, and lastly start the Flask server by (installing Flask and) running ‘flask run’, after which you can view the graphs on your localhost, most likely at ‘http://127.0.0.1:5000/’.