Voice clips deleted post-transcription.
Photos and voice clips are processed server-side and deleted unless you opt into model improvement. Analytics opt-out leaves the app fully functional.
Three input modes, one calorie tracker. Type a quick description, snap a photo, or speak it aloud — Calorai's AI handles food recognition, portion estimation and macro breakdown.
Most calorie apps assume you'll happily type "190 g grilled chicken breast skinless" at 9pm. Calorai assumes you won't, and gives you two faster ways out.
"Two eggs, sourdough, avocado." The LLM parses your sentence, matches each item to the database, returns calories + macros. About a hundred milliseconds, no menus.
The vision model identifies up to four items in one frame (composite meal detection), estimates portion size from visual cues, and returns a single entry. Editable in two taps.
Hold the mic, describe the meal. Whisper-based transcription shows what was heard, the LLM logs it. Built for the moment your hands are full. Try it →
The friction lives in how you log, not in the calories. Calorai removes it three different ways.
Calorai isn't trying to be the only nutrition app you ever use. It's trying to be the one that survives because logging at lunch doesn't feel like a chore.
One sentence beats five database searches. The LLM parses "two eggs and a slice of sourdough" faster than you can find the right egg.
The dish on the menu doesn't exist in the MyFitnessPal database. The plate in front of you does — point the camera, get a working estimate.
Driving, holding the baby, in the middle of cooking. Hold the mic for two seconds, the meal is logged before the rice is on the plate.
Breakfast typed (it's always the same). Lunch photographed (variety). Dinner spoken (cooking with hands). The data doesn't care.
Six features built around the multi-modal input idea. Things on the roadmap that haven't earned their place are still on the roadmap.
The same Calorai entry is reachable from a chat bubble, a camera frame, or a microphone. Different friction, same data structure. Open Calorai →
The vision model lists up to four items in one frame — chicken + rice + avocado + spinach — and totals them as one entry.
Bidirectional sync writes meals and reads steps/workouts. Training-day calorie budget adjusts automatically — no manual offset math.
Voice transcription always shows the heard text and the AI's interpretation before committing. Edit in two taps, log when right.
Database-cached typed logging keeps working without signal. Photo and voice queue locally, sync when you're back online.
Photos and voice clips are processed server-side and deleted unless you opt into model improvement. Analytics opt-out leaves the app fully functional.
Where multi-modal input wins, where the established databases still win, and the spots where you should probably keep MyFitnessPal in the rotation.
The 4-star one is here because pretending nobody's frustrated insults the customer who is.
"I chat-log breakfast in five seconds. That's the entire reason I'm still tracking — every previous app I'd quit by week two when typing got tedious."
"Photo mode at restaurants is the killer feature — the dish isn't in any database, but the camera nails the entry close enough that I just edit the portion."
"Voice is genuinely magical when it works. In a quiet kitchen it's flawless. On a busy street with traffic, it logged 'chicken thighs' as 'kitchen tights' — twice. Edit-and-fix takes a few taps, then you give up."
A lot of nutrition apps now have one good AI feature bolted onto a 2014-era logging UX. Calorai started from the input side, not the analytics side.
The hypothesis was uncomfortable: the calorie-tracking problem isn't a database problem, it's a typing problem. MyFitnessPal has 14 million food items and the best barcode coverage in the industry, and the median user still logs for 14 days and stops. Cronometer has the most verified nutrition data in any consumer app, and the drop-off curve looks the same.
People don't stop tracking because they ran out of foods to log. They stop because at the end of a 14-hour day, typing "200 g chicken thigh, skin on, baked" into a search bar feels like a third job. Calorai was built to answer one specific question: what if you could log a meal in the time it takes to say what you ate?
The architecture is three input pipelines that converge into one entry: an LLM-parsed chat field for typed input, a vision model trained on plated meals for photo input, and a Whisper-based transcription pipeline for voice. All three resolve into the same Calorai log structure — same shape, same macros, same Apple Health sync. The free tier covers typed and photo logging. Premium ($4.99/month) unlocks voice, offline-cached typing, composite meal detection and weekly analytics.
None of those make Calorai the wrong choice — they make it a specific choice. If your nutrition is mostly whole-food and home-cooked, multi-modal input is meaningfully faster than database search. If most of what you eat is packaged with a barcode, MyFitnessPal is probably still the right tool. The two pair well. Start a week →
Three input modes, three seconds, one calorie log. Free tier covers typed and photo logging — voice is unlocked with Premium. Apple Health and Health Connect ready out of the box.
Get Calorai free →