The process of creating my third Amazon Echo skill – Banana Bread Baker

This is the first recipe in what I hope to be a series for Cros Cooking, my cooking service.

Much of this project is a continuation of my work on Visual Voice Assistant with Computer Vision

For this project I wanted to try working with this modified version of the double diamond triangle created by Dan Nessler.

The double diamond with more detailed steps

What could be-

I had a hypothesis that I wanted to test with these new Amazon Echo Show devices. Could voice commands make the cooking process easier with the aid of video? I was pretty sure the answer was yes, because of all the ads I saw for these devices had some cooking featured, but I hadn’t seen anything that used video in the way I wanted to.

These are screenshots from ads for the Google Home Hub and the Amazon Echo Show devices

Figure 1.1 – These are screenshots from ads for the Google Home Hub and the Amazon Echo Show devices

I wanted to bring the looping feature and the human personality that I had in my last project into an experience that would help people cook. I had imagined it would look like Figure 1.2.

Post it notes with the authors imagined steps for his idea

Figure 1.2

Primary Research –

I began my research by analyzing the hypothesis that I had.

Could voice commands make the cooking process easier with the aid of video?

I color coded key words in this question because I planned on researching each of these as separate topics so that I could answer that question more thoughtfully as a whole.

First I looked into voice command cooking. It already existed, but was kind of cumbersome to use. On the devices without screens the recipe would just be narrated to you as you try to cook along. I kept having to say “next” as the device would read the ingredients list one by one. I kept forgetting steps just to find myself saying “What was that again?” The other option for cooking with a voice only interface was getting it sent directly to your phone, in which case you would probably be better off selecting what kind of food you’d want visually instead of having the assistant just serve you a random one.

For voice command cooking with a screen, or using “smart displays” to cook, the findings were mixed. There are three different ways you can cook with these devices.

The First is by finding a video that you want to cook with on YouTube or another video provider. This is clunky at best on Amazon devices as they do not have native YouTube support and in order to access them you have to open the Silk browser first then navigate to the video. On Google devices with a screen you can just ask for them; this is the predominantly advertised way to cook with Google devices as shown in Figure 1.1.

The second way to cook with these devices is to find ported over recipes from websites like Kitchen Stories, SideCHEF, or allrecipes. This method is somewhat easy to use and available on both Amazon and Google devices. The interface varies slightly between the two but the content is roughly the same. Figure 2.1 shows what it looks like to cook with this method on the second generation Amazon Echo show. In this image you can see that there are only two ingredients showing, the rest are down below if you scroll to find it, the only signifier being that you were probably expecting to see more ingredients than the two. If you start the recipe you will find that even the steps have this same problem, you have to scroll to see the full picture. These recipes were not optimized for a voice first experience, but imported from phone recipes, where scrolling on your phone while cooking is the norm.

Classic spaghetti carbonara with only two ingredients listed

Figure 2.1

The third method for cooking with smart displays is limited to the Amazon Ecosystem, and was the closest to what I wanted to create. Amazon has taken prime Voice User Interface (VUI) real estate for cooking with these devices. This method opens when you say “Alexa, let’s cook.” Additionally, this interface is optimized for voice; you can select the recipe with via voice command and use voice to navigate all the steps. First, you are shown the ingredients all presented nicely on one screen, as shown in figure 2.2. Then as you continue through the built in skill by saying next, you will get told what to do for each step voiced by Alexa, text of what she said displayed at the top. On the screen, a video loops showing you how to complete that step and a slider for what step you are at, along with two buttons one that reads “Next Step” and another with “<“.

Cros Cooking Banana Bread Baker

Figure 2.2

However, this method is not well suited for home cooking because it lacks the measurements for each of the ingredients. The Ingredients list is missing because this method for cooking is part of Amazon meal kits system. You can buy a meal kit for 8.99 per serving, but in order to get one you must be in Seattle (for now), be an Amazon Prime member, and an Amazon Fresh member, as well as pay for $9.99 for delivery unless you order more than $40 of food. I do secondary research on this method later as it was the closest to what I wanted to create.

While working on this project I was cooking a lot more in my daily life. I found recipes online, in videos, meal kits, and on the backs of products like a bag of crawfish tails. As I cooked more I began to think of what I could do to make the cooking process easier, and what that meant to the people around me that cook as well. So I asked just about everyone I knew about how they cooked and some of the problems that they had while cooking. From my friends that cooked the most, I heard that they cooked a lot of recipes they already knew by heart, simple recipes they had been taught growing up, and that this was habitual. They had also said that sometimes they would use videos to help them cook something they craved or that they were cooking something they saw their favorite food YouTube channel whip up. They even told me the names of the channels without me asking (Tasty, Munchies Matty Matheson) because they felt a connection with the chefs. This connection was the reason they subscribed and why they wanted to cook with these chef’s recipes; they trusted that these recipes are good because the chef’s are good.

When cooking with the back of a crawfish tail bag, I had to trust myself to not mess up. Unfortunately, I messed up several times, but I learned from it. I had forgotten some of the ingredients I needed at the store, cooked ingredients out of order, and forgot a step as I skimmed through the recipe. Luckily, the food still came out alright. For all my errors, I had a couple findings.

– The cooking process started at the grocery store, not the kitchen

– The tools you cook with are important to the process, even if they aren’t in the recipe

-Segmenting the steps, would help me follow them one by one.

The people that I talked to about cooking, mostly UTD students and academics, had told me that cooking with video was nice, that seeing the steps being done on screen helped them pull off the recipes themselves. I believe this familiarity with the chefs not only helped them be more comfortable when cooking, but also helped them remember the techniques in the video, as everyone I talked to could remember a detail of the process to tell me. From my own experience, video definitely helps me cook. Being able to see the color of the food, the techniques, and the consistency of food on screen in video helps in a way other cooking guides can’t. The negative to using video to cook is that often I use my phone to pull up the recipe. When using the phone, the screen times out, I have to scrub backwards in the video multiple times to repeat a step I might have missed, and by the end of it all my screen is filthy.

Secondary research-

I had seen first hand that the cooking process is difficult. There are a lot of things that can go wrong even before you get to the kitchen. It’s time consuming, requires planning, and as a result is being replaced by more convenient alternatives like fast food and prepackaged food. These alternatives are certainly not cheaper than cooking, but are starting to resemble cooking again in the form of meal kits. I wanted to research a couple different meal kits to see what conveniences they bring to the cooking process. The two meal kits that I researched were Amazon Meal kit, and Blue Apron.

With Amazon Meal Kit, the method that was most similar to what I wanted to create, I could not get a kit. As described earlier I would need to have an Amazon Prime membership, an Amazon Fresh membership and live in Seattle to obtain one.

So in order to use this process, I changed what I was researching. Instead of researching the meal kit, it kind of showed me what this technology could do to help people cook without all the prepackaging. I went to the grocery store with my friend that went to culinary school, Julian, and we estimated what we would need to create the recipe in the meal kit using figure 2.2, a fish curry dish with rice. With the ingredients in our cart, the total came out to ~$58; I already had rice and the spices at my house. The process of cooking (figure 2.3) with the video on screen with voice commands was very easy to follow which made it fun. Minus the fact that we had to guess the amounts to use at every step, this was only made harder because we were trying to cook for 4 people but the difficulties didn’t matter, the food came out excellent! We both found the voice dictation to be annoying as the visual aid with text was more useful to reference as we worked on each action.

Julian and I cooking along with our modified Amazon meal kit

Figure 2.3 – Still frame from footage

After eating, I had some time to reflect on our research. These Amazon Meal Kits are typically $8.99 a serving, which means the meal that we whipped ($58) up for 4 was more expensive to get all of the ingredients than if we had just bought 4 servings ($35.95). That is until you consider that I had enough extra food after shopping to make another 4 person serving 3 more times (16 total servings, at $3.62 each). That is a drastic savings even if you account for the rice and spices I had at my house. The food that we have at our house already can save us huge amounts of money in the long run, but is seen as a liability to some meal kit companies. Takeout Kit, a company that sells on Amazon says this about their meal kit as a selling point, “Takeout Kits can sit patiently in your pantry until you want to cook. No food waste = no cash wasted. Refreshing? We think so!” I’m sure they’re talking about perishables that might be uneaten extras, but the rice and the spices that I had at my house were there because they last a long time and they were a staple food in my diet. I began to think about how this staple food had saved me money in this instance and how it might be just likely that there were other staple foods in other peoples kitchen that could save them money. I could see the ease of use of using smart displays to cook helping people cook with ingredients they already had.

I had been really interested in meal kits after I saw this article, 90% of Americans don’t like to cook — and it’s costing them thousands each year because even though we supposedly don’t like cooking, these meal kits made people cook in order to enjoy their food. Meal kits were doing something new that made cooking easier, and I needed to find out what that was. The one that I heard the most about from friends was Blue Apron. One friend told me that they used Blue Apron after they got a referral code from their girlfriend, they described it as fun, tasty and kind of like a treat for themselves as it showed up at their door in a nice little package. They did not buy any orders after getting the referral. Another friend had told me that they got a Blue Apron subscription after getting into a serious car accident. They said they didn’t want to be in a vehicle if they didn’t have to be, and having the food delivered right to her instead of driving to get ingredients was nice because it meant less driving in her life. The process of getting your delivery starts at your computer, where they tailor make the experience for what you get by asking you some vital questions like the one in figure 2.4.

“What foods can your household eat?”

“How experienced are you at cooking?”

“How adventurous is your household when it comes to eating?”

“How do you feel about spending time in the kitchen?”

“Which cuisines does your household enjoy?”

Blue apron question with graphics about time spent in kitchens

Figure 2.4

Then you put in all your information and it shows up at your door a couple weeks later. The food comes in an insulated package with ice packs layered in between the food to keep it cold, but not frozen. Inside are all the pre-portioned packages that you need to assemble your food and thick paper instructions of how to put it all together.

Cooking with Blue Apron was incredibly easy, all the steps are displayed clearly on text and everything is well labeled. I did a little chopping, but most of the prep work now looked like opening a bunch of tiny little plastic packages. I personally felt a little bad about how much plastic waste was created after making this one dish for just me, especially after cooking for myself more during this time and creating very little imperishable garbage. After cooking this meal I had enough left over for another meal to bring to work, the food smelled pretty good so my co-workers were interested in what I had cooked up. It always feels pretty good to tell some one you cooked what they’re interested in eating, even if it came from a meal kit.

Market research-

How many people have smart displays?

More than 133 million households own smart speakers, and ~5.9% of smart speaker owners have a smart display. That comes out to ~7,847,000‬ people.

How many people have these devices in their kitchens?

~51% of Amazon Echo owners have their devices in the kitchen. Additionally, Echo Show devices are designed for the kitchen.

Total market size = ~4,001,970 and growing as these devices are getting cheaper and more widely adopted.

Why release the app for free?

Currently if you want to buy things using your voice through an Alexa, which is a growing method of purchasing goods, Amazon has what is called position 0. This puts their products first when users shop with their voice, meaning users will rarely hear an alternative. If I can generate sales for consumer packaged goods companies through an ecosystem of recipes of my own, I believe I can monetize with branded recipes in the future. Sending the ingredients list to pickup will generate a direct sale for these companies.

If you are interested in this business, reach out to me. Links on this website.

Insights-

Could voice commands make the cooking process easier with the aid of video?

To answer this question with my own research, laid out my findings on the wall and grouped them together into loose themes. (Answered in Opportunity Areas)

Wall with sticky notes

Figure 3.1

Rapport-

With most cooking guides I found, I saw that they either had a rating system or a compelling host that built a connection with the person cooking their recipe. It makes sense that when making a meal for yourself that you’d want some reassurance that your food is going to taste good.

Ingredients-

Cooking starts with ingredients and if you’re like me you get them at the grocery store. You may not get all of your ingredients, especially if you don’t know what they look like but having a visual aid might help you get everything. Or you might not get ingredients because you dislike the drive. This made me realize that grocery store pickup/delivery can solve both of these problems by gathering all of the ingredients in one place and optionally sending them straight to your door. And this is available at all major grocery stores.

Additionally, if you have a staple food at your house like rice or potatoes, your meals can be significantly cheaper, especially if you cook these foods regularly or habitually.

Positive feelings from cooking-

People can get a sense of accomplishment and pride from the food they create and have fun while learning techniques and pairings. While cooking with my friend Julian, he taught me multiple things that opened my eyes. I learned that I was doing things that people in the industry did, and that they weren’t that hard to pick up. That was really fun, I felt that I had a personal cooking lesson. I wanted to give people a similar experience when using my skill.

Visuals-

From cooking with all these other methods I’ve noted some things I liked an disliked. I liked seeing tools be included as part of the recipe after cooking with recipes that didn’t include them. I liked knowing the serving sizes and cooking times of recipes, and not getting my screen dirty. Lastly, I liked seeing steps segmented into different parts after using a recipe that was a wall of text.

Opportunity areas-

Yes, voice commands could make the cooking process easier with the aid of video. But this technology was being used to help people assemble meal kits. After doing the research I saw some clear innovations we could make right away to this kind of cooking.

I saw the opportunity to help people cook with grocery store sourced ingredients with near the same amount of convenience by utilizing grocery store pickup.

I saw the opportunity to help them save money with this process by utilizing the food they probably already have in their pantry to cook regularly.

I also saw an opportunity to use this technology to teach people techniques they could integrate into their own cooking.

I took these findings to Julian and we brainstormed,

How Might We change recipes in a way that could teach people about cooking?

Ideation-

Julian and the Author brainstorm on a whiteboard

Figure 4.1 – Yes, this is also from footage. I film everything for documentation purposes.

In figure 4.1 Julian and I talked about what this process would look like for the simplest of recipes that we could think of, the omelet. We discussed the issues that we had with cooking and what we thought cooking was at its best, we hoped to express that through our content. We cooked a couple omelets and they were tasty, but that recipe was too simple. We decided to scale up.

I asked Julian to show me a couple medium difficulty recipes that we might use. When he did they were all baked goods, he had trained in school for baking and this is what he was most comfortable with. We settled on the cheesecake with blueberry pate, and started the recordings that we planned on putting into the skill. The total process including cooling took 4 hours. While the cheesecake was delicious, it was clear to me that this was not the recipe that we were going to cook with. I was exhausted and I knew this would be extremely hard to user test as the first recipe. So we moved onto some simpler recipes.

Julian pulled out a couple gigantic cookbooks and we browsed for simple recipes with simple ingredients. We wanted to use ingredients that people probably already had in their kitchens and came to the choice of either cheese bites or vegan banana bread. Ultimately, we chose the banana bread because it was more accepting to dietary restrictions and was healthier.

Game plan for recording day

Figure 4.2

Planning how you are going to shoot this kind of content needs to be coordinated with whomever you are filming, in figure 4.2 you can see the rough game plan that I made with Julian after making a test banana bread first. I knew that being quick and effective would be important to scaling this if need be. To do this we had a couple simple shots in mind that we could repeat, overhead, process, time-lapse and lesson shots. We arranged all the ingredients and took a picture that we intended on using for the Mise en place, then started filming. We then took some glamour shots of the final product for thumbnails and intros. After editing the footage myself, I had made it into a slideshow, a low fidelity prototype that would give people a clear picture of what this project of mine did.

Build Test Iterate-

I spent my spring break building it and coding everything I needed to make it work. I had experience in the Amazon Echo development portal, but I still needed to learn this new tool, Amazon Presentation Language, which would make the interfaces for my skill. I was aiming to build a “minimum viable product” to see if people would understand how to use whatever I built before I started adding features. The code for my first working draft is in my Github.

The interface was simple, it opened with a page that told you could navigate with your voice by saying next and go back as needed, then it showed you the ingredients, then every video step one by one with a little play/pause button in the top left corner. It was a lot like the slideshow I had before, but now it was activated with your voice.

With a working prototype for my device I could user test. I started by giving people paper ingredients to put together my recipe and using the 5 act interview that I picked up after reading Jake Knapp’s “Sprint” book. These are the steps.

1.Friendly welcome

2.Context questions

3.Introduction to the prototype

4.Tasks

5.Quick debrief

User testing room, Interview time

Figure 5.1

This really helped the people that were testing my prototype become more comfortable in the room in figure 5.1 that I was testing in and helped me fill out my heuristic evaluation. I found that a major issue that people were having was not knowing that you could say next after the first screen and quickly patched it. However, testing with real people trying to cook with my skill showed me much more. I cooked with my skill alone a few times and found more errors to add to my heuristic evaluation but learned even more when I had two of my friends come over to cook with the skill. Like before I used the 5 act interview to get into it and learned much more. When cooking with real ingredients I realized that I, as the editor, did not know that “tsb” was not a measurement and used it for both tablespoon and teaspoon measurements. I had wondered why my bread had come out differently. I also realized that if the skill had closed out for any reason, in this case crashing, that it would take lots of “Alexa, next” to get where you were.

After these tests I had lots of issues on my heuristic evaluation, many of which required re-shooting the video. So we did and I spent some time fixing things that were broken or buggy and adding some functionality that should have been there. The Ingredients could now be asked for, the wording was clear, the Mise en place was now accurate, baking times were corrected, and to solve the navigation problem a new page was added. The new pages are shown below on the device simulator because they give more context. In red, in figure 5.2 are tips on how to read the simulator interface.

Alexa simulator first page of Banana bread baker

Figure 5.2

Mise en place in banana bread baker

Figure 5.3

Julian introduces you to the recipe

Figure 5.4

Audio free looping video for preheating the oven

Figure 5.5

A list that looks like a traditional recipe- it is a touchscreen list for navigation in banana bread baker

Figure 5.6

Functionality of each of the pages and future plans for improvement-

On the first page, Figure 5.2

– There is an image that gives you a preview of what you are about to cook and the instructions of how to use the skill. In addition, Alexa will also tell you how to use the skill. Throughout this skill, Alexa will only show up when there is no video to help you navigate through the skill. In user tests I found that sometimes people will ignore text, especially on static screens, or if they assume they know how to use something.

Improving this page

– We plan to improve this by adding serving sizes and total cooking time.

The functionality to see the ingredients is in the skill but not mentioned here, as well as “help”.

Possibly adding an accessibility button for audio to play throughout for the visually impaired.

In Figure 5.3

-The signifier to say next was needed as some people did not know they could still say next later in the app, this helped. Everything you’d need, including the tools are all laid out so you can find them in your pantry. You can also use this picture as a visual aid while shopping.

Improving this page

-I had intended in connecting this with the API for walmart, or another grocery store’s pickup, but then I thought about how they would need to input all of their data in order to get the cart sent to the store. I’m now thinking an easier solution would be to just put in a qr code with the ingredients you need to make the recipe. For example link to this – https://www.walmart.com/lists/shared/66b05975-aa2c-4d4e-87cd-2b136bbaa57e This can also be used to shop with, or sent to pickup. Just from your phone, not from the Amazon Echo Show device. Just as noted in the market research tab, these brands have been pre-selected for you, and if you were to simply click send to cart, that would be the brand you bought.

In Figure 5.4

– Julian introduces himself and his credentials and also introduces you to the recipe you’re about to make. This builds a little trust and gets you acquainted with what you’re about to cook. With all videos of the Chef they will explain a part of the cooking process and then pass the baton on to you. These videos automatically skip to an instructional step with no audio. You can use voice commands to go back and listen again. The only visual interface is of a play/pause button. Even though all audio free videos loop, I wanted to give users the option of pausing in case they wanted to see a detail in the video. You can pause and play with your voice as well. Touch interfaces are limited as the point of this skill is to keep your hands on your food and away from the screen.

Improving this page

-Signifiers for saying “next” or “go back” may be needed, but all users so far have not needed them, and I would prefer to keep the video clear of clutter. I would like to integrate some visual indicator of what step you are on out of the whole and have it link to figure 5.6. Additionally the pause play button is very large and might be better suited in a different part of the screen.

In Figure 5.5

-A line of text tells you what you need to do and the accompanying looping audio free video gives you as much time as you need to complete it free of annoyances.

Improving this page

-Similar notes as above, with the additional option of Alexa dictating what is on each screen for the visually impaired.

In Figure 5.6

-This page solved the issue of people constantly having to say next to go to a step that might be deeper into the process of the recipe. Now you could just select it. This page resembles a traditional recipe, but each step is linked to one of the videos that will help you cook.

Improving this page

-This page needs work, because it is the newest addition to the skill.

Not all text fits on one screen, making it impossible to know how many steps there are for pure voice navigation

There is no voice navigation

steps are abbreviated, maybe too much

The steps need some space in between each one

The title is too large, and might read better as Vegan chocolate chip banana bread

Visual pictures of each step might be nice

Final thoughts-

I really believe that this methodology for cooking could really make it easier for people to save money by cooking for themselves with staple foods they probably already have. It could save them time at the grocery store by sending the ingredients list to your phone or even possibly your house with delivery. This method already makes cooking extremely easy by showing people exactly what to do. However, as of now this project only helps people make banana bread. In order to help people save money by cooking for themselves, I would need to add more recipes. I plan on adding more recipes and organizing them by the staple food that pairs with it so that people can start cooking regularly and consistently for themselves so it can become habitual.

So far from my metrics I have had 68 unique users with zero advertising. A unique user means they have logged in from a different device. Since its release I have seen my little sister and my older sister’s boyfriend use the skill successfully to make delicious banana breads easily and without me needing to be there. I am afraid that if I advertise, my compute costs will be too much for me to run this app for free, so I do not expect much more growth. You are free to check this skill out for yourself. It’s live! https://www.amazon.com/Darrell-Keller-Banana-bread-baker/dp/B07TYBW3GW

This project has been featured at the ArtSciLab at UTDallas and at the Institute for Innovation and Entrepreneurship at Blackstone for CometX