Amazon Provides DIY Echo Plans for Raspberry Pi

escobar · on March 25, 2016

I have had reservations about the Echo line because of the whole "always listening" thing, regardless of what anyone's said about how it's not recording, how I can unplug it, etc. The whole "always listening" thing isn't what interests me about playing with Alexa.

As someone who's spent a fair amount of time with hardware, I think this is what will make me tinker with the Alexa service - I am interested to see what it can do and I like keeping up with Amazon's hardware projects. I've got all the parts lying around to throw this together without spending anything, so it's a neat way for them to grab some interest from a different user demographic. This also should be fairly easy to get running on a BeagleBone too, which I tend to lean towards (more I/O, PRU can be useful)

SomeCallMeTim · on March 26, 2016

>I have had reservations about the Echo line because of the whole "always listening" thing

I've spoken with other people about feeling this way. I think that the difference here is actually 100% psychological, and that always-listening devices like the Echo are exactly as trustworthy as the company that makes them.

I am currently standing next to at least 4 different devices with microphones and internet connections that are on or in "sleep" mode. Just because they aren't "listening" to me in a way that is obvious (e.g., they respond to a command) doesn't mean that they're not recording every sound I make. There are in fact trojans designed to do exactly that.

Either you trust the manufacturer of these devices or you don't. The fact that there's a secondary processor on the Echo that does low-power constant voice recognition for the word "Alexa" (and similarly for some phones which can be activated with "OK Google") doesn't make it suddenly more likely to be storing all of your audio, all the time.

The only salient difference is just that it makes it obvious that it was in fact listening, whereas any internet-connected device around you could be listening to you right now and simply never let on.

I'm keeping my Echo plugged in. :)

ipsin · on March 26, 2016

Even if you trust the company, you also have to trust that they have no logs that can be subpoenaed and that they cannot be compelled or hacked to wiretap you.

ketralnis · on March 27, 2016

Sure, but that misses the point: if your phone were doing that it would be functionally equivalent. You wouldn't be able to tell any more than you would with Alexa

cptskippy · on March 26, 2016

I'm pretty sure Alexa isn't recording or transmitting what's said at all times, there's just a local process looking for the queue to start recording a sample to upload. That queue being the word "Alexa".

FungalRaincloud · on March 26, 2016

I believe you're right, but it still feels icky to have something always listening for that special phrase, to me. Maybe if I wrote and maintained the code myself that always listened, I'd feel more comfortable with it.

Oh, I think the proper word was cue, not queue, by the way.

AckSyn · on March 26, 2016

It would scare you to know that OnStar can be remotely activated without the driver knowing, wouldn't it?

fsckin · on March 26, 2016

Or any cell phone.

XorNot · on March 26, 2016

This has always seemed pretty marginal to me. You might be able to turn the microphone on, but when my battery dies an hour later I'll be suspicious.

jayd16 · on March 26, 2016

Lithium ion batteries don't get old and lose capacity. That's just the NSA backing off the sleep interval.

jayd16 · on March 26, 2016

>Maybe if I wrote and maintained the code myself that always listened, I'd feel more comfortable with it.

Maybe? Maybe you'd be comfortable? There's a world where you wrote the code yourself and still don't trust that it's not sending data back?

baocin · on March 26, 2016

You could try adding in some simple motion, like raising your hand, before the system would start listening for hot words. Maybe connect a cheap infrared sensor to GPIO and block its view so it only detects motion at or above certain height.

fn1 · on March 26, 2016

"queue" -> "cue"?

JohnBooty · on March 26, 2016

Looking at what Amazon's posted, it looks like what they've released doesn't even give you the "always listening" option.

You have to click on the "start listening" button and then the "stop listening" button.

ChuckMcM · on March 26, 2016

Now all you need is a remote microphone in the shape of a star trek communicator for your shirt pocket. Tap, "Alexa, three to beam aboard." :-)

samstave · on March 26, 2016

Just needs a vocera badge integration and a connection to Coding Insight from Talix to allow for NLP voice access to patient records...

djhworld · on March 26, 2016

I believe the terms and conditions for the Alexa SDK state you can't activate the Alexa Voice Service via voice, your user has to purposefully interact with something like a button to use it.

EDIT: I'm wondering if this is a legal thing, i.e. they don't want any tom, dick or harry creating "always listening" devices associated with their brand, or they just want to differentiate their Echo product and not have competitors

halite · on March 26, 2016

Confirmed.

This is not always listening app. You've to click "start listening".

escobar · on March 26, 2016

Yes, agreed - this is perfectly fine with me. If I wanted the "always listening" feature I'd grab an "official" Alexa device :)

techdragon · on March 26, 2016

Its worth remembering that these services are dependent on the network 'carrying their packets'. If you're worried about them 'phoning home' just make sure your home network is configured to record such events to the best of your ability, and setup some simple alerts or blocking.

x5n1 · on March 25, 2016

Agreed, I think this can be generalized to say that this should be the case with all these cloud services. The hardware should be open source, so you can at least control that part. At least you have that much control over the cloud APIs.

dperfect · on March 25, 2016

This may bring me one step closer to my personal "holy grail" of home automation: every room in the house[1] working with seamless voice-activated home automation. This is what I'm ultimately after:

- A cheap device (DIY if possible) in the form factor of a small plug-in unit. Ideally the device itself should be practically "invisible" in each room, and won't require any special home wiring. This is definitely in the realm of possibility for a Raspberry Pi (or similar).

- A microphone for the device that works at least as well as the Echo's far-field mic. I have not been able to find any good options for this, apart from some obscure parts that are too expensive for me to test, let alone buy for every room.

- Software that allows for voice-activated operation. There's probably a suitable workaround for doing this with the Alexa Voice Service now, though it may require more CPU power than is available on the Raspberry Pi.

- Ideally, I could host the voice service myself and wouldn't have to worry about the privacy implications of going through someone like Amazon. I know there are several existing software packages that claim to do this, but none that I've found can match the quality of Echo/Alexa for everyday interaction.

- Audio feedback does not need to be high quality, but at least audible. A small speaker within the device is probably enough. For other areas of the house, it would be nice for the output to be connected to a bluetooth speaker in the room or a home audio system (if available).

The Echo Dot appears to be a pretty close match for this (though I haven't tried it) - at least in terms of functionality, but the form factor still seems a bit off. I'd rather have a self-contained plug-in unit than something that sits on a desk or table.

[1] Or most of the house anyway

Swizec · on March 25, 2016

> is may bring me one step closer to my personal "holy grail" of home automation: every room in the house[1] working with seamless voice-activated home automation.

But why? What do you want to automate?

That's the part I never understood about this stuff. What is there to automate in the first place?

dperfect · on March 25, 2016

That's a good question. Maybe "home automation" isn't exactly the right name for it, because that's just a part of it. It is nice to be able to control things by voice (lights, locks, window shades, music, TV, etc), but to call any of those things "essential" just sounds lazy :)

For me, the Amazon Echo is more about having a connection to the world without looking at a screen. Getting answers to random questions (or utilities like timers), latest traffic conditions for travel, weather, etc are all really great applications for a voice interface. If that could also be combined with a good communication platform (voice calls, text/email messaging, etc), it would be even better.

kingnothing · on March 28, 2016

What does Echo offer that Siri, etc. don't? You already have all of that functionality through your phone.

JohnBooty · on March 25, 2016

This is what I always wonder too.

If I lived alone with no pets I'd love to be able to automate the temperature to save money when I wasn't home. But there are usually people in my house, and there are always pets there, so I have to keep it pretty temperate in there at all times.

Other automatable things (lights, lawn sprinklers) are so easily accomplished with simple analog timers that I really do not see the attraction in controlling them via an app. Maybe it's just the programmer in me thinking... at this point in my life, I'm very jaded when it comes to software.

Security is a big one, I guess. It would be nice to be able to monitor that remotely. Though honestly I already have a dog in the house which is probably more effective than 90% of the solutions on the market.

My big thing is whole-house streaming audio, and we've had that for years with Airplay and Sonos and now with Chromecast as well. So that's cool.

lstamour · on March 26, 2016

Things I've considered automating: different light settings (temperature, which light to keep on, watch-style notifications/interactions when not at my device (usually because it's charging), instructions to nearby screens or audio systems, reminders... A lot of it relates to integrations and interactions which require too many button pushes. Voice control is one step closer to "mind-reading"-like experiences and having knowledge of who is in which room would improve the context-awareness of existing apps and whole-home systems. For example, ideally if I said "play some music" it would know who I was and what I liked vs others in the house. This is more than voice control, and is likely phone- or app-integrated, but every small step gets us closer. ;-)

modoc · on March 25, 2016

Not the OP but here's some of the stuff I have/want:

Have:

Adding things to my shopping list (via Alexa - currently in the kitchen and bathroom - two most frequent rooms where I'm like "hey I need more XXXX")

Changing the temperature (Nest via phone app and command line tool I wrote, and now Alexa just did a native integration)

Finding out my schedule for the day.

Finding out the weather forecast for the day.

Playing music (via Sonos or Alexa)

Locking front door when I'm going to bed.

Getting notified of movement or sound in the house while I away (Nest Cam and Smart Things)

......

Like the OP I'd love to have the voice control available in any room of the house. I'd love to have better security features (i.e. if I drive up my driveway at 3 AM turn on the outside lights, if a stranger drives up my driveway at 3 AM WAKE ME UP WITH AN ALARM). I'd love better voice controlled Sonos music selection. Composing emails, having handsfree phone or video calls, warming up the oven at a certain time, etc...

alexcaps · on March 25, 2016

It's not low cost yet, but check out http://josh.ai. Starting high end but price will drop significantly.

modoc · on March 25, 2016

I will, thanks!

stepanhruda · on March 25, 2016

Sounds like all of these could also be handled by a phone/watch instead of needing a device for each room.

For adding to shopping list, check out Amazon Dash.

modoc · on March 25, 2016

I have a few Dash buttons, but it's SUPER limited what you can link them to. Plus I'd need hundreds of them:) The "Alexa, add XXXX to my shopping list" works really well for me.

I don't wear a watch around the house, and I don't really like smart watches honestly - I like big heavy metal automatics. Likewise I enjoy being able to leave my phone on the charger somewhere else. I dunno. I mean long term dream would be integrating things like "smart" mirrors, semi-smart AI-esque assistants, and so on. Then again, I'm currently in a zero-tech house on the beach in Costa Rica, with a barely there internet connection, so it's not like any of this is that important...

robbiet480 · on March 25, 2016

Let me introduce you to Dasher: https://github.com/maddox/dasher

serge2k · on March 25, 2016

Personally I'd love to automate

* lights * alarm clock (setting, turning it off and on, 15 more minutes) * be able to turn on my xbox/fire tv and tell it I want to watch <x>, then it figures out how * taking the dog out when it's 1am and I just want to go to bed (okay this one might take a little longer).

omginternets · on March 25, 2016

I'd also like to ask: how is inviting more networked microphones into your house a good idea?

oh_sigh · on March 25, 2016

OP more describes home control than home automation. Home automation is the house intelligently doing things for you, without your prompting. Home control is the house doing things for you when you ask it do things for you.

E.g. Home automation: Every day at sunset, lower the blinds

E.g. Home control : You say "House, lower the blinds now".

Aleman360 · on March 25, 2016

Agreed. I want minimal technology at home. I deal with it enough all day at work.

BatFastard · on March 27, 2016

I just want technology that WORKS at home. Too much of my time is spent fixing other people's "mistakes". I find Alexa fits that bill. Takes a bit of learning the right cadence of speaking to her. And unfortunately she has the same name as my daughter, which causes some interesting interactions.

vollmond · on March 25, 2016

lots of things would be nice to have control of from work, or even just from bed at night. climate control, door locks, lights, etc.

faitswulff · on March 25, 2016

Yeah, and what about the endless false positives in every room of the house?

hirsin · on March 25, 2016

It's pretty hard to order something via Alexa - I know, my friends have tried to order me several things ranging from sex toys to more Echos. Never worked. Please don't spread nonsense.

faitswulff · on March 25, 2016

Didn't know that. Edited to reflect the actual concern, minus the humor.

ocdtrekkie · on March 25, 2016

Yeah, if anyone found something equivalent to the Echo in mic tech that I can hook up to my own setup, I'd die to know. That's the one thing I can't really replicate on my own system.

oh_sigh · on March 25, 2016

The kinect has gotten good reviews for its microphone array

voltagex_ · on March 26, 2016

I haven't bothered getting the SDK for the Kinect 2 yet but both of them seem to only be good for about 135 degrees around the device - the Echo is a much better omnidirectional microphone.

chris11 · on March 26, 2016

The kinect seems pretty cool. I've heard of one person who brought it into work and used it to identify and greet people who walked into his office. So there is probably some pretty cool user identification projects you could do with it.

_asummers · on March 26, 2016

My (admittedly anecdotal) experience having an Xbox One is that it would pick up phrases that barely sounded like built in triggers and do unexpected things at unexpected times, such that I felt forced to disable it.

SomeCallMeTim · on March 26, 2016

We've had an Echo since before they were available to the public (ex-Amazon employee here), and it does occasionally beep when it thinks it hears "Alexa."

But it's very rare that it goes beyond that; usually we hear the beep and one of us will yell "never mind!", and then the Echo will go silent. Sometimes it will cancel itself, too. I think that the always-listening part is less discriminating than the active voice recognition, and that it can re-parse the last couple of seconds and decide, in retrospect, that you didn't say "Alexa."

That said, I just recently said to my wife, "I wonder if Alexa knows 'sudo make me a sandwich'". The timing was such that it parsed "sudo make me a sandwich" and answered, "Well, if you ask like that, how can I refuse?" :)

We had a good laugh. Funny thing was, looking at the app later, it was actually parsed as "Pseudo make me a sandwich." I bet Google Now would have corrected it to sudo. :)

eiopa · on March 25, 2016

tl;dr:

It's literally a tutorial on configuring Alexa Voice Services + their sample code on Debian.

The way you interact with it is by clicking on a button in a Java app. No trigger phrase like Echo.

Nexxxeh · on March 25, 2016

But presumably you could have the Pi listen for a trigger word or whistle or whatever using software running locally, and when triggered, kick over to the Alexa API?

tmuir · on March 26, 2016

You could setup an IFTTT "Do" button [1] with the Maker Channel, which allows you to make an arbitrary web request. Then have a server running locally that can receive the request and trigger the recording. Nodered [2] would make setting that server up pretty simple.

[1] https://ifttt.com/products/do/button [2] http://nodered.org/

IshKebab · on March 26, 2016

There's no good open source software to do this.

Also you need a microphone array to do it reliable (the Echo has 7 microphones).

jjwiseman · on March 26, 2016

Yes, but it won't work as well as the Echo, especially in a noisy environment.

jjwiseman · on March 26, 2016

To expand a little on this: The Echo has a 7-microphone array which is crucial to speech recognition accuracy. This gives it the best far-field recognition ability of any consumer product I've seen, with the ability to stay accurate even if you're across the room, with music playing. That's just the hardware, and replicating it's abilities will not be easy.

On the software side, supposedly they're using Nuance for recognition. Nuance isn't cutting edge: In the tests I've done, Nuance has a Word Error Rate (WER) that's 10%-20% higher than Google's, but it's still much better than something like Pocketsphinx or any other open source recognizer.

There are a lot of factors that go into making a speech interface a good experience for users: Good recognition accuracy even with background noise, good voice activity detection (even with background noise), very accurate word spotting, low latency. It's hard to hit all these things well enough to make the interface usable.

Slippery_John · on March 26, 2016

That's against the Alexa Voice Service ToS though

jxy · on March 25, 2016

Does it mean that we can run it on any computer that runs Java? I read through the tutorial but couldn't find anything that specifically tied to raspberry pi.

rtpg · on March 26, 2016

Probably. The nice thing about the Pi is that it's cheap and has crazy low energy consumption

mnglkhn2 · on March 26, 2016

the trick left for the "makers" is to add a button to Raspberry Pi that will let you press it and have the app "listen" to your voice.

lolmycat · on March 26, 2016

From what I understand, the echo has a specific piece of hardware in it that is 'always listening', and once triggered via voice command the echo actually begins to listen. So unless you have something connected to the device that could reproduce that initial voice analysis hardware, you cant have the 'always listening' feature.

Viper007Bond · on March 26, 2016

Probably for power reasons, much like the most recent iPhones can be controlled by ; Hey Siri". However older iPhones can always listen too but are required to be plugged in to do it because they're using their main processor and doing it at a software level.

In short, always listening isn't difficult on non-battery devices, it's just a software problem.

hiharryhere · on March 26, 2016

I made an Alexa clone and use PocketSphinx to listen out for a wake word.

There's a phrase detection function you can configure to trigger audio streaming to the cloud.

torbjorn · on March 25, 2016

This is awesome. I am strongly considering getting setting this up as I just purchased a fresh raspberry pi.

The only limitation appears to be you have to click a "start listening" button to get it to start recording audio. You can't simply say "Alexa" to get the raspberry pi + alexa web service to listen for your query.

Anyone have any ideas for a work around/ solution to this?

gt565k · on March 25, 2016

On a related note, check this project out. http://jasperproject.github.io/

You can perhaps trigger alexa to start listening through it by wiring the voice recognition to click the "start listening" button.

soared · on March 26, 2016

Jasper is outdated and incredibly difficult to install on any recent model (2, 3 or 0).

llamataboot · on March 25, 2016

yes this is what I came into say. You can hardwire a phrase similar to how jasper does it and use that to trigger the start listening method

incongruity · on March 25, 2016

This actually sounds more like my ideal.

I've heard the anecdotes about Alexa responding erroneously when people weren't home and doing things like turning the furnace on, etc. That combined with the general privacy concerns make me much more comfortable with being able to push a button to get her to listen – but I'd rather it be a button on my person – like on my fitbit, watch, etc. – easy, always available – plus maybe a way to turn it on for listening over a duration – say while cooking.

deegles · on March 25, 2016

Haven't tried it yet, but I hear Pocketsphinx has a keyword spotting mode.

https://github.com/cmusphinx/pocketsphinx

alexcaps · on March 25, 2016

Look into OpenEars. It's pretty crummy but it's free :/

131hn · on March 25, 2016

i wrote a simple clap trigger (using the raspberry micro) that is always listening. I use it to shuffle my phillips hue light but it can easily be used to trigger echo voice https://github.com/131/clap-trigger

jimmcslim · on March 26, 2016

I'm imagining carrying around a little bell.

pre · on March 25, 2016

I have Blather configured to do various key-presses and so control pentadactyl and so control firefox.

It could be set up to provide whatever signals a button can provide.

http://www.jezra.net/projects/blather

nrub · on March 25, 2016

I think part of the terms for the Alexa voice service forbid auto-listening.

_jcwu · on March 25, 2016

You could use a bluetooth remote with 1 button or so to trigger that listening. I don't think the RPI is powerful enough to do voice recognition.

llamataboot · on March 25, 2016

It is indeed for a limited number of phrases! I have used it that way for at least 2 years with Jasper and PocketSphinx

ViViDboarder · on March 26, 2016

I wish I could go one step further and instead of even having a mic on the device, use a web app on my phone to record and send the audio to the pi

Implicated · on March 25, 2016

Privacy concerns aside, this is pretty damn cool.

I've been looking for an excuse to tinker with a raspberry pi for a while - this seems like something I could have some fun with then give away to someone less paranoid/concerned with the privacy issues.

jMyles · on March 25, 2016

Well, isn't the point here that you can verify that it isn't listening except when you want it to?

criddell · on March 25, 2016

That's a great point. A little openness can only help Amazon sell even more of these things. I don't have one, but everybody I know who does really likes it.

CaptSpify · on March 25, 2016

You do have more control, sure, but its still cloud dependent

jMyles · on March 25, 2016

But you can hard-interrupt the microphone. I mean it's a completely different dynamic as far as security.

CaptSpify · on March 27, 2016

Sure, But you still have to send your data through their API

blacksmith_tb · on March 25, 2016

Nice to see them walking through pretty much everything from getting your RPi running to making it work with AVS. That said, Sam Machin's Python CHIP / RPi client was there first, and has a smaller footprint: https://github.com/sammachin/AlexaCHIP

daveloyall · on March 25, 2016

Props to Amazon for putting this up. There are hundreds of steps and a lot of it is manual drudgery. 10/10 would hack again.

haack · on March 25, 2016

Out of curiosity, does anyone know what amazon's incentive is to do this?

freyr · on March 25, 2016

The value is in Amazon's voice services and speech recognition platform, not in the Echo device itself. As machine learning improves, voice and speech may play a bigger part in user interaction, and Amazon wants to be out in front of that.

The Echo hardware was just a way to get the ball rolling with this platform.

sisk · on March 25, 2016

I'd imagine the incentive is more points for the speech recognition model. The race is for who can handle completely unstructured speech best, and the field is vast at this stage.

atarian · on March 25, 2016

A direct response to Google opening their Voice Recognition API?

make3 · on March 26, 2016

Like with their hackable instant order button, I would expect that at least for the short term, they want to use it as a way to make purchasing as fast and easy as possible, obviously to maximize their income.

Longer term, conversational AI is clearly the next huge thing in sales and customer interaction, and they probably want to start progressively building grounds in that domain

lotso · on March 25, 2016

Guessing they'd make more money off of people ordering products through any echo device than they would on their own hardware?

nerfhammer · on March 25, 2016

Hypothetically, amazon wants to make money of what people use echo for and not the hardware itself.

danifel · on March 25, 2016

What I think you guys are really looking for is something like this: http://www.microsemi.com/products/audio-processing/home-auto.... Ambarella uses those in their IP Cameras designs, so it should be straight forward to integrate...

sp332 · on March 25, 2016

Does anyone know a way to use the Android Alexa app without buying an Echo device or Fire TV first?

schlarpc · on March 26, 2016

I'm able to use it on an account with only a device generated via the Alexa Voice Service. I made a device profile on the developer site, then authenticated to it via the OAuth flow.

dharma1 · on March 26, 2016

Has anyone done hardware tinkering with the Echo? Does it run Linux? what does the mic array look like? Possible to use just the mic array and pipe the audio elsewhere?

regularfry · on March 26, 2016

Has anyone found a decent solution to hooking more than one mic input into an RPi? Something that would allow doing some simple DSP across, say, a 4-input array?

brooklyndude · on March 26, 2016

We have 100% totally pivoted on this one. Every proposal we put out, now has Echo front and center. As we say "screens", you mean like your father/mother used to use? How old school. A screen? Oh boy ... :-)

As Woz says, "bigger than the iPhone." That sounds like a hell of a prediction to me. Woz knows all. :-)

jarmitage · on March 25, 2016

Would there be a way to dodge the privacy issues with this, by spoofing the service somehow?

sp332 · on March 25, 2016

I suppose, if you feel like implementing all the API calls yourself on your own server.

newman314 · on March 26, 2016

Self-signed cert?

Miss opportunity for Amazon to push Let's Encrypt...

thesimon · on March 26, 2016

For what reason? Amazon offers their own free certs [0] and I doubt you can get an LE cert for a local IP address.

[0]: https://aws.amazon.com/de/blogs/aws/new-aws-certificate-mana...

Irishsteve · on March 26, 2016

Anyone know how to buy one if based outside the US