In alexa custom app Can I get the query as a voice stream/audio file of a user? - alexa-skills-kit

After the user invokes an app on Alexa , Is there a way to get the query as a voice stream/audio file of a user? Through alexa I want to send the stream to a webservice/lambda that the invoked app will call and analyze the intent there.
We have some proprietary code that we want to use for analyzing intent hence we cant do it on the alexa side
Since I am sending the query after the user has invoked the app and through the app there are no privacy concerns(hopefully)
Thanks

No, that is not possible, and I don't think it will be.
Echo devices connect to Amazon only, and Amazon uses Lex (which is also available via AWS) to parse speech files. As a skill developer, you will only receive the parsed results: intent, slots - and maybe, when Amazon implements user differentiation, an anonymous ID for the speaker.
There is no way to access the original speech audio in your skill. As every file is also used by Amazon to train their speech recognition, I doubt they will open their ecosystem accordingly.
Only option I see currently: build your own Echo with e.g. a Raspberry Pi, then you have full control. But you can't leverage the install base of Echo.
Same applies to Google Home and Microsoft Cortana, so it's not just Amazon.

Related

What is a solution to using the scanning of a QR code to execute a smart contract function on Ethereum?

So I have been working with Ethereum's metamask, and implemented web app where the user can come while being logged into metamask, and call a function on my smart contract via metamask. Tutorials for this exist and it's not very hard.
I want to implement calling a smart contract, when a user comes into a store ([physically). I want to do it the following way:
Some customer comes in and uses some application $A on his phone, which may be a wallet app or some other application that has access to his ethereum wallet.
On my POS application, I will render a QR code.
When he scans the QR code with his mobile phone, it is equivalent to either:
sending ethereum funds to our company account, while I am able to verify that this has happened via an event, or
he calls a function on a smart contract. This seems more appealing because I know that you can send events via smart contract calls.
So my question is:
Do any applications that have the functionality of application $A exist, are they widely used?
What can I use to implement exposing a QR code which is effectively some form of visual API for my program?
What you need is a library in your application to create a QR code. This QR code will need to be created for the function/method you need to call on the smart contract. You can use https://github.com/jibrelnetwork/ethereum-qr-code/blob/master/README.md
The QR code will be scanned by the user using the wallet on their mobile phone app wallet. This transaction will trigger the smart contract function and allow your application to proceed forward.
Hope this helps.
Cheers,
Answer to Q1:
Yes, there are applications like this(i made one myself). Most existing mobile wallets use this approach, for eg. Tenx.
Answer to Q2:
Sorry but this question is not totally clear to me. Are you asking for some sort of library for creating a QR code?

HTTP POST from GOOGLE ASSISTANT to PRIVATE SERVER and convert response in voice

I want use Google Assistant from my phone to send HTTP POST command to my server. I have a simple webnms app running over it, this server support REST API and now I want to use Google Assistant to shoot GET or POST command to that server and return my output.
Is it something possible? I am not full time developer.
Yes, as #Prisoner says it is possible. It is not what you asked - but have you seen these ways that Google provides to get skills published without requiring a lot of developer savvy?
https://developers.google.com/actions/content-actions/
https://developers.google.com/actions/templates/first-app
I don't speak for them, but IMO Google's target audience for Action building apart from the above is those who have at least some familiarity with the JavaScript language and its "run-time" Node.
There is also this - which I haven't tried by the way.
https://www.techadvisor.co.uk/how-to/digital-home/easy-actions-google-assistant-3665372/
In case it is not obvious, Google Actions are essentially websites that interact with Google's assistant running on a Home device or a smart phone, say. Think of the Assistant as a browser initiating requests and your Action as serving them. If you can (build and?) deploy a server that handles POSTS over HTTPS on a publicly addressable URL, and if you can understand the JSON payload that the Assistant sends and respond with appropriate JSON to carry out you application then you are good to go.
Where you don't have a public IP address - e.g. in testing - you can use a tool like ngrok ( https://ngrok.com/ ) to reverse proxy requests emanating from the Assistant to your server.
I have slides for a presentation I did targeting fledgling developers who had never built an Action here
https://docs.google.com/presentation/d/1lGxmoMDZLFSievf5phoQVmlp85ofWZ2LDjNnH6wx7UY/edit?usp=sharing
and the code that goes with it here
https://github.com/unclewill/parrot
On the upside the code is about as simple as it gets. On the downside it does almost nothing. In particular, it doesn't try to understand language. As #Prisoner says you'll likely need a tool like Dialog Flow for that.
Yes, it is possible.
Your server will need to implement the Actions on Google API. This is a REST API which will accept JSON containing what the user is intending to do and specific information about what they have said. Your server will need to send back JSON indicating the reply, along with additional information about how to continue the conversation.
You will likely also want to use a tool such as Dialogflow to handle building the conversational script and converting a user's phrases into something that makes sense to you. You'll also need to use the Actions on Google console to manage your Action and provide additional details about how users contact your Action. All of this is explained in the Actions on Google documentation.
Simple Actions are fairly easy to develop, and can certainly be done by a developer as a hobby. Good Actions, however, take a lot more thought and planning. Google offers you to the tools - it is up to you to best take advantage of them.
I've found the solution.
In the "Action" console https://console.actions.google.com/project/sandbox-csuite/scenes/Start
Go to menu "Webhook", click "Change fulfillment method", and then select "HTTPS endpoint"

How to capture wechat group incoming message and save it

need some advice from wechat experts.
How to get all incoming message from a specific wechat group and send it to server db.
Is there any available api or need to use xposed and hook the function of wechat.
Plz advice. Tq
I dont know the specifics of wechat but if it is an Android app you can always intercept its text and images.
Check if the following suits your needs: angel bitbucket.
It is basically an Xposed module that intercepts TextViews and ImageViews in apps. It then sends them from the 3rd party apps to a standalone stub (running in the Xposed app itself). This stub sends them to a nodejs server that shows them in "real-time" in the browser.
Note that by default it is set to only intercept a few apps (whatsapp, facebook...) and while i made it modular, you will still have to build it yourself.

Can an Alexa custom skill get access to the voice stream/audio file of a user?

I would like to have a custom skill, but it would need direct access to the users voice (our output of a recorded audio). Can/will Alexa relay the stream rather than sending the request invocations (launch/intent/session-end)?
I understand custom skills can send back mp3s as responses, but being able to gain access to the actual voice requests, either the stream or a mp3, would be awesome.
Edit:
It seems that there is not a provided mp3 in the request object: https://developer.amazon.com/public/solutions/alexa/alexa-skills-kit/docs/alexa-skills-kit-interface-reference#LaunchRequest
Alexa does not provide this service.
Having an always-on device in a domestic setting, that can hear everything said, plus background noise, and side conversations, is a huge security concern. Amazon mitigates this concern by filtering the input, performing the difficult Speech-to-text work, and only providing the resulting text. (After further processing by your interaction model.)
In short, no - I can't find anywhere specifically in the documentation but I just created a Python library that encapsulates all the JSON structures, so I know you can't do this yet.
The only control over audio is 'output' through embedding links in SSML.
https://developer.amazon.com/public/solutions/alexa/alexa-skills-kit/docs/handling-requests-sent-by-alexa#Including%20Pre-Recorded%20Audio%20in%20your%20Response

Check SIM status by USSD query from site

I want to check anyone phonenumber's online status by query from serverside without any notice to user. Is it even possible?
An operator can see the latest location updates etc from a mobile device, but I am guessing this is not your use case.
To do it as a third party service, you would need to either implement an instant messaging like availability service or leverage or extend an existing open source one like Tox (https://tox.chat), Telegram (https://telegram.org), Linphone (http://www.linphone.org) etc.

Resources