Mechanism behind QR code scanning of WhatsApp web/desktop app - qr-code

I could not find any answers related to the working mechanism of QR code scanning used on WhatsApp Web.
How does the authentication happen when the phone (any smartphone running WhatsApp) scans the QR code on the browser.
I don't want to know about the technology stack behind them. Like WhatsApp uses modified version of xmpp, uses erlang, uses web technologies like socket.io and ajax for the web version to implement such functionality.
The question might be broad. But I am eager to know about the implementation behind it.

It works like this :
1- You open the following URL on your browser : https://web.whatsapp.com/
2- The Browser loads the page with all sorts of JS and CSS stuff , but also opens a WebSocket ( wss://w4.web.whatsapp.com/ws ) - Check this image :
2.1- Every 20000 milliseconds you see traffic on the WebSocket for a refresh on the QR code you have on you screen. This is sent by the Server to the Browser, throught the WebSocket (WS we call it from now onwards)
2.2- On each QR Code refresh received on the WS , your browser does a GET request for the new QR Code in BASE64 encode .
2.3 - Notice that this specific WS that the server has open between the Server and the Browser is associated with the unique QR code !!! So, knowing the QR code, the server knows which WS is associated with it!
---- At this stage your browser is ready do the WhatsApp App work , but it does not know what is your ID (Whatsapp identifier which is your mobile number) , because it can't really get you phone number from thin air .
It also does not require you to type it, because the server wouldn't be sure that the number really belongs to you .
So, to let the Servers know that the WS session belongs to a specific phone, you need to use the phone for QR reading
3- You grab your phone, which is authenticated (otherwise you wouldn't have access to the section to scan QR codes) , and do the QR Code reading thing
4- When your mobile reads the QR code, it contacts the WhatsApp servers and tells them : My number is XXXX , My auth creds are YYYYY , and the WS associated with this QR code can now receive my DATA
5- The server now knows that it can direct Traffic to the specific WS socket that belongs to that QR Code, and does so !
6- On the Browser WS you can see the Server sending data regarding the user, regarding the conversations that you are having and which photo thumbnails to go and Grab.
7- The Browser gets this data from the WebSocket , and makes the corresponding GET requests to get the Thumbs, and other resources it needs, like an MP3 for notifications
7.1 - The WS listener on the Browser also makes Javascript calls, on the javascript files that were received at step 1 , to redraw the page DOM with the new interface .
8- The interface is now redraw to look like the WhatsApp app , and you continue to receive data on the WS , and sending when needed, and updates are made to the interface as data is arriving on the WS .
That is it.
Using Chrome, and Developer tools , you can see all this happening live. You can also see the WS communication (most of it, the binary frames you would need another tool ) and see what is happening all steps of the way.
Also:
Check a complete Tutorial on this : HERE
Source code for the Tutorial : Android Client
Source code for the Tutorial : Java Play Server

It uses something like below.
Whatsapp web application is opened by user via web browser.
Server creates a UNIQUE token (number) and embeds that number in QR-Code
Whatsapp phone application reads QR-Code and decodes token.
Whatsapp phone application sends information about its current user and this newly read token to whatsapp server.
Whatsapp server matches token (+ phone app user information) with web browser.
It automatically authenticates user and open new web page with his/her information on it.

there are two ways to implement QR login like whatsapp
Ajax polling
Websocket
I've made demos in php of
both
QR Login with Websocket
QR Login with Ajax polling
Note: Websocket apporach requires 2 port, one for main app and other for listening websocket connection.
Http server and websocket server can run on same port too with some proxy or some other way.
I found an example in nodejs too
QR login Websocket with nodejs

Related

Blazor Server - How to handle situation when user is faster then SignalR?

I have a complex Blazor server app and I run into situations that user interact with the form and while the server is trying to manipulate the DOM via SignalR, the user manage to do other things in the form.
When I debug the app or using PC in the LAN, it works fine because it is very fast.
But when I browse from cell phone it is noticeable.
One of those situations is reading a barcode using barcode reader.
The flow is:
user scan barcode into textbox
the barcode is validated
if OK or Bad the user gets a message
the textbox cleared
What happen is that the user is able to scan other barcode before any render is being done to its DOM.
It seems to me that it has to be done with some JS locally.
Any solutions ideas would be appreciated
Tx
Yaron

How to capture wechat group incoming message and save it

need some advice from wechat experts.
How to get all incoming message from a specific wechat group and send it to server db.
Is there any available api or need to use xposed and hook the function of wechat.
Plz advice. Tq
I dont know the specifics of wechat but if it is an Android app you can always intercept its text and images.
Check if the following suits your needs: angel bitbucket.
It is basically an Xposed module that intercepts TextViews and ImageViews in apps. It then sends them from the 3rd party apps to a standalone stub (running in the Xposed app itself). This stub sends them to a nodejs server that shows them in "real-time" in the browser.
Note that by default it is set to only intercept a few apps (whatsapp, facebook...) and while i made it modular, you will still have to build it yourself.

Photo streaming uploading protocol

In this age of *chat applications and various messaging software, I was wondering if there is already an official protocol (RFC) that would follow the following basic flow:
Client connects to Server for a new session
Client uploads an image (or video) with metadata information (size, resolution, format) to server
Server does some work (not part of the protocol)
Server replies with REJECT then client goes to 1.
Server replies with ACCEPT then client stops and gather the result as part of the reply from the server
I have a proprietary solution now that does the basic (supports basic formats) and as we know, the devil is in the details so I wonder if some existing protocol would cover the stream format and more unhappy paths I may have missed with this simple design.
I'm not aware of any protocol that can handle file probing for you ..
ffprobe is a good open source solution to do this but requires processing power and scale.
So this step must be done on server side, after the upload. You cannot trust the client for such information.
I suggest the cloud approach. Here, we're using Amazon Cloud.
Upload your file to AWS S3. You can use multi part upload for faster upload. No need to scale anything, AWS will do it for you.
Your clients just request a signed URL from your web server. The server return the URL and an ID for this new asset. Your clients upload to AWS S3 using the URL.
Once the upload is done, your client make a call to your server again to say: "I'm done with Asset ID blah". Your server knows the asset is now uploaded and can initiate transcoding, analysis, DB updates, etc.
We do this exact scenario in our project.
For transcoding at scale we use our own open source project: https://github.com/sportarchive/CloudTranscode
This is not an easy business, especially if you want to handle videos.
If you restrict yourself to pictures, then a lot can be done on the user side. You can create several versions of the image in JS, directly in the browser for example or in the mobile app, and upload them to your server. The load is much smaller and you may not need this decoupled architecture.
If you handle videos, you need a solid backend.
Hope this help

Sending a RESTful url (endpoint) from Band

I just have a general question. Can you send a url from a button on the band. I have a home automation system that you can trigger events by sending a RESTful url (endpoint) to. Basically I can put the url in any web browser and trigger the event. It would be great if this could be done through the Band. I don't really need a response from the Url, just to send it.
Does that make sense?
Thanks,
Scott
No, the Band communicates only via Bluetooth to (applications on) its paired device. On Windows (Phone), the application must be running, with a connection to the Band, and subscribed to the Tile button pressed event in order to receive such notifications. This generally rules out scenarios that require ad-hoc input from the Band unless you're willing to use voice commands via Cortana.
But i think its possible by creating custom tile and handling custom tile events. Haven't tried it in my project but can see from sdk documentation.
For android you can implement broadcast receiver and listen to tile events. Check: sdk doc
Chap 9, page 51
In short, yes it is possible.
However, the problem would be that the button would be single use to only send that ONE URL command and it actually wouldn't be done via the Band.
You can create custom layouts for your applications with the Microsoft Band SDK which will allow you to create a button. You'll then need to register to the click event from the Band which then would get fired on the device the app is running on. From there, you'd be able to send the URL but it would be sent from the Windows Phone or Windows PC rather than the Band so you'd need to be connected. The documentation covers how you can do this here: http://developer.microsoftband.com/Content/docs/Microsoft%20Band%20SDK.pdf
A downside to doing this with WinRT is that as soon as the app is closed and the connection to the Band is lost, your button click won't have any action. The best way to get around this is to create the connection to the Band in a background task but unfortunately, you can't keep hold of the connection to the Band for an infinite amount of time and you'd have to live with the possibilities that you may have times where it doesn't work. I have a GitHub sample which shows you how to connect to the Band in a background task for an indefinite amount of time.
The Microsoft Band has really been developed for the Health aspect and collecting data rather than interactions with other apps which it does in some way support.

How can I launch an external application from the Browsers (IE, Firefox, Chrome, Safari) in windows

Is it possible to embed an external application inside the browser (IE, Chrome, Safari, Firefox) so it will look like a native web application but actually having access to the USB ports of the client machine? I have heard that I need to make an ActiveX control. I would like to use the .Net framework, but if that is not possible, maybe using Java or C++ will be fine.
I have to make an application that will allow to the users to connect an external device to an USB port, this device will take a backup of the information contained in a SIM card and send it to the user's account online agenda. So the user can restore it later using the same application. This should be a web application or at least look like one.
If the first is not possible. Is there any way to launch an external application from all the browsers, and then pass information to the browser window to allow it to refresh after the backup has been made?
Thanks for your help in advance.
First off this seems to be a big security issue and hence this is the reason why you might be finding it tricky.
What I would do is look at it from a different angle; what am I trying to achieve? How is the user going to use the data? Where is the user going to use the data?
From you question I have answered those questions with the following; I hope I've not miss interpretted anything.
I want to copy the data from an external sim card to a central location
I want the user to see this data from the central location; preferablly from a web application.
The user is going to see and use the data from the web app
Assuming all of these things are true; one design option is the following:
1 - Have a client based application which can read stuff from the usb device
2 - Have a secure webservice which the client based application can upload the data too
3 - Have a web application which can view this data and see refreshes
Let me go into bit more detail for each step.
1 - If you write a small client application it is installed or at least runs on the client computer. Due to this it can access the local client resources such as usb and interface with them. This will mean they can read the sim data throuogh this app, buut also potentially save it locally as well as upload the data. To access the web service they would enter their username/password so you could authenticate them for the upload.
2 - This web service would do the authentication from the client application, but also receive the data submitted from the client app. Acessing web services from .net now a days is really straight forward. Using this web service the client application could also do some checking to make sure the data has been updated and it could handle re-tries if the network dropped etc.
3 - The web front end of the system would interface to the same data source. This site would take the username / password to authenticate them on the site, but also let them see the uploaded data. As for the refreshes; if the user is logged in and looking at the data you could have a javascript timer polling an action/service to see when new records have been added etc. This could then display a message through jQuery or similar to notifiy the user. This could be similar to the notifications which StackOverflow gives when you visit for the first time or get a new badge etc.
Hope this helps :-)

Resources