How to parse the JSON data in flume httpSource - flume-ng

I want to parse the data which is coming to flume httpsource, and extract some content and put it into channel ,.. can anyone help me to write java code to parse the data at flume http source

Interceptors are simple pluggable components that sit between a source and the channel(s) it writes to. Events received by sources can be transformed or dropped by interceptors before they are written to the corresponding channels
The regex filtering interceptor can be used to filter events passing through it. The filtering is based on a regular expression (regex) supplied in the configuration. Each regex filtering interceptor converts the event’s body into a UTF-8 string and matches that string against the regex provided.

Related

Decode JSON RPC request to a contract

I am currently using some website to read some useful data. Using the browser's Inspect>Network I can see this data comes from JSON RPC requests to (https://bsc-dataseed1.defibit.io/) the public available BSC explorer API endpoint.
This requests have the following format:
Request params:
{"jsonrpc":"2.0","id":43,"method":"eth_call","params":[{"data":"...LONGBYTESTRING!!!","to":"0x1ee38d535d541c55c9dae27b12edf090c608e6fb"},"latest"]}
Response:
{"jsonrpc":"2.0","id":43,"result":"...OTHERVERYLONGBYTESTRING!!!"}
I know that the to field corresponds to the address of a smart contract 0x1ee38d535d541c55c9dae27b12edf090c608e6fb.
Looks like this requests "queries" the contract for some data (but it costs 0 gas?).
From (the very little) I understand, the encoded data can be decoded with the schema, which I think I could get from the smart contract address. (perhaps this is it? https://api.bscscan.com/api?module=contract&action=getabi&address=0x1ee38d535d541c55c9dae27b12edf090c608e6fb)
My goal is to understand the data being sent in the request and the data given in the response so I can reproduce the data from the website without having to scrape this data from the website.
Thanks.
The zero cost is because of the eth_call method. It's a read-only method which doesn't record any state changes to the blockchain (and is mostly used for getter functions, marked as view or pure in Solidity).
The data field consists of:
0x
4 bytes (8 hex characters) function signature
And the rest is arguments passed to the function.
You can find an example that converts the function name to the signature in this other answer.

Using a substring of a return value in a subsequent request

I'm attempting to construct a series of Paw calls using the variables feature. I have one situation I'm unable to solve.
At authentication into the server I'm using, I get a JSON response, with one value that looks like this:
endpoint = "https://sub.something.com/thingone/thingtwo.php?token=sometoken&id=blahblah"
The endpoint portion "https://sub.something.com/" is then used as the base for subsequent calls, where a call might be "GET https://sub.something.com/data?id=123".
I don't want to hardcode the endpoint in Paw, as the endpoint will vary based on factors I can't predict at my end.
Is there a way to do basic string processing like this either in Paw, or by calling out to a shell script and using the return value of said script as a Paw variable?
That's doable using that RegExp Match dynamic value extension. Click on that previous link and hit Install Extension.
Type "Regexp" in the field you expect this value to be used. Pick Regexp Match from the completion results:
Then enter a regexp that matches your need, https?://[^/]+/? should be good:
I've put your example string in the screenshot above to show that it works, but you can instead put a "pointer" (Response Dynamic Value) to the response you want:
In the choices, pick Response Parsed Body if you want to parse a JSON or XML from the reponse. If the string is simply in plain text in the response body, pick Response Raw Body.
Once these steps are completed, you've got a working "Pointer" + "Parser" to the response that extract the part of the string you need. You can do the same operation with another regex for the token…
Tip: these dynamic value tokens can be selected like text and copy/pasted (Cmd+C/Cmd+V) :-)

JMeter: How to capture overall response data using Regular Expression Extractor

I want to re-use the Response Data received in Listener as show in Image below.
I would like to know, how can I capture overall response so that I can re-use the same for uploading.
Scenario is:
Download 1KB of string data using TCP Sampler (Port: XYZW)
Upload the text response received (Port: ASDF)
As per How to Extract Data From Files With JMeter the relevant Regular Expression should be:
(?s)(^.*)
Entire configuration:
With Http sampler, I add a BeanShell PostProcessor as a child of Http sampler and use below script to retrieve all response data, I think it's the same with TCP sampler, let's try:
// get all response data
String dashboardData = prev.getResponseDataAsString();
// do something with the data
// and then put the retrieved data into parameter to use later
vars.put("dataTobeUsed", dashboardData);
and we can use ${dataTobeUsed} for other samplers
If you want to get the response data via regular expression extractor, you can use the pattern ([^"]+)
Hope it's helpful!
Hope I understood your question right,
You can use regular exp [a-z0-9]* with any reference name lets say "TCP_Data" in your first TCP request.
Now you can use the same reference name in TCP request 2, by ${TCP_Data}.

Servlet stripping parameter values because of # character

My URL is http://175.24.2.166/download?a=TOP#0;ONE=1;TWO2.
How should I encode the parameter so that when I print the parameter in the Servlet, I get the value in its entirety? Currently when I print the value by using request.getParameter("a") I get the output as TOP instead of TOP#0;ONE=1;TWO2.
You should encode it like this http://175.24.2.166/download?a=TOP%230%3BONE%3D1%3BTWO2 . There are a lot of the encoders in Java, you can try to use URLEncoder or some online encoders for experements
This is known as the "fragment identifier".
as mentioned in wiki
The fragment identifier introduced by a hash mark # is the optional last part of a URL for a document. It is typically used to identify a portion of that document.
the part after the # is info for the client. Put everything your client needs here.
you need to encode your query string.
you can use encodeURIComponent() function in JavaScript encodes a URI component.This function encodes special characters.

Designing proper REST URIs

I have a Java component which scans through a set of folders (input/processing/output) and returns the list of files in JSON format.
The REST URL for the same is:
GET http://<baseurl>/files/<foldername>
Now, I need to perform certain actions on each of the files, like validate, process, delete, etc. I'm not sure of the best way to design the REST URLs for these actions.
Since its a direct file manipulation, I don't have any unique identifier for the files, except their paths. So I'm not sure if the following is a good URL:
POST http://<baseurl>/file/validate?path=<filepath>
Edit: I would have ideally liked to use something like /file/fileId/validate. But the only unique id for files is its path, and I don't think I can use that as part of the URL itself.
And finally, I'm not sure which HTTP verb to use for such custom actions like validate.
Thanks in advance!
Regards,
Anand
When you implement a route like http:///file/validate?path you encode the action in your resource that's not a desired effect when modelling a resource service.
You could do the following for read operations
GET http://api.example.com/files will return all files as URL reference such as
http://api.example.com/files/path/to/first
http://api.example.com/files/path/to/second
...
GET http://api.example.com/files/path/to/first will return validation results for the file (I'm using JSON for readability)
{
name : first,
valid : true
}
That was the simple read only part. Now to the write operations:
DELETE http://api.example.com/files/path/to/first will of course delete the file
Modelling the file processing is the hard part. But you could model that as top level resource. So that:
POST http://api.example.com/FileOperation?operation=somethingweird will create a virtual file processing resource and execute the operation given by the URL parameter 'operation'. Modelling these file operations as resources gives you the possibility to perform the operations asynchronous and return a result that gives additional information about the process of the operation and so on.
You can take a look at Amazon S3 REST API for additional examples and inspiration on how to model resources. I can highly recommend to read RESTful Web Services
Now, I need to perform certain actions on each of the files, like validate, process, delete, etc. I'm not sure of the best way to design the REST URLs for these actions. Since its a direct file manipulation, I don't have any unique identified for the files, except their paths. So I'm not sure if the following is a good URL: POST http:///file/validate?path=
It's not. /file/validate doesn't describe a resource, it describes an action. That means it is functional, not RESTful.
Edit: I would have ideally liked to use something like /file/fileId/validate. But the only unique id for files is its path, and I don't think I can use that as part of the URL itself.
Oh yes you can! And you should do exactly that. Except for that final validate part; that is not a resource in any way, and so should not be part of the path. Instead, clients should POST a message to the file resource asking it to validate itself. Luckily, POST allows you to send a message to the file as well as receive one back; it's ideal for this sort of thing (unless there's an existing verb to use instead, whether in standard HTTP or one of the extensions such as WebDAV).
And finally, I'm not sure which HTTP verb to use for such custom actions like validate.
POST, with the action to perform determined by the content of the message that was POSTed to the resource. Custom “do something non-standard” actions are always mapped to POST when they can't be mapped to GET, PUT or DELETE. (Alas, a clever POST is not hugely discoverable and so causes problems for the HATEOAS principle, but that's still better than violating basic REST principles.)
REST requires a uniform interface, which in HTTP means limiting yourself to GET, PUT, POST, DELETE, HEAD, etc.
One way you can check on each file's validity in a RESTful way is to think of the validity check not as an action to perform on the file, but as a resource in its own right:
GET /file/{file-id}/validity
This could return a simple True/False, or perhaps a list of the specific constraint violations. The file-id could be a file name, an integer file number, a URL-encoded path, or perhaps an unencoded path like:
GET /file/bob/dir1/dir2/somefile/validity
Another approach would be to ask for a list of the invalid files:
GET /file/invalid
And still another would be to prevent invalid files from being added to your service in the first place, ie, when your service processes a PUT request with bad data:
PUT /file/{file-id}
it rejects it with an HTTP 400 (Bad Request). The body of the 400 response could contain information on the specific error.
Update: To delete a file you would of course use the standard HTTP REST verb:
DELETE /file/{file-id}
To 'process' a file, does this create a new file (resource) from one that was uploaded? For example Flickr creates several different image files from each one you upload, each with a different size. In this case you could PUT an input file and then trigger the processing by GET-ing the corresponding output file:
PUT /file/input/{file-id}
GET /file/output/{file-id}
If the processing isn't near-instantaneous, you could generate the output files asynchronously: every time a new input file is PUT into the web service, the web service starts up an asynchronous activity that eventually results in the output file being created.

Resources