I wanna make a web crawling, currently i am reading a txt file with 12000 urls, i wanna use concurrency in this process, but the requests don't work.
typealias escHandler = ( URLResponse?, Data? ) -> Void
func getRequest(url : URL, _ handler : #escaping escHandler){
let session = URLSession(
configuration: .default,
delegate: nil,
delegateQueue: nil)
var request = URLRequest(url:url)
request.httpMethod = "GET"
let task = session.dataTask(with: request){ (data,response,error) in
handler(response,data)
}
task.resume()
}
for sUrl in textFile.components(separatedBy: "\n"){
let url = URL(string: sUrl)!
getRequest(url: url){ response,data in
print("RESPONSE REACHED")
}
}
If you have your URLSessions working correctly, all you need to go is create separate OperationQueue create a Operation for each of your async tasks you want completed, add it to your operation queue, and set your OperationQueue's maxConcurrentOperationCount to control how many of your tasks can run at one time. Puesdo code:
let operationQueue = OperationQueue()
operationQueue.qualityOfService = .utility
let exOperation = BlockOperation(block: {
//Your URLSessions go here.
})
exOperation.completionBlock = {
// A completionBlock if needed
}
operationQueue.addOperation(exOperation)
exOperation.start()
Using a OperationQueue subclass and Operation subclass will give you additional utilities for dealing with multiple threads.
Related
I am working on a simple Lambda function and I was wondering if I could pass in client (dynamodb this time) to the handler, so we do not re-connect for every request.
The macro is defined here:
https://docs.rs/lambda_http/0.1.1/lambda_http/macro.lambda.html 3
My function so far:
fn main() -> Result<(), Box<dyn Error>> {
simple_logger::init_with_level(log::Level::Debug)?;
info!("Starting up...");
let dynamodb_client = DynamoDbClient::new(Region::EuCentral1);
lambda!(router);
return Ok(());
}
fn router(req: Request, ctx: Context) -> Result<impl IntoResponse, HandlerError> {
let h_req = HReq {
http_path: req.uri().path(),
http_method: req.method(),
};
match h_req {
HReq {
http_path: "/login",
http_method: &Method::POST,
} => user_login(req, ctx),
_ => {
error!(
"Not supported http method or path {}, {}",
h_req.http_path, h_req.http_method
);
let mut resp = Response::default();
*resp.status_mut() = StatusCode::METHOD_NOT_ALLOWED;
Ok(resp)
}
}
}
Is it possible to extend this macro to have a second option so I can add the client all the way down to the functions which are actually talking to the database?
DynamoDB is a web service, each request to it is treated as a distinct API call.
There is no functionality to keep a client connection alive in the same way you would with a regular database connection (e.g. MySQL).
My rust knowledge is a little lacking, so I don't know if http keepalive is set by default with the DynamoDBClient, but making sure http keepalive is set will help performance.
After considering all the options I decided to implement this with lazy_static.
#[macro_use]
extern crate lazy_static;
lazy_static! {
static ref DYNAMODB_CLIENT: DynamoDbClient = DynamoDbClient::new(Region::EuCentral1);
}
This is getting instantiated at run time and can be used internally in the module without any problems.
I come from a C# background and would like to implement awaiting functionality in my Swift app. I've achieved my desired results but I had to use a semaphore which I'm not sure is good practice. I have a function with an alamo request that returns a JSON with a success value and as I understand it that request function is async with a completion handler. The handler fires once the request is complete. The problem is returning the success value from that operation. Here's a psuedo-code example of what I'm doing:
func AlamoTest() -> Bool{
var success = false
//Do some things...
//...
//Signal from async code
let semaphore = DispatchSemaphore(value: 0)
Alamofire.request("blah blah blah", method: .post, parameters: parameters, encoding: URLEncoding.default).responseJSON { response in {
success = response["success"]
if(success){
//Do some more things
}
semaphore.signal() //Signal async code is done
}
//Wait until async code done to get result
semaphore.wait(timeout: DispatchTime.distantFuture)
return success
}
Is there a "better" way of achieving my goal? I'm new to Swift and its async constructs.
Best solution I found is what I call "callback chaining". Example of my method looks like this:
func postJSON(json: NSDictionary, function: ServerFunction, completionHandler: ((_ jsonResponse: NSDictionary) -> Void)? = nil) {
//Create json payload from dictionary object
guard let payload = serializeJSON(json: json) else {
print("Error creating json from json parameter")
return
}
//Send request
Alamofire.request(urlTemplate(function.rawValue), method: .post, parameters: payload, encoding: URLEncoding.default).validate().responseJSON { response in
//Check response from server
switch response.result {
case .success(let data):
let jsonResponse = data as! NSDictionary
print("\(jsonResponse)")
//Execute callback post request handler
if completionHandler != nil {
completionHandler!(jsonResponse)
}
case .failure(let error):
print("Shit didn't work!\(error)")
}
}
}
The last parameter is a closure that executes once the orginal async operation is complete. You pass in the result to the closure and do what you want with it. In my case I wanted to disable the view while the async operations were rolling. You can enable the view in your closure argument since the result from the alamo async operation is called on the main thread. completionHandler defaults to nil if you don't need the result and stops the chaining.
You can use this framework for Swift coroutines - https://github.com/belozierov/SwiftCoroutine
func AlamoTest() throws -> Bool {
try Coroutine.await() { callback in
Alamofire.request("blah blah blah", method: .post, parameters: parameters, encoding: .default).responseJSON { response in
let success = response["success"]
callback(success)
}
}
}
and then call this method inside coroutine:
DispatchQueue.main.startCoroutine {
let result = try AlamoTest()
}
My objective is to
start a GUI effect,
await some async work without freezing the GUI
do a final GUI effect
I've prepared a first demo code using a viewmodel with the following
member this.RunSetStatus() =
async {
this.Status <- "!Start resetting #" + DateTime.Now.ToString "yy.MM.dd hh:mm:ss"
let! task = async {
do! Async.Sleep (10 * 1000)
return "!Reset done #" + DateTime.Now.ToString "yy.MM.dd hh:mm:ss"
}
this.Status <- task
} |> Async.StartImmediate
It behaves as expected so I'm happy with the above.
The issue is when I replace the Sleep in the demo with real blocking work, like a wcf consumer, retrieving some results.
member this.CheckReport(user : string) =
async {
let endpoint = new ServiceEndpoint(ContractDescription.GetContract(typeof<IClaimService>),
new BasicHttpBinding(),
new EndpointAddress(address))
let factory = new ChannelFactory<IClaimService>(endpoint)
let channel = factory.CreateChannel()
let resp = channel.CheckReport(user)
factory.Close()
return resp
}
called from my final delegate command
let RefreshLogic() =
this.RefreshIsActive <- true
async {
let cons = ConsumerLib.ConsumerWCF()
let! task, msg = async {
try
let! resp = cons.CheckReport(Environment.UserName.ToLower())
return resp , ""
with
|exc -> return [||], (ConsumerLib.FindInner(exc).Message + ConsumerLib.FindInner(exc).StackTrace)
}
this.Reports <- task
this.RefreshIsActive <- false
this.StatusMsg <- msg
this.ExportCommand.RaiseCanExecuteChanged()
} |> Async.StartImmediate
It unfortunately freezes the GUI while refreshing (why?)
The problem is your CheckReport function. While it's an async block, it never actually calls any asynchronous work (ie: nothing is bound via let! or do!), so the entire block runs synchronously.
Even though the work is inside of an asynchronous workflow, when you use StartImmediate, the work runs synchronously up to the first actual asynchronous function call, which would be bound by let! or do!. Since your work is completely synchronous, this propogates upwards, and ends up being synchronous, blocking the UI.
If your WCF bindings were setup to include asynchronous versions that are Task returning, the best approach here would be to use the asynchronous version of the WCF method, which would look something like:
let! resp = channel.CheckReportAsync(user) |> Async.AwaitTask
I'm having problems to execute a https requests, if the request don't have any error i never get the message, this is a command line tool application and i have a plist to allow http requests, i always see the completion block.
typealias escHandler = ( URLResponse?, Data? ) -> Void
func getRequest(url : URL, _ handler : #escaping escHandler){
let session = URLSession.shared
var request = URLRequest(url:url)
request.cachePolicy = .reloadIgnoringLocalCacheData
request.httpMethod = "GET"
let task = session.dataTask(with: url ){ (data,response,error) in
handler(response,data)
}
task.resume()
}
func startOp(action : #escaping () -> Void) -> BlockOperation{
let exOp = BlockOperation(block: action)
exOp.completionBlock = {
print("Finished")
}
return exOp
}
for sUrl in textFile.components(separatedBy: "\n"){
let url = URL(string: sUrl)!
let queu = startOp {
getRequest(url: url){ response, data in
print("REACHED")
}
}
operationQueue.addOperation(queu)
operationQueue.waitUntilAllOperationsAreFinished()
One problem is that your operation is merely starting the request, but because the request is performed asynchronously, the operation is immediately completing, not actually waiting for the request to finish. You don't want to complete the operation until the asynchronous request is done.
If you want to do this with operation queues, the trick is that you must subclass Operation and perform the necessary KVO for isExecuting and isFinished. You then change isExecuting when you start the request and isFinished when you finish the request, with the associated KVO for both. This is all outlined in the Concurrency Programming Guide: Defining a Custom Operation Object, notably in the Configuring Operations for Concurrent Execution section. (Note, this guide is a little outdated (it refers to the isConcurrent property, which has been replaced is isAsynchronous; it's focusing on Objective-C; etc.), but it introduces you to the issues.
Anyway, This is an abstract class that I use to encapsulate all of this asynchronous operation silliness:
/// Asynchronous Operation base class
///
/// This class performs all of the necessary KVN of `isFinished` and
/// `isExecuting` for a concurrent `NSOperation` subclass. So, to developer
/// a concurrent NSOperation subclass, you instead subclass this class which:
///
/// - must override `main()` with the tasks that initiate the asynchronous task;
///
/// - must call `completeOperation()` function when the asynchronous task is done;
///
/// - optionally, periodically check `self.cancelled` status, performing any clean-up
/// necessary and then ensuring that `completeOperation()` is called; or
/// override `cancel` method, calling `super.cancel()` and then cleaning-up
/// and ensuring `completeOperation()` is called.
public class AsynchronousOperation : Operation {
override public var isAsynchronous: Bool { return true }
private let lock = NSLock()
private var _executing: Bool = false
override private(set) public var isExecuting: Bool {
get {
return lock.synchronize { _executing }
}
set {
willChangeValue(forKey: "isExecuting")
lock.synchronize { _executing = newValue }
didChangeValue(forKey: "isExecuting")
}
}
private var _finished: Bool = false
override private(set) public var isFinished: Bool {
get {
return lock.synchronize { _finished }
}
set {
willChangeValue(forKey: "isFinished")
lock.synchronize { _finished = newValue }
didChangeValue(forKey: "isFinished")
}
}
/// Complete the operation
///
/// This will result in the appropriate KVN of isFinished and isExecuting
public func completeOperation() {
if isExecuting {
isExecuting = false
isFinished = true
}
}
override public func start() {
if isCancelled {
isFinished = true
return
}
isExecuting = true
main()
}
}
And I use this Apple extension to NSLocking to make sure I synchronize the state changes in the above (theirs was an extension called withCriticalSection on NSLock, but this is a slightly more generalized rendition, working on anything that conforms to NSLocking and handles closures that throw errors):
extension NSLocking {
/// Perform closure within lock.
///
/// An extension to `NSLocking` to simplify executing critical code.
///
/// - parameter block: The closure to be performed.
func synchronize<T>(block: () throws -> T) rethrows -> T {
lock()
defer { unlock() }
return try block()
}
}
Then, I can create a NetworkOperation which uses that:
class NetworkOperation: AsynchronousOperation {
var task: URLSessionTask!
init(session: URLSession, url: URL, requestCompletionHandler: #escaping (Data?, URLResponse?, Error?) -> ()) {
super.init()
task = session.dataTask(with: url) { data, response, error in
requestCompletionHandler(data, response, error)
self.completeOperation()
}
}
override func main() {
task.resume()
}
override func cancel() {
task.cancel()
super.cancel()
}
}
Anyway, having done that, I can now create operations for network requests, e.g.:
let queue = OperationQueue()
queue.name = "com.domain.app.network"
let url = URL(string: "http://...")!
let operation = NetworkOperation(session: .shared, url: url) { data, response, error in
guard let data = data, error == nil else {
print("\(error)")
return
}
let string = String(data: data, encoding: .utf8)
print("\(string)")
// do something with `data` here
}
let operation2 = BlockOperation {
print("done")
}
operation2.addDependency(operation)
queue.addOperations([operation, operation2], waitUntilFinished: false) // if you're using command line app, you'd might use `true` for `waitUntilFinished`, but with standard Cocoa apps, you generally would not
Note, in the above example, I added a second operation that just printed something, making it dependent on the first operation, to illustrate that the first operation isn't completed until the network request is done.
Obviously, you would generally never use the waitUntilAllOperationsAreFinished of your original example, nor the waitUntilFinished option of addOperations in my example. But because you're dealing with a command line app that you don't want to exit until these requests are done, this pattern is acceptable. (I only mention this for the sake of future readers who are surprised by the free-wheeling use of waitUntilFinished, which is generally inadvisable.)
I'm creating an app that requires todo parallel http request, I'm using HttpClient for this.
I'm looping over the urls and foreach URl I start a new Task todo the request.
after the loop I wait untill every task finishes.
However when I check the calls being made with fiddler I see that the request are being called synchronously. It's not like a bunch of request are being made, but one by one.
I've searched for a solution and found that other people have experienced this too, but not with UWP. The solution was to increase the DefaultConnectionLimit on the ServicePointManager.
The problem is that ServicePointManager does not exist for UWP. I've looked in the API's and I thought I could set the DefaultConnectionLimit on HttpClientHandler, but no.
So I have a few Questions.
Is DefaultConnectionLimit still a property that could be set somewhere?
if so, where do i set it?
if not, how do I increase the connnectionlimit?
Is there still a connectionlimit in UWP?
this is my code:
var requests = new List<Task>();
var client = GetHttpClient();
foreach (var show in shows)
{
requests.Add(Task.Factory.StartNew((x) =>
{
((Show)x).NextEpisode = GetEpisodeAsync(((Show)x).NextEpisodeUri, client).Result;}, show));
}
}
await Task.WhenAll(requests.ToArray());
and this is the request:
public async Task<Episode> GetEpisodeAsync(string nextEpisodeUri, HttpClient client)
{
try
{
if (String.IsNullOrWhiteSpace(nextEpisodeUri)) return null;
HttpResponseMessage content; = await client.GetAsync(nextEpisodeUri);
if (content.IsSuccessStatusCode)
{
return JsonConvert.DeserializeObject<EpisodeWrapper>(await content.Content.ReadAsStringAsync()).Episode;
}
}
catch (Exception ex)
{
Debug.WriteLine(ex.Message);
}
return null;
}
Oke. I have the solution. I do need to use async/await inside the task. The problem was the fact I was using StartNew instead of Run. but I have to use StartNew because i'm passing along a state.
With the StartNew. The task inside the task is not awaited for unless you call Unwrap. So Task.StartNew(.....).Unwrap(). This way the Task.WhenAll() will wait untill the inner task is complete.
When u are using Task.Run() you don't have to do this.
Task.Run vs Task.StartNew
The stackoverflow answer
var requests = new List<Task>();
var client = GetHttpClient();
foreach (var show in shows)
{
requests.Add(Task.Factory.StartNew(async (x) =>
{
((Show)x).NextEpisode = await GetEpisodeAsync(((Show)x).NextEpisodeUri, client);
}, show)
.Unwrap());
}
Task.WaitAll(requests.ToArray());
I think an easier way to solve this is not "manually" starting requests but instead using linq with an async delegate to query the episodes and then set them afterwards.
You basically make it a two step process:
Get all next episodes
Set them in the for each
This also has the benefit of decoupling your querying code with the sideeffect of setting the show.
var shows = Enumerable.Range(0, 10).Select(x => new Show());
var client = new HttpClient();
(Show, Episode)[] nextEpisodes = await Task.WhenAll(shows
.Select(async show =>
(show, await GetEpisodeAsync(show.NextEpisodeUri, client))));
foreach ((Show Show, Episode Episode) tuple in nextEpisodes)
{
tuple.Show.NextEpisode = tuple.Episode;
}
Note that i am using the new Tuple syntax of C#7. Change to the old tuple syntax accordingly if it is not available.