Haskell Https Get Proxy Request - http

I have a working getRequest via a Proxy :
main = do
rsp <- browse $ do
setProxy . fromJust $ parseProxy "128.199.232.117:3128"
request $ getRequest "https://www.youtube.com/watch?v=yj_wyw6Xrq4"
print $ rspBody <$> rsp
But it's htpps and so basically I get an Exception. But I foud out here that it can also work with htpps :
import Network.Connection (TLSSettings (..))
import Network.HTTP.Conduit
main :: IO ()
main = do
request <- parseUrl "https://github.com/"
let settings = mkManagerSettings (TLSSettingsSimple True False False) Nothing
manager <- newManager settings
res <- httpLbs request manager
print res
But I have no idea how to integrate this into my Proxy getRequest Code?
Could someone show me please? Thanks

Looks like you are using HTTP package it the first snippet and http-conduit in the second one.
Unfortunately HTTP doesn't support https, so your can't "integrate" the second snippet into the first one. But http-conduit supports proxies, so you can use addProxy function to set proxy host and port (not tested):
{-# LANGUAGE OverloadesStrings #-}
...
request <- do
req <- parseUrl "https://github.com/"
return $ addProxy "128.199.232.117" 3128 req
...

Related

How to get the client IP adress using HTTP.jl

I'm trying to get the both the client request and IP address from http requests to my HTTP.jl server (based on the basic server example in the docs).
using HTTP
using Sockets
const APP = HTTP.Router()
# My request handler function can see the request's method
# and target but not the IP address it came from
HTTP.#register(APP,"GET","/",req::HTTP.Request -> begin
println("$(req.method) request to $(req.target)")
"Hello, world!"
end)
HTTP.serve(
APP,
Sockets.localhost,
8081;
# My tcpisvalid function can see the client's
# IP address but not the HTTP request
tcpisvalid=sock::Sockets.TCPSocket -> begin
host, port = Sockets.getpeername(sock)
println("Request from $host:$port")
true
end
)
My best guess would be that there's a way to parse the TCPSocket.buffer into an HTTP request but I can't find any methods to do it.
Can you suggest a way to get an HTTP.Request from a TCPSocket or a different way to approach this problem?
Thanks in advance!
The router (APP) is a (collection of) "request handler(s)" which can only access the HTTP.Request -- you can not get the stream from it. Instead you can define a "stream handler", which is passed the stream. From the stream you can get the client's IP adress using Sockets.getpeername (requires HTTP.jl version 0.9.7 when called on a HTTP.Stream as in the examples below).
using HTTP, Sockets
const APP = HTTP.Router()
function request_handler(req::HTTP.Request)
println("$(req.method) request to $(req.target)")
return "Hello, world!"
end
HTTP.#register APP "GET" "/" request_handler
function stream_handler(http::HTTP.Stream)
host, port = Sockets.getpeername(http)
println("Request from $host:$port")
return HTTP.handle(APP, http) # regular handling
end
# HTTP.serve with stream=true to specify that stream_handler is a function
# that expects a HTTP.Stream as input (and not a HTTP.Request)
HTTP.serve(stream_handler, Sockets.localhost, 8081; stream=true) # <-- Note stream=true
# or HTTP.listen
HTTP.listen(stream_handler, Sockets.localhost, 8081)

How to reload code when HTTP server is running?

When starting an http server using HTTP.serve there is apparently no way to reload the code that is actually handling the HTTP request.
In the example below I would like to have the modifications in my_httphandler taken into account without having to restart the server.
For the moment I need to stop the server from the REPL by pressing CTRL+C twice and then run the script again.
Is there a workaround ?
module MyModule
using HTTP
using Mux
using JSON
using Sockets
function my_httphandler(req::HTTP.Request)
return HTTP.Response(200, "Hello world")
end
const MY_ROUTER = HTTP.Router()
HTTP.#register(MY_ROUTER, "GET", "/*", my_httphandler)
HTTP.serve(MY_ROUTER, Sockets.localhost, 8081)
end
I'm not sure whether Mux caches handlers. As long as it does not, this should work:
module MyModule
using HTTP
using Mux
using JSON
using Sockets
function my_httphandler(req::HTTP.Request)
return HTTP.Response(200, "Hello world")
end
const functionref = Any[my_httphandler]
const MY_ROUTER = HTTP.Router()
HTTP.#register(MY_ROUTER, "GET", "/*", functionref[1])
HTTP.serve(MY_ROUTER, Sockets.localhost, 8081)
end
function newhandler(req::HTTP.Request)
return HTTP.Response(200, "Hello world 2")
end
MyModule.functionref[1] = newhandler
Revise.jl lets you automatically update code in a live Julia session. You may be especially interested in entr; see Revise's documentation for details.
When using HTTP.jl: just add #async before HTTP.serve
module MyModule
using HTTP
using Sockets
function my_httphandler(req::HTTP.Request)
return HTTP.Response(200, "Hello world")
end
const MY_ROUTER = HTTP.Router()
HTTP.#register(MY_ROUTER, "GET", "/*", my_httphandler)
#async HTTP.serve(MY_ROUTER, Sockets.localhost, 8081)
end # module
When using Mux.jl: nothing to do, the server is started in the background
using Mux
function sayhellotome(name)
return("hello " * name * "!!!")
end
#app test = (
Mux.defaults,
route("/sayhello/:user", req -> begin
sayhellotome(req[:params][:user])
end),
Mux.notfound())
Mux.serve(test, 8082)
I've added a ticket #587 to HTTP.jl project for developer workflow support. I'm not sure this is your use case or not.
# hello.jl -- an example showing how Revise.jl works with HTTP.jl
# julia> using Revise; includet("hello.jl"); serve();
using HTTP
using Sockets
homepage(req::HTTP.Request) =
HTTP.Response(200, "<html><body>Hello World!</body></html>")
const ROUTER = HTTP.Router()
HTTP.#register(ROUTER, "GET", "/", homepage)
serve() = HTTP.listen(request -> begin
Revise.revise()
Base.invokelatest(HTTP.handle, ROUTER, request)
end, Sockets.localhost, 8080, verbose=true)
Alternatively, you could have a test/serve.jl file, that assumes MyModule with a top-level HTTP.jl router is called ROUTER. You'll need to remove the call to serve in your main module.
#!/usr/bin/env julia
using HTTP
using Sockets
using Revise
using MyModule: ROUTER
HTTP.listen(request -> begin
Revise.revise()
Base.invokelatest(HTTP.handle, ROUTER, request)
end, Sockets.localhost, 8080, verbose=true)
A more robust solution would catch errors; however, I had challenges getting this to work and reported my experience at #541 in Revise.jl.

simpleHttp causing 'unsupported browser response?'

I'm executing a simpleHttp request to a https domain, yet the response html is showing 'unsupported browser' messages -- i believe this is because simpleHttp does not support HTTPS.
My function:
import Network.HTTP.Simple
makeRequest :: IO LAZ.ByteString
makeRequest = do
response <- simpleHttp "https://www.example.com"
return (response)
Which haskell libraries support https?
Wreq provides a very easy to follow tutorial on http/s requests using basic lens syntax.
A https compatible request is as simple as:
main = do
r <- get "https://www.example.com"
Response statuses and bodies can be accessed respectively:
r ^. responseStatus . statusCode
r ^. responseBody
This code doesn't compile. Even adding in the LAZ import, the Network.HTTP.Simple module does not provide the simpleHttp function. You can do this with httpLBS:
{-# LANGUAGE OverloadedStrings #-}
import Network.HTTP.Simple
import qualified Data.ByteString.Lazy as LAZ
makeRequest :: IO LAZ.ByteString
makeRequest = do
response <- httpLBS "https://www.example.com"
return (getResponseBody response)
main :: IO ()
main = makeRequest >>= LAZ.putStr
Or by using the simpleHttp function from Network.HTTP.Conduit:
{-# LANGUAGE OverloadedStrings #-}
import Network.HTTP.Conduit
import qualified Data.ByteString.Lazy as LAZ
makeRequest :: IO LAZ.ByteString
makeRequest = simpleHttp "https://www.example.com"
main :: IO ()
main = makeRequest >>= LAZ.putStr
Note that wreq uses the same HTTP engine under the surface as http-conduit (http-client). My guess is that you were originally trying to use one of the functions from http-client itself, but I'm not sure what that code would have looked like.

Increasing request timeout for Network.HTTP.Conduit

I use the http-conduit library version 2.0+ to fetch the contents from a HTTP webservice:
import Network.HTTP.Conduit
main = do content <- simpleHttp "http://stackoverflow.com"
print $ content
As stated in the docs, the default timeout is 5 seconds.
Note: This question was answered by me immediately and therefore intentionally does not show further research effort.
Similar to this previous question you can't do that with simpleHttp alone. You need to use a Manager together with httpLbs in order to be able to set the timeout.
Note that you don't need to set the timeout in the manager but you can set it for each request individually.
Here is a full example that behaves like your function above, but allows you to modify the timeout:
import Network.HTTP.Conduit
import Control.Monad (liftM)
import qualified Data.ByteString.Lazy.Char8 as LB
-- | A simpleHttp alternative that allows to specify the timeout
-- | Note that the timeout parameter is in microseconds!
downloadHttpTimeout :: Manager -> String -> Int -> IO LB.ByteString
downloadHttpTimeout manager url timeout = do req <- parseUrl url
let req' = req {responseTimeout = Just timeout}
liftM responseBody $ httpLbs req' manager
main = do manager <- newManager conduitManagerSettings
let timeout = 15000000 -- Microseconds --> 15 secs
content <- downloadHttpTimeout manager "http://stackoverflow.com" timeout
print $ content
I've found the following to be a version of Uli's downloadHttpTimeout that resembles simpleHTTP more closely:
simpleHTTPWithTimeout :: Int -> Request a -> IO (Response LB.ByteString)
simpleHTTPWithTimeout timeout req =
do mgr <- newManager tlsManagerSettings
let req = req { responseTimeout = Just timeout }
httpLbs req mgr
the only difference from simpleHTTP being a slightly different return type, so to extract e.g. the response body, one uses conduit's responseBody not Network.HTTP.getResponseBody.

Haskell http response result unreadable

import Network.URI
import Network.HTTP
import Network.Browser
get :: URI -> IO String
get uri = do
let req = Request uri GET [] ""
resp <- browse $ do
setAllowRedirects True -- handle HTTP redirects
request req
return $ rspBody $ snd resp
main = do
case parseURI "http://cn.bing.com/search?q=hello" of
Nothing -> putStrLn "Invalid search"
Just uri -> do
body <- get uri
writeFile "output.txt" body
Here is the diff between haskell output and curl output
It's probably not a good idea to use String as the intermediate data type here, as it will cause character conversions both when reading the HTTP response, and when writing to the file. This can cause corruption if these conversions are nor consistent, as it would appear they are here.
Since you just want to copy the bytes directly, it's better to use a ByteString. I've chosen to use a lazy ByteString here, so that it does not have to be loaded into memory all at once, but can be streamed lazily into the file, just like with String.
import Network.URI
import Network.HTTP
import Network.Browser
import qualified Data.ByteString.Lazy as L
get :: URI -> IO L.ByteString
get uri = do
let req = Request uri GET [] L.empty
resp <- browse $ do
setAllowRedirects True -- handle HTTP redirects
request req
return $ rspBody $ snd resp
main = do
case parseURI "http://cn.bing.com/search?q=hello" of
Nothing -> putStrLn "Invalid search"
Just uri -> do
body <- get uri
L.writeFile "output.txt" body
Fortunately, the functions in Network.Browser are overloaded so that the change to lazy bytestrings only involves changing the request body to L.empty, replacing writeFile with L.writeFile, as well as changing the type signature of the function.

Resources