In golang, how to determine the final URL after a series of redirects? - http

So, I'm using the net/http package. I'm GETting a URL that I know for certain is redirecting. It may even redirect a couple of times before landing on the final URL. Redirection is handled automatically behind the scenes.
Is there an easy way to figure out what the final URL was without a hackish workaround that involves setting the CheckRedirect field on a http.Client object?
I guess I should mention that I think I came up with a workaround, but it's kind of hackish, as it involves using a global variable and setting the CheckRedirect field on a custom http.Client.
There's got to be a cleaner way to do it. I'm hoping for something like this:
package main
import (
"fmt"
"log"
"net/http"
)
func main() {
// Try to GET some URL that redirects. Could be 5 or 6 unseen redirections here.
resp, err := http.Get("http://some-server.com/a/url/that/redirects.html")
if err != nil {
log.Fatalf("http.Get => %v", err.Error())
}
// Find out what URL we ended up at
finalURL := magicFunctionThatTellsMeTheFinalURL(resp)
fmt.Printf("The URL you ended up at is: %v", finalURL)
}

package main
import (
"fmt"
"log"
"net/http"
)
func main() {
resp, err := http.Get("http://stackoverflow.com/q/16784419/727643")
if err != nil {
log.Fatalf("http.Get => %v", err.Error())
}
// Your magic function. The Request in the Response is the last URL the
// client tried to access.
finalURL := resp.Request.URL.String()
fmt.Printf("The URL you ended up at is: %v\n", finalURL)
}
Output:
The URL you ended up at is: http://stackoverflow.com/questions/16784419/in-golang-how-to-determine-the-final-url-after-a-series-of-redirects

I would add a note that http.Head method should be enough to retrieve the final URL. Theoretically it should be faster comparing to http.Get as a server is expected to send back just a header:
resp, err := http.Head("http://stackoverflow.com/q/16784419/727643")
...
finalURL := resp.Request.URL.String()
...

Related

Why is there a 60 second delay on my HTTP POST request when using a Go HTTP client?

My goal is to scrape a website that requires me to log in first using HTTP requests in Golang. I actually succeeded by finding out I can send a post request to the website writing form-data into the body of the request. When I test this through an API development software I use called Postman, the response is instantaneous with no delays. However, when performing the request with an HTTP client in Go, there is a consistent 60 second delay every single time. I end up getting a logged in page, but for my program I need the response to be nearly instantaneous.
As you can see in my code, I've tried adding a bunch of headers to the request like "Connection", "Content-Type", "User-Agent" since I thought maaaaaybe the website can tell I'm requesting from a program and is forcing me to wait 60 seconds for a response. Adding these headers to make my request more legitimate(?) doesn't work at all.
Is the delay coming from Go's HTTP client being slow or is there something wrong with how I'm forming my HTTP POST request? Also, was I on to something with my headers and HTTP client is rewriting them when they send out?
Here's my simple program...
package main
import (
"bytes"
"fmt"
"mime/multipart"
"net/http"
"net/http/cookiejar"
"os"
)
func main() {
url := "https://easypronunciation.com/en/log-in"
method := "POST"
payload := &bytes.Buffer{}
writer := multipart.NewWriter(payload)
_ = writer.WriteField("email", "foo#bar.com")
_ = writer.WriteField("password", "*********")
_ = writer.WriteField("persistent_login", "on")
_ = writer.WriteField("submit", "")
err := writer.Close()
if err != nil {
fmt.Println(err)
}
cookieJar, _ := cookiejar.New(nil)
client := &http.Client{
Jar: cookieJar,
}
req, err := http.NewRequest(method, url, payload)
if err != nil {
fmt.Println(err)
}
req.Header.Set("Content-Type", writer.FormDataContentType())
req.Header.Set("Connection", "Keep-Alive")
req.Header.Set("Accept-Language", "en-US")
req.Header.Set("User-Agent", "Mozilla/5.0")
res, err := client.Do(req)
if err != nil {
fmt.Println(err)
}
defer res.Body.Close()
f, err := os.Create("response.html")
defer f.Close()
res.Write(f)
}
I doubt, this is the go client library too. I would suggest printing out the latencies for different components and see if/where the 60 second delay is. I would also replace and try different URLs instead

Handling custom 404 pages with http.FileServer

I'm currently using a basic http.FileServer setup to serve a simple static site. I need to handle 404 errors with a custom not found page. I've been looking into this issue quite a bit, and I cannot determine what the best solution is.
I've seen several responses on GitHub issues along the lines of:
You can implement your own ResponseWriter which writes a custom message after WriteHeader.
It seems like this is the best approach but I'm a bit unsure of how this would actually be implemented. If there are any simple examples of this implementation, it'd be greatly appreciated!
I think this can be solved with your own middleware. You can try to open the file first and if it doesn't exist, call your own 404 handler. Otherwise just dispatch the call to the static file server in the standard library.
Here is how that could look:
package main
import (
"fmt"
"net/http"
"os"
"path"
)
func notFound(w http.ResponseWriter, r *http.Request) {
// Here you can send your custom 404 back.
fmt.Fprintf(w, "404")
}
func customNotFound(fs http.FileSystem) http.Handler {
fileServer := http.FileServer(fs)
return http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
_, err := fs.Open(path.Clean(r.URL.Path)) // Do not allow path traversals.
if os.IsNotExist(err) {
notFound(w, r)
return
}
fileServer.ServeHTTP(w, r)
})
}
func main() {
http.ListenAndServe(":8080", customNotFound(http.Dir("/path/to/files")))
}

Unable to read post form value in Golang with htprouter

I'm new to Golang and trying to get a basic http app running using the httprouter API. I've hit a wall with reading posted form data, despite following the advice given in another StackOverflow question.
Here's my code (minus irrelevancies):
import (
"fmt"
"net/http"
"github.com/julienschmidt/httprouter"
)
func main() {
r := httprouter.New()
r.POST("/sub", func(w http.ResponseWriter, r *http.Request, _ httprouter.Params) {
r.Header.Set("content-type", "text/html")
err := r.ParseForm()
if err != nil {
fmt.Fprintf(w, "<h1>Error: %s</h1>\n", err)
}
fmt.Fprintf(w, "<h1>Submitted message!</h1>\n<p>-%s-</p>\n", r.PostFormValue("msg"))
})
http.ListenAndServe("localhost:3000", r)
}
In the output, where I should see -hello-, I just see --. When I inspect the http request in Firefox, in the Form Data panel, I see msg:"hello", so why is r.PostFormValue("msg") returning a blank string?
Thanks to Volker for pointing out an error. When I commented out the line r.Header.Set("content-type", "text/html"), the problem was resolved. Perhaps that was the issue, or perhaps there was some issue with the IDE (LiteIDE) caching an old version of the code. In any case, I can now read the posted value.

How to process GET operation (CRUD) in go lang via Postman?

I want to perform a get operation. I am passng name as a resource to the URL.
The URL I am hitting in Postman is : localhost:8080/location/{titan rolex} ( I chose the GET method in the dropdown list)
On the URL hit in Postman, I am executing the GetUser func() with body as:
func GetUser(rw http.ResponseWriter, req *http.Request) {
}
Now I wish to get the resource value i.e 'titan rolex' in the GetUser method.
How can I achieve this in golang?
In main(), I have this :
http.HandleFunc("/location/{titan rolex}", GetUser)
Thanks in advance.
What you are doing is binding the complete path /location/{titan rolex} to be handled by GetUser.
What you really want is to bind /location/<every possible string> to be handled by one handler (e.g. LocationHandler).
You can do that with either the standard library or another router. I will present both ways:
Standard lib:
import (
"fmt"
"net/http"
"log"
)
func locationHandler(w http.ResponseWriter, r *http.Request) {
name := r.URL.Path[len("/location/"):]
fmt.Fprintf(w, "Location: %s\n", name)
}
func main() {
http.HandleFunc("/location/", locationHandler)
log.Fatal(http.ListenAndServe(":8080", nil))
}
Note however, more complex paths (such as /location/<every possible string>/<some int>/<another string>) will be tedious to implement this way.
The other way is to use github.com/julienschmidt/httprouter, especially if you encounter these situations more often (and have more complex paths).
Here's an example for your use case:
import (
"fmt"
"github.com/julienschmidt/httprouter"
"net/http"
"log"
)
func LocationHandler(w http.ResponseWriter, r *http.Request, ps httprouter.Params) {
fmt.Fprintf(w, "Location: %s\n", ps.ByName("loc"))
}
func main() {
router := httprouter.New()
router.GET("/location/:loc", LocationHandler)
log.Fatal(http.ListenAndServe(":8080", router))
}
Note that httprouter uses a slightly different signature for handlers. This is because, as you can see, it passes these parameters to the functions as well.
Oh and another note, you can just hit http://localhost:8080/location/titan rolex with your browser (or something else) - if that something else is decent enough, it will URLEncode that to be http://localhost:8080/location/titan%20rolex.

Unexpected EOF using Go http client

I am learning Go and came across this problem.
I am just downloading web page content using HTTP client:
package main
import (
"fmt"
"io/ioutil"
"log"
"net/http"
)
func main() {
client := &http.Client{}
req, err := http.NewRequest("GET", "https://mail.ru/", nil)
req.Close = true
response, err := client.Do(req)
if err != nil {
log.Fatal(err)
}
defer response.Body.Close()
content, err := ioutil.ReadAll(response.Body)
if err != nil {
fmt.Println(err)
}
fmt.Println(string(content)[:100])
}
I get an unexpected EOF error when reading response body. At the same time content variable has full page content.
This error appear only when I downloading https://mail.ru/ content. With other URLs everything works fine - without any errors.
I used curl for downloading this page content - everything works as expected.
I am confused a bit - what's happening here?
Go v1.2, tried on Ubuntu and MacOS X
It looks like the that server (Apache 1.3, wow!) is serving up a truncated gzip response. If you explicitly request the identity encoding (preventing the Go transport from adding gzip itself), you won't get the ErrUnexpectedEOF:
req.Header.Add("Accept-Encoding", "identity")

Resources