Firestore snapshots simply stop working in Go - firebase

I have a Go app running on a GCloud Compute Engine Instance with Ubuntu 20.04 and whatever the latest ver of Go is for 20.04.
The app is highly reliant on a Firestore Snapshot listener. After a random period of time (hours to a week), the listener just stops working. No errors, nothing. It works flawless until then. I can even update Firestore myself and it still doesn't see the change. Restarting the app fixes it.
I've reported this possible bug to FireBase but haven't heard back yet. Does anyone see anything wrong with this code that might cause that?
I don't think there are any coding mistakes here. If so, it's because I anonymized the code a bit.
func UsersSnapShots() {
ctx := context.Background()
iter := usersdb.Where("connected", "==", 1).Where("status", "==", 1).OrderBy("name", firestore.Asc).Snapshots(ctx)
defer iter.Stop()
for {
next, err := iter.Next()
log.Printf("User change detected")
users = make(map[string]interface{})
if err == iterator.Done {
log.Printf("users done. That shouldn't happen! Run this function again.")
go UsersSnapShots()
break
}
if err != nil {
sendOnlineUsers()
log.Printf("USERS SNAPS ERROR: %v", err)
go UsersSnapShots()
return
}
doc := next.Documents
for {
docIter, err := doc.Next()
if err == iterator.Done {
break
}
t := docIter.Data()
temp := []string{}
//clients is a map of my websocket clients.
safeClients := clients //maps are not thread safe. So let's make a copy.
//Add the websocket IDs for these certain users to the map.
for k, v := range safeClients {
if v.email == t["email"] {
temp = append(temp, k)
}
}
t["sockets"] = temp
users[t["email"].(string)] = t
}
sendOnlineUsers() //Send all the logged in users to all the users' browsers.
//log.Printf("users: %v", users)
}
}

Related

Authenticate Service Account for Remote Config REST API using Go

Over here the Firebase docs explain how you can retrieve a token required to make requests to the Remote Config Rest API.
It provides example code for Python, Java and Node.js. Because there is no code for Go, it sends me to the Google Client Library (for Go). You might be able to understand why I am getting lost there...
The examples use GoogleCredential in Java, ServiceAccountCredentials in Python and google.auth.JWT in Node.js. I was not able to find any of those here. I do not know why there are no clear naming conventions.
I have found
firebaseremoteconfig-gen.go: The code looks like it already implements what the Firebase documentation page tries to achieve "manually". Comparison: doc, package.
Help
Because the "Usage example" of the package ends strangely abrupt and is the opposite of extensive, I do not understand how to make use of it.
I would be helped if someone could tell me how I can use this:
firebaseremoteconfigService, err := firebaseremoteconfig.New(oauthHttpClient)
I could not figure out where I would get oauthHttpClient from. There is an oauth2 package in the repository, but there I face the same problem:
oauth2Service, err := oauth2.New(oauthHttpClient)
I need oauthHttpClient again, so this cannot be a solution.
http.Client could be anything, but I need to authenticate with a service-account.json file, like shown in the three example snippets here.
Tags explanation
I hope that someone has either had experience with integrating Firebase Remote Config with Go, someone knows how Google Client API authentication works or someone is good enough with Go to get how the usage works.
There are a couple of main ways of authenticating with the google APIs, they are documented here:
Link to docs
The ways documented are "3-legged OAuth", "Using API Keys" and finally "Service Accounts".
From the links that you've included in the question; you are looking at the Python / Java / Node examples of "Service Accounts".
Using Service Accounts in go
The oauthHttpClient that you are referring to, is an http client that will attach the authentication information to the requests automatically.
You can create one using this package:
https://godoc.org/golang.org/x/oauth2/google
The examples linked in other languages use a "service account json key file".
Using the method linked below, you can read that keyfile and create a jwt.Config struct that will give you access to the client that you need.
https://godoc.org/golang.org/x/oauth2/google#JWTConfigFromJSON
The go equivalent of the other language examples linked is;
data, err := ioutil.ReadFile("/path/to/your-project-key.json")
if err != nil {
log.Fatal(err)
}
conf, err := google.JWTConfigFromJSON(data, "https://www.googleapis.com/auth/firebase.remoteconfig")
if err != nil {
log.Fatal(err)
}
// Initiate an http.Client. The following GET request will be
// authorized and authenticated on the behalf of
// your service account.
client := conf.Client(oauth2.NoContext)
client.Get("...")
I just started using the same library (from an AppEngine Standard project). This is how I am creating the service client:
import (
"context"
"encoding/json"
"fmt"
"io/ioutil"
"net/http"
"golang.org/x/oauth2/google"
fb "google.golang.org/api/firebaseremoteconfig/v1"
"google.golang.org/appengine"
"google.golang.org/appengine/log"
)
const (
// Name of our service account file
saFileName = "my-firebase-sa.json"
// OAuth scopes used for remote config API
scopeRemoteConfig = "https://www.googleapis.com/auth/firebase.remoteconfig"
)
func createFirebaseService(ctx context.Context) (*fb.Service, error) {
data, err := ioutil.ReadFile(saFileName)
if err != nil {
return nil, err
}
conf, err := google.JWTConfigFromJSON(data, scopeRemoteConfig)
if err != nil {
return nil, err
}
return fb.New(conf.Client(ctx))
}
And I call it as such:
func fetchConfig(ctx context.Context) (*fb.RemoteConfig, error) {
s, err := createFirebaseService(ctx)
if err != nil {
log.Errorf(ctx, "Failed to create firebase service: %v", err)
return nil, fmt.Errorf("Failed to initialize Firebase service")
}
projectID := "projects/" + appengine.AppID(ctx)
cfg, err := s.Projects.GetRemoteConfig(projectID).Do()
if err != nil {
log.Errorf(ctx, "Failed to call Firebase remote config API: %v", err)
return nil, err
}
return cfg, nil
}
The code is using the Project ID to form its path; after reading through the lib code I noticed it was missing /projects/ from that path; so I just prepended that to my project ID and it works ;-) At least until they fix that and my code stops working..
Hopefully this helps someone.

Async work after response

I am trying to implement http server that:
Calculate farther redirect using some logic
Redirect user
Log user data
The goal is to achieve maximum throughput (at least 15k rps). In order to do this, I want to save log asynchronously. I'm using kafka as logging system and separate logging block of code into separate goroutine. Overall example of current implementation:
package main
import (
"github.com/confluentinc/confluent-kafka-go/kafka"
"net/http"
"time"
"encoding/json"
)
type log struct {
RuntimeParam string `json:"runtime_param"`
AsyncParam string `json:"async_param"`
RemoteAddress string `json:"remote_address"`
}
var (
producer, _ = kafka.NewProducer(&kafka.ConfigMap{
"bootstrap.servers": "localhost:9092,localhost:9093",
"queue.buffering.max.ms": 1 * 1000,
"go.delivery.reports": false,
"client.id": 1,
})
topicName = "log"
)
func main() {
siteMux := http.NewServeMux()
siteMux.HandleFunc("/", httpHandler)
srv := &http.Server{
Addr: ":8080",
Handler: siteMux,
ReadTimeout: 2 * time.Second,
WriteTimeout: 5 * time.Second,
IdleTimeout: 10 * time.Second,
}
if err := srv.ListenAndServe(); err != nil {
panic(err)
}
}
func httpHandler(w http.ResponseWriter, r *http.Request) {
handlerLog := new(log)
handlerLog.RuntimeParam = "runtimeDataString"
http.Redirect(w, r, "http://google.com", 301)
go func(goroutineLog *log, request *http.Request) {
goroutineLog.AsyncParam = "asyncDataString"
goroutineLog.RemoteAddress = r.RemoteAddr
jsonLog, err := json.Marshal(goroutineLog)
if err == nil {
producer.ProduceChannel() <- &kafka.Message{
TopicPartition: kafka.TopicPartition{Topic: &topicName, Partition: kafka.PartitionAny},
Value: jsonLog,
}
}
}(handlerLog, r)
}
The questions are:
Is it correct/efficient to use separate goroutine to implement async logging or should I use a different approach? (workers and channels for example)
Maybe there is a way to further improve performance of server, that I'm missing?
Yes, this is correct and efficient use of a goroutine (as Flimzy pointed in the comments). I couldn't agree more, this is a good approach.
The problem is that the handler may finish executing before the goroutine started processing everything and the request (which is a pointer) may be gone or you may have some races down the middleware stack. I read your comments, that it isn't your case, but in general, you shouldn't pass a request to a goroutine. As I can see from your code, you're really using only RemoteAddr from the request and why not to redirect straight away and put logging in the defer statement? So, I'd rewrite your handler a bit:
func httpHandler(w http.ResponseWriter, r *http.Request) {
http.Redirect(w, r, "http://google.com", 301)
defer func(runtimeDataString, RemoteAddr string) {
handlerLog := new(log)
handlerLog.RuntimeParam = runtimeDataString
handlerLog.AsyncParam = "asyncDataString"
handlerLog.RemoteAddress = RemoteAddr
jsonLog, err := json.Marshal(handlerLog)
if err == nil {
producer.ProduceChannel() <- &kafka.Message{
TopicPartition: kafka.TopicPartition{Topic: &topicName, Partition: kafka.PartitionAny},
Value: jsonLog,
}
}
}("runtimeDataString", r.RemoteAddr)
}
The goroutines unlikely improve performance of your server as you just send the response earlier and those kafka connections could pile up in the background and slow down the whole server. If you find this as the bottleneck, you may consider saving logs locally and sending them to kafka in another process (or pool of workers) outside of your server. This may spread the workload over time (like sending fewer logs when you have more requests and vice versa).

How to avoid running into max open files limit

I'm building an application that will be downloading roughly 5000 CSV files concurrently using go routines and plain ol http get requests. Downloading the files in parallel.
I'm currently running into open file limits imposed by OS X.
The CSV files are served over http. Are there any other network protocols that I can use to batch each request into one? I don't have access to the server, so I can't zip them. I'd also prefer not to change the ulimit because once in production, I probably won't have access to that configuration.
You probably want to limit active concurrent requests to a more sensible number than 5000. Possibly spin up 10/20 workers and send individual files to them over a channel.
The http client should reuse connections for requests, assuming you always read the entire request body, and close it.
Something like this:
func main() {
http.DefaultTransport.(*http.Transport).MaxIdleConnsPerHost = 100
for i := 0; i < 10; i++ {
wg.Add(1)
go worker()
}
var csvs = []string{"http://example.com/a.csv", "http://example.com/b.csv"}
for _, u := range csvs {
ch <- u
}
close(ch)
wg.Wait()
}
var ch = make(chan string)
var wg sync.WaitGroup
func worker() {
defer wg.Done()
for u := range ch {
get(u)
}
}
func get(u string) {
resp, err := http.Get(u)
//check err here
// make sure we always read rest of body, and close
defer resp.Body.Close()
defer io.Copy(ioutil.Discard, resp.Body)
//read and decode / handle it. Make sure to read all of body.
}

Program halts after successive timeout while performing GET request

I'm making a crawler that fetches html, css and js pages. The crawler is a typical one with 4 go-routines running concurrently to fetch the resources. To study, I've been using 3 test sites. The crawler works fine and shows program completion log while testing two of them.
In the 3rd website however, there are too many timeouts happening while fetching css links. This eventually causes my program to stop. It fetches the links but after 20+ successive timeouts, the program stops showing log. Basically it halts. I don't think it's problem with Event log console.
Do I need to handle timeouts separately ? I'm not posting the full code because it won't relate to conceptual answer that I'm seeking. However the code goes something like this :
for {
site, more := <-sites
if more {
url, err := url.Parse(site)
if err != nil {
continue
}
response, error := http.Get(url.String())
if error != nil {
fmt.Println("There was an error with Get request: ", error.Error())
continue
}
// Crawl function
}
}
The default behavior of the http client is to block forever. Set a timeout when you create the client: (http://godoc.org/net/http#Client)
func main() {
client := http.Client{
Timeout: time.Second * 30,
}
res, err := client.Get("http://www.google.com")
if err != nil {
panic(err)
}
fmt.Println(res)
}
After 30 seconds Get will return an error.

Inspecting the body of an HTTP request with gocraft Middleware

I've been using the gocraft-web package so far to do some development on an HTTP service. It's really great because you can stick middleware in it to check for stuff like the presence of a Cookie in the header.
At the moment I am wanting to implement request signing. Getting the client to sign the request is easy enough, but I am wanting to check it for all endpoints with a common piece of middleware. Basically the middleware needs to find the key to check against, compute the request HMAC, and check it against the supplied HMAC (presumably in the Authorization Header).
Computing the actual HMAC is really easy in go.
The problem is: reading the message in middleware makes it unavailable to the final endpoint.
The best solution I have come up with (example shown below) is to read everything from the Request in the middleware and stuffing it back into a bytes.Buffer for later reading. Is there a better way to do this? The current implementation seems a bit hackish.
Reading everything into memory sucks, but I can probably just put my service behind a proxy and limit the size of requests anyways. The actual content will always be pretty small(under 5 kilobytes). The extra copy introduced by this approach is likely to be quite slow, but computing the HMAC of a message is not exactly cheap to begin with.
The advantage to this is that it is transparent: it will work with any other go http code that just expects to read from Request.Body without any magic.
I suppose I could be a bit slicker and use a io.TeeReader.
This is my solution so far. If you post to localhost:3300 some JSON it prints the sha512 to the terminal in the server process, but also the response is able to contain a listing of the keys & values in it.
package main
import "fmt"
import "github.com/gocraft/web"
import "net/http"
import "bytes"
import "crypto/sha512"
import "io"
import "encoding/hex"
import "encoding/json"
type Context struct{}
type echoer struct {
*bytes.Buffer
}
func (e echoer) Close() error {
//Just do nothing to make the interface happy
return nil
}
func middlewareThatLooksAtBody(rw web.ResponseWriter, req *web.Request, next web.NextMiddlewareFunc) {
var replacement echoer
replacement.Buffer = &bytes.Buffer{}
hash := sha512.New()
hash.Write([]byte(req.Method))
reader := req.Body
var bytes []byte = make([]byte, 64)
for {
amount, err := reader.Read(bytes)
fmt.Printf("Read %d bytes\n", amount)
if err != nil {
if err == io.EOF {
break
}
panic(err)
}
if amount == 0 {
break
}
hash.Write(bytes)
replacement.Write(bytes)
}
//Is this needed?
reader.Close()
//replacement.Seek(0, 0)
req.Body = replacement
fmt.Printf("%v\n", hex.EncodeToString(hash.Sum(nil)))
next(rw, req)
}
func echoJson(rw web.ResponseWriter, req *web.Request) {
dec := json.NewDecoder(req.Body)
var obj map[string]interface{}
err := dec.Decode(&obj)
if err != nil {
rw.WriteHeader(http.StatusBadRequest)
fmt.Fprintf(rw, "%v\n", err)
return
}
for k, v := range obj {
fmt.Fprintf(rw, "%v = %v\n", k, v)
}
}
func main() {
router := web.New(Context{})
router.Middleware(middlewareThatLooksAtBody)
router.Post("/", echoJson)
http.ListenAndServe("localhost:3300", router)
}
From your description, it looks like you need to read all the bytes from the request body, regardless of what your handlers will do.
If so, then you have at least a couple of options that would avoid the extra copy:
1) Store the read contents inside your gocraft context.
2) Do all body data processing and validation in the middleware and store the results of the processing in the context.
Granted, this means that your handlers now must know that they should look for the contents in the context instead of the req.Body.
I think it's a decent trade-off though, given your requirements.

Resources