In golang, how to write a pipeline stage that introduces a delay for the next stage? - asynchronous

I am following the https://blog.golang.org/pipelines article to implement a few stages.
I need one of the stages to introduce a delay of a few seconds before the events are passed on the next stage in the pipeline.
My concern with the code below is that it will yield an unbounded number of go routines that time.Sleep() before passing the events along. Are there better ways to do this?
Thanks!
func fooStage(inChan <- chan *Bar) (<- chan *Bar) {
out := make(chan *Bar, 10000)
go func() {
defer close(out)
wg := sync.WaitGroup{}
for {
select {
case event, ok := <-inChan:
if !ok {
// inChan closed
break
}
wg.Add(1)
go func() {
time.Sleep(5 * time.Second)
out <- event
wg.Done()
}()
}
}
wg.Wait()
}()
return out
}

You could use another channel to limit the number of active goroutines your loop is able to create.
const numRoutines = 10
func fooStage(inChan <-chan *Bar) <-chan *Bar {
out := make(chan *Bar, 10000)
routines := make(chan struct{}, numRoutines)
go func() {
defer close(out)
wg := sync.WaitGroup{}
for {
select {
case event, ok := <-inChan:
if !ok {
// inChan closed
break
}
wg.Add(1)
routines <- struct{}{}
go func() {
time.Sleep(5 * time.Second)
out <- event
wg.Done()
<-routines
}()
}
}
wg.Wait()
}()
return out
}

You may fix the number of goroutines manually - starting only number you need.
func sleepStage(in <-chan *Bar) (out <-chan *Bar) {
out = make(<-chan *Bar)
wg := sync.WaitGroup
for i:=0; i < N; i++ { // Number of goroutines in parallel
wg.Add(1)
go func(){
defer wg.Done()
for e := range in {
time.Sleep(5*time.Seconds)
out <- e
}
}()
}
go func(){}
wg.Wait()
close(out)
}()
return out
}

You can use time.Ticker:
func fooStage(inChan <- chan *Bar) (<- chan *Bar) {
//... some code
ticker := time.NewTicker(5 * time.Second)
<-ticker // the delay, probably need to call twice
ticker.Stop()
close(ticker.C)
//... rest code
}

This is what you should use for a pipeline application. The context allows for faster tear down.
Whatever is responsible for managing your in channel must close it during tear down. Always close your channels.
// Delay delays each `interface{}` coming in through `in` by `duration`.
// If the context is canceled, `in` will be flushed until it closes.
// Delay is very useful for throtteling back CPU usage in your pipelines.
func Delay(ctx context.Context, duration time.Duration, in <-chan interface{}) <-chan interface{} {
out := make(chan interface{})
go func() {
// Correct memory management
defer close(out)
// Keep reading from in until its closed
for i := range in {
// Take one element from in and pass it to out
out <- i
select {
// Wait duration before reading from in again
case <-time.After(duration):
// Don't wait if the context is canceled
case <-ctx.Done():
}
}
}()
return out
}

I have already solved such a problem with my pipeline library, like that:
import "github.com/nazar256/parapipe"
//...
pipeline := parapipe.NewPipeline(10).
Pipe(func(msg interface{}) interface{} {
//some code
}).
Pipe(func(msg interface{}) interface{} {
time.Sleep(3*time.Second)
return msg
}).
Pipe(func(msg interface{}) interface{} {
//some other code
})

Related

goroutine deadlock with waitgroup

I was reading through the Go concurrency book, and I came across the following example.
func condVarBroadcast() {
type button struct {
clicked *sync.Cond
}
b := button{sync.NewCond(&sync.Mutex{})}
// for each call to subscribe, start up a new goroutine
// and wait using the condition variable.
subscribe := func(c *sync.Cond, fn func()) {
var w sync.WaitGroup
w.Add(1)
go func() {
w.Done()
c.L.Lock()
defer c.L.Unlock()
c.Wait()
fn()
}()
w.Wait()
}
var wg sync.WaitGroup
wg.Add(3)
subscribe(b.clicked, func() {
fmt.Println("Maximize window")
wg.Done()
})
subscribe(b.clicked, func() {
fmt.Println("displaying dialog box")
wg.Done()
})
subscribe(b.clicked, func() {
fmt.Println("mouse clicked")
wg.Done()
})
b.clicked.Broadcast()
wg.Wait()
}
I had a couple of doubts:
if I replace "w.Done()" with "defer w.Done()", there is a deadlock. Why? What is the point of doing w.Done() inside a goroutine when it clearly is not done executing? Is this a normal practice?
Why do we even need to waitgroup w in this example? If I don't use it, it says "deadlock: all goroutines are asleep.
Here the problem with race condition is that the main is super fast and broadcasts to the sync.Cond even before the go-routines start and wait for the lock. The reason to have w waitgroup is to make sure that all the go-routines start first and hold the lock before the main thread signals the broadcast
func condVarBroadcast() {
type button struct {
clicked *sync.Cond
}
b := button{sync.NewCond(&sync.Mutex{})}
// for each call to subscribe, start up a new goroutine
// and wait using the condition variable.
subscribe := func(c *sync.Cond, fn func()) {
var w sync.WaitGroup
w.Add(1)
go func() {
w.Done() // <-------- this will notify that the goroutine has started
c.L.Lock()
defer c.L.Unlock()
c.Wait()
fn()
}()
w.Wait() // <--- this will block till the go routine starts and then continue
}
var wg sync.WaitGroup
wg.Add(3)
subscribe(b.clicked, func() {
fmt.Println("Maximize window")
wg.Done()
}) // < ----- This call will block till the go routine starts
subscribe(b.clicked, func() {
fmt.Println("displaying dialog box")
wg.Done()
}) // < ----- This call will block till the go routine starts
subscribe(b.clicked, func() {
fmt.Println("mouse clicked")
wg.Done()
}) // < ----- This call will block till the go routine starts
b.clicked.Broadcast() // <------------ Once this happens all the go routines will continue the processing
wg.Wait()
}
If you want to avoid the w waitgroup, you have to somehow make sure the Broadcast call happens after all the go routines start in that case you could add a simple time.Sleep(time.Second * 1) before the Broadcast call and remove the w waitgroup from the code.

how to use context.Done() with nested http middleware

I would like to know how to properly implement/use context.Done() method when using it within an HTTP server and implementing middleware, my goal is to cancel subsequent events when a client disconnects across nested middleware.
For testing I created the following code, I don't know if is the correct way of doing it since I had to create a channel within the HandleFunc and a goroutine to handle the requests, putting all this together within a select wait statement.
package main
import (
"fmt"
"log"
"net/http"
"time"
)
func hello(w http.ResponseWriter, r *http.Request) {
ctx := r.Context()
log.Println("handler started")
defer log.Println("hander ended")
ch := make(chan struct{})
go func() {
time.Sleep(5 * time.Second)
fmt.Fprintln(w, "Hello")
ch <- struct{}{}
}()
select {
case <-ch:
case <-ctx.Done():
err := ctx.Err()
log.Println(err)
http.Error(w, err.Error(), http.StatusPartialContent)
}
}
func main() {
http.HandleFunc("/", hello)
log.Fatal(http.ListenAndServe(":8080", nil))
}
Basically here the request simulates load by sleeping 5 seconds, and then prints Hello, but if the client cancels the request, for example:
$ curl 0:8080
And then pressing ctl + c, this will be loged:
2017/07/07 22:22:40 handler started
2017/07/07 22:22:42 context canceled
2017/07/07 22:22:42 hander ended
This works but wondering if this pattern (the goroutine and select) should be used in every nested handler or if there is a better way of implementing this.:
ch := make(chan struct{})
go func() {
// some logic
ch <- struct{}{}
}()
select {
case <-ch:
case <-ctx.Done():
err := ctx.Err()
log.Println(err)
http.Error(w, err.Error(), http.StatusPartialContent)
}
At Google, we require that Go programmers pass a Context parameter as the first argument to every function on the call path between incoming and outgoing requests.
-- Go Concurrency Patterns: Context

Recursion in golang is giving deadlock or negative WaitGroup counter when using goroutines, channels and sync.Waitgroup

I am trying to find the list of all directories using a recursive function. The code to the function is
func FindDirs(dir string, nativePartitions []int64, wg *sync.WaitGroup, dirlistchan chan string) {
// defer wg.Done here will give negative waitgroup panic, commenting it will give negative waitgroup counter panic
fd, err := os.Open(dir)
if err != nil {
panic(err)
}
filenames, err := fd.Readdir(0)
if err != nil {
panic(err)
}
for _, i := range filenames {
var buff bytes.Buffer
buff.WriteString(dir)
switch dir {
case "/":
default:
buff.WriteString("/")
}
buff.WriteString(i.Name())
/*err := os.Chdir(dir)
if err != nil {
return err
}*/
t := new(syscall.Statfs_t)
err = syscall.Statfs(buff.String(), t)
if err != nil {
//fmt.Println("Error accessing", buff.String())
}
if checkDirIsNative(t.Type, nativePartitions) && i.IsDir(){
dirlistchan <- buff.String()
FindDirs(buff.String(), nativePartitions, wg, dirlistchan) //recursion happens here
} else {
//fmt.Println(i.Name(), "is not native")
}
}
}
and in the main function, I am calling it as
wg := new(sync.WaitGroup)
dirlistchan := make(chan string, 1000)
wg.Add(1)
go func() {
filtermounts.FindDirs(parsedConfig.ScanFrom, []int64{filtermounts.EXT4_SUPER_MAGIC}, wg, dirlistchan)
}()
go func() {
wg.Wait()
close(dirlistchan)
}()
for i := range dirlistchan {
fmt.Println(i)
}
wg.Wait()
and I am getting a
fatal error: all goroutines are asleep - deadlock!
I was able to get this working if I am printing the result instead of using channels, or append to a slice using mutex. (verified with the linux find command to see if the results are same.) Please find the function after omitting channels and using sync.Mutex and append.
func FindDirs(dir string, nativePartitions []int64, dirlist *[]string, mutex *sync.Mutex) []string{
fd, err := os.Open(dir)
defer fd.Close()
if err != nil {
panic(err)
}
filenames, err := fd.Readdir(0)
if err != nil {
panic(err)
}
for _, i := range filenames {
var buff bytes.Buffer
buff.WriteString(dir)
switch dir {
case "/":
default:
buff.WriteString("/")
}
buff.WriteString(i.Name())
/*err := os.Chdir(dir)
if err != nil {
return err
}*/
t := new(syscall.Statfs_t)
err = syscall.Statfs(buff.String(), t)
if err != nil {
//fmt.Println("Error accessing", buff.String())
}
if checkDirIsNative(t.Type, nativePartitions) && i.IsDir(){
//dirlistchan <- buff.String()
mutex.Lock()
*dirlist = append(*dirlist, buff.String())
mutex.Unlock()
//fmt.Println(buff.String())
FindDirs(buff.String(), nativePartitions, dirlist, mutex)
} else {
//fmt.Println(i.Name(), "is not native")
}
}
return *dirlist
}
But I cannot think of a way to make this work with channels and goroutines. Any help is greatly appreciated.
Note: Here is a link to the golang playground with the code. I couldn't find a workaround to get the syscall thing to work on the playground either. It works on my system though.
Thanks.
Short answer : You are not closing the channel.
Fix : add defer wg.Done() at beginning of the go routine that calls FindDirs
go func() {
defer wg.Done()
filtermounts.FindDirs(parsedConfig.ScanFrom, []int64{filtermounts.EXT4_SUPER_MAGIC}, wg, dirlistchan)
}()
Why did it happen
The go routine that is responsponsible for closing the channel waits for wg there is no wg.Done in the code above. So close never happens
Now the for loop blocks on the channel for close or a value for ever, this cause the error
fatal error: all goroutines are asleep - deadlock!
So here is your code ,this may be run as
go run filename.go /path/to/folder
Code
package main
import (
"bytes"
"fmt"
"os"
"sync"
"syscall"
)
func main() {
wg := new(sync.WaitGroup)
dirlistchan := make(chan string, 1000)
wg.Add(1)
go func() {
defer wg.Done()
FindDirs(os.Args[1], []int64{61267}, wg, dirlistchan)
}()
go func() {
wg.Wait()
close(dirlistchan)
}()
for i := range dirlistchan {
fmt.Println(i)
}
wg.Wait()
}
func FindDirs(dir string, nativePartitions []int64, wg *sync.WaitGroup, dirlistchan chan string) {
fd, err := os.Open(dir)
if err != nil {
panic(err)
}
filenames, err := fd.Readdir(0)
if err != nil {
panic(err)
}
for _, i := range filenames {
var buff bytes.Buffer
buff.WriteString(dir)
switch dir {
case "/":
default:
buff.WriteString("/")
}
buff.WriteString(i.Name())
/*err := os.Chdir(dir)
if err != nil {
return err
}*/
t := new(syscall.Statfs_t)
err = syscall.Statfs(buff.String(), t)
if err != nil {
//fmt.Println("Error accessing", buff.String())
}
if checkDirIsNative(t.Type, nativePartitions) && i.IsDir() {
dirlistchan <- buff.String()
FindDirs(buff.String(), nativePartitions, wg, dirlistchan) //recursion happens here
} else {
//fmt.Println(i.Name(), "is not native")
}
}
}
func checkDirIsNative(dirtype int64, nativetypes []int64) bool {
for _, i := range nativetypes {
if dirtype == i {
return true
}
}
return false
}
Find the go.play link here
As has been stated already you should close the channel if you want the main goroutine to exit.
Example of implementation :
In function func FindDirs you could make an additional channel for every recursive func FindDirs call that this function is going to make and pass that new channel in the argument. Then simultaneously listen to all those new channels and forward the strings back to the channel that function got in the argument.
After all new channels has been closed close the channel given in the argument.
In other words every func call should have its own channel that it sends to. The string is then forwarded all the way to main function.
Dynamic select described here : how to listen to N channels? (dynamic select statement)

Go race condition in timeout handler

I can see two main issues in the example code below, but I don't know how to solve them correctly.
If the timeout handler does not get the signal through the errCh that the next handler has completed or an error occured, it will reply "408 Request timeout" to the request.
The problem here is that the ResponseWriter is not safe to be used by multiple goroutines. And the timeout handler starts a new goroutine when executing the next handler.
Issues:
How to prevent the next handler from writing into the ResponseWriter when the ctx's Done channel times out in the timeout handler.
How to prevent the timeout handler from replying 408 status code when the next handler is writing into the ResponseWriter but it has not finished yet and the ctx's Done channel times out in the timeout handler.
package main
import (
"context"
"fmt"
"net/http"
"time"
)
func main() {
http.Handle("/race", handlerFunc(timeoutHandler))
http.ListenAndServe(":8080", nil)
}
func timeoutHandler(w http.ResponseWriter, r *http.Request) error {
const seconds = 1
ctx, cancel := context.WithTimeout(r.Context(), time.Duration(seconds)*time.Second)
defer cancel()
r = r.WithContext(ctx)
errCh := make(chan error, 1)
go func() {
// w is not safe for concurrent use by multiple goroutines
errCh <- nextHandler(w, r)
}()
select {
case err := <-errCh:
return err
case <-ctx.Done():
// w is not safe for concurrent use by multiple goroutines
http.Error(w, "Request timeout", 408)
return nil
}
}
func nextHandler(w http.ResponseWriter, r *http.Request) error {
// just for fun to simulate a better race condition
const seconds = 1
time.Sleep(time.Duration(seconds) * time.Second)
fmt.Fprint(w, "nextHandler")
return nil
}
type handlerFunc func(w http.ResponseWriter, r *http.Request) error
func (fn handlerFunc) ServeHTTP(w http.ResponseWriter, r *http.Request) {
if err := fn(w, r); err != nil {
http.Error(w, "Server error", 500)
}
}
Here is a possible solution, which is based on #Andy's comment.
A new responseRecorder will be passed to the nextHandler, and the recorded response will be copied back to the client:
func timeoutHandler(w http.ResponseWriter, r *http.Request) error {
const seconds = 1
ctx, cancel := context.WithTimeout(r.Context(),
time.Duration(seconds)*time.Second)
defer cancel()
r = r.WithContext(ctx)
errCh := make(chan error, 1)
w2 := newResponseRecorder()
go func() {
errCh <- nextHandler(w2, r)
}()
select {
case err := <-errCh:
if err != nil {
return err
}
w2.cloneHeader(w.Header())
w.WriteHeader(w2.status)
w.Write(w2.buf.Bytes())
return nil
case <-ctx.Done():
http.Error(w, "Request timeout", 408)
return nil
}
}
And here is the responseRecorder:
type responseRecorder struct {
http.ResponseWriter
header http.Header
buf *bytes.Buffer
status int
}
func newResponseRecorder() *responseRecorder {
return &responseRecorder{
header: http.Header{},
buf: &bytes.Buffer{},
}
}
func (w *responseRecorder) Header() http.Header {
return w.header
}
func (w *responseRecorder) cloneHeader(dst http.Header) {
for k, v := range w.header {
tmp := make([]string, len(v))
copy(tmp, v)
dst[k] = tmp
}
}
func (w *responseRecorder) Write(data []byte) (int, error) {
if w.status == 0 {
w.WriteHeader(http.StatusOK)
}
return w.buf.Write(data)
}
func (w *responseRecorder) WriteHeader(status int) {
w.status = status
}

golang http timeout and goroutines accumulation

I use goroutines achieve http.Get timeout, and then I found that the number has been rising steadily goroutines, and when it reaches 1000 or so, the program will exit
Code:
package main
import (
"errors"
"io/ioutil"
"log"
"net"
"net/http"
"runtime"
"time"
)
// timeout dialler
func timeoutDialler(timeout time.Duration) func(network, addr string) (net.Conn, error) {
return func(network, addr string) (net.Conn, error) {
return net.DialTimeout(network, addr, timeout)
}
}
func timeoutHttpGet(url string) ([]byte, error) {
// change dialler add timeout support && disable keep-alive
tr := &http.Transport{
Dial: timeoutDialler(3 * time.Second),
DisableKeepAlives: true,
}
client := &http.Client{Transport: tr}
type Response struct {
resp []byte
err error
}
ch := make(chan Response, 0)
defer func() {
close(ch)
ch = nil
}()
go func() {
resp, err := client.Get(url)
if err != nil {
ch <- Response{[]byte{}, err}
return
}
defer resp.Body.Close()
body, err := ioutil.ReadAll(resp.Body)
if err != nil {
ch <- Response{[]byte{}, err}
return
}
tr.CloseIdleConnections()
ch <- Response{body, err}
}()
select {
case <-time.After(5 * time.Second):
return []byte{}, errors.New("timeout")
case response := <-ch:
return response.resp, response.err
}
}
func handler(w http.ResponseWriter, r *http.Request) {
_, err := timeoutHttpGet("http://google.com")
if err != nil {
log.Println(err)
return
}
}
func main() {
go func() {
for {
log.Println(runtime.NumGoroutine())
time.Sleep(500 * time.Millisecond)
}
}()
s := &http.Server{
Addr: ":8888",
ReadTimeout: 15 * time.Second,
WriteTimeout: 15 * time.Second,
}
http.HandleFunc("/", handler)
log.Fatal(s.ListenAndServe())
}
http://play.golang.org/p/SzGTMMmZkI
Init your chan with 1 instead of 0:
ch := make(chan Response, 1)
And remove the defer block that closes and nils ch.
See: http://blog.golang.org/go-concurrency-patterns-timing-out-and
Here is what I think is happening:
after the 5s timeout, timeoutHttpGet returns
the defer statement runs, closing ch and then setting it to nil
the go routine it started to do the actual fetch finishes and attempts to send its data to ch
but ch is nil, and so won't receive anything, preventing that statement from finishing, and thus preventing the go routine from finishing
I assume you are setting ch = nil because before you had that, you would get run-time panics because that's what happens when you attempt to write to a closed channel, as described by the spec.
Giving ch a buffer of 1 means that the fetch go routine can send to it without needing a receiver. If the handler has returned due to timeout, everything will just get garbage collected later on.

Resources