Golang multiple timers with map+channel+mutex - dictionary

So I'm implementing multiple timers using map/channel/mutex. In order for timer to cancel, I have a channel map that stores cancel info, below is the code:
var timerCancelMap = make(map[string]chan interface{})
var mutexLocker sync.Mutex
func cancelTimer(timerIndex string) {
mutexLocker.Lock()
defer mutexLocker.Unlock()
timerCancelMap[timerIndex] = make(chan interface{})
timerCancelMap[timerIndex] <- struct{}{}
}
func timerStart(timerIndex string) {
fmt.Println("###### 1. start timer: ", timerIndex)
timerStillActive := true
newTimer := time.NewTimer(time.Second * 10)
for timerStillActive {
mutexLocker.Lock()
select {
case <-newTimer.C:
timerStillActive = false
fmt.Println("OOOOOOOOO timer time's up: ", timerIndex)
case <-timerCancelMap[timerIndex]:
timerCancelMap[timerIndex] = nil
timerStillActive = false
fmt.Println("XXXXXXXXX timer canceled: ", timerIndex)
default:
}
mutexLocker.Unlock()
}
fmt.Println("###### 2. end timer: ", timerIndex)
}
func main() {
for i := 0; i < 10; i++ {
go timerStart(strconv.Itoa(i))
if i%10 == 0 {
cancelTimer(strconv.Itoa(i))
}
}
}
Now this one gives me deadlock, if I remove all mutex.lock/unlock, it gives me concurrent map read and map write. So what am I doing wrong?
I know sync.Map solves my problem, but the performance suffers significantly, so I kinda wanna stick with the map solution.
Thanks in advance!

There's a few things going on here which are going to cause problems with your script:
cancelTimer creates a channel make(chan interface{}) which has no buffer, e.g. make(chan struct{}, 1). This means that sending to the channel will block until another goroutine attempts to receive from that same channel. So when you attempt to call cancelTimer from the main goroutine, it locks mutexLocker and then blocks on sending the cancellation, meanwhile no other goroutine can lock mutexLocker to receive from the cancellation channel, thus causing a deadlock.
After adding a buffer, the cancelTimer call will return immediately.
We will then run into a few other little issues. The first is that the program will immediately quit without printing anything. This happens because after launching the test goroutines and sending the cancel, the main thread has done all of its work, which tells the program it is finished. So we need to tell the main thread to wait for the goroutines, which sync.WaitGroup is very good for:
func main() {
var wg sync.WaitGroup
for i := 0; i < 10; i++ {
wg.Add(1)
go func(i int) {
defer wg.Done()
timerStart(strconv.Itoa(i))
}(i)
if i%10 == 0 {
cancelTimer(strconv.Itoa(i))
}
}
wg.Wait()
}
I can see you've added the mutexLocker to protect the map and later added the for loop to give each goroutine an opportunity to acquire mutexLocker to check their timers. This results in a lot of work for the computer, and more complicated code than is necessary. Instead of having timerStart look up it's index in the cancellations map, we can provide the cancellation channel as an argument:
func testTimer(i int, cancel <-chan interface{}) {
and have the main function create the channels. You will then be a le to remove map access, mutexLocker locking, and the for loop from testTimer. If you still require the map for purposes not shown here, you can put the same channel in the map that you pass to testTimer, and if not you can remove all of that code too.
This all ends up looking something like https://play.golang.org/p/iQUvc52B6Nk
Hope that helps 👍

Related

Why does my code work correctly when I run wg.Wait() inside a goroutine?

I have a list of urls that I am scraping. What I want to do is store all of the successfully scraped page data into a channel, and when I am done, dump it into a slice. I don't know how many successful fetches I will get, so I cannot specify a fixed length. I expected the code to reach wg.Wait() and then wait until all the wg.Done() methods are called, but I never reached the close(queue) statement. Looking for a similar answer, I came across this SO answer
https://stackoverflow.com/a/31573574/5721702
where the author does something similar:
ports := make(chan string)
toScan := make(chan int)
var wg sync.WaitGroup
// make 100 workers for dialing
for i := 0; i < 100; i++ {
wg.Add(1)
go func() {
defer wg.Done()
for p := range toScan {
ports <- worker(*host, p)
}
}()
}
// close our receiving ports channel once all workers are done
go func() {
wg.Wait()
close(ports)
}()
As soon as I wrapped my wg.Wait() inside the goroutine, close(queue) was reached:
urls := getListOfURLS()
activities := make([]Activity, 0, limit)
queue := make(chan Activity)
for i, activityURL := range urls {
wg.Add(1)
go func(i int, url string) {
defer wg.Done()
activity, err := extractDetail(url)
if err != nil {
log.Println(err)
return
}
queue <- activity
}(i, activityURL)
}
// calling it like this without the goroutine causes the execution to hang
// wg.Wait()
// close(queue)
// calling it like this successfully waits
go func() {
wg.Wait()
close(queue)
}()
for a := range queue {
// block channel until valid url is added to queue
// once all are added, close it
activities = append(activities, a)
}
Why does the code not reach the close if I don't use a goroutine for wg.Wait()? I would think that the all of the defer wg.Done() statements are called so eventually it would clear up, because it gets to the wg.Wait(). Does it have to do with receiving values in my channel?
You need to wait for goroutines to finish in a separate thread because queue needs to be read from. When you do the following:
queue := make(chan Activity)
for i, activityURL := range urls {
wg.Add(1)
go func(i int, url string) {
defer wg.Done()
activity, err := extractDetail(url)
if err != nil {
log.Println(err)
return
}
queue <- activity // nothing is reading data from queue.
}(i, activityURL)
}
wg.Wait()
close(queue)
for a := range queue {
activities = append(activities, a)
}
Each goroutine blocks at queue <- activity since queue is unbuffered and nothing is reading data from it. This is because the range loop on queue is in the main thread after wg.Wait.
wg.Wait will only unblock once all the goroutine return. But as mentioned, all the goroutines are blocked at channel send.
When you use a separate goroutine to wait, code execution actually reaches the range loop on queue.
// wg.Wait does not block the main thread.
go func() {
wg.Wait()
close(queue)
}()
This results in the goroutines unblocking at the queue <- activity statement (main thread starts reading off queue) and running until completion. Which in turn calls each individual wg.Done.
Once the waiting goroutine get past wg.Wait, queue is closed and the main thread exits the range loop on it.
queue channel is unbuffered so every goroutine trying to write to it gets blocked because reader process is not yet started. So no goroutinte can write and they all hang - as a result wg.Wait waits forever.
Try to launch reader in a separate goroutine:
go func() {
for a := range queue {
// block channel until valid url is added to queue
// once all are added, close it
activities = append(activities, a)
}
}()
and then start waiter:
wg.Wait()
close(queue)
This way you can not to accumulate all the data in channel and overload it, but get data as it comes and put to target slice.

Golang persistent channel accepting input from multiple function calls

I have a function a:
func a(input *some_type) {
// do sth.
b(input)
}
This function gets called multiple times.
I want a function b to wait indefinitely for input from function a and perform an action when it has collected n inputs.
func b(input *some_type) {
// wait until received n inputs then do sth. with all inputs
}
How would I go about doing this? My first thought was to use a sync.WaitGroup with a channel between a and b.
This is a common producer-consumer problem. Use channels to wait on the input from another routine. Does something like this help?
In this particular example, you would have to call go b(c) again after collecting the inputs as it terminates, but you could easily wrap whatever b does in an infinite for loop. Or whatever needs to happen.
Please note that in this example, and unbuffered channel is used, which forces both routines to meet at the same time to "hand off" the *Thing. If you want the producer (a's process) to not have to wait, you can use a buffered channel, which is created like so:
c := make(chan(*Thing, n))
Where n is the number of items the channel can store. This allows several to be queued by the producer.
https://play.golang.org/p/X14_QsSSU4
package main
import (
"fmt"
"time"
)
type Thing struct {
N int
}
func a(t *Thing, c chan (*Thing)) {
// stuff happens. whee
c <- t
}
func b(c chan (*Thing)) {
things := []*Thing{}
for i := 0; i < 10; i++ {
t := <-c
things = append(things, t)
fmt.Printf("I have %d things\n", i+1)
}
fmt.Println("I now have 10 things! Let's roll!")
// do stuff with your ten things
}
func main() {
fmt.Println("Hello, playground")
c := make(chan (*Thing))
go b(c)
// this would probably be done producer-consumer like in a go-routine
for i := 0; i < 10; i++ {
a(&Thing{i}, c)
time.Sleep(time.Second)
}
time.Sleep(time.Second)
fmt.Println("Program finished")
}

Golang http server implementation

I have read that net/http starts a go subroutine for each connection. I have few questions. But I haven't seen any parameter to limit the number of spawned new go subroutines. For example, If I have to handle 1 million concurrent requests per second, what will happen? Do we have any control over spawned go subroutines? If it spawns one go subroutine per connection, won't it choke my entire system? What is the recommended way of handling huge number of concurrent requests for a go webserver? I have to handle both cases of responses being asynchronous and synchronous.
Job/Worker pattern is a well common go concurrency pattern suited for this task.
Multiple goroutines can read from a single channel, distributing an amount of work between CPU cores, hence the workers name. In Go, this pattern is easy to implement - just start a number of goroutines with channel as parameter, and just send values to that channel - distributing and multiplexing will be done by Go runtime.
package main
import (
"fmt"
"sync"
"time"
)
func worker(tasksCh <-chan int, wg *sync.WaitGroup) {
defer wg.Done()
for {
task, ok := <-tasksCh
if !ok {
return
}
d := time.Duration(task) * time.Millisecond
time.Sleep(d)
fmt.Println("processing task", task)
}
}
func pool(wg *sync.WaitGroup, workers, tasks int) {
tasksCh := make(chan int)
for i := 0; i < workers; i++ {
go worker(tasksCh, wg)
}
for i := 0; i < tasks; i++ {
tasksCh <- i
}
close(tasksCh)
}
func main() {
var wg sync.WaitGroup
wg.Add(36)
go pool(&wg, 36, 50)
wg.Wait()
}
All goroutines run in parallel, waiting for channel to give them work. The goroutines receive their work almost immediately one after another.
Here is a great article about how you can handle 1 million requests per minute in go: http://marcio.io/2015/07/handling-1-million-requests-per-minute-with-golang/

Golang: Why does increasing the size of a buffered channel eliminate output from my goroutines?

I am trying to understand why making the buffer size of a channel larger changes causes my code to run unexpectedly. If the buffer is smaller than my input (100 ints), the output is as expected, i.e., 7 goroutines each read a subset of the input and send output on another channel which prints it. If the buffer is the same size or larger than the input, I get no output and no error. Am I closing a channel at the wrong time? Do I have the wrong expectation about how buffers work? Or, something else?
package main
import (
"fmt"
"sync"
)
var wg1, wg2 sync.WaitGroup
func main() {
share := make(chan int, 10)
out := make(chan string)
go printChan(out)
for j:= 1; j<=7; j++ {
go readInt(share, out, j)
}
for i:=1; i<=100; i++ {
share <- i
}
close(share)
wg1.Wait()
close(out)
wg2.Wait()
}
func readInt(in chan int, out chan string, id int) {
wg1.Add(1)
for n := range in {
out <- fmt.Sprintf("goroutine:%d was sent %d", id, n)
}
wg1.Done()
}
func printChan(out chan string){
wg2.Add(1)
for l := range out {
fmt.Println(l)
}
wg2.Done()
}
To run this:
Small buffer, expected output. http://play.golang.org/p/4r7rTGypPO
Big buffer, no output. http://play.golang.org/p/S-BDsw7Ctu
This has nothing directly to do with the size of the buffer. Adding the buffer is exposing a bug in where you're calling waitGroup.Add(1)
You have to add to the WaitGroup before you dispatch the goroutine, otherwise you may end up calling Wait() before the waitGroup.Add(1) executes.
http://play.golang.org/p/YaDhc6n8_B
The reason it works in the first and not the second, is because the synchronous sends ensure that the gouroutines have executed at least that far. In the second example, the for loop fills up the channel, closes it and calls Wait before anything else can happen.

How to find out nothing is being received in an unbuffered channel without closing it?

Is there a way to know if all the values in channel has been consumed? I'm making a crawler which recursively fetches sites from seed site. I'm not closing the channel because it consumes from the server and should crawl every time new site is sent. For a given seed site, I can't find a better way to know completion of a subtask other than timing out. If there was a way to know that there is no value in channel(left to be consumed), my program could get out of the sub task and continue listening to the server.
There is no such things as "queued in an unbuffered channel." If the channel is unbuffered, it is by definition always empty. If it is buffered, then it may have some number of elements in it up to its size. But trying to read how many elements are in it is always going to cause race conditions, so don't design that way (it's also impossible in Go).
Ideally, avoid designs that need to know when children are complete, but when you must, send them a channel to respond to you on. When they respond, then you know they're complete.
The kind of problem you're describing is well covered in the Go blogs and talks:
Go Concurrency Patterns: Pipelines and cancellation
Go Concurrency Patterns: Context
Concurrency is not paralellism
Go Concurrency Patterns
Advanced Go Concurrency Patterns
You can determine whether or not a goroutine is blocked on the other end of a channel by using default in a select statement. For example:
package main
import (
"fmt"
"time"
)
var c = make(chan int)
func produce(i int) {
c <- i
}
func consume() {
for {
select {
case i := <-c:
fmt.Println(i)
default:
return
}
}
}
func main() {
for i := 0; i < 10; i++ {
go produce(i)
}
time.Sleep(time.Millisecond)
consume()
}
Keep in mind that this isn't a queue though. If you were to have 1 producing goroutine that looped and produced multiple values between the time it took to send one value and get back around the loop again the default case would happen and your consumer would move on.
You could use a timeout:
case <-time.After(time.Second):
Which would give your producer a second to produce another value, but you're probably better off using a terminal value. Wrap whatever you're sending in a struct:
type message struct {
err error
data theOriginalType
}
And send that thing instead. Then use io.EOF or a custom error var Done = errors.New("DONE") to signal completion.
Since you have a recursive problem why not use a WaitGroup? Each time you start a new task increment the wait group, and each time a task completes, decrement it. Then have an outer task waiting on completion. For example here's a really inefficient way of calculating a fibonacci number:
package main
import (
"fmt"
"sync"
)
var wg sync.WaitGroup
func fib(c chan int, n int) {
defer wg.Done()
if n < 2 {
c <- n
} else {
wg.Add(2)
go fib(c, n - 1)
go fib(c, n - 2)
}
}
func main() {
wg.Add(1)
c := make(chan int)
go fib(c, 18)
go func() {
wg.Wait()
close(c)
}()
sum := 0
for i := range c {
sum += i
}
fmt.Println(sum)
}

Resources