problem description
Recently I learned the option -race to check the exist of race condition in go. The full command is go run -race xxx.go It really helped me a lot. But as with the code below, I think the check result is wrong, and tried a lot of method (My try to get a panic below) to get a REAL panic but failed. So I want to know whether the code is correct and the race check is wrong,or can you revise my code so that I can SEE a real panic. Thanks a lot.
The code
package main
import "fmt"
type myType struct {
A int
}
func main(){
c:=make(chan bool)
x := new(myType)
go func(){
x = new(myType) // write to x
c <- true
}()
_ = *x // read from x
<-c
fmt.Println("end")
}
The race check result
go run -race test.go
==================
WARNING: DATA RACE
Write at 0x00c00009c010 by goroutine 6:
main.main.func1()
/Users/yaodongen/test.go:12 +0x56
Previous read at 0x00c00009c010 by main goroutine:
main.main()
/Users/yaodongen/test.go:15 +0xe2
Goroutine 6 (running) created at:
main.main()
/Users/yaodongen/test.go:11 +0xd4
==================
end
Found 1 data race(s)
exit status 66
My point
I tried to find the reason for the race condition report.In a post(Chinese), it mentions that the operation a = in64(0) is not atomic. For example in one 32 bit Machine and the data like int64 may be 64 bit length, CPU could copy half of the data and be interruptted by others. In the following code (Prove the golang copy is not atomic), I write a code to prove its true. But in my case, the code x = new(myType) is to copy a pointer value, and I think it can be done in one CPU copy. In other word, the operation is atomic and will never reach race condition.
Prove the golang copy is not atomic
package main
import "fmt"
import "time"
func main(){
var x = [...]int{1,1,1,1,1,1}
c := make(chan int, 100)
go func(){
for i:=0;;i++{
if i&1 == 0 {
x = [...]int{2,2,2,2,2,2} // write to x
}else{
x = [...]int{1,1,1,1,1,1} // write to x
}
c<-0 // let other goroutine see the change of x
<-c
}
}()
go func(){
for {
d := x // read from x
if d[0] != d[5] {
fmt.Println(d)
panic("error") // proved the copy operation is not atomic
}
c<-0
<-c
}
}()
time.Sleep(time.Millisecond * 10000)
fmt.Println("end")
}
My try to get a panic
But it failed, the code will panic if there exists a race condition (wrong memory address).
package main
import "fmt"
import "time"
type myType struct {
A int
}
func main(){
x := new(myType)
c := make(chan int, 100)
go func(){
for {
x = new(myType) // write to x
c<-0
<-c
}
}()
for i:=0; i<4; i++{
go func(){
for {
_ = *x // if exists a race condition, `*x` will visit a wrong memory address, and will panic
c<-0
<-c
}
}()
}
time.Sleep(time.Second * 10)
fmt.Println("end")
}
Go's race detection never gives false positives. If it tells you there's a race, then there is a race. It might not recognize all races (they have to happen to be detectable), but what it finds is always positive (bugs in the race detector not counting).
The race condition in your example is clear and simple. You have 2 goroutines, one reads a variable and the other one writes it without synchronization. This is the recipe for a race condition.
Race conditions make your app unpredictable. A race condition's behavior is undefined. Any experienced behavior falls under undefined, including the lack of panic. Don't tempt the devil, if there's a race condition, use proper synchronization. End of story.
See Is it safe to read a function pointer concurrently without a lock?
Related
I've run into an issue in my current project where I have two modules, one implementing an interface for testing purposes, and one just a concrete struct, which each depend on a method from the other.
In order to resolve this tension, I've attempted to create a top-level "container" struct that holds a reference to the dependent struct and interface, and then with a method on the container struct, assign as a member of each component struct that top level container's pointer to the other struct. I am doing this instead of using globals in order to be able to better encapsulate my code for testing purposes.
However, it seems that whichever struct is initialized first does not see the change in the other struct's address when the second struct is initialized. I do not understand why, and I don't seem to be able to make this function as expected.
Since there are many extraneous details in the actual code I've created this toy example to illustrate what I'm talking about.
type container struct {
r requestor
a *A
}
type requestor interface {
Request()
}
type A struct {
r requestor
}
type R struct {
a *A
}
func (r R) Request() {
log.Info("I requested")
return
}
func (container *container) NewA() *A {
log.Info("New A received container.r: ", container.r)
a := &A{
r: container.r,
}
container.a = a
return a
}
func (container *container) NewR() *R {
r := &R{
a: container.a,
}
container.r = r
return r
}
func TestDepResolution(t *testing.T) {
top := container{}
top.NewR()
top.NewA()
// top.a.r = r
log.Infof("top: %+v", top)
log.Infof("R: %+v", top.r)
log.Infof("A: %+v", top.a)
}
It's setup as a test so I can easily execute it within my project. The output is as such:
=== RUN TestDepResolution
INFO[0000] New A received container.r: <nil>
INFO[0000] top: {r:0xc000010028 a:0xc00006abc0}
INFO[0000] R: &{a:0xc00006abc0}
INFO[0000] A: &{r:<nil>}
I expected that A's r variable would become equal to top's r variable after NewR() was called, but it doesn't seem to change. The same issue occurs the other way around if I switch the order of NewA() and NewR().
I expected since I am using pointers and interfaces here that the values would be connected when top's values changed, but it's apparent I must be misunderstanding something. I've tried playing around with the pointers quite a bit to no avail.
So why doesn't this work as I expected? Is there a way to make this work as I've proposed? Or am I thinking about this issue in an entirely wrongheaded way? I have tried to think about extracting functionality from the modules so that they are not mutually dependent and I could avoid this issue entirely, but I have not been able to come up with a good way to do so.
To be able to utilize pointers the way you seem to want to, you first need actual pointers (i.e. not nil pointers) and you also need to use pointer indirection to be able to "share" the updates to the pointed values.
For example:
type T struct { F string }
a := &T{"foo"} // non-nil pointer
b := a
fmt.Println(b) // output: {"foo"}
*a = T{"bar"} // pointer indirection
fmt.Println(b) // output: {"bar"}
For comparison, here's what your code is attempting to do:
type T struct { F string }
a := (*T)(nil) // nil pointer
b := a
fmt.Println(b) // output: <nil>
a = &T{"bar"} // plain assignment
fmt.Println(b) // output: <nil>
And note that even if you used pointer indirection, it is illegal to do so on a nil pointer and the runtime, if it encounters such an operation, will panic.
a := (*T)(nil) // nil pointer
b := a
fmt.Println(b) // output: <nil>
*a = T{"bar"} // pointer indirection on nil, will crash the program
fmt.Println(b)
So, your example doesn't work because it does not properly initialize the pointers and it does not use pointer indirection, rather, it uses simple assignment which just updates the target variable's pointer and not the pointed-to value.
To initialize the container properly you should do it in one step:
func NewContainer() *container {
c := &container{a: &A{}}
c.r = &R{a: c.a}
c.a.r = c.r
return c
}
https://play.golang.com/p/hfbqJEVyAHZ
Or, if you want to do it in two, you can do something like this:
func (c *container) NewA() *A {
log.Println("New A received c.r: ", c.r)
a := &A{
r: c.r,
}
if c.a != nil {
*c.a = *a
} else {
c.a = a
}
return a
}
func (c *container) NewR() *R {
if c.a == nil {
c.a = new(A)
}
r := &R{
a: c.a,
}
c.r = r
c.a.r = r
return r
}
https://play.golang.com/p/krmUQOsACdU
but, as you can see, the multi step approach to initializing so tightly coupled dependencies can get unnecessarily convoluted and ugly, i.e. complex, i.e. very much error prone. Avoid it if you can.
All that said, personally, I would consider this kind of circular dependency a smell and would start thinking about redesign, but maybe that's just me.
type A struct {
x1 []int
x2 []string
}
func (this *A) Test() {
fmt.Printf("this is: %p, %+v\n", this, *this)
}
func main() {
var fn func()
{
a := &A{}
a.x1 = []int{1, 2, 3}
a.x2 = []string{"one", "two", "three"}
fn = a.Test
}
fn()
}
Please see: https://play.golang.org/p/YiwHG0b1hW-
My question is:
Will 'a' be released out of {} local scope?
Is the lifetime of 'a' equal to the one of 'fn'?
Go is a garbage collected language. As long as you don't touch package unsafe (or similar like Value.UnsafeAddr() in package reflect), all values remain in memory as long as they are reachable. You don't have to worry about memory management.
That's why it's also safe to return addresses (pointers) of local variables created inside functions. And also it's safe to refer to local variables from function values (closures) that will be out of scope when the function value is executed some time in the future, like this one:
func counter() func() int {
i := 0
return func() int {
i++
return i
}
}
This counter() returns a function (closure) which when called, returns increasing values:
c := counter()
fmt.Println(c())
fmt.Println(c())
fmt.Println(c())
This outputs (try it on the Go Playground):
1
2
3
counter() creates a local variable i which is not returned, but is accessed from the function value returned by it. As long as the returned function value is accessible, the local variable i is not freed. Also if you call counter() again, that creates a new i variable distinct from the previous one.
See related questions:
How to delete struct object in go?
Cannot free memory once occupied by bytes.Buffer
I am trying to understand why making the buffer size of a channel larger changes causes my code to run unexpectedly. If the buffer is smaller than my input (100 ints), the output is as expected, i.e., 7 goroutines each read a subset of the input and send output on another channel which prints it. If the buffer is the same size or larger than the input, I get no output and no error. Am I closing a channel at the wrong time? Do I have the wrong expectation about how buffers work? Or, something else?
package main
import (
"fmt"
"sync"
)
var wg1, wg2 sync.WaitGroup
func main() {
share := make(chan int, 10)
out := make(chan string)
go printChan(out)
for j:= 1; j<=7; j++ {
go readInt(share, out, j)
}
for i:=1; i<=100; i++ {
share <- i
}
close(share)
wg1.Wait()
close(out)
wg2.Wait()
}
func readInt(in chan int, out chan string, id int) {
wg1.Add(1)
for n := range in {
out <- fmt.Sprintf("goroutine:%d was sent %d", id, n)
}
wg1.Done()
}
func printChan(out chan string){
wg2.Add(1)
for l := range out {
fmt.Println(l)
}
wg2.Done()
}
To run this:
Small buffer, expected output. http://play.golang.org/p/4r7rTGypPO
Big buffer, no output. http://play.golang.org/p/S-BDsw7Ctu
This has nothing directly to do with the size of the buffer. Adding the buffer is exposing a bug in where you're calling waitGroup.Add(1)
You have to add to the WaitGroup before you dispatch the goroutine, otherwise you may end up calling Wait() before the waitGroup.Add(1) executes.
http://play.golang.org/p/YaDhc6n8_B
The reason it works in the first and not the second, is because the synchronous sends ensure that the gouroutines have executed at least that far. In the second example, the for loop fills up the channel, closes it and calls Wait before anything else can happen.
Is there a way to know if all the values in channel has been consumed? I'm making a crawler which recursively fetches sites from seed site. I'm not closing the channel because it consumes from the server and should crawl every time new site is sent. For a given seed site, I can't find a better way to know completion of a subtask other than timing out. If there was a way to know that there is no value in channel(left to be consumed), my program could get out of the sub task and continue listening to the server.
There is no such things as "queued in an unbuffered channel." If the channel is unbuffered, it is by definition always empty. If it is buffered, then it may have some number of elements in it up to its size. But trying to read how many elements are in it is always going to cause race conditions, so don't design that way (it's also impossible in Go).
Ideally, avoid designs that need to know when children are complete, but when you must, send them a channel to respond to you on. When they respond, then you know they're complete.
The kind of problem you're describing is well covered in the Go blogs and talks:
Go Concurrency Patterns: Pipelines and cancellation
Go Concurrency Patterns: Context
Concurrency is not paralellism
Go Concurrency Patterns
Advanced Go Concurrency Patterns
You can determine whether or not a goroutine is blocked on the other end of a channel by using default in a select statement. For example:
package main
import (
"fmt"
"time"
)
var c = make(chan int)
func produce(i int) {
c <- i
}
func consume() {
for {
select {
case i := <-c:
fmt.Println(i)
default:
return
}
}
}
func main() {
for i := 0; i < 10; i++ {
go produce(i)
}
time.Sleep(time.Millisecond)
consume()
}
Keep in mind that this isn't a queue though. If you were to have 1 producing goroutine that looped and produced multiple values between the time it took to send one value and get back around the loop again the default case would happen and your consumer would move on.
You could use a timeout:
case <-time.After(time.Second):
Which would give your producer a second to produce another value, but you're probably better off using a terminal value. Wrap whatever you're sending in a struct:
type message struct {
err error
data theOriginalType
}
And send that thing instead. Then use io.EOF or a custom error var Done = errors.New("DONE") to signal completion.
Since you have a recursive problem why not use a WaitGroup? Each time you start a new task increment the wait group, and each time a task completes, decrement it. Then have an outer task waiting on completion. For example here's a really inefficient way of calculating a fibonacci number:
package main
import (
"fmt"
"sync"
)
var wg sync.WaitGroup
func fib(c chan int, n int) {
defer wg.Done()
if n < 2 {
c <- n
} else {
wg.Add(2)
go fib(c, n - 1)
go fib(c, n - 2)
}
}
func main() {
wg.Add(1)
c := make(chan int)
go fib(c, 18)
go func() {
wg.Wait()
close(c)
}()
sum := 0
for i := range c {
sum += i
}
fmt.Println(sum)
}
http://play.golang.org/p/uRHG-Th_2P
I am having hard time understanding the concept of channel
package main
import (
"fmt"
)
func Fibonacci(limit int, chnvar chan int) {
x, y := 0, 1
for i := 0; i < limit; i++ {
chnvar <- x
x, y = y, x+y
}
close(chnvar)
v, ok := <-chnvar
fmt.Println(v, ok)
}
func main() {
chn := make(chan int, 10)
go Fibonacci(cap(chn), chn)
for elem := range chn {
fmt.Printf("%v ", elem)
}
}
//1 1 2 3 5 8 13 21 34
1)
How do I get false value from the line
v, ok := <-chnvar
It says false if there are no more values to get.
and also false if the channel is closed.
But in this case, the channel is closed but(?) still get the true value.
And if I take out the close, it panics.
How and why it returns true here?
2)
The line
go Fibonacci(cap(chn), chn)
also runs without goroutine.
What is the difference? Just matter of performance.
Thanks in advance
Your Fibonacci function stuffs 10 values into the channel (which has a buffer of 10 values), and then closes it. Assuming the v, ok <- chnvar statement executes before the main goroutine reads everything out of the channel (very likely, but not guaranteed), there will be a value to read so ok will be true.
If you remove the close call, the for loop in the main goroutine will eventually empty the channel's buffer and block waiting for more data. Since there is no other goroutine active to write to the channel, the runtime detects this as a deadlock.
Your sample program runs with Fibonacci called directly (not as a goroutine) because the channel it writes to is buffered, and it never overruns the buffer. Therefore it can complete without blocking and allows execution to continue to the rest of the main function.
If the channel was not buffered, or you wrote more values than would fit in the buffer, then Fibonacci would block waiting for some other goroutine to read something from the channel.
1)
The Go specification states for channel receive operations (my emphasis):
x, ok := <-ch
The value of ok is true if the value received was delivered by a successful send operation to the channel, or false if it is a zero value generated because the channel is closed and empty.
That is, because the buffered channel is not empty and you have successfully received a value (0), ok will be true. You won't receive false until the channel has been emptied.
2)
By running the Fibonacci(cap(chn), chn) in it's own Go routine, main can start receiving and processing (printing out) out values while the Fibonacci function is still feeding new values to the channel.
In your case, this probably never happens since the function will have filled up the buffer and completed before main gets a chance to process anything.
If it would not be running in a Go routine, Fibonacci would first need to produce all the values before they can be processed further by main.