Is it safe to range map without locking if multiple goroutines will run notifyAll func? Actually in a range I need to sometimes remove entries from a map.
var mu sync.RWMutex
func (self *Server) notifyAll(event *Event)
ch := make(chan int, 64)
num := 0
for k, v := range self.connections {
num++
ch <- num
go func(int k, conn *Conn) {
err := conn.sendMessage(event)
<-ch
if err != nil {
self.removeConn(k)
}
}(k, v)
}
}
func (self *Server) removeConn(int k) {
mu.Lock()
defer mu.Unlock()
delete(self.connections, k)
}
// Somewhere in another goroutine
func (self *Server) addConn(conn *Conn, int k) {
mu.Lock()
defer mu.Unlock()
self.connections[k] = conn
}
Or I must RLock map before range?
func (self *Server) notifyAll(event *Event)
mu.RLock()
defer mu.RUnlock()
// Skipped previous body...
}
Short answer: maps are not concurrent-safe (one can still say thread-safe) in Go.
So, if you need to access a map from different go-routines, you must employ some form of access orchestration, otherwise "uncontrolled map access can crash the program" (see this).
Edit:
This is another implementation (without considering housekeeping concerns - timeouts, quit, log, etc) which ignores the mutex all-together and uses a more Goish approach (this is just for demonstrating this approach which helps us to clear access orchestration concerns - might be right or not for your case):
type Server struct {
connections map[*Conn]struct{}
_removeConn, _addConn chan *Conn
_notifyAll chan *Event
}
func NewServer() *Server {
s := new(Server)
s.connections = make(map[*Conn]struct{})
s._addConn = make(chan *Conn)
s._removeConn = make(chan *Conn, 1)
s._notifyAll = make(chan *Event)
go s.agent()
return s
}
func (s *Server) agent() {
for {
select {
case c := <-s._addConn:
s.connections[c] = struct{}{}
case c := <-s._removeConn:
delete(s.connections, c)
case e := <-s._notifyAll:
for c := range s.connections {
closure := c
go func() {
err := closure.sendMessage(e)
if err != nil {
s._removeConn <- closure
}
}()
}
}
}
}
func (s *Server) removeConn(c *Conn) {
s._removeConn <- c
}
func (s *Server) addConn(c *Conn) {
s._addConn <- c
}
Edit:
I stand corrected; according to Damian Gryski maps are safe for concurrent reads. The reason that the map order changes on each iteration is "the random seed chosen for map iteration order, which is local to the goroutine iterating" (another tweet of him). This fact does not affect the first edit and suggested solution.
Related
Following up on old post here.
I am iterating over flatProduct.Catalogs slice and populating my productCatalog concurrent map in golang. I am using upsert method so that I can add only unique productID's into my productCatalog map.
Below code is called by multiple go routines in parallel that is why I am using concurrent map here to populate data into it. This code runs in background to populate data in the concurrent map every 30 seconds.
var productRows []ClientProduct
err = json.Unmarshal(byteSlice, &productRows)
if err != nil {
return err
}
for i := range productRows {
flatProduct, err := r.Convert(spn, productRows[i])
if err != nil {
return err
}
if flatProduct.StatusCode == definitions.DONE {
continue
}
r.products.Set(strconv.Itoa(flatProduct.ProductId, 10), flatProduct)
for _, catalogId := range flatProduct.Catalogs {
catalogValue := strconv.FormatInt(int64(catalogId), 10)
r.productCatalog.Upsert(catalogValue, flatProduct.ProductId, func(exists bool, valueInMap interface{}, newValue interface{}) interface{} {
productID := newValue.(int64)
if valueInMap == nil {
return map[int64]struct{}{productID: {}}
}
oldIDs := valueInMap.(map[int64]struct{})
// value is irrelevant, no need to check if key exists
// I think problem is here
oldIDs[productID] = struct{}{}
return oldIDs
})
}
}
And below are my getters in the same class where above code is there. These getters are used by main application threads to get data from the map or get the whole map.
func (r *clientRepository) GetProductMap() *cmap.ConcurrentMap {
return r.products
}
func (r *clientRepository) GetProductCatalogMap() *cmap.ConcurrentMap {
return r.productCatalog
}
func (r *clientRepository) GetProductData(pid string) *definitions.FlatProduct {
pd, ok := r.products.Get(pid)
if ok {
return pd.(*definitions.FlatProduct)
}
return nil
}
This is how I am reading data from this productCatalog cmap but my system is crashing on the below range statement -
// get productCatalog map which was populated above
catalogProductMap := clientRepo.GetProductCatalogMap()
productIds, ok := catalogProductMap.Get("211")
data, _ := productIds.(map[int64]struct{})
// I get panic here after sometime
for _, pid := range data {
...
}
Error I am getting as - fatal error: concurrent map iteration and map write.
I think issue is r.productCatalog is a concurrentmap, but oldIDs[productID] is a normal map which is causing issues while I am iterating in the for loop above.
How can I fix this race issue I am seeing? One way I can think of is making oldIDs[productID] as concurrent map but if I do that approach then my memory increase by a lot and eventually goes OOM. Below is what I have tried which works and it solves the race condition but it increases the memory by a lot which is not what I want -
r.productCatalog.Upsert(catalogValue, flatProduct.ProductId, func(exists bool, valueInMap interface{}, newValue interface{}) interface{} {
productID := newValue.(int64)
if valueInMap == nil {
// return map[int64]struct{}{productID: {}}
return cmap.New()
}
// oldIDs := valueInMap.(map[int64]struct{})
oldIDs := valueInMap.(cmap.ConcurrentMap)
// value is irrelevant, no need to check if key exists
// oldIDs[productID] = struct{}{}
oldIDs.Set(strconv.FormatInt(productID, 10), struct{}{})
return oldIDs
})
Any other approach I can do which doesn't increase memory and also fixes the race condition I am seeing?
Note
I am still using v1 version of cmap without generics and it deals with strings as keys.
Rather than a plain map[int64]struct{} type, you could define a struct which holds the map and a mutex to control the access to the map:
type myMap struct{
m sync.Mutex
data map[int64]struct{}
}
func (m *myMap) Add(productID int64) {
m.m.Lock()
defer m.m.Unlock()
m.data[productID] = struct{}{}
}
func (m *myMap) List() []int64 {
m.m.Lock()
defer m.m.Unlock()
var res []int64
for id := range m.data {
res = append(res, id)
}
// sort slice if you need
return res
}
With the sample implementation above, you would have to be careful to store *myMap pointers (as opposed to plain myMap structs) in your cmap.ConcurrentMap structure.
I wanted to know if there is a way to lock only an index in a map during concurrent read/write. I am pretty new to Golang and parallelism sorry if the answer is obvious.
func Check(a, b []string) map[string]int {
var res = make(map[string]int)
go func() {
for _, v := range a {
res[v]++
}
}()
go func() {
for _, v := range b {
res[v]++
}
}()
return res
}
Eventually this piece of code is going to panic due to concurrent map Read/Write. So we should add mutex to lock the map.
var m sync.Mutex
go func() {
for _, v := range a {
m.Lock()
res[v]++
m.Unlock()
}
}()
go func() {
for _, v := range b {
m.Lock()
res[v]++
m.Unlock()
}
}()
But from my understanding m.lock() will lock my whole map? Isn't this too much overhead by locking everything? This bugged me as i thought this piece of code may not be faster than running linearly. Can I possibly lock only the map at map["some key"], so that my second goroutine can still write in map["some other key"]?
From GO 1.9 Release notes
Concurrent Map
The new Map type in the sync package is a concurrent map with amortized-constant-time loads, stores, and deletes. It is safe for multiple goroutines to call a Map's methods concurrently.
GO Team built one for you!
Maps themselves do not take care of locks, therefore any manipulation of them on multiple go routines (or reading while they are being manipulated) will require some for of syncing (e.g., sync.Mutex). There are fancier things you can do though.
RW Mutex
You can get a little fancier depending on your use case and use a sync.RWMutex. This will allow concurrent reads while safely blocking for any write. For example:
package main
import (
"sync"
"time"
)
func main() {
m := map[int]int{}
lock := sync.RWMutex{}
go func() {
// Writer
for range time.Tick(250 * time.Millisecond) {
// Notice that this uses Lock and NOT RLock
lock.Lock()
m[5]++
m[6] += 2
lock.Unlock()
}
}()
go func() {
for range time.Tick(250 * time.Millisecond) {
lock.RLock()
println(m[5])
lock.RUnlock()
}
}()
for range time.Tick(250 * time.Millisecond) {
lock.RLock()
println(m[6])
lock.RUnlock()
}
}
This does not give you a key by key locking mechanism though.
sync.Map
The sync.Map is provided by the standard library and is robust. It has more fine grained locking.
package main
import (
"sync"
"time"
)
func main() {
m := sync.Map{}
go func() {
// Writer
for range time.Tick(250 * time.Millisecond) {
value, _ := m.LoadOrStore(5, 0)
m.Store(5, value.(int)+1)
value, _ = m.LoadOrStore(6, 0)
m.Store(6, value.(int)+2)
}
}()
go func() {
for range time.Tick(250 * time.Millisecond) {
value, _ := m.LoadOrStore(5, 0)
println(value.(int))
}
}()
for range time.Tick(250 * time.Millisecond) {
value, _ := m.LoadOrStore(6, 0)
println(value.(int))
}
}
Notice that the code doesn't have any mutexes. Also notice that you get to deal with empty interfaces...
i have a map that is being read and written by 3 goroutines constantly, the program always ends up with a "fatal error: concurrent map iteration and map write" despite me setting up the mutex to protect it, I know I could use sync.Map or I could sync with a channel but I'd really like to understand what I am doing wrong. this is the code:
//book.go
type OrderbookMap map[float64]float64
type Orderbook struct {
Bids OrderbookMap
Asks OrderbookMap
Symbol string
IsInit bool
UpdateId int
mu sync.Mutex
}
func (book *Orderbook) Init() {
book.mu.Lock()
defer book.mu.Unlock()
if book.IsInit {
return
}
book.Asks = make(OrderbookMap)
book.Bids = make(OrderbookMap)
book.IsInit = true
}
//functions with mutexes
func DelBid2(b *Orderbook, price float64) {
b.mu.Lock()
defer b.mu.Unlock()
if _, ok := b.Bids[price]; ok {
delete(b.Bids, price)
} else {
fmt.Printf("VALUE NOT FOUND %v\n", price)
}
}
func AddBid2(b *Orderbook, price float64, qty float64) {
b.mu.Lock()
defer b.mu.Unlock()
b.Bids[price] = qty
}
func GetBids2(b *Orderbook) OrderbookMap {
b.mu.Lock()
defer b.mu.Unlock()
return b.Bids
}
//TesterFile.go
func TestBookRace(t *testing.T) {
var B Orderbook
B.Init()
//add
go func() {
for {
b, q := rFloat(), rFloat()
AddBid2(&B, b, q)
fmt.Printf("ADD %v NEW: %v\n", b, GetBids2(&B))
}
}()
//del
go func() {
for {
b := rFloat()
DelBid2(&B, b)
fmt.Printf("DEL %v NEW: %v\n", b, GetBids2(&B))
}
}()
//read
go func() {
for {
fmt.Printf("READ %v\n", GetBids2(&B))
}
}()
for { time.Sleep(10 * time.Second)}
}
My code:
func getSourceUrl(url string) (string, error) {
resp, err := http.Get(url)
if err != nil {
fmt.Println("Error getSourceUrl: ")
return "", err
}
defer resp.Body.Close()
body := resp.Body
// time = 0
sourcePage, err := ioutil.ReadAll(body)
// time > 5 minutes
return string(sourcePage), err
}
I have a website link with a source of around> 100000 lines. Using ioutil.ReadAll made me get very long (about> 5 minutes for 1 link). Is there a way to get Source website faster? Thank you!
#Minato try this code, play with M throttling parameter. Play with it if you get too errors (reduce it).
package main
import (
"fmt"
"io"
"io/ioutil"
"log"
"net/http"
"runtime"
"time"
)
// Token is an empty struct for signalling
type Token struct{}
// N files to get
var N = 301 // at the source 00000 - 00300
// M max go routines
var M = runtime.NumCPU() * 16
// Throttle to max M go routines
var Throttle = make(chan Token, M)
// DoneStatus is used to signal end of
type DoneStatus struct {
length int
sequence string
duration float64
err error
}
// ExitOK is simple exit counter
var ExitOK = make(chan DoneStatus)
// TotalBytes read
var TotalBytes = 0
// TotalErrors captured
var TotalErrors = 0
// URLTempl is templte for URL construction
var URLTempl = "https://virusshare.com/hashes/VirusShare_%05d.md5"
func close(c io.Closer) {
err := c.Close()
if err != nil {
log.Fatal(err)
}
}
func main() {
log.Printf("start main. M=%d\n", M)
startTime := time.Now()
for i := 0; i < N; i++ {
go func(idx int) {
// slow ramp up fire getData after i seconds
time.Sleep(time.Duration(i) * time.Second)
url := fmt.Sprintf(URLTempl, idx)
_, _ = getData(url) // errors captured as data
}(i)
}
// Count N byte count signals
for i := 0; i < N; i++ {
status := <-ExitOK
TotalBytes += status.length
if status.err != nil {
TotalErrors++
log.Printf("[%d] : %v\n", i, status.err)
continue
}
log.Printf("[%d] file %s, %.1f MByte, %.1f min, %.1f KByte/sec\n",
i, status.sequence,
float64(status.length)/(1024*1024),
status.duration/60,
float64(status.length)/(1024)/status.duration)
}
// totals
duration := time.Since(startTime).Seconds()
log.Printf("Totals: %.1f MByte, %.1f min, %.1f KByte/sec\n",
float64(TotalBytes)/(1024*1024),
duration/60,
float64(TotalBytes)/(1024)/duration)
// using fatal to verify only one go routine is running at the end
log.Fatalf("TotalErrors: %d\n", TotalErrors)
}
func getData(url string) (data []byte, err error) {
var startTime time.Time
defer func() {
// release token
<-Throttle
// signal end of go routine, with some status info
ExitOK <- DoneStatus{
len(data),
url[41:46],
time.Since(startTime).Seconds(),
err,
}
}()
// acquire one of M tokens
Throttle <- Token{}
log.Printf("Started file: %s\n", url[41:46])
startTime = time.Now()
resp, err := http.Get(url)
if err != nil {
return
}
defer close(resp.Body)
data, err = ioutil.ReadAll(resp.Body)
if err != nil {
return
}
return
}
Per transfer variation is about 10-40KByte/sec and final total for all 301 files I get 928MB, 11.1min at 1425 KByte/sec. I believe you should be able to get similar results.
// outside the scope of the question but maybe useful
Also give this a try http://www.dslreports.com/speedtest/ go to settings and select bunch of US servers for testing and set duration to 60sec. This will tell you what your actual effective total rate is to US.
Good luck!
You could iterate sections of the response at a time, something like;
responseSection := make([]byte, 128)
body.Read(responseSection)
return string(responseSection), err
Which would read 128 bytes at a time. However would suggest confirming the download speed is not causing the slow load.
The 5 minutes is probably network time.
That said, you generally would not want to buffer enormous objects in memory.
resp.Body is a Reader.
So you cold use io.Copy to copy its contents into a file.
Converting sourcePage into a string is a bad idea as it forces another allocation.
I have a client server application, using TCP connection
Client:
type Q struct {
sum int64
}
type P struct {
M, N int64
}
func main() {
...
//read M and N
...
tcpAddr, err := net.ResolveTCPAddr("tcp4", service)
...
var p P
p.M = M
p.N = N
err = enc.Encode(p)
}
Server:
type Q struct {
sum int64
}
type P struct {
M, N int64
}
func main() {
...
tcpAddr, err := net.ResolveTCPAddr("ip4", service)
listener, err := net.ListenTCP("tcp", tcpAddr)
...
var connB bytes.Buffer
dec := gob.NewDecoder(&connB)
var p P
err = dec.Decode(p)
fmt.Printf("{%d, %d}\n", p.M, p.N)
}
The result on serve is {0, 0} because I don't know how to obtain a bytes.Buffer variable from net.Conn.
Is there any way for sending gob variables over TCP ?
If true, how can this be done ? Or there are any alternative in sending numbers over TCP ?
Any help or sample code would really be appreciated.
Here's a complete example.
Server:
package main
import (
"fmt"
"net"
"encoding/gob"
)
type P struct {
M, N int64
}
func handleConnection(conn net.Conn) {
dec := gob.NewDecoder(conn)
p := &P{}
dec.Decode(p)
fmt.Printf("Received : %+v", p);
conn.Close()
}
func main() {
fmt.Println("start");
ln, err := net.Listen("tcp", ":8080")
if err != nil {
// handle error
}
for {
conn, err := ln.Accept() // this blocks until connection or error
if err != nil {
// handle error
continue
}
go handleConnection(conn) // a goroutine handles conn so that the loop can accept other connections
}
}
Client :
package main
import (
"fmt"
"log"
"net"
"encoding/gob"
)
type P struct {
M, N int64
}
func main() {
fmt.Println("start client");
conn, err := net.Dial("tcp", "localhost:8080")
if err != nil {
log.Fatal("Connection error", err)
}
encoder := gob.NewEncoder(conn)
p := &P{1, 2}
encoder.Encode(p)
conn.Close()
fmt.Println("done");
}
Launch the server, then the client, and you see the server displaying the received P value.
A few observations to make it clear :
When you listen on a socket, you should pass the open socket to a goroutine that will handle it.
Conn implements the Reader and Writer interfaces, which makes it easy to use : you can give it to a Decoder or Encoder
In a real application you would probably have the P struct definition in a package imported by both programs