Split string based on byte length in golang

Split string based on byte length in golang - http

The http request header has a 4k length limit.
I want to split the string which I want to include in the header based on this limit.
Should I use []byte(str) to split first then convert back to string using string([]byte) for each split part?
Is there any simpler way to do it?

In Go, a string is really just a sequence of bytes, and indexing a string produces bytes. So you could simply split your string into substrings by slicing it into 4kB substrings.
However, since UTF-8 characters can span multiple bytes, there is the chance that you will split in the middle of a character sequence. This isn't a problem if the split strings will always be joined together again in the same order at the other end before decoding, but if you try to decode each individually, you might end up with invalid leading or trailing byte sequences. If you want to guard against this, you could use the unicode/utf8 package to check that you are splitting on a valid leading byte, like this:
package httputil
import "unicode/utf8"
const maxLen = 4096
func SplitHeader(longString string) []string {
splits := []string{}
var l, r int
for l, r = 0, maxLen; r < len(longString); l, r = r, r+maxLen {
for !utf8.RuneStart(longString[r]) {
r--
}
splits = append(splits, longString[l:r])
}
splits = append(splits, longString[l:])
return splits
}
Slicing the string directly is more efficient than converting to []byte and back because, since a string is immutable and a []byte isn't, the data must be copied to new memory upon conversion, taking O(n) time (both ways!), whereas slicing a string simply returns a new string header backed by the same array as the original (taking constant time).

Related

how to sum the digits in an integer using recusion?

Write a recursive method that computes the sum of the sum of the digits in an integer. use the following method header:
public static int sumDigits(long n)
For example, sumDigits(234) returns 2 + 3 + 4 = 9. Write a real program that prompts the user to enter an integer and displays its sum.

Receive an integer as a parameter
Convert to string
Parse the string's individual characters
Remove a character (first or last doesn't matter)
Put the remaining characters back into a single string
Cast that string back to integer
Call "result = removedChar As Integer + function(remainingChars as Integer)" <--- this is the recursion
In the future you should at least make one attempt for others to help you edit when you post an obvious homework question ;)

Hash collisions for golang built-in map and string keys?

I wrote this function to generate random unique id's for my test cases:
func uuid(t *testing.T) string {
uidCounterLock.Lock()
defer uidCounterLock.Unlock()
uidCounter++
//return "[" + t.Name() + "|" + strconv.FormatInt(uidCounter, 10) + "]"
return "[" + t.Name() + "|" + string(uidCounter) + "]"
}
var uidCounter int64 = 1
var uidCounterLock sync.Mutex
In order to test it, I generate a bunch of values from it in different goroutines, send them to the main thread, which puts the result in a map[string]int by doing map[v] = map[v] + 1. There is no concurrent access to this map, it's private to the main thread.
var seen = make(map[string]int)
for v := range ch {
seen[v] = seen[v] + 1
if count := seen[v]; count > 1 {
fmt.Printf("Generated the same uuid %d times: %#v\n", count, v)
}
}
When I just cast the uidCounter to a string, I get a ton of collisions on a single key. When I use strconv.FormatInt, I get no collisions at all.
When I say a ton, I mean I just got 1115919 collisions for the value [TestUuidIsUnique|�] out of 2227980 generated values, i.e. 50% of the values collide on the same key. The values are not equal. I do always get the same number of collisions for the same source code, so at least it's somewhat deterministic, i.e. probably not related to race conditions.
I'm not surprised integer overflow in a rune would be an issue, but I'm nowhere near 2^31, and that wouldn't explain why the map thinks 50% of the values have the same key. Also, I wouldn't expect a hash collision to impact correctness, just performance, since I can iterate over the keys in a map, so the values are stored there somewhere.
In the output, all runes printed are 0xEFBFBD. It's the same number of bits as the highest valid unicode code point, but that doesn't really match either.
Generated the same uuid 2 times: "[TestUuidIsUnique|�]"
Generated the same uuid 3 times: "[TestUuidIsUnique|�]"
Generated the same uuid 4 times: "[TestUuidIsUnique|�]"
Generated the same uuid 5 times: "[TestUuidIsUnique|�]"
...
Generated the same uuid 2047 times: "[TestUuidIsUnique|�]"
Generated the same uuid 2048 times: "[TestUuidIsUnique|�]"
Generated the same uuid 2049 times: "[TestUuidIsUnique|�]"
...
What's going on here? Did the go authors assume that hash(a) == hash(b) implies a == b for strings? Or am I just missing something silly? go test -race isn't complaining either.
I'm on macOS 10.13.2, and go version go1.9.2 darwin/amd64.

String conversion of an invalid rune returns a string containing the unicode replacement character: "�".
Use the strconv package to convert an integer to text.

openresty: convert int64 to string

I am using openresty/1.7.7.2 with Lua 5.1.4. I am receiving int64 in request and i have it's string format saved in DB (can't change DB schema or request format). I am not able to match both of them.
local i = 913034578410143848 --request
local p = "913034578410143848" -- stored in DB
print(p==tostring(i)) -- return false
print(i%10) -- return 0 ..this also doesn't work
Is there a way to convert int64 to string and vice versa if possible?
update:
I am getting i from protobuf object. proto file describe i as int64. I am using pb4lua library for protobuf.
ngx.req.read_body()
local body = ngx.req.get_body_data()
local request, err = Request:load(body)
local i = request.id

Lua 5.1 can not represent integer values larger than 2^53.
Number literal not excaption. So you can not just write
local i = 913034578410143848.
But LuaJIT can represent int64 values like boxed values.
Also there exists Lua libraries to make deal with large numbers.
E.g. bn library.
I do not know how your pb4lua handle this problem.
E.g. lua-pb library uses LuaJIT boxed values.
Also it provide way to specify user defined callback to make int64 value.
First I suggest figure out what real type of your i value (use type function).
All other really depends on it.
If its number then I think pb4lua just loose some info.
May be it just returns string type so you can just compare it as string.
If it provide LuaJIT cdata then this is basic function to convert string
to int64 value.
local function to_jit_uint64(str)
local v = tonumber(string.sub(str, 1, 9))
v = ffi.new('uint64_t', v)
if #str > 9 then
str = string.sub(str, 10)
v = v * (10 ^ #str) + tonumber(str)
end
return v
end

how to convert an array of characters to qstring?

I have an array -
char name[256];
sprintf(name, "hello://cert=prv:netid=%d:tsid=%d:pid=%d\0", 1010,1200, 1300);
QString private_data_string = name;
At the last offset of this string i.e. '\0',when I try to do the following.
while(private_data_string.at(offset) != ':' &&
private_data_string.at(offset) != ';' &&
private_data_string.at(offset).isNull() == false)
The application aborts. Looks like that the data pointer is also zero at the string '\'. How can I fix this?

QString doesn't contain terminating character as you expect that is why you are failing assertion out of bounds. This is proper approach:
while(offset<private_data_string.length() &&
private_data_string.at(offset) != ':' &&
private_data_string.at(offset) != ';') {
// ...
}
It looks like you are doing something strange. Looks like your question is wrong. You are asking how to fix your strange solution of some mysterious problem, instead explain what are you trying to do and then as a bonus: how did you try to solve it.

You need to know several facts:
Writing \0 at tge end of your string literal is not necessary. String literals are null-terminated by default. Literal "abc" will actually contain 4 characters including terminating null character. Your string literal has 2 null characters at its end.
You have used the default constructor QString(char*). There is no additional data about buffer's length, so QString reads characters from the buffer until it encounters first null character. It doesn't matter how many null characters are actually at the end. The null character is interpreted as a buffer end marker, not a part of the string.
When you have QString "abc", its size is 3 (it would be surprising to have another value). Null character is not a part of the string. QString::at function can be used for positions 0 <= position < size(). This is explicitly specified in the documentation. So it doesn't matter if QString's internal buffer is null-terminated or not. Either way, you don't have access to null terminator.
If you really want null character to be part of your data, you should use QByteArray instead of QString. It allows to specify buffer size on construction and can contain as many null characters as you want. However, when dealing with strings, it's usually not necessary.
You should use QString::arg instead of sprintf:
QString private_data_string =
QString("hello://cert=prv:netid=%1:tsid=%2:pid=%3")
.arg(netid).arg(tsid).arg(pid);
sprintf is unsafe and can overrun your fixed-size buffer if you're not careful. In C++ there's no good reason to use sprintf.
"A QString that has not been assigned to anything is null, i.e., both the length and data pointer is 0" - this has nothing to do with your situation because you have assigned a value to your string.

How to get a string from TextIO in sml/ml?

I'm trying to read text from a file in SML. Eventually, I want a list of individual words; however, I'm struggling at how to convert between a TextIO.elem to a string. For example, if I write the following code it returns a TextIO.elem but I don't know how to convert it to a string so that I can concat it with another string
TextIO.input1 inStream

TextIO.elem is just a synonym for char, so you can use the str function to convert it to a string. But as I replied to elsewhere, I suggest using TextIO.inputAll to get a string right away.
Here is a function that takes an instream and delivers all (remaining) words in it:
val words = String.tokens Char.isSpace o TextIO.inputAll
The type of this function is TextIO.instream -> string list.

Develop Reference

r css asp.net wordpress firebase qt symfony nginx http apache-flex

Split string based on byte length in golang - http

The http request header has a 4k length limit. I want to split the string which I want to include in the header based on this limit. Should I use []byte(str) to split first then convert back to string using string([]byte) for each split part? Is there any simpler way to do it?

Related

how to sum the digits in an integer using recusion?

Hash collisions for golang built-in map and string keys?

openresty: convert int64 to string

how to convert an array of characters to qstring?

How to get a string from TextIO in sml/ml?

Categories

Resources