I want to iteratively use insert to modify the first element in a vector<int>(I know that with vector it's better to insert element in the back, I was just playing).
int main() {
vector<int> v1 = {1,2,2,2,2};
auto itr = v1.begin();
print_vector(v1);
cout<<*itr<<endl; // ok, itr is pointing to first element
v1.insert(itr,3);
cout<<*itr<<endl; // after inserting 3 itr is still pointing to 1
print_vector(v1);
cout<<*itr<<endl; // but now itr is pointing to 3
v1.insert(itr,7);
print_vector(v1);
cout<<*itr<<endl;
return 0;
}
v[]: 1 2 2 2 2
1
1
v[]: 3 1 2 2 2 2
3
v[]: 131072 3 1 2 2 2 2
Process finished with exit code 0
So my problem here are mainly 2:
After v1.insert(itr,3), itr is still pointing to 1. After the call of print_vector() now itr is pointing to 3. Why?
Ok now itr its pointing to 3 (the first element of v1). I call v1.insert(itr,7) but instead of placing 7 as the first element, it place 131072. Again, why?
The print_vector function I have implemented is the following:
void print_vector(vector<int> v){
cout<<"v[]: ";
for(int i:v){
cout<<i<<" ";
}
cout<<endl;
}
After inserting an element to a vector, all of its iterators are invalidated, meaning any behavior involving them falls under undefined behavior. You can find a list of iterator invalidation conditions in the answers on Iterator invalidation rules for C++ containers.
Anything you're experiencing after the first v1.insert() call falls under undefined behavior, as you can clearly see with the placement of 131072 (an arbitrary value).
If you refresh the iterator after every insertion call, you should get normal behavior:
int main()
{
vector<int> v1 = { 1,2,2,2,2 };
auto itr = v1.begin();
print_vector(v1);
cout << *itr << endl;
v1.insert(itr, 3);
itr = v1.begin(); // Iterator refreshed
cout << *itr << endl;
print_vector(v1);
cout << *itr << endl;
v1.insert(itr, 7);
itr = v1.begin(); // Iterator refreshed
print_vector(v1);
cout << *itr << endl;
return 0;
}
And the output:
v[]: 1 2 2 2 2
1
3
v[]: 3 1 2 2 2 2
3
v[]: 7 3 1 2 2 2 2
7
Related
I'm new to CUDA.
I want to copy and sum values in device_vector in the following ways. Are there more efficient ways (or functions provided by thrust) to implement these?
thrust::device_vector<int> device_vectorA(5);
thrust::device_vector<int> device_vectorB(20);
copydevice_vectorA 4 times into device_vectorB in the following way:
for (size_t i = 0; i < 4; i++)
{
offset_sta = i * 5;
thrust::copy(device_vectorA.begin(), device_vectorA.end(), device_vectorB.begin() + offset_sta);
}
Sum every 5 values in device_vectorB and store the results in new device_vector (size 4):
// Example
device_vectorB = 1 2 3 4 5 | 1 2 3 4 5 | 1 2 3 4 5 | 1 2 3 4 5
device_vectorC = 15 15 15 15
thrust::device_vector<int> device_vectorC(4);
for (size_t i = 0; i < 4; i++)
{
offset_sta = i * 5;
offset_end = (i + 1) * 5 - 1;
device_vectorC[i] = thrust::reduce(device_vectorB.begin() + offset_sta, device_vectorB.begin() + offset_end, 0);
}
Are there more efficient ways (or functions provided by thrust) to implement these?
P.S. 1 and 2 are separate instances. For simplicity, these two instances just use the same vectors to illustrate.
Step 1 can be done with a single thrust::copy operation using a permutation iterator that uses a transform iterator working on a counting iterator to generate the copy indices "on the fly".
Step 2 is a partitioned reduction, using thrust::reduce_by_key. We can again use a transform iterator working on a counting iterator to create the flags array "on the fly".
Here is an example:
$ cat t2124.cu
#include <thrust/device_vector.h>
#include <thrust/host_vector.h>
#include <thrust/copy.h>
#include <thrust/reduce.h>
#include <thrust/sequence.h>
#include <thrust/iterator/permutation_iterator.h>
#include <thrust/iterator/transform_iterator.h>
#include <thrust/iterator/counting_iterator.h>
#include <thrust/iterator/discard_iterator.h>
#include <iostream>
using namespace thrust::placeholders;
const int As = 5;
const int Cs = 4;
const int Bs = As*Cs;
int main(){
thrust::device_vector<int> A(As);
thrust::device_vector<int> B(Bs);
thrust::device_vector<int> C(Cs);
thrust::sequence(A.begin(), A.end(), 1); // fill A with 1,2,3,4,5
thrust::copy_n(thrust::make_permutation_iterator(A.begin(), thrust::make_transform_iterator(thrust::counting_iterator<int>(0), _1%A.size())), B.size(), B.begin()); // step 1
auto my_flags_iterator = thrust::make_transform_iterator(thrust::counting_iterator<int>(0), _1/A.size());
thrust::reduce_by_key(my_flags_iterator, my_flags_iterator+B.size(), B.begin(), thrust::make_discard_iterator(), C.begin()); // step 2
thrust::host_vector<int> Ch = C;
thrust::copy_n(Ch.begin(), Ch.size(), std::ostream_iterator<int>(std::cout, ","));
std::cout << std::endl;
}
$ nvcc -o t2124 t2124.cu
$ compute-sanitizer ./t2124
========= COMPUTE-SANITIZER
15,15,15,15,
========= ERROR SUMMARY: 0 errors
$
If we wanted to, even the device vector A could be dispensed with; that could be created "on the fly" using a counting iterator. But presumably your inputs are not actually 1,2,3,4,5
Given an undirected graph with costs on edges, find the shortest path, from given node A to B. Let's put it this way: besides the costs and edges we start at time t = 0 and for every node you are given a list with some times that you can't pass through those nodes at that times, and you can't do anything in that time you have to wait until "it passes". As the statement says, you are a prisoner and you can teleport through the cells and the teleportation time requires the cost of the edge time, and those time when you can't do anything is when a guardian is with you in the cell and they are in the cell at every timestamp given from the list, find the minimum time to escape the prison.
What I tried:
I tried to modify it like that: in the normal dijkstra you check if it's a guardian at the minimum time you find for every node, but it didn't work.. any other ideas?
int checkGuardian(int min, int ind, List *guardians)
{
for (List iter = guardians[ind]; iter; iter = iter->next)
if(min == iter->value.node)
return min + iter->value.node;
return 0;
}
void dijkstra(Graph G, int start, int end, List *guardians)
{
Multiset H = initMultiset();
int *parent = (int *)malloc(G->V * sizeof(int));
for (int i = 0; i < G->V; ++i)
{
G->distance[i] = INF;
parent[i] = -1;
}
G->distance[start] = 0;
H = insert(H, make_pair(start, 0));
while(!isEmptyMultiset(H))
{
Pair first = extractMin(H);
for (List iter = G->adjList[first.node]; iter; iter = iter->next)
if(G->distance[iter->value.node] > G->distance[first.node] + iter->value.cost
+ checkGuardian(G->distance[first.node] + iter->value.cost, iter->value.node, guardians))
{
G->distance[iter->value.node] = G->distance[first.node] + iter->value.cost
+ checkGuardian(G->distance[first.node] + iter->value.cost, iter->value.node, guardians);
H = insert(H, make_pair(iter->value.node, G->distance[iter->value.node]));
parent[iter->value.node] = first.node;
}
}
printf("%d\n", G->distance[end]);
printPath(parent, end);
printf("%d\n", end);
}
with these structures:
typedef struct graph
{
int V;
int *distance;
List *adjList;
} *Graph;
typedef struct list
{
int size;
Pair value;
struct list *tail;
struct list *next;
struct list *prev;
} *List;
typedef struct multiset
{
Pair vector[MAX];
int size;
int capacity;
} *Multiset;
typedef struct pair
{
int node, cost;
} Pair;
As an input you are given number of nodes, number of edges and start node. For the next number of edges lines you are reading and edge between 2 nodes and the cost associated with that edge, then for the next number of nodes lines you are reading a character "N" if you can't escape from that cell and "Y" if you can escape from that cell then the number of timestamps guardians are in then number of timestamps, timestamps.
For this input:
6 7 1
1 2 5
1 4 3
2 4 1
2 3 8
2 6 4
3 6 2
1 5 10
N 0
N 4 2 3 4 7
Y 0
N 3 3 6 7
N 3 10 11 12
N 3 7 8 9
I would expect this output:
12
1 4 2 6 3
But I get this output:
10
1 4 2 6 3
Triangular numbers are numbers which is number of things when things can be arranged in triangular shape.
For Example, 1, 3, 6, 10, 15... are triangular numbers.
o o o o o o o o o o is shape of n=4 triangular number
what I have to do is A natural number N is given and I have to print
N expressed by sum of triangular numbers.
if N = 4
output should be
1 1 1 1
1 3
3 1
else if N = 6
output should be
1 1 1 1 1 1
1 1 1 3
1 1 3 1
1 3 1 1
3 1 1 1
3 3
6
I have searched few hours and couldn't find answers...
please help.
(I am not sure this might help, but I found that
If i say T(k) is Triangular number when n is k, then
T(k) = T(k-1) + T(k-3) + T(k-6) + .... + T(k-p) while (k-p) > 0
and p is triangular number )
Here's Code for k=-1(Read comments below)
#include <iostream>
#include <vector>
using namespace std;
long TriangleNumber(int index);
void PrintTriangles(int index);
vector<long> triangleNumList(450); //(450 power raised by 2 is about 200,000)
vector<long> storage(100001);
int main() {
int n, p;
for (int i = 0; i < 450; i++) {
triangleNumList[i] = i * (i + 1) / 2;
}
cin >> n >> p;
cout << TriangleNumber(n);
if (p == 1) {
//PrintTriangles();
}
return 0;
}
long TriangleNumber(int index) {
int iter = 1, out = 0;
if (index == 1 || index == 0) {
return 1;
}
else {
if (storage[index] != 0) {
return storage[index];
}
else {
while (triangleNumList[iter] <= index) {
storage[index] = ( storage[index] + TriangleNumber(index - triangleNumList[iter]) ) % 1000000;
iter++;
}
}
}
return storage[index];
}
void PrintTriangles(int index) {
// What Algorithm?
}
Here is some recursive Python 3.6 code that prints the sums of triangular numbers that total the inputted target. I prioritized simplicity of code in this version. You may want to add error-checking on the input value, counting the sums, storing the lists rather than just printing them, and wrapping the entire routine into a function. Setting up the list of triangular numbers could also be done in fewer lines of code.
Your code saved time but worsened memory usage by "memoizing" the triangular numbers (storing and reusing them rather than always calculating them when needed). You could do the same to the sum lists, if you like. It is also possible to make this more in the dynamic programming style: find the sum lists for n=1 then for n=2 etc. I'll leave all that to you.
""" Given a positive integer n, print all the ways n can be expressed as
the sum of triangular numbers.
"""
def print_sums_of_triangular_numbers(prefix, target):
"""Print sums totalling to target, each after printing the prefix."""
if target == 0:
print(*prefix)
return
for tri in triangle_num_list:
if tri > target:
return
print_sums_of_triangular_numbers(prefix + [tri], target - tri)
n = int(input('Value of n ? '))
# Set up list of triangular numbers not greater than n
triangle_num_list = []
index = 1
tri_sum = 1
while tri_sum <= n:
triangle_num_list.append(tri_sum)
index += 1
tri_sum += index
# Print the sums totalling to n
print_sums_of_triangular_numbers([], n)
Here are the printouts of two runs of this code:
Value of n ? 4
1 1 1 1
1 3
3 1
Value of n ? 6
1 1 1 1 1 1
1 1 1 3
1 1 3 1
1 3 1 1
3 1 1 1
3 3
6
I have written the program to print fibonacci numbers upto the limit as the user wants. I wrote that program in recursive fashion which should give the output as expected. It is giving the right output but with appended wrong values too. This happens if the user wants to print 4 or more than 4 fibonacci numbers. Also in the recursive function I have decreased the count value before passing it in the same function call. If i decrease the count value in the called function parameters then the while loop runs endlessly. When the loop finishes after some steps and the user limit input is 5 then the output is
Enter the limit number....
5
Fibonacci numbers are: 0 1 1 2 3 3 2 3 3
Finished.........
Can anyone tell me the fault in my program or the exact reason behind this output. Thanks in advance for it.
Program is as follows:
public class FibonacciNumbers
{
public static void main(String[] args)
{
int i=0, j=1;
Scanner sc = new Scanner(System.in);
System.out.println("Enter the limit number....");
int num = sc.nextInt();
System.out.print("Fibonacci numbers are: " + i + " " + j + " " );
fibonacci(num-2, i, j);
System.out.println("\nFinished.........");
}
public static void fibonacci(int count, int i, int j)
{
int sum = 0;
while(count > 0)
{
sum = i+j;
i=j;
j=sum;
System.out.print(sum + " ");
--count;
fibonacci(count, i, j);
}
}
}
You don't need both the while loop AND the recursive function calls. You have to choose between using a loop OR recursive calls.
The recursive solution:
public static void fibonacci(int count, int i, int j) {
if (count>0){
int sum = i+j;
i=j;
j=sum;
System.out.print(sum + " ");
--count;
fibonacci(count, i, j);
}
}
The solution involving a loop:
public static void fibonacci(int count, int i, int j) {
int sum = 0;
while(count > 0) {
sum = i+j;
i=j;
j=sum;
System.out.print(sum + " ");
--count;
}
}
The problem with your code
If you look closely at the following output of your code, you can see that in the beginning of the output there are the actual 7 first fibonacci numbers, and after that comes an unneeded series of the same fibonacci numbers. You printed two numbers from main, and then you expected 5 more numbers but got 31:
Enter the limit number.... 7
Fibonacci numbers are: 0 1 1 2 3 5 8 8 5 8 8 3 5 8 8 5 8 8 2 3 5 8 8 5
8 8 3 5 8 8 5 8 8
This happens because when you first call the fibonacci function with count=5, the while loop has 5 iterations, so it prints 5 fibonacci numbers and the fibonacci function is called 5 times from there with these count parameters: 4,3,2,1,0. When the fibonacci function is called with the parameter count=4, it prints 4 numbers and calls fibonacci 4 times with these parameters: 3,2,1,0 because the while loop then has 4 iterations. I drew an image of the recursive calls (I omitted the f(0) calls because they don't print anything):
If you add it all up, you can see that the program prints 31 fibonacci numbers altogether which is way too much because you wanted to print only 5! This trouble is caused by using while and recursive calls at the same time. You want the recursive behaviour to be like this instead, with no while loop:
OR you want one while loop and no recursion:
I'm calling the kernel below with GlobalWorkSize 64 4 1 and WorkGroupSize 1 4 1 with the argument output initialized to zeros.
__kernel void kernelB(__global unsigned int * output)
{
uint gid0 = get_global_id(0);
uint gid1 = get_global_id(1);
output[gid0] += gid1;
}
I'm expecting 6 6 6 6 ... as the sum of the gid1's (0 + 1 + 2 + 3). Instead I get 3 3 3 3 ... Is there a way to get this functionality? In general I need the sum of the results of each work-item in a work group.
EDIT: It seems it must be said, I'd like to solve this problem without atomics.
You need to use local memory to store the output from all work items. After the work items are done their computation, you sum the results with an accumulation step.
__kernel void kernelB(__global unsigned int * output)
{
uint item_id = get_local_id(0);
uint group_id = get_group_id(0);
//memory size is hard-coded to the expected work group size for this example
local unsigned int result[4];
//the computation
result[item_id] = item_id % 3;
//wait for all items to write to result
barrier(CLK_LOCAL_MEM_FENCE);
//simple O(n) reduction using the first work item in the group
if(local_id == 0){
for(int i=1;i<4;i++){
result[0] += result[i];
}
output[group_id] = result[0];
}
}
Multiple work items are accessing elements of global simultaneously and the result is undefined. You need to use atomic operations or write unique location per work item.