i am trying to have a function fire every x amount without blocking the main loop, i saw some example code to do this, see code below:
// Interval is how long we wait
// add const if this should never change
int interval=1000;
// Tracks the time since last event fired
unsigned long previousMillis=0;
void loop() {
// Get snapshot of time
unsigned long currentMillis = millis();
Serial.print("TIMING: ");
Serial.print(currentMillis);
Serial.print(" - ");
Serial.print(previousMillis);
Serial.print(" (");
Serial.print((unsigned long)(currentMillis - previousMillis));
Serial.print(") >= ");
Serial.println(interval);
if ((unsigned long)(currentMillis - previousMillis) >= interval) {
previousMillis = currentMillis;
}
}
Now the following happens:
TIMING: 3076 - 2067 (1009) >= 1000
TIMING: 4080 - 3076 (1004) >= 1000
TIMING: 5084 - 4080 (1004) >= 1000
TIMING: 6087 - 5084 (1003) >= 1000
TIMING: 7091 - 6087 (1004) >= 1000
Why is the currentMillis getting so much higher every loop? It looks like it is sharing a pointer or something like that, because it adds the interval value everytime. I am confused!
I think that the code you presented us is an incomplete picture of what you have uploaded on the Arduino, since on my device I obtain the following sequence
TIMING: 0 - 0 (0) >= 1000
TIMING: 0 - 0 (0) >= 1000
TIMING: 1 - 0 (1) >= 1000
TIMING: 32 - 0 (32) >= 1000
TIMING: 93 - 0 (93) >= 1000
TIMING: 153 - 0 (153) >= 1000
TIMING: 218 - 0 (218) >= 1000
TIMING: 283 - 0 (283) >= 1000
TIMING: 348 - 0 (348) >= 1000
TIMING: 412 - 0 (412) >= 1000
TIMING: 477 - 0 (477) >= 1000
TIMING: 541 - 0 (541) >= 1000
TIMING: 606 - 0 (606) >= 1000
TIMING: 670 - 0 (670) >= 1000
TIMING: 735 - 0 (735) >= 1000
TIMING: 799 - 0 (799) >= 1000
TIMING: 865 - 0 (865) >= 1000
TIMING: 929 - 0 (929) >= 1000
TIMING: 994 - 0 (994) >= 1000
TIMING: 1058 - 0 (1058) >= 1000
TIMING: 1127 - 1058 (69) >= 1000
TIMING: 1198 - 1058 (140) >= 1000
TIMING: 1271 - 1058 (213) >= 1000
TIMING: 1344 - 1058 (286) >= 1000
and it sounds correct given the code you provided.
Are you sure there isn't any sleep() call in your original source code?
(perhaps you didn't upload the updated code on the device?)
To expand on #patrick-trentin's answer, it's looking very likely that your code is not the only thing you run on your Arduino. The code you see in your sketches is never the only code that the arduino runs. It handles for you the Serial data incoming, and if you use some other modules (like SPI or network) it's having some other code that runs in ISP, which are functions run regularly using a timer.
But the arduino CPU cannot run code in parallel. To mimic parallel behaviour, it's actually stopping your main loop, to run a subroutine (the ISP), which will read a byte coming in through Serial (for example), buffer it to then make it available to you using a nice method on the Serial object.
So the more things you do in those interrupt based subroutines, the less often you'll iterate your main loop, having it pass less often over the millis() comparaison you do.
Related
I'm working on a ESP32 using Arduino, for some reason the values are printed differently, what is the cause?
auto reset_time = 24L * 60 * 60 * 1000 * 1000; //86400000000
Serial.print("Reset Timer in: ");
Serial.println(reset_time);
Serial.print((reset_time / 1000));
Serial.println(" ms");
Serial.print((reset_time / 1000 / 1000));
Serial.println(" s");
Serial.print((reset_time / 1000 / 1000 / 60));
Serial.println(" m");
Serial.print((reset_time / 1000 / 1000 / 60 / 60));
Serial.println(" h");
This produces the following output:
21:05:58.310 -> Reset Timer in: 500654080
21:05:58.310 -> 500654 ms
21:05:58.310 -> 500 s
21:05:58.310 -> 8 m
21:05:58.310 -> 0 h
86400000000 Mod 2^32 is 500654080.
The value is larger than fits in a 32 bit int; what you see is the remainder.
If I read a C17 draft correctly, a constant expression that cannot be represented in its type is a constraint violation. It requires a diagnostic message from the compiler:
6.6 Constant expressions
Constraints
[...]
4 Each constant expression shall evaluate to a constant
that is in the range of representable values for
its type.
5.1.1.3 Diagnostics
1 A conforming implementation shall produce at least one diagnostic message (identified in an implementation-
defined manner) if a preprocessing translation unit or translation unit contains a violation
of any syntax rule or constraint,
I am attempting to write a program that executes Monte Carlo simulations using OpenCL. I have run into an issue involving exponentials. When the value of the variable steps becomes large, approximately 20000, the calculation of the exponent fails unexpectedly, and the program quits with "Abort Trap: 6". This seems to be a bizarre error given that steps should not affect memory allocation. I have tried setting normal, alpha, and beta to 0 but this does not resolve the problem however commenting out the exponent and replacing it with the constant 1 seems to fix the problem. I have run my code on an AWS GPU instance and it does not run into any issues. Does anybody have any ideas as to why this might be a problem on an integrated graphics card?
SOLUTION
Execute the kernel multiple times over a smaller ranges to keep kernel execution time under 5 seconds
Code Snippet
#ifndef M_PI
#define M_PI 3.14159265358979323846
#endif
static uint MWC64X(uint2 *state) {
enum { A = 4294883355U };
uint x = (*state).x, c = (*state).y;
uint res = x ^ c;
uint hi = mul_hi(x, A);
x = x * A + c;
c = hi + (x < c);
*state = (uint2)(x, c);
return res;
}
__kernel void discreteMonteCarloKernel(...) {
float cumulativeWalk = stockPrice;
float currentValue = stockPrice;
...
uint n = get_global_id(0);
uint2 seed2 = (uint2)(n, seed);
uint random1 = MWC64X(&seed2);
uint2 seed3 = (uint2)(random1, seed);
uint random2 = MWC64X(&seed3);
float alpha = (interestRate - 0.5 * sigma * sigma) * dt;
float beta = sigma * sqrt(dt);
float u1;
float u2;
float a;
float b;
float normal;
for (int j = 0; j < steps; j++) {
random1 = MWC64X(&seed2);
if (random1 == 0) {
random1 = MWC64X(&seed2);
}
random2 = MWC64X(&seed3);
u1 = (float)random1 / (float)0xffffffff;
u2 = (float)random2 / (float)0xffffffff;
a = sqrt(-2 * log(u1));
b = 2 * M_PI * u2;
normal = a * sin(b);
exponent = exp(alpha + beta * normal);
currentValue = currentValue * exponent;
cumulativeWalk += currentValue;
...
}
Problem Report
Exception Type: EXC_CRASH (SIGABRT)
Exception Codes: 0x0000000000000000, 0x0000000000000000
Exception Note: EXC_CORPSE_NOTIFY
Application Specific Information:
abort() called
Application Specific Signatures:
Graphics hardware encountered an error and was reset: 0x00000813
Thread 0 Crashed:: Dispatch queue: opencl_runtime
0 libsystem_kernel.dylib 0x00007fffb14bad42 __pthread_kill + 10
1 libsystem_pthread.dylib 0x00007fffb15a85bf pthread_kill + 90
2 libsystem_c.dylib 0x00007fffb1420420 abort + 129
3 libGPUSupportMercury.dylib 0x00007fffa98e6fbf gpusGenerateCrashLog + 158
4 com.apple.driver.AppleIntelHD5000GraphicsGLDriver 0x000000010915f13b gpusKillClientExt + 9
5 libGPUSupportMercury.dylib 0x00007fffa98e7983 gpusQueueSubmitDataBuffers + 168
6 com.apple.driver.AppleIntelHD5000GraphicsGLDriver 0x00000001091aa031 IntelCLCommandBuffer::getNew(GLDQueueRec*) + 31
7 com.apple.driver.AppleIntelHD5000GraphicsGLDriver 0x00000001091a9f99 intelSubmitCLCommands(GLDQueueRec*, unsigned int) + 65
8 com.apple.driver.AppleIntelHD5000GraphicsGLDriver 0x00000001091b00a1 CHAL_INTEL::ChalContext::ChalFlush() + 83
9 com.apple.driver.AppleIntelHD5000GraphicsGLDriver 0x00000001091aa2c3 gldFinishQueue + 43
10 com.apple.opencl 0x00007fff9ffeeb37 0x7fff9ffed000 + 6967
11 com.apple.opencl 0x00007fff9ffef000 0x7fff9ffed000 + 8192
12 com.apple.opencl 0x00007fffa000ccca 0x7fff9ffed000 + 130250
13 com.apple.opencl 0x00007fffa001029d 0x7fff9ffed000 + 144029
14 libdispatch.dylib 0x00007fffb13568fc _dispatch_client_callout + 8
15 libdispatch.dylib 0x00007fffb1357536 _dispatch_barrier_sync_f_invoke + 83
16 com.apple.opencl 0x00007fffa001011d 0x7fff9ffed000 + 143645
17 com.apple.opencl 0x00007fffa000bda6 0x7fff9ffed000 + 126374
18 com.apple.opencl 0x00007fffa00011df clEnqueueReadBuffer + 813
19 simplisticComparison 0x0000000107b953cf BinomialMultiplication::execute(int) + 1791
20 simplisticComparison 0x0000000107b9ec7f main + 767
21 libdyld.dylib 0x00007fffb138c235 start + 1
Thread 1:
0 libsystem_pthread.dylib 0x00007fffb15a50e4 start_wqthread + 0
1 ??? 0x000070000eed6b30 0 + 123145552751408
Thread 2:
0 libsystem_pthread.dylib 0x00007fffb15a50e4 start_wqthread + 0
Thread 3:
0 libsystem_pthread.dylib 0x00007fffb15a50e4 start_wqthread + 0
1 ??? 0x007865646e496d65 0 + 33888479226719589
Thread 0 crashed with X86 Thread State (64-bit):
rax: 0x0000000000000000 rbx: 0x0000000000000006 rcx: 0x00007fff58074078 rdx: 0x0000000000000000
rdi: 0x0000000000000307 rsi: 0x0000000000000006 rbp: 0x00007fff580740a0 rsp: 0x00007fff58074078
r8: 0x0000000000000000 r9: 0x00007fffb140ba50 r10: 0x0000000008000000 r11: 0x0000000000000206
r12: 0x00007f92de80a7e0 r13: 0x00007f92e0008c00 r14: 0x00007fffba29e3c0 r15: 0x00007f92de801a00
rip: 0x00007fffb14bad42 rfl: 0x0000000000000206 cr2: 0x00007fffba280128
Logical CPU: 0
Error Code: 0x02000148
Trap Number: 133
I have a guess. The driver can crash in two ways:
We reference a bad buffer address. This is probably not your case.
We time out (exceed the TDR). A kernel has a few seconds to complete.
My money is on #2. If the larger value (steps) makes the GPU run too long, the system will kill things.
I am not familiar with the guts of Apple's Intel driver, but typically there is a way to disable the TDR in extreme cases. E.g. see the Windows Documenation on TDRs to get the gist. (Linux drivers have a way to disable this too.)
Normally we want to avoid running things that take super long and it might be a good idea to decompose the workload in some way so that you naturally don't hit this kill switch. E.g. perhaps chunk the "steps" into smaller chunks (pass in and save your state for parts you can't recompute).
I wrote a sketch on my Arduino Mega when I was prototyping. Afterwards, I flashed it as is to a atmega328 chip. I got odd results all over the sketch. To fix it, I copied module by module over to a new IDE windows and that is when I noticed something fishy with the analogWrite functions. In order to take away all other variables, I uploaded this sketch which is a slightly modified FADE example sketch
int led = 6;
int brightness = 0;
int fadeAmount = 5;
void setup() {
Serial.begin(9600);
pinMode(led, OUTPUT);
}
void loop() {
Serial.println(brightness);
analogWrite(led, brightness);
brightness = brightness + fadeAmount;
if (brightness == 0 || brightness == 255) {
fadeAmount = -fadeAmount ;
}
delay(1000);
}
It uploads perfectly fine with no errors and I attached an led and resistor to that pin. when the chip starts running the code, all I get the led flashing and the serial data like this
.5
.0
.5
.0
.5
.0
.5
.0
.5
.0
.5
.0
.5
.10
What can be wrong with it???
Strange things are happening. I ran the program after copying and pasting your code and got the expected result:
0
5
10
15
20
25
30
35
40
45
50
55
60
65
70
75
80
85
90
95
100
105
110
115
120
125
130
135
140
145
150
155
160
165
170
175
180
185
190
195
200
205
210
215
220
225
230
235
240
245
250
255
250
245
240
235
230
225
220
Are you sure you have pasted the exact code that produces the unexpected results in your side? The dots in front of the numbers are just one of many strange things. Of course the alternating values are another. As is the .10 that suddenly appears after the seros and fives. In short, the fishiness seems to be unrelated to analogwrite.
Unless it is a hardware problem. What value resistor? Are the resistor and LED in series? Does the LED flash with a frequency of 0.5 Hz and a duty cycle of 0.5? or not?
BTW, RBerteig would be correct if your condition checked with 256 instead of 255. His version is indeed better, but if this were the problem you would see a different behaviour
An obvious problem is with this line:
if (brightness == 0 || brightness == 255) {
Since you are modifying brightness by adding (or subtracting) 5 on each iteration and 256 is not divisible by 5, neither endpoint is going to test well.
Change the == test to an inequality.
if (brightness <= 0 || brightness >= 255) {
I am building a game which allows the player to control "power flow" between 10 circuits.
Each of the 10 circuits is adjusted individually and the total must always equal 100%.
For example a perfectly balanced situation would be all 10 circuits at 10% (10x10=100)
Edit 2: If what I am trying to do here is know as things other than "balancing", please comment and I will research them.
Now the player also has the ability to lock circuits so that the power level cannot be changed by other circuits but it can still be changed directly.
EDIT 3: Sometimes the requested amount may not be possible to achieve (eg: example 3 and 6) in these situations the nearest possible result will be the result
EDIT: Seeing that my post is receiving down votes I will include what I have already tried
Sum of change divided by circuits requesting change adding to circuits requesting change and taken off circuits not changing - The problem with this method was negative and positive changes at the same time could balance and result in "deadlock" situations where no change happens
Looping circuit by circuit adding and taking as needed - The problem with this method is that it rarely balanced correctly
Applying subtractions and additions first and then balance all circuits back into range (so total becomes 100) - the problem with this was power would end where it shouldn't be with circuits that should be at 0 ending up with small amounts of power
To simplify my question we can work with just 5 circuits.
I need assistance to work out the math for calculating the following. After 20 or so attempts I am thinking I am over complicating it as I keep ending up with 200 line scripts or is this actually very complicated?
Example 1: Addition Example
20 20 20 20 20 Start values
+10 +10 0 0 0 Change
30 30 3.3 3.3 3.3 After first iteration
50 50 0 0 0 After x iterations (eg key held down)
Example 2: Subtraction Example
20 20 20 20 20 Start values
-10 -10 0 0 0 Change
10 10 26.6 26.6 26.6 After first iteration
0 0 33.3 33.3 33.3 After x iterations (eg key held down)
Example 3: Lock + Addition (L is locked)
L
2.5 90 2.5 2.5 2.5 Start values
0 0 +50 0 0 Change
0 90 10 0 0 After first iteration
0 90 10 0 0 After x iterations (eg key held down)
Example 4: Lock + Subtraction (L is locked)
L
2.5 90 2.5 2.5 2.5 Start values
0 -10 0 0 0 Change
5 80 5 5 5 After first iteration
25 0 25 25 25 After x iterations (eg key held down)
Example 5: Multi Lock + Subtraction (L is locked)
L L
2.5 90 2.5 2.5 2.5 Start values
0 -10 0 0 0 Change
5.8 80 2.5 5.8 5.8 After first iteration
32.5 0 2.5 32.5 32.5 After x iterations (eg key held down)
Example 6: Balancing change from unbalanced start (This math may be a bit off)
2.5 90 2.5 2.5 2.5 Start values
+10 +10 +10 0 0 Change
16.7 66.6 16.7 0 0 After first iteration
33.3 33.3 33.3 0 0 After x iterations (eg key held down)
Start by retrieving all circuits that may potentially be changed by the runtime:
Candidates = AllCircuits \ (LockedCircuits u ChangedCircuits)
Here, \ denotes the set minus operator and u is the union operator.
Calculate the average change per circuit:
targetTotalChange = totalChange
averageChange = totalChange / |Candidates|
Now, start to change the candidates. To account for limitations, order the candidates by their current power flow. If averageChange is negative, then order them in ascending order. If it is positive, order them in descending order.
And remember how many circuits you already have processed:
processedCircuits = 0
Now iterate all candidates in the specified order:
for each candidate in Candidates
Check if the the average change can be added to this circuit. Otherwise, adapt the values:
processedCircuits++
prevPower = candidate.PowerFlow
targetPower = prevPower + averageChange
if(targetPower < 0)
{
totalChange += prevPower
candidate.PowerFlow = 0
//recalculate average change
}
else if(targetPower > 100)
{
totalChange -= 100 - prevPower
candidate.PowerFlow = 100
//recalculate average change
}
else
{
totalChange -= averageChange
candidate.PowerFlow += averageChange
}
When you need to recalculate the average change, do the following:
averageChange = totalChange / (|Candidates| - processedCircuits)
Beware of division by zero.
Now you have adapted all other circuits. What remains is adapting the changed circuits. This is quite easy. We changed all other circuits by targetTotalChange - totalChange. This change can be added to the changed circuits. We can just add the according percentage:
percentage = (targetTotalChange - totalChange) / targetTotalChange
for each circuit in ChangedCircuits
circuit.PowerFlow += percentage * targetChange[circuit]
next
I am interested in profiling some Rcpp code under OS X (Mountain Lion 10.8.2), but the profiler crashes when being run.
Toy example, using inline, just designed to take enough time for a profiler to notice.
library(Rcpp)
library(inline)
src.cpp <- "
RNGScope scope;
int n = as<int>(n_);
double x = 0.0;
for ( int i = 0; i < n; i++ )
x += (unif_rand()-.5);
return wrap(x);"
src.c <- "
int i, n = INTEGER(n_)[0];
double x = 0.0;
GetRNGstate();
for ( i = 0; i < n; i++ )
x += (unif_rand()-.5);
PutRNGstate();
return ScalarReal(x);"
f.cpp <- cxxfunction(signature(n_="integer"), src.cpp, plugin="Rcpp")
f.c <- cfunction(signature(n_="integer"), src.c)
If I use either the GUI Instruments (in Xcode, version 4.5 (4523)) or the command line sample, both crash: Instruments crashes straight away, while sample completes processing samples before crashing:
# (in R)
set.seed(1)
f.cpp(200000000L)
# (in a separate terminal window)
~ ยป sample R # this invokes the profiler
Sampling process 81337 for 10 seconds with 1 millisecond of run time between samples
Sampling completed, processing symbols...
[1] 81654 segmentation fault sample 81337
If I do the same process but with the C version (i.e., f.c(200000000L)) both Instruments and sample work fine, and produce output like
Call graph:
1832 Thread_6890779 DispatchQueue_1: com.apple.main-thread (serial)
1832 start (in R) + 52 [0x100000e74]
1832 main (in R) + 27 [0x100000eeb]
1832 run_Rmainloop (in libR.dylib) + 80 [0x1000e4020]
1832 R_ReplConsole (in libR.dylib) + 161 [0x1000e3b11]
1832 Rf_ReplIteration (in libR.dylib) + 514 [0x1000e3822]
1832 Rf_eval (in libR.dylib) + 1010 [0x1000aa402]
1832 Rf_applyClosure (in libR.dylib) + 849 [0x1000af5d1]
1832 Rf_eval (in libR.dylib) + 1672 [0x1000aa698]
1832 do_dotcall (in libR.dylib) + 16315 [0x10007af3b]
1382 file1412f6e212474 (in file1412f6e212474.so) + 53 [0x1007fded5] file1412f6e212474.cpp:16
+ 862 unif_rand (in libR.dylib) + 1127,1099,... [0x10000b057,0x10000b03b,...]
+ 520 fixup (in libR.dylib) + 39,67,... [0x10000aab7,0x10000aad3,...]
356 file1412f6e212474 (in file1412f6e212474.so) + 70,61,... [0x1007fdee6,0x1007fdedd,...] file1412f6e212474.cpp:16
56 unif_rand (in libR.dylib) + 1133 [0x10000b05d]
38 DYLD-STUB$$unif_rand (in file1412f6e212474.so) + 0 [0x1007fdf1c]
I would really appreciate some advice into if there is anything I'm doing wrong, if there is some other preferred way, or if this is just not possible. Given that one of the main uses of Rcpp seems to be in speeding up R code, I'm surprised not to find more information on this, but perhaps I'm looking in the wrong place.
This is on OS X 10.8.2 with R 2.15.1 (x86_64-apple-darwin9.8.0), Rcpp 0.9.15, and g++ --version reports "i686-apple-darwin11-llvm-g++-4.2 (GCC) 4.2.1 (Based on Apple Inc. build 5658) (LLVM build 2336.11.00)".
A solution
Thanks to Dirk's answer below, and his talk here http://dirk.eddelbuettel.com/papers/ismNov2009introHPCwithR.pdf, I have at least a partial solution using Google perftools. First, install from here http://code.google.com/p/gperftools/, and add -lprofiler to PKG_LIBS when compiling the C++ code. Then either
(a) Run R as CPUPROFILE=samples.log R, run all code and quit (or use Rscript)
(b) Use two small utility functions to turn on/off profiling:
RcppExport SEXP start_profiler(SEXP str) {
ProfilerStart(as<const char*>(str));
return R_NilValue;
}
RcppExport SEXP stop_profiler() {
ProfilerStop();
return R_NilValue;
}
Then, within R you can do
.Call("start_profiler", "samples.log")
# code that calls C++ code to be profiled
.Call("stop_profiler")
either way, the file samples.log will contain profiling information. This can be looked at with
pprof --text /Library/Frameworks/R.framework/Resources/bin/exec/x86_64/R samples.log
which produces output like
Using local file /Library/Frameworks/R.framework/Resources/bin/exec/x86_64/R.
Using local file samples.log.
Removing __sigtramp from all stack traces.
Total: 112 samples
64 57.1% 57.1% 64 57.1% _unif_rand
30 26.8% 83.9% 30 26.8% _process_system_Renviron
14 12.5% 96.4% 101 90.2% _for_profile
3 2.7% 99.1% 3 2.7% Rcpp::internal::expr_eval_methods
1 0.9% 100.0% 1 0.9% _Rf_PrintValueRec
0 0.0% 100.0% 1 0.9% 0x0000000102bbc1ff
0 0.0% 100.0% 15 13.4% 0x00007fff5fbfe06f
0 0.0% 100.0% 1 0.9% _Rf_InitFunctionHashing
0 0.0% 100.0% 1 0.9% _Rf_PrintValueEnv
0 0.0% 100.0% 112 100.0% _Rf_ReplIteration
which would probably be more informative on a real example.
I'm confused, your example is incomplete:
you don't show the (trivial) invocation of cfunction() and cxxfunction()
you don't show how you invoke the profiler
you aren't profiling the C or C++ code (!!)
Can you maybe edit the question and make it clearer?
Also, when I run this, the two example do give identical speed results as they are essentially identical. [ Rcpp would let you do this in call with sugars random number functions. ]
R> library(Rcpp)
R> library(inline)
R>
R> src.cpp <- "
+ RNGScope scope;
+ int n = as<int>(n_);
+ double x = 0.0;
+ for ( int i = 0; i < n; i++ )
+ x += (unif_rand()-.5);
+ return wrap(x);"
R>
R> src.c <- "
+ int i, n = INTEGER(n_)[0];
+ double x = 0.0;
+ GetRNGstate();
+ for ( i = 0; i < n; i++ )
+ x += (unif_rand()-.5);
+ PutRNGstate();
+ return Rf_ScalarReal(x);"
R>
R> fc <- cfunction(signature(n_="int"), body=src.c)
R> fcpp <- cxxfunction(signature(n_="int"), body=src.c, plugin="Rcpp")
R>
R> library(rbenchmark)
R>
R> print(benchmark(fc(10000L), fcpp(10000L)))
test replications elapsed relative user.self sys.self user.child sys.child
1 fc(10000) 100 0.013 1 0.012 0 0 0
2 fcpp(10000) 100 0.013 1 0.012 0 0 0
R>