I am using an Arduino Due with a TMP36 (for reading temperature). Here is my formula that converts the readings to °F:
tempReading = analogRead(tempPin);
voltage = tempReading * 5.0; // Saves the voltage
voltage /= 1024.0;
tempC = (voltage - 0.5) * 100 ; //Converts to Celsius
tempF = (tempC * 9.0 / 5.0) + 32; //Converts to Fahrenheit
In the serial, my Arduino is printing out temperatures from 90-100 °F, and my house is set to about 70 °F. What the problem here be?
From http://arduino.cc/en/Main/ArduinoBoardDue:
"Unlike other Arduino boards, the Arduino Due board runs at 3.3V. The maximum voltage that the I/O pins can tolerate is 3.3V. Providing higher voltages, like 5V to an I/O pin could damage the board."
So likely you should multiply tempreading not by 5, but by 3.3.
Related
I want to measure RPM of a metal wheel using an inductive proximity sensor (npn) and a LM2917N which is supose to convert frequency into voltage which I intend to read using an ESP32 since has 12bits ADC.
The wheel has 2 holes which will be "see" by the sensor and a diameter of 50mm. Basically for a complete RPM sensor needs to "see" 2 wholes. Considering the max speed of the wheel which I intend to measure is 5 km/h I made following calculations:
Max rpm of the wheel will be around 530 rpm.
For that rpm sensor will get max 1060 pulses per minute which means about 17.67 Hz
Min rpm which I would like to measure is 100 rpm which means about 3.2 Hz.
Now the concerns:
I see in LM2917N datasheet that the input voltage can be 0 and 28 V, in my setup will be powered at 12V and supply voltage I assume will be same 12V since proximity sensor is powered also on same power supply as LM2917N.
I am not able to do calculations of the C1, C2 and R1 based on my setup and also have to find a way to put the output voltage in range of 0-3.3V (esp32 limits)
Second help is needed to understand how to match the read voltage to frequency (eg - 1V means 10Hz or rpm...)
Any help will be highly appreciated
Thanks in advance!
after reading datasheet : formula is Vo = R1 × C1 × VCC × f
with Vcc=12V, C1=0.1 mF and R1=100K >> V0=0.12V/Hz
for 18Hz, output will be 2.16V
so you can connect directly output to analog input and convert value as
int getRPM() {
float V0 = 3.3 * analogRead(A0)/1024;
float F = V0 / 0.12;
int rpm = F * 60;
return(rpm);
}
How I can calculate accurate energy if I have Power, Current , Voltage values
This is the code of energy calculation, the result's it's wrong so how I can fix that
I want to measure apparent energy, I don't have a problem in V , I, P values
if(millis() >= energyLastSample + 1)
{
energySampleCount = energySampleCount + 1;
energyLastSample = millis();
}
if(energySampleCount >= 1000)
{
apparent_energy_l1 = apparent_power_l1/3600.0;
finalEnergyValue_l1 = finalEnergyValue_l1 + apparent_energy_l1;
apparent_energy_l2 = apparent_power_l2/3600.0;
finalEnergyValue_l2 = finalEnergyValue_l2 + apparent_energy_l2;
apparent_energy_l3 = apparent_power_l3/3600.0;
finalEnergyValue_l3 = finalEnergyValue_l3 + apparent_energy_l3;
// Serial.print(finalEnergyValue,2);
// Serial.println("test");
energySampleCount = 0 ;
}
energy_total= finalEnergyValue_l1+finalEnergyValue_l2+finalEnergyValue_l3;
}
Some tips about power calculation using Arduino or any microcontroller,
open-source code or project,
guidelines to solve my problem
Note that energy (W x t) is a measurement of power over time, while power is a measurement of work, meaning that you cannot simply divide power by 3600 (which would be the factor to convert from seconds to hours) to get an energy value. Power (W) is a measurement of how much work for example a device is currently doing. If you want to calculate the Energy consumed by a device, you will have to continuously measure the Power, for example in 1s intervals, and add it to a counter. Then you have a value which represents Ws - Watt seconds. You can then calculate the Wh consumed from that value.
Example:
You have a device which consumes 300W of power. You keep that device running for exactly 3 hours. If you measure the power consumption every second as described, you will have measured 3240000 Ws. 3240000 Ws / 3600 = 900Wh / 1000 = 0,9 kWh. You can of course change your measurement interval to fit your needs in regard to accuracy.
Pseudocode:
if ( millis() >= lastmillis + 1000 )
{
lastmillis = millis();
wattseconds = wattseconds + power; #increment energy counter by current power
kilowatthours = wattseconds / 3600000;
print(kilowatthours)
}
You could of course use a one second interrupt with an external RTC to get a more accurate timing.
If the voltage across a 16 mF capacitor is 7 volts at t=0, find the voltage across the capacitor after 0.2 seconds of discharging through a 120 Ω resistor.
Substitute your values into the capacitor discharge equation:
Vc = Vi * exp(-t/RC)
Vc = voltage across the capacitor at time 't'
Vi = Voltage across capacitor at t = 0
I want to calculate the max network throughput on 1G Ethernet link. I understand how to estimate max rate in packets/sec units for 64-bytes frame:
IFG 12 bytes
MAC Preamble 8 bytes
MAC DA 6 bytes
MAC SA 6 bytes
MAC type 2 bytes
Payload 46 bytes
FCS 4 bytes
Total Frame size -> 84 bytes
Now for 1G link we get:
1,000,000,000 bits/sec * 8 bits/byte => 1,488,096 fps
As I understand, this is a data link performance, correct?
But how to calculate throughput in megabits per second for different packets size, i.e. 64,128...1518? Also, how to calculate UDP/TCP throughput, since I have to consider headers overhead.
Thanks.
Max throughput over Ethernet = (Payload_size / (Payload_size + 38)) * Link bitrate
I.e. if you send 50 bytes of payload data, max throughput would be (50 / 88) * 1,000,000,000 for a 1G link, or about 568 Mbit/s. If you send 1000 bytes of payload, max throughput is (1000/1038) * 1,000,000,000 = 963 Mbit/s.
IP+UDP adds 28 bytes of headers, so if you're looking for data throughput over UDP, you should use this formula:
Max throughput over UDP = (Payload_size / (Payload_size + 66)) * Link bitrate
And IP+TCP adds 40 bytes of headers, so that would be:
Max throughput over TCP = (Payload_size / (Payload_size + 78)) * Link bitrate
Note that these are optimistic calculations. I.e. in reality, you might have extra options in the header data that increases the size of the headers, lowering payload throughput. You could also have packet loss that causes performance to drop.
Check out the Wikipedia article on the ethernet frame, and particularly the "Maximum throughput" section:
http://en.wikipedia.org/wiki/Ethernet_frame
Should be an easy one but my OpenCL skills are completely rusty. :)
I have a simple kernel that does the sum of two arrays:
__kernel void sum(__global float* a, __global float* b, __global float* c)
{
__private size_t gid = get_global_id(0);
c[gid] = log(sqrt(exp(cos(sin(a[gid]))))) + log(sqrt(exp(cos(sin(b[gid])))));
}
It's working fine.
Now I'm trying to use local memory hoping it could speed things up:
__kernel void sum_with_local_copy(__global float* a, __global float* b, __global float* c, __local float* tmpa, __local float* tmpb, __local float* tmpc)
{
__private size_t gid = get_global_id(0);
__private size_t lid = get_local_id(0);
__private size_t grid = get_group_id(0);
__private size_t lsz = get_local_size(0);
event_t evta = async_work_group_copy(tmpa, a + grid * lsz, lsz, 0);
wait_group_events(1, &evta);
event_t evtb = async_work_group_copy(tmpb, b + grid * lsz, lsz, 0);
wait_group_events(1, &evtb);
tmpc[lid] = log(sqrt(exp(cos(sin(tmpa[lid]))))) + log(sqrt(exp(cos(sin(tmpb[lid])))));
event_t evt = async_work_group_copy(c + grid * lsz, tmpc, lsz, 0);
wait_group_events(1, &evt);
}
But there is two issues with this kernel:
it's something like 3 times slower than the naive implementation
the results are wrong starting at index 64
My local-size is the max workgroup size.
So my questions are:
1) Am I missing something obvious or is there really a subtlety?
2) How to use local memory to speed up the computation?
3) Should I loop inside the kernel so that each work-item does more than one operation?
Thanks in advance.
Your simple kernel is already optimal w.r.t work-group performance.
Local memory will only improve performance in cases where multiple work-items in a work-group read from the same address in local memory. As there is no shared data in your kernel there is no gain to be had by transferring data from global to local memory, thus the slow-down.
As for point 3, you may see a gain by processing multiple values per thread (depending on how expensive your computation is and what hardware you have).
As you probably know you can explicitly set the local work group size (LWS) when executing your kernel using:
clEnqueueNDRangeKernel( ... bunch of args include Local Work Size ...);
as discussed here. But as already mentioned by Kyle, you don't really have to do this because OpenCL tries to pick the best value for the LWS when you pass in NULL for LWS argument.
Indeed the specification says: "local_work_size can also be a NULL value in which case the OpenCL implementation will determine how to be break the global work-items into appropriate work-group instances."
I was curious to see how this played out in your case so I setup your calculation to verify the performance against the default value chosen by OpenCL on my device.
In case your interested I setup some arbitrary data:
int n = powl(2, 20);
float* a = (float*)malloc(sizeof(float)*n);
float* b = (float*)malloc(sizeof(float)*n);
float* results = (float*)malloc(sizeof(float)*n);
for (int i = 0; i<n; i++) {
a[i] = (float)i;
b[i] = (float)(n-i);
results[i] = 0.f;
}
and then after defining all of the other OpenCL structures I varied, lws = VALUE, from 2 to 256 (max allowed on my device for this kernel) in powers of 2, and measured the wall-clock time (note: can also use OpenCL events):
struct timeval timer;
int trials = 100;
gettimeofday(&timer, NULL);
double t0 = timer.tv_sec+(timer.tv_usec/1000000.0);
// ---------- Execution ---------
size_t global_work_size = n;
size_t lws[] = {VALUE}; // VALUE was varied from 2 to 256 in powers of 2.
for (int trial = 0; trial<trials; trial++) {
clEnqueueNDRangeKernel(cmd_queue, kernel[0], 1, NULL, &global_work_size, lws, 0, NULL, NULL);
}
clFinish(cmd_queue);
gettimeofday(&timer, NULL);
double t1 = timer.tv_sec+(timer.tv_usec/1000000.0);
double avgTime = (double)(t1-t0)/trials/1.0f;
I then plotted the total execution time as a function of the LWS and as expected the performance varies by quite a bit, until the best value of LWS = 256, is reached. For LWS > 256, the memory on my device is exceeded with this kernel.
FYI for these tests I am running a laptop GPU: AMD ATI Radeon HD 6750M, Max compute units = 6 and the CL_DEVICE_LOCAL_MEM_SIZE = 32768 (so no big screamer compared other GPUs)
Here are the raw numbers:
LWS time(sec)
2 14.004
4 6.850
8 3.431
16 1.722
32 0.866
64 0.438
128 0.436
256 0.436
Next, I checked the default value chosen by OpenCL (passing NULL for the LWS) and this corresponds to the best value that I found by profiling, i.e., LWS = 256.
So in the code you setup you found one of the suboptimal cases, and as mentioned before, its best to let OpenCL pick the best values for the local work groups, especially when there is no shared data in your kernel between multiple work-items in a work-group.
As to the error you got, you probably violated a constraint (from the spec):
The total number of work-items in the work-group must be less than or equal to the CL_DEVICE_MAX_WORK_GROUP_SIZE
Did you check that in detail, by querying the CL_DEVICE_MAX_WORK_GROUP_SIZE for your device?
Adding to what Kyle has written: It has to be multiple work items reading from the same address; if it's just each work item itself reading multiple times from the same address - then again local memory won't help you any; just use the work item's private memory, i.e. variables you define within your kernel.
Also, some points not related to the use of local memory:
log(sqrt(exp(x)) = log(exp(x)) / 2 = x / 2 ... assuming it's the natural logarithm.
log(sqrt(exp(x)) = log(exp(x)) / 2 = x / (2 ln(2)) ... assuming it's the base-2 logarithm. Compute ln(2) in advance of course.
If you really did have some complex function-of-a-function-of-a-function, you might be better off using a Taylor series expansion. For example, your function expands to 1/2-x^2/4+(5 x^4)/48+O(x^6) (order 5).
The last term is an error term, which you can bound from above to choose the appropriate order for the expansion; the error term should not be that high for 'well-behaving' functions. The Taylor expansion calculation might even benefit from further parallelization (but then again, it might not).