Is there any evidence of a particular sizing unit taking longer to process? For instance, if you were to use rem to size your entire site would it take longer to calculate/paint the page than if everything were given a specific px value?
Is there any benefit to max-width: 16rem over max-width: 250px?
I'm under the impression that rem takes longer since it has to revert back to the root and calculate while em is like a steady stream of processing, and px would be the fastest because there's nothing to calculate.
Please let me know if anyone has any evidence of which method is faster
Edit: I started off pretty much dismissing this discussion as, well,
polishing the roof of a truck, but I had not considered css-animations
which are quite heavy pocessor users and with CSS not being a
graphically optimized process (-very inefficient) then I think
there is a slightly higher warrent for such a question, if a website
has a large number of css-animations.
Quote from question:
I'm under the impression that rem takes longer since it has to revert back to the root and calculate while em is like a steady stream of processing, and px would be the fastest because there's nothing to calculate.
No. Rem simply takes a factor of the root em value, rather than the parent em value. (As the root doesn't change I would hope that the structure processing of CSS doesn't need to keep calling it and can instead simply regain it from memory).
rem Is the way we should be writing CSS in 2016. It beats lights out of em beyond having one or two parent elements effecting the em value, for instance [from the pont of view of working out as a developer what 1.2em of 1.4 em of 1.2 em of 14px is, why not just have 1.2 of 14px as 1.2rem].
As for px, that is not a straight-to-screen result either as with many modern display devices, the pixel is not a pixel, this may be an interesting topic for you to read.
If you care about the speed of processing rem against px then I personally feel you're in effect trying to get better fuel efficiency from your truck by polishing the roof so that air resistence is reduced, your work may have a tiny impact but there are other far larger consumers of GPU,CPU ram and operating power, and many more of them.
You may also like to read this: How a CSS pixel size is calculated?
And because I want to entertain you, you may like to know that you can now generate full 3D computer game levels developed entirely through CSS. This was made in 2013! I still find it increadible!!
In this game the developer used px throughout. You could perhaps take his code and apply em and/or rem and the heaviness of the page will display if it is indeed notably faster.
Related
I have initially work units with the size of 11*11*6779. For the sake of simplicity I dont want to translate it into 1D global work size. When when i changed it into 21*21*6779 the performance is 5-6x slower than before. the code as far as i know has nothing to do with the number of threads being ran.
The amount of data transfered is only 4x bigger, which I dont think is a reason why the programm runs slower, because i tested the memory allocation process.
Note that my device has a max work items of 256*256*256, meaning I would be use half of all available work items, and this is not a dedicated device (also used for display..).
I wonder if setting the work item sizes into 21*21*6779 uses too many of my work items, or the dimensions are simply inconvenient for openCL to adjust ?
If your max work items is 256x256x256 then why are you using 21x21x6779 (where 6779 is greater than 256)? Note that if the work group size is not specified, the runtime will try to pick one that can divide up your global work size. If your dimensions not easily divisible by the runtime, it might pick bad work group sizes. That could explain why the performance changes based on global work size. I recommend you specify the work group size, and make the global work size a multiple of that (if necessary, pass in the real size as parameters and in each work item check if it is in range; this is a typical pattern you will see a lot in OpenCL).
According to MDN, the "px" unit can mean 2 completely different things depending on whether it's on a "low-dpi" device or a "high-dpi" device.
For low-dpi devices, the unit px represents the physical reference pixel; other units are defined relative to it.
or
For high-dpi devices, inches (in), centimeters (cm), and millimeters (mm) are the same as their physical counterparts. Therefore, the px unit is defined relative to them (1/96 of 1 inch).
But how exactly does it differentiate one from another? What is the cut off for "high dpi"? How can I tell which one of these 2 cases is being used on a particular device?
Found the answer on the w3 page:
In the past, CSS required that implementations display absolute units correctly even on computer screens. But as the number of incorrect implementations outnumbered correct ones and the situation didn't seem to improve, CSS abandoned that requirement in 2011. Currently, absolute units must work correctly only on printed output and on high-resolution devices.
CSS doesn't define what “high resolution” means. But as low-end printers nowadays start at 300 dpi and high-end screens are at 200 dpi, the cut-off is probably somewhere in between.
https://www.w3.org/Style/Examples/007/units.en.html
Hi, there! I was reading the W3C spec about Units of measurement (https://www.w3.org/TR/css-values-3/#reference-pixel) and the fact is that I didn't understand what is a reference pixel. Can you guys explain it to me, or point me to another reference or explanation that's easier to understand. Also I'm not sure if I really understand the other things about Units and Measurements ha ha. Really, that seems too hard.
Thank you!
The reference pixel is an attempt to standardize what "pixel" means in web development. The reason that this matters is because the physical measurement of a pixel can vary greatly depending on the pixel depth of the display.
For example, old CRT monitors had 72 pixels per inch, whereas an iPhone 7+ has 401 pixels per inch. So a literal measurement of 100px would be 1.39 inches on the CRT monitor and 0.25 inches on the iPhone.
This article also has a pretty good explanation that helped me understand it better.
A List Apart, "A Pixel Identity Crisis" by Scott Kellum. January 17, 2012
"The w3c currently defines the reference pixel as the standard for all
pixel-based measurements. Now, instead of every pixel-based
measurement being based on a hardware pixel it is based on an optical
reference unit that might be twice the size of a hardware pixel. This
new pixel should look exactly the same in all viewing situations..."
"When using a phone that you held close, a reference pixel will be
smaller on the screen than a projection you view from a distance. If
the viewer holds their phone up so it is side-by-side with the
projection, the pixel sizes should look identical no matter the
resolution or pixel density the devices have. When implemented
properly, this new standard will provide unprecedented stability
across all designs on all platforms no matter the pixel density or
viewing distance."
I'm working on optimizing a separable image downscaler. My next step is reduction of multiple samplings (nearest) of the same texel by reading all necessary texels into local memory. Here begins the fun...
The downscaler is versatile, so it can downscale anything larger into anything smaller and even take sections of an image and downscale it into a destination image. Thus the final resolution divider never is a whole number. Most of the time it will be something around 3.97 or such. This means: I do not know the required size for that local array at compile time.
To me that means: before enqueuing a task, I'll have to create a local mem object of the required size.
How do I know what workgroup sizes OpenCL will select?
If there is no way, is there a "best practice" to overcome this problem?
P.S.: I'm writing for OpenCL 1.1 compatibility.
Since you are using images, the texture cache can be relied upon instead of using shared local memory.
in these days, i'm trying program on mobile gpu(adreno)
the algorithm what i use for image processing has 'randomness' for memory access.
it refers some pixels in 'fixed' range for filtering.
BUT, i cant know exactly which pixel will be referred(depends on image)
as far as i understood. if multiple thread access local memory bank
it causes bank conflict. so in my case it should make bank conflict.
MY question: Can i eliminate bank conflict at random memory access?
or can i reduce them?
Assuming that the distances of your randomly accessed pixels is somehow normal distributed, you could think of tiling your image into subimages.
What I mean: instead of working with a (lets say) 1024x1024 image, you might have 4x4 images of size 256x256. Each of them is kept together in memory, so "near" pixel access stays within the same image object. Only the far distance operations need to access different subimages.
A second option: instead of using CLImage objects, try to save your data into an array. The data in the array can be stored in a Z-order curve sorting. This also leads to a reduced spatially distribution (compared to row-order-sorting)
But of course, this depends strongly on your image size.
There are a variety of ways to deal with bank conflicts - the size of the elements you are working with, the strides between lines and shifting the coordinates around to different memory addresses. It's never going to be as good as non-random / conflict free though and so what you will notice is depending on the image - you will see significantly different compute times.
See http://cuda-programming.blogspot.com/2013/02/bank-conflicts-in-shared-memory-in-cuda.html