Im writing a program that analyzes a picture and returns the most prominent color. Its simple to get the most frequently occurring color but I've found that very often this color is a dark black/gray/brown or a white and not the "color" you would associate with the image. So I'd like to get the top 5 colors and compare them based on some metric to determine which color is most "Vibrant/Colorful" and return that color.
Saturation wont work in this case because a saturated black will be ranked above a lighter pink and Brightness/Luminance wont work because a white will be ranked about a darker red. Im want to know what metric I can use to judge this. I recognize this is kind of an obtuse question but I know of other programs that do similar things so I assume there must be some way to calculate "Vibrancy/Colorfulness". It doesn't need to be perfect just work most of the time
For what its worth I'm working in javascript but the actual code is not the issue, I just need the equation I can use and then I can implement it
There is no common way to define "vibrancy" of a color. Thus, you can try combining multiple metrics such as "saturation", "brightness", and "luminance". The lower the overall metric is, the better. The following is an example in pseudocode.
// Compare metrics to "ideal"
var deltaSat = Saturation(thisColor) - idealSat;
var deltaBright = Brightness(thisColor) - idealBrightness;
var deltaLum = Luminance(thisColor) - idealLum;
// Calculate overall distance from ideal; the lower
// the better.
var dist = sqrt((deltaSat*deltaSat) +
(deltaBright*deltaBright) +
(deltaLum*deltaLum))
(If your issue is merely that you're having trouble calculating a metric for a given color, see my page on color topics for programmers.)
If your criteria for "vibrancy" are complex enough, you should consider using machine learning techniques such as classification algorithms. In machine learning in general:
You train a model to recognize different categories (such as "vibrant" and "non-vibrant" colors in this case).
You test the model to check how well it performs.
Once the model works well, you deploy the model and use it to predict whether a color is "vibrant" or "non-vibrant".
Machine learning is quite complex, however, so you should try the simpler method given earlier in this answer.
After trying several different formulas I had the most success with the following
let colorfulness = ((max+ min) * (max-min))/max
where max & min are the highest and lowest RGB values, respectively. This page has a more detailed explanation of the formula itself.
This will return a value between 0 and 255 with 0 being least colorful and 255 being most. From running this on a bunch of different colors, I found that for my application any value above 50 was colorful enough, thought you can adjust this.
My final code is as follows
function getColorFromImage(image) {
//gets the three most commonly occuring, distinct colors in an image as RGB values, in order of their frequency
let palette = getPaletteFromImage(image, 3)
for (let color of palette){
var colorfulness = 0
//(0,0,0) will return NAN if used in the formula, if (0,0,0) leave colorfulness as its default 0
if (color != [0,0,0]){
//get min & max values
let min = Math.min(color)
let max = Math.max(color)
//calculate colorfulness of color
colorfulness = ((max+ min) * (max-min))/max
}
//compare color's colorfulness against a threshold to determine if color is "colorful" enough
//ive found 50 is a good threshold but adjust as needed
if (colorfulness >= 50.0){
return color
}
}
//if none of the colors are deemed to be sufficiently colorful, just return the most common
return palette[0]
}
Related
I have read some articles on perlin noise but each seems to have their own way of implementation:
In this article, the gradient function returns a single double value.
In this article, the gradient is generated as a 3D vector.
In this article a static 256 array of random gradient vectors is generated and a random one is picked using the permutation table and then more complex details of spherical gradients are discussed.
And these are just a few of the articles I saw. With all these variations of the same algorithm which one do I use or which one is suitable for what purpose?
I have generated terrains and height maps with each of these techniques and their respective outputs widely differ in their own ways and I can't tell if I am doing it right cause I don't know what to look for in the output (cause it's just random values at the end)
I am just looking for some context on when to use what so any insight would be very usefull
There are multiple ways to implement the same algorithm, some are faster or slower than others, some are easier or harder to understand. The original implementation by Ken Perlin is difficult to understand by just looking at it. So some of the articles you linked (including #2, which I wrote, yay!), try to simplify the implementation to make it easier to understand.
But in the end, its exactly the same algorithm:
Take the input, calculate the coordinates of the 4 corners of the square (for 2D Perlin noise, or cube if using the 3D version) containing the input point
Calculate a random value for all 4 of them (by first assigning a random gradient vector to each one (there are 4 possibilities in 2D: (+1, +1), (-1, +1), (-1, -1) and (+1, -1)), then calculating the dot product between this random gradient vector and the vector from the corner of the square to the input point)
Finally, smoothly interpolate between those 4 random values to get a final value
In article #1, the grad function returns the dot product directly, whereas in article #2, vector objects are created and a dot product function is called to make it explicit what is being done (this will probably be a bit slower than the other implementations since a lot of vector objects are created and used briefly each time you want to run the algorithm).
Whether 2 implementations will produce the same terrain / height maps depends on if they generate the same random values for each corner of the square/cube (the results of the dot products). If 2 algorithms generate the same random values for every single integer points on the grid (all the corners of all the possible squares/cubes), then they will produce the same results. Ken Perlin's original implementation and the 3 articles all use an array of integers to generate a random gradient vector for each corner (out of 4 possible choices) to calculate the dot product. So in theory if the arrays are identical, then they should produce the same results. (Unless maybe if some implementation uses another method to generate the random vectors.)
I'm not really sure if that answers you questions, so don't hesitate to ask something else :)
Edit:
Generally, you would not use Perlin noise alone. So for every final value you want (for example a single pixel in a height map texture), you would call the noise function multiple times (octaves). For example:
float finalValue = 0.0f;
float amplitude = 1.0f;
float frequency = 1.0f;
int octaveCount = 8;
for (int octave = 0; octave < octaveCount; ++octave) {
finalValue += amplitude * noise(x * frequency, y * frequency, z * frequency);
amplitude *= 0.5f;
frequency *= 2.0f;
}
// Do something fun with 'finalValue'
Frequency, amplitude and the number of octaves are the most common parameters you can play with to produce different values.
If, say, you are generating a terrain, you would want many octaves. The first one will produce the rough shape of the mountains, so you would want a high amplitude (1.0 in the example code) and low frequency (also 1.0 in the above code). But just this octave would result in really smooth terrain with no details. For those small details, you would want more octaves, but with higher frequencies (so in the same range for your inputs (x, y, z), you would have a lot more ups and downs of the Perlin noise value), and lower amplitudes (you want small details, because if you would keep the same amplitude as the first octave (1.0, in the example code), there would be a lot of ups and downs really close together and really high, and this would result in a really rough moutains (imagine 100 meters 80 degrees drops and slopes every few meters you walk))
You can play with those parameters to get different results. There is also something called "domain warping" or "warped noise" that you can look up. Basically, you call a noise function as the input of a noise function. Like instead of calling:
float result = noise(x, y, z);
You would call something like:
// The numbers used are arbitrary values, you can just play around until you get something cool
float result = noise(noise(x * 1.7), 0.5 * noise(y * 4.1), noise(z * 2.3));
This can produce really interesting results
I understand that domain or color wheel plotting is typical for complex functions.
Incredibly, I can't find a million + returns on a web search to easily allow me to reproduce some piece of art as this one in Wikipedia:
There is this online resource that reproduces plots with zeros in black - not bad at all... However, I'd like to ask for some simple annotated code in Octave to produce color plots of functions of complex numbers.
Here is an example:
I see here code to plot a complex function. However, it uses a different technique with the height representing the Re part of the image of the function, and the color representing the imaginary part:
Peter Kovesi has some fantastic color maps. He provides a MATLAB function, called colorcet, that we can use here to get the cyclic color map we need to represent the phase. Download this function before running the code below.
Let's start with creating a complex-valued test function f, where the magnitude increases from the center, and the phase is equal to the angle around the center. Much like the example you show:
% A test function
[xx,yy] = meshgrid(-128:128,-128:128);
z = xx + yy*1i;
f = z;
Next, we'll get its phase, convert it into an index into the colorcet C2 color map (which is cyclic), and finally reshape that back into the original function's shape. out here has 3 dimensions, the first two are the original dimensions, and the last one is RGB. imshow shows such a 3D matrix as a color image.
% Create a color image according to phase
cm = colorcet('C2');
phase = floor((angle(f) + pi) * ((size(cm,1)-1e-6) / (2*pi))) + 1;
out = cm(phase,:);
out = reshape(out,[size(f),3]);
The last part is to modulate the intensity of these colors using the magnitude of f. To make the discontinuities at powers of two, we take the base 2 logarithm, apply the modulo operation, and compute the power of two again. A simple multiplication with out decreases the intensity of the color where necessary:
% Compute the intensity, with discontinuities for |f|=2^n
magnitude = 0.5 * 2.^mod(log2(abs(f)),1);
out = out .* magnitude;
That last multiplication works in Octave and in the later versions of MATLAB. For older versions of MATLAB you need to use bsxfun instead:
out = bsxfun(#times,out,magnitude);
Finally, display using imshow:
% Display
imshow(out)
Note that the colors here are more muted than in your example. The colorcet color maps are perceptually uniform. That means that the same change in angle leads to the same perceptual change in color. In the example you posted, for example yellow is a very narrow, bright band. Such a band leads to false highlighting of certain features in the function, which might not be relevant at all. Perceptually uniform color maps are very important for proper interpretation of the data. Note also that this particular color map has easily-named colors (purple, blue, green, yellow) in the four cardinal directions. A purely real value is green (positive) or purple (negative), and a purely imaginary value is blue (positive) or yellow (negative).
There is also a great online tool made by Juan Carlos Ponce Campuzano for color wheel plotting.
In my experience it is much easier to use than the Octave solution. The downside is that you cannot use perceptually uniform coloring.
I am monitoring an audio source and visualizing the power of each channel. I get a number out of the api (averagePowerForChannel, but the language/platform shouldn't be important for this problem).
When I add both numbers together I have a scale from -240...0. This makes sense as this is the decibel range.
I transform this scale to a linear representation of the same numbers from 0...1. (I understand that decibels are logarithmic, I leave that alone and just map the scale linearly)
I then give the 0...1 value to an alpha channel that nicely represents the audio being played.
The problem is it's not showing enough change aesthetically. The value shifts slightly and usually hovers around 80:
alpha: 0.820713937282562
alpha: 0.816978693008423
alpha: 0.830410122871399
...
As you might imagine, this just creates a mild flicker.
Instead I'd like to accentuate the peaks of the audio. I have thrown some different methods at it:
// var alpha = 1 / (1 + exp(1-linear)) // never gets fully bright, sits at about .45
// var alpha = 1 - exp2(-linear) // stays around .45
// var alpha = linear / linear + 1
These do not get me a good result, but then again I don't have any idea what I was trying to do.
Goal:
low values on the range get pushed to zero or near zero (could even shift the range down 0.2 after the curve is calculated)
Mid values are pushed lower
High values have their differences accentuated (eg: .83 is shifted very close to 1, but .81 is shifted to .5)
I think I might want an exponential curve? I'm not sure. This is a very specific problem with known inputs so a magic number solution is acceptable.
I get a satisfactory visual by shifting the range to an interesting area, then using an exponential curve to emphasize changes from there on:
var alpha = volume / maxVolume
alpha = alpha - 0.5 // Shift the range over to the area with interesting differences in our source tracks
alpha = pow(alpha, 3) // Emphases the changes in this range
alpha *= 10 // Fix the decimal place
Will accept a better/more pure answer--for this I just wiggled numbers until they got me a good visual result. I'm sorry for grossing out the CS folks here :)
The best answer may be frequency isolation, but there is enough interesting difference to make a good visual without it.
Not sure why you don't invert the logarthmic scale:
decibels = 10 * log10( value );
the inverse is just algebra:
value = pow(10.0, decibels/10);
Observe that since your decibels are between -240 and 0, the value is between 0 (exclusive) and 1 (inclusive). That should ensure your values are more sparsely distributed. However if it still isn't, then your audio configuration might not be detecting significant changes in average amplitude - a not-so-unlikely possibility. In that case you might have to look at decomposing the audio into particular frequencies and looking at the amplitude of each frequency.
I'm trying to achieve the ramp effect as seen here:
(source: splashdamage.com)
Blending the textures based on a distribution pattern is easy. Basically, just this (HLSL):
Result = lerp(SampleA, SampleB, DistributionPatternSample);
Which works, but without the ramp.
http://aaronm.nuclearglory.com/private/stackoverflow/result1.png
My first guess was that to incorporate "Ramp Factor" I could just do this:
Result = lerp(A, B, (1.0f - Ramp)*Distribution);
However, that does not work because if Ramp is also 1.0 the result would be zero, causing just 'A' to be used. This is what I get when Ramp is 1.0f with that method:
http://aaronm.nuclearglory.com/private/stackoverflow/result2.png
I've attempted to just multiply the ramp with the distribution, which is obviously incorrect. (Figured it's worth a shot to try and discover interesting effects. No interesting effect was discovered.)
I've also attempted subtracting the Ramp from the Distribution, like so:
Result = lerp(A, B, saturate(Distribution - Ramp));
But the issue with that is that the ramp is meant to control sharpness of the blend. So, that doesn't really do anything either.
I'm hoping someone can inform me what I need to do to accomplish this, mathematically. I'm trying to avoid branching because this is shader code. I can simulate branching by multiplying out results, but I'd prefer not to do this. I am also hoping someone can fill me in on why the math is formulated the way it is for the sharpness. Throwing around math without knowing how to use it can be troublesome.
For context, that top image was taken from here:
http://wiki.splashdamage.com/index.php/A_Simple_First_Megatexture
I understand how MegaTextures (the clip-map approach) and Virtual Texturing (the more advanced approach) work just fine. So I don't need any explanation on that. I'm just trying to implement this particular blend in a shader.
For reference, this is the distribution pattern texture I'm using.
http://aaronm.nuclearglory.com/private/stackoverflow/distribution.png
Their ramp width is essentially just a contrast change on the distribution map. A brute version of this is a simple rescaling and clamp.
Things we want to preserve are that 0.5 maps to 0.5, and that the texture goes from 0 to 1 over a region of width w.
This gives
x = 0.5 + (x-0.5)/w
This means the final HLSL will look something like this:
Result = lerp(A, B, clamp( 0.5 + (Distribution-0.5)/w, 0, 1) );
Now if this ends up looking jaggy at the edges you can switch to using a smoothstep. In shich case you'd get
Result = lerp(A, B, smoothstep( 0.5 + (Distribution-0.5)/w, 0, 1) );
However, one thing to keep in mind here is that this type of thresholding works best with smoothish distribution patters. I'm not sure if yours is going to be smooth enough (unless that is a small version of a mega texture in which case you're probabbly OK.)
I am looking for a fairly simple image comparison method in AS3. I have taken an image from a web cam (with no subject) passed it in to bitmap data, then a second image is taken (this time with a subject) to compare this data, from these two images I would like to create a mask from the pixels that match on both bitmaps. I have been scratching my head for a while, and I am not really making any progress. Could any one point me in the right direction for pixel comparison method, something like getPixel32()
Cheers
Jono
use compare to create a difference between the two and then use treshold to extract the parts that interest you.
edit: actually it is pretty straight forward. the trick is to apply the threshold multiple times per channel using the mask parameter (otherwise the comparison only makes little sense, since 0x010000 (which is almost black) is consider greater than 0x0000FF (which is anything but black)). here's how:
var dif:BitmapData;//your original bitmapdata
var mask:BitmapData = new BitmapData(dif.width, dif.height, true, 0);
const threshold:uint = 0x20;
for (var i:int = 0; i < 3; i++)
mask.threshold(dif, dif.rect, new Point(), ">", threshold << (i * 8), 0xFF000000, 0xFF << (i * 8));
this creates a transparent mask. then the threshold is applied for all three channels, setting the alpha channel to fully opaque where the channels value exceeds the threshold value (you might wanna decrease it).
you can isolate the foreground object ("the guy in front of the webcam") by copying the alpha channel from the mask to the current video image.
one of the problems here is that you want to find if a pixel has ANY change to it, and if it does then to convert that pixel to another color (for masking). Unfortunately, a webcam's quality isn't great so even if your scene does not change at all the bitmapdata coming from the webcam will change slightly. Therefor, when your subject steps into frame...you will get pixel changes for the subject...but also noise in other areas due to lighting changes or camera quality. What you'll need to do is write a function that analyzes the result of a bitmapdaya.compare() for change in area's larger than _____ to determine if there is enough change to warrant an actual object being there. That will help remove noise and make your mask more accurate.