Google vision not returning 'Max Results' - google-cloud-vision

When using vision to scan photos for objects, no matter what I do, I can't get it to return more than 98 labels. I'm training the algorithm to identify and count all the identifiable objects on the image, but it seems to give up after 98. I've tried lowering the tolerance and upping the max results, but still stalls out at 98 labels. Any ideas? I need it to return upwards of 300-500 labels per image.

Related

Turning a band of Sentinel 2 image into an array

I am new to Google Earth Engine and have started playing with mathematically combining different bands to define new index. The problem I am having is the visualisation of the new index - I need to define the max and min parameter when adding it to the map, and I am having troubles understanding what these two end points should be. So here come my two questions:
Is it possible to get the matrix of my image in terms of pixel values? Then I could easily see from what values they range and hence could define min and max!
What values are taken in different bands? Is it from 0 to 1 and measures intensity at given wavelength, or is it something else?
Any help would be much appreciated, many thanks in advance!
Is it possible to get the matrix of my image in terms of pixel values? Then I could easily see from what values they range and hence could define min and max!
If this is what you want to do, there's a built in way to do it. Go to the layer list, click on the gear for the layer, and in the “Range” section, pick one of the “Stretch:” options from the menu, then click “Apply”. You can choose a range in standard deviations, or 100% (min and max).
You can then use the “Import” button to save these parameters as a value you can use in your script.
(All of this applies to the region of the image that's currently visible on screen — not the entire image.)
What values are taken in different bands? Is it from 0 to 1 and measures intensity at given wavelength, or is it something else?
This is entirely up to the individual dataset you are using; Earth Engine only knows about numbers stored in bands and not units of measure or spectra. There may be sufficient information in the dataset's description in the data catalog, or you may need to consult the original provider's documentation.

Google Places API using lat/long returns a mix of results in the right and wrong locations for the same call

I am using an R package, googleway, for calls to the Places API. I am currently trying to return some eateries/restaurants and testing some common search terms to see how results differ. When I put in the word "restaurant" or "food" it all seems to be fine. When there are fewer than 20 results in a particular area, the API returns between 1 and 20 as expected. Then I put in "drinks", and suddenly half the results are in a radius of the lat/long I want (currently in Florida), and the other half are near me, which is quite far from Florida.
google_places(search_string = 'drinks',
location = c(27.638332, -81.824000),
radius = 1000,
key = api_key)
This particular example with some test coordinates of a street corner in Bowling Green, FL consistently returns a mix of two entirely different states. I cannot tell if this is due to the keyword I'm using, an error/bug in Maps/Places, or something else entirely. My hunch is that I should just rely on other keywords like "bar" or "nightclub" but it is rather odd that "drinks" appears to have such odd behavior, so I would like to know why for the future.

What is the maximum possible identicons count on GitHub?

GitHub provides default identicons as the profile picture. You can get one for your account too, like this.
There are many other implementations on GitHub, using the similar way and producing similar results.
In their implementation they choose to color some of the 25 tiles and then use one color to fill for all the non-blank tiles. Let's assume that they use a total of 20 colors to fill the tiles, they don't use this much colors, though.
This way, the maximum possible identicon count seems to be 20 x 215 only, as they reflect the identicon with the vertical axis and use only 15 tiles to generate an identicon.
Now, 20 x 215 = 655360 is a very low number compared to total number of users on GitHub.
So my question is that, is there a definite collision situation in this case or am I missing something?

Exclude graph values above certain point

I would like to ensure that when looking at my web-server response time graphs I can see a good level of detail from 0-5k on the scale of my graph. However occasionally there are metrics above the 5k (File downloads) mark which then increase the scale of the graph making it difficult to see what is going on around the regular range of values.
How do I exclude metric values from being plotted that are above 5k? Bearing in mind I do not want metrics themselves to be excluded.
Or perhaps the best thing to do would be to scale down the high points with log, but then I loose the actual scale information, which is quite useful at a glance.
Any help appreciated.
From the Graphite Documentation:
http://graphite.readthedocs.org/en/latest/render_api.html#ymax
Default: The highest value of any of the series displayed
Manually sets the upper bound of the graph. Can be passed any integer
or floating point number.
Example:
&yMax=0.2345
Looks like yMax parameter was only a suggestion at one point. Reported to be strictly enforced as of 0.9.5. For more: https://bugs.launchpad.net/graphite/+bug/412663
Also, from: http://graphite.wikidot.com/url-api-reference
yMin and yMax set the minimum and maximum y-values for the generated
image. A good use of these parameters would be min=0&max=100 when the
value you are graphing is a percentage.
Some other finds. Not sure if they're entirely relevant; might be helpful.
graphite-graph-dsl: A small DSL to describe graphite graphs
https://github.com/behrendsj/graphite-graph-dsl
Added ability to define the right y-axis min and max values: https://github.com/behrendsj/graphite-graph-dsl/commit/11e146b0b3eb82faa7c1f5db5af324c81db66144
graphene: Graphene is a realtime dashboard & graphing toolkit based on D3 and Backbone.
https://github.com/jondot/graphene
Define yMax support: https://github.com/jondot/graphene/pull/33

Accurately measuring relative distance between a set of fiducials (Augmented reality application)

Let's say I have a set of 5 markers. I am trying to find the relative distances between each marker using an augmented reality framework such as ARToolkit. In my camera feed thee first 20 frames show me the first 2 markers only so I can work out the transformation between the 2 markers. The second 20 frames show me the 2nd and 3rd markers only and so on. The last 20 frames show me the 5th and 1st markers. I want to build up a 3D map of the marker positions of all 5 markers.
My question is, knowing that there will be inaccuracies with the distances due to low quality of the video feed, how do I minimise the inaccuracies given all the information I have gathered?
My naive approach would be to use the first marker as a base point, from the first 20 frames take the mean of the transformations and place the 2nd marker and so forth for the 3rd and 4th. For the 5th marker place it inbetween the 4th and 1st by placing it in the middle of the mean of the transformations between the 5th and 1st and the 4th and 5th. This approach I feel has a bias towards the first marker placement though and doesn't take into account the camera seeing more than 2 markers per frame.
Ultimately I want my system to be able to work out the map of x number of markers. In any given frame up to x markers can appear and there are non-systemic errors due to the image quality.
Any help regarding the correct approach to this problem would be greatly appreciated.
Edit:
More information regarding the problem:
Lets say the realworld map is as follows:
Lets say I get 100 readings for each of the transformations between the points as represented by the arrows in the image. The real values are written above the arrows.
The values I obtain have some error (assumed to follow a gaussian distribution about the actual value). For instance one of the readings obtained for marker 1 to 2 could be x:9.8 y:0.09. Given I have all these readings how do I estimate the map. The result should ideally be as close to the real values as possible.
My naive approach has the following problem. If the average of the transforms from 1 to 2 is slightly off the placement of 3 can be off even though the reading of 2 to 3 is very accurate. This problem is shown below:
The greens are the actual values, the blacks are the calculated values. The average transform of 1 to 2 is x:10 y:2.
You can use a least-squares method, to find the transformation that gives the best fit to all your data. If all you want is the distance between the markers, this is just the average of the distances measured.
Assuming that your marker positions are fixed (e.g., to a fixed rigid body), and you want their relative position, then you can simply record their positions and average them. If there is a potential for confusing one marker with another, you can track them from frame to frame, and use the continuity of each marker location between its two periods to confirm its identity.
If you expect your rigid body to be moving (or if the body is not rigid, and so forth), then your problem is significantly harder. Two markers at a time is not sufficient to fix the position of a rigid body (which requires three). However, note that, at each transition, you have the location of the old marker, the new marker, and the continuous marker, at almost the same time. If you already have an expected location on the body for each of your markers, this should provide a good estimate of a rigid pose every 20 frames.
In general, if your body is moving, best performance will require some kind of model for its dynamics, which should be used to track its pose over time. Given a dynamic model, you can use a Kalman filter to do the tracking; Kalman filters are well-adapted to integrating the kind of data you describe.
By including the locations of your markers as part of the Kalman state vector, you may be able to be able to deduce their relative locations from purely sensor data (which appears to be your goal), rather than requiring this information a priori. If you want to be able to handle an arbitrary number of markers efficiently, you may need to come up with some clever mutation of the usual methods; your problem seems designed to avoid solution by conventional decomposition methods such as sequential Kalman filtering.
Edit, as per the comments below:
If your markers yield a full 3D pose (instead of just a 3D position), the additional data will make it easier to maintain accurate information about the object you are tracking. However, the recommendations above still apply:
If the labeled body is fixed, use a least-squares fit of all relevant frame data.
If the labeled body is moving, model its dynamics and use a Kalman filter.
New points that come to mind:
Trying to manage a chain of relative transformations may not be the best way to approach the problem; as you note, it is prone to accumulated error. However, it is not necessarily a bad way, either, as long as you can implement the necessary math in that framework.
In particular, a least-squares fit should work perfectly well with a chain or ring of relative poses.
In any case, for either a least-squares fit or for Kalman filter tracking, a good estimate of the uncertainty of your measurements will improve performance.

Resources