Rackspace FIles Folder 'Alias/Redirect' - cdn

I am trying to come up with a 'versioned' data system. The different groups of data I have will be updated at different intervals and are quite large (MAP TIFFS) so I'd like to avoid duplicating content as much as possible, we're talking around the 50gb mark. Say for example I have the two following categories of Maps: Country Maps & City Maps. Country Maps get updated quarterly and City maps get updated bi-annually. Over a period of 6 Months The folder structure I end up with is this:
RACKSPACE CONTAINER
|
|-JAN2014
| |
| |-Cities
| |-Countries
|
|-APR2014
| |
| |-[Cities] (Not a real folder, an alias/redirect to the Jan 2014 version)
| |-Countries
|
|-JUL2014
| |
| |-Cities
| |-Countries
|
|
My App is given the current data version for that time period (i.e JAN2014, APR2014 or JUL2014) and will use it to form the url to fetch the map file i.e blah.rackcdn.com/JAN2014/Cities/Map.file) I would like to be able to point an alias/redirect of blah.rackcdn.com/APR2014/Cities/Map.file (which doesn't exists because the older cities map data is still valid) to the old folder, Hopefully that makes sense, is there any way to accomplish this? Currently I'm using Cyberduck ftp to upload my files / directory structure to rackspace.
If I am unable to achieve this with Rackspace, is this able to be done with any other file hosting services (i.e. Google Cloud storage)?
Cheers

It is possible to simulate symlinks on Rackspace Cloud Files. See this blog post on how to do this using the Cloud Files REST API: http://developer.rackspace.com/blog/simulate-symLinks-on-cloud-files.html.

Related

BigQuery to Data Studio : Show reliable COUNT DISTINCT regardless of the selected period

in my BigQuery project I store event data integrated from Firebase. The granularity and dimension is such that trying to present raw data in Data Studio quickly makes the report become VERY slow (1-2 min per page/interaction).
I then started to think how I could create pre-aggregated tables in BigQuery to speed everything up, but quickly realised COUNT DISTINCT metrics would be a problem with this approach.
Let me explain:
SELECT user, date
FROM UNNEST([
STRUCT("Adam" AS user, "20190923" AS date),
("Bob", "20190923"),
("Carl", "20190923"),
("Adam", "20190924"),
("Bob", "20190924"),
("Adam", "20190925"),
("Carl", "20190925"),
("Bob", "20190926")
]) AS website_visits;
+------+----------+
| User | Date |
+------+----------+
| Adam | 20190923 |
| Bob | 20190923 |
| Carl | 20190923 |
| Adam | 20190924 |
| Bob | 20190924 |
| Adam | 20190925 |
| Carl | 20190925 |
| Bob | 20190926 |
+------+----------+
The above is a table of website visits.
Clearly, creating a pre-aggregated table like
SELECT date, COUNT(DISTINCT user) FROM website_visits GROUP BY date
has the limitation that the count cannot be aggregated further (or even less, dinamically) to get a total, as doing a SUM would return 8 unique users which is not correct, there are only 3 unique users.
In BigQuery, this is fixed by using HLL_COUNT, which despite the approximation works ok for me.
Now to the big question:
How to do the same so that the result is displayable in Data Studio????
HLL_COUNT.EXTRACT is not available as function in there, and in the reporting I always have to keep in mind that the date range is set by the user however (s)he likes so it's not possible to store a pre-aggregated result for ALL cases...
EDIT 1: APPROX_COUNT_DISTINCT
As per answer from Bobbylank, I tried to use APPROX_COUNT_DISTINCT.
However I found that this just seems to move the issue down the line. My fault for not explaining what's over there.
Despite being performances acceptable it does not seem possible to me to blend a data source with this calculated metric.
Example: After displaying the amount of unique users in the selected period (which now works), I'm also trying to display Average Revenue Per User (ARPU) in Data Studio like Firebase does.
To do this, I have to SUM(REVENUE) / APPROX_COUNT_DISTINCT(USER)
Clearly, REVENUE works ok with pre-aggregation and is available in the raw data. I tried then to blend the raw data with a table containing just user visits. However APPROX_COUNT_DISTINCT can't be used in the blended data definition as calculated metrics are not allowed.
Even trying to use the USER field as a metric with Count Distinct aggregation, despite returning the correct figures when showing revenue and user count separately, when I try to divide them the problem becomes aggregation (apply SUM or AVG to the field and basically the result will be AVG(REVENUE/USERS) for each day).
I also then tried to store REVENUE directly in the visits table, but was reminded by Data Studio that I can't create calculated metrics that I can't mix dimensions and metrics in a calculated field.
APPROX_COUNT_DISTINCT might be more performance friendly for you?
https://support.google.com/datastudio/answer/9189108?hl=en
Otherwise the only way I can think would be to pre-calculate several metrics (e.g. unique users on that day, 7-day cumulative, 14-day, etc.) as your customer require for each single day.
Or you could provide a 2 page report with both of these methods with the caveat that the first can be used over a time period but will be much slower?

Optimize complex scenario in Cucumber

I have been working on an automation project where I have to write cucumber test for search filter. Search filter works dynamically where parameters are nested - next parameter are populated based on previous parameter e.g. On selecting "Subscribers" next parameters in dropdown are "Name", "City", "Network". Likewise, on selecting "Service Desk", parameters in subsequent dropdown are "Status", "Ticket no.", "Assignee". I am using Scenario Outline as below:
Scenario Outline: As a user, I can search records
Given I am on search page
When I search on "<category>" and "<nestedfilter>"
Then I see records having "<category>" category
Examples:
|category |nestedfilter|
|Subscribers |Name |
|Subscribers |City |
|Subscribers |Network |
|Service Desk|Status |
|Service Desk|Ticket no. |
|Service Desk|Assignee |
The filter could be more complex as there could be more nested filters based on previous nested filters.
All I need to know if there could be a more efficient way to handle this problem? For example passing data table to step_definition for which I am not too sure.
Thanks
If you really need the order of your items to be preserved, use a data table instead of a scenario outline.
A scenario outline is a shorthand notation for multiple scenarios. The execution of each scenario is not guaranteed. Or at least it would be a mistake to assume a specific execution order. The order of the items in a data table will not change if you use a List as argument and therefore a lot safer in your case.
A common mistake with Cucumber is to use Scenario Outline and example tables to do some sort of semi-exhaustive testing. This tends to hide lots of interesting things about the functionality being developed.
I would start writing single features for the searches you are working with and explore what those searches are and why they are important. So if we start with your first one we get ...
Note: all of the following assumes a background step Given I am searching
When I search on subscribers and name
Then I should see records for subscribers
and with the second one
When I search on subscribers and city
Then I should see records for subscribers
Now it becomes clear that there is a serious flaw in these scenarios, as both scenarios are looking for the same result.
So what you are actually testing is that
The subscribers search has name and city filters
A subscriber search should return subscriber results
Now you can refactor and get
When I do a subscriber search
Then I should see city, name, network filters
When I do a subscriber search
Then I should only see subscriber results
note: This is already much more efficient as you have reduced the number of scenarios from 3 to 2, and reduced the number of searches you have to do from 3 to 1.
Now I have no idea if this is what you want to do, but this is what your current scenario is doing. However because you are using an Outline and Example tables you can't see this.
The fact that you have a drop-down and nested filters is an implementation detail, which describes how the user is trying to achieve what they want to achieve.
If you think of what you're trying to do as examples of how the system behaves, rather than tests, it might be easier. You're not looking for something exhaustive. You also want your scenarios to be specific, so that you're illustrating them with realistic data and concrete examples. If you would commonly have some typical data available, that's a perfect thing to set up using Background.
So for instance, I might have scenarios like:
Background:
Given I have subscribers
| Name | City | Network | Status | etc.
| Bob | Rome | ABC | Alive | ...
| Sam | Berlin | ABC | Dead | ...
| Sue | Berlin | DEF | Dead | ...
| Ann | Berlin | DEF | Alive | ...
| Jon | London | DEF | Dead | ...
Scenario: First level search
Given I'm on the search page
When I search for Subscribers who are in Rome
Then I should see Bob
But not Sue or Jon.
Scenario: Second level search
Given I'm on the search page
When I search for Subscribers in Berlin on the ABC network
Then I should see Sam
But not Sue or Ann
etc.
The full-system scenarios should be just enough to understand what's going on. Don't use BDD for regression. It can help with that, but scenarios will rapidly become slow and unmaintainable if you try to cover every case. Delegate to integration and unit tests where appropriate (see "the testing pyramid").

Concatenate Google Analytics results to ignore country code in URL

Our website automatically detects a user's region. Though the site structure remains the same across all regions, the content on the page can vary.
As such, URLs are fomatted as so: http://website.com/XX/pagename with XX=country code (e.g. GB, US, IT, etc.)
On Google Analytics, I want to see all of the different country versions of a single page contained as a single result.
For example, if I look at our top pages for January, I see:
| URL | page views |
|-------------------------|------------|
| website.com/US/page1 | 100 |
| website.com/GB/homepage | 60 |
| website.com/US/homepage | 40 |
| website.com/GB/page1 | 20 |
But what I want to see is:
| URL | page views |
|----------------------|------------|
| website.com/page1 | 120 |
| website.com/homepage | 100 |
Wherein the same URL (ignoring country code) is concatenated into one figure.
Is such a thing possible?
My end game here is a desire to see what our most popular pages are across the site in total, regardless of which country the user is browsing from.
Thanks!
One option is to use an advanced filter in GA so that you take something like website.com/US/page1 and replace it with website.com/page1. This only works on data moving forward from when the filter is applied, and does not change historical data, and cannot be undone once applied. This is another reason why it's always a good idea to have a Raw view which is unfiltered.
For the Advanced Filter, you need to do something like this:
where it looks for the pattern /{any two letters}/{anything else} and outputs just the /{anything else} part.

How to create a table like structure in drupal content type?

I am almost there with this but cannot seem to get this functionality going as planned.
I am creating a questionnaire using drupal content type. What I am trying to do is to create a table like structure as below in content type. The second and third column contain check boxes and first column data(i.e computer, internet) and first row(i.e Everyone have access , Nobody have access) are taxonomy terms . Is it possible to display like this in content type by using some modules in drupal? Anybody have any better suggestions?
| | Everyone have access | Nobody have access |
---------------------------------------------------------
| Computers | 1 | 2 |
---------------------------------------------------------
| Internet | 1 | 2 |
---------------------------------------------------------
| Fax | 1 | 2 |
---------------------------------------------------------
You can use the Term Level Field module. This module provides a field type for referencing terms with a level to an entity.
You may use Editable fields with Views module. Of course, if you didn't need such a display (table forms) you should use Views Bulk Operations modules.
If you need to do this in a node use Tableform module. But if you want to show nodes whike editing a node it is the same. A node should not be used for tasks just for content.

Amazon MWS Flat File Modify Listing

I'm trying to find the documentation for how the flat file looks for modifying the quantity of a product on Amazon.
This is what we send at the moment but it would be good to see what the list of headings we can use.
SKU | Quantity
000 | 1
I'm guessing that this is correct,
SKU | Price | Quantity
000 | 9.99 | 1
Any links would be welcome.
Amazon's MWS site https://developer.amazonservices.com/
You can get to the full description of the flat file feed specifications to send the flat file in the correct format to the MWS by going to the following URL after you're logged into your sellercentral or mws account.
https://sellercentral.amazon.com/gp/help/help.html/ref=ag_13461_cont_help?ie=UTF8&itemID=13461&language=en_US
have a look at the scratchpad, this will bring back flat files so you can see how they look.
https://mws.amazonservices.co.uk/scratchpad/index.html - UK Version
https://mws.amazonservices.com/scratchpad/index.html - General (US) version

Resources