WSO2 analytics server database is growing - wso2-api-manager

I am using WSO2 API Manager along with it's analytics server. I configured MySQL as it's database.
After a year of PROD use, I found that there are couple of tables from Analytics module, which consumes most of the DB space, around 95%.
Would like to know the significance of these tables. As well the challenges if we delete those tables.
Table names are
+--------------------------------+------------------------------------------------------+------------+
| Database | Table | Size in MB |
+--------------------------------+------------------------------------------------------+------------+
| wso2_analytics_event_store | anx___7lsekeca_ | 665.03 |
| wso2_analytics_event_store | anx___7lmnf2xa_ | 638.00 |
| wso2_analytics_event_store | anx___7lqcf_8o_ | 636.14 |
| wso2_analytics_event_store | anx___7lmk3tr0_ | 398.13 |
| analytics_processed_data_store | anx___7lpteea4_ | 282.75 |
| analytics_processed_data_store | anx___7lsn7ita_ | 249.97 |
| wso2_analytics_event_store | anx___7lsgqyce_ | 209.25 |
| wso2_analytics_event_store | anx___7lmno15m_ | 207.25 |
| wso2_analytics_event_store | anx___7lver1fy_ | 191.16 |

You can enable data purging for analytics tables. See below section taken from the docs.
Ref: https://docs.wso2.com/display/AM220/Purging+Analytics+Data

Related

volume backup create what is errno 22?

Trying to create a volume backup both using the web UI and the cmd and keep getting errno 22. I'm unable to find information about the error or how to fix it. Anyone knows where I should start looking?
(openstack) volume backup create --force --name inventory01_vol_backups 398ee974-9b83-4918-9935-f52882b3e6b7
(openstack) volume backup show inventory01_vol_backups
+-----------------------+------------------------------------------------------------------+
| Field | Value |
+-----------------------+------------------------------------------------------------------+
| availability_zone | None |
| container | None |
| created_at | 2021-08-03T23:45:49.000000 |
| data_timestamp | 2021-08-03T23:45:49.000000 |
| description | None |
| fail_reason | [errno 22] RADOS invalid argument (error calling conf_read_file) |
| has_dependent_backups | False |
| id | 924c6e62-789e-4e51-9748-927695fc744c |
| is_incremental | False |
| name | inventory01_vol_backups |
| object_count | 0 |
| size | 30 |
| snapshot_id | None |
| status | error |
| updated_at | 2021-08-03T23:45:50.000000 |
| volume_id | 398ee974-9b83-4918-9935-f52882b3e6b7 |
+-----------------------+------------------------------------------------------------------+
The issue was caused due to a bug in Cinder version 16.2.1.dev13. Updating cinder to the latest version solved the issue

Microstack - Cannont access (ping/ssh) launched VMs

I am trying to access some launched VMs without success. I followed this tutorial to create a private network. It is listed below:
+--------------------------------------+----------+--------------------------------------+
| ID | Name | Subnets |
+--------------------------------------+----------+--------------------------------------+
| 326a319c-e75d-48f1-ac36-aed342c45874 | private | f16b8b8c-482e-4cf5-a5d6-74e284b7e0f1 |
The security groups are listed below:
microstack.openstack security group list
+--------------------------------------+---------+------------------------+----------------------------------+------+
| ID | Name | Description | Project | Tags |
+--------------------------------------+---------+------------------------+----------------------------------+------+
| 04c5c579-91bf-4497-bd01-47c7fa69df81 | default | Default security group | 9c12393bf2e54547bef78aac510ba6c6 | [] |
| 3c69498c-c210-48c8-ba43-fbf60a0c224e | default | Default security group | 37f73779b3cd42dc96044ea0fd6d1e98 | [] |
| 5a20b02a-aac4-4c62-9ea2-24dfd8c59f67 | default | Default security group | | [] |
+--------------------------------------+---------+------------------------+----------------------------------+------+
I am using the following security group:
microstack.openstack security group show 3c69498c-c210-48c8-ba43-fbf60a0c224e
+-----------------+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| Field | Value |
+-----------------+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| created_at | 2020-08-14T17:54:45Z |
| description | Default security group |
| id | 3c69498c-c210-48c8-ba43-fbf60a0c224e |
| location | Munch({'cloud': '', 'region_name': '', 'zone': None, 'project': Munch({'id': '37f73779b3cd42dc96044ea0fd6d1e98', 'name': 'admin', 'domain_id': None, 'domain_name': 'default'})}) |
| name | default |
| project_id | 37f73779b3cd42dc96044ea0fd6d1e98 |
| revision_number | 3 |
| rules | created_at='2020-08-14T17:54:45Z', direction='egress', ethertype='IPv6', id='1e5c2fed-7c7a-4dd4-9e11-c87d0de012ee', updated_at='2020-08-14T17:54:45Z' |
| | created_at='2020-08-14T17:54:45Z', direction='ingress', ethertype='IPv4', id='36394ec6-0f35-4b26-9788-61bf76a08088', remote_group_id='3c69498c-c210-48c8-ba43-fbf60a0c224e', updated_at='2020-08-14T17:54:45Z' |
| | created_at='2020-08-14T17:54:45Z', direction='ingress', ethertype='IPv6', id='48986d96-ec57-4f49-aee8-6e1c68e273b1', remote_group_id='3c69498c-c210-48c8-ba43-fbf60a0c224e', updated_at='2020-08-14T17:54:45Z' |
| | created_at='2020-08-14T17:56:16Z', direction='ingress', ethertype='IPv4', id='58816267-8df8-4a89-a9c5-31986a441365', port_range_max='22', port_range_min='22', protocol='tcp', remote_ip_prefix='0.0.0.0/0', updated_at='2020-08-14T17:56:16Z' |
| | created_at='2020-08-14T17:54:45Z', direction='egress', ethertype='IPv4', id='c75e9aa8-84f3-4d05-9d33-0da7892f7a07', updated_at='2020-08-14T17:54:45Z' |
| | created_at='2020-08-14T17:56:14Z', direction='ingress', ethertype='IPv4', id='d029b66c-219e-488d-93af-1f87a9d8b006', protocol='icmp', remote_ip_prefix='0.0.0.0/0', updated_at='2020-08-14T17:56:14Z' |
| tags | [] |
| updated_at | 2020-08-14T17:56:16Z |
The command I used to launch the VM:
microstack.openstack server create --flavor m1.medium --image ubuntu_1804 --nic net-id=326a319c-e75d-48f1-ac36-aed342c45874 --key-name microstack --security-group 3c69498c-c210-48c8-ba43-fbf60a0c224e server_micro
Below, we can see the VM was launched:
microstack.openstack server list
| ID | Name | Status | Networks | Image | Flavor |
+--------------------------------------+--------------+--------+-----------------------------------+-------------+-----------+
| 9e88311d-0907-4534-ba5d-ee80d2de06ee | server_micro | ACTIVE | private=10.0.0.127 | ubuntu_1804 | m1.medium |
microstack.openstack server show 9e88311d-0907-4534-ba5d-ee80d2de06ee
+-------------------------------------+----------------------------------------------------------+
| Field | Value |
+-------------------------------------+----------------------------------------------------------+
| OS-DCF:diskConfig | MANUAL |
| OS-EXT-AZ:availability_zone | nova |
| OS-EXT-SRV-ATTR:host | jabuti |
| OS-EXT-SRV-ATTR:hypervisor_hostname | jabuti |
| OS-EXT-SRV-ATTR:instance_name | instance-0000000a |
| OS-EXT-STS:power_state | Running |
| OS-EXT-STS:task_state | None |
| OS-EXT-STS:vm_state | active |
| OS-SRV-USG:launched_at | 2020-08-31T13:54:52.000000 |
| OS-SRV-USG:terminated_at | None |
| accessIPv4 | |
| accessIPv6 | |
| addresses | private=10.0.0.127 |
| config_drive | |
| created | 2020-08-31T13:54:45Z |
| flavor | m1.medium (3) |
| hostId | 61fe40d2c4303db62eef04a071c6d7ee01f0465ec467f911ac05e2c0 |
| id | 9e88311d-0907-4534-ba5d-ee80d2de06ee |
| image | ubuntu_1804 (a1d60e2d-72d7-47d8-8aea-e97e8ba2a09b) |
| key_name | microstack |
| name | server_micro |
| progress | 0 |
| project_id | 37f73779b3cd42dc96044ea0fd6d1e98 |
| properties | |
| security_groups | name='default' |
| status | ACTIVE |
| updated | 2020-08-31T13:54:53Z |
| user_id | ff66b68443994bfeb2101851e7ea026d |
| volumes_attached | |
+-------------------------------------+----------------------------------------------------------+
But I cannot access the launched instance:
ping 10.0.0.127
PING 10.0.0.127 (10.0.0.127) 56(84) bytes of data.
From 10.75.211.9: icmp_seq=2 Redirect Host(New nexthop: 10.75.211.13)
From 10.75.211.9: icmp_seq=3 Redirect Host(New nexthop: 10.75.211.13)
From 10.75.211.9: icmp_seq=4 Redirect Host(New nexthop: 10.75.211.13)
From 10.75.211.9: icmp_seq=5 Redirect Host(New nexthop: 10.75.211.13)
^C
--- 10.0.0.127 ping statistics ---
5 packets transmitted, 0 received, 100% packet loss, time 4004ms
What am I missing? What should I do to ping/ssh the launched instance?
Once we create a VM with a private network, we need to associate a floating IP to it. Below, I list the steps needed to solve the problem.
Create a floating IP for your external network:
microstack.openstack floating ip create external
Create a router to communicate both networks (internal and external):
microstack.openstack router create router1
Add the external network to the router:
microstack.openstack router set router1 --external-gateway external
Add your private subnetwork to the router:
microstack.openstack router add subnet router1 f16b8b8c-482e-4cf5-a5d6-74e284b7e0f1
Associate the floating IP to your VM (suppose the created IP is 10.20.20.92):
microstack.openstack server add floating ip server_micro 10.20.20.92
Now you should be able to ping the VM and access it through ssh.

Does OpenstackSDK have support for usage metrics?

So...I'm facing a problem which I need the available amount of resources (and how are they being used) in the DCs (focusing on each project/server/network consume) of my Openstack (Stein) through python code (cause the other functionalities are in python and I don't like mixing languages if it have support for a functionality).
Are there any support for this on OpenstackSDK libraries? If yes, where to find the API documentation (or code examples of usage). If don't, why?
You can use existing Nova APIs to list down Compute capabilities and available resources.
nova hypervisor-stats
+----------------------+-------+
| Property | Value |
+----------------------+-------+
| count | 2 |
| current_workload | 0 |
| disk_available_least | 1378 |
| free_disk_gb | 1606 |
| free_ram_mb | 47003 |
| local_gb | 1606 |
| local_gb_used | 0 |
| memory_mb | 48027 |
| memory_mb_used | 1024 |
| running_vms | 0 |
| vcpus | 28 |
| vcpus_used | 0 |
+----------------------+-------+
You can automate it by wrapping it in shell or calling python-openstack lib.

Modeling Relational Data in DynamoDB (nested relationship)

Entity Model:
I've read AWS Guide about create a Modeling Relational Data in DynamoDB. It's so confusing in my access pattern.
Access Pattern
+-------------------------------------------+------------+------------+
| Access Pattern | Params | Conditions |
+-------------------------------------------+------------+------------+
| Get TEST SUITE detail and check that |TestSuiteID | |
| USER_ID belongs to project has test suite | &UserId | |
+-------------------------------------------+------------+------------+
| Get TEST CASE detail and check that | TestCaseID | |
| USER_ID belongs to project has test case | &UserId | |
+-------------------------------------------+------------+------------+
| Remove PROJECT ID, all TEST SUITE | ProjectID | |
| AND TEST CASE also removed | &UserId | |
+-------------------------------------------+------------+------------+
So, I model a relational entity data as guide.
+-------------------------+---------------------------------+
| Primary Key | Attributes |
+-------------------------+ +
| PK | SK | |
+------------+------------+---------------------------------+
| user_1 | USER | FullName | |
+ + +----------------+----------------+
| | | John Doe | |
+ +------------+----------------+----------------+
| | prj_01 | JoinedDate | |
+ + +----------------+----------------+
| | | 2019-04-22 | |
+ +------------+----------------+----------------+
| | prj_02 | JoinedDate | |
+ + +----------------+----------------+
| | | 2019-05-26 | |
+------------+------------+----------------+----------------+
| user_2 | USER | FullName | |
+ + +----------------+----------------+
| | | Harry Potter | |
+ +------------+----------------+----------------+
| | prj_01 | JoinedDate | |
+ + +----------------+----------------+
| | | 2019-04-25 | |
+------------+------------+----------------+----------------+
| prj_01 | PROJECT | Name | Description |
+ + +----------------+----------------+
| | | Facebook Test | Do some stuffs |
+ +------------+----------------+----------------+
| | t_suite_01 | | |
+ + +----------------+----------------+
| | | | |
+------------+------------+----------------+----------------+
| prj_02 | PROJECT | Name | Description |
+ + +----------------+----------------+
| | | Instagram Test | ... |
+------------+------------+----------------+----------------+
| t_suite_01 | TEST_SUITE | Name | |
+ + +----------------+----------------+
| | | Test Suite 1 | |
+ +------------+----------------+----------------+
| | t_case_1 | | |
+ + +----------------+----------------+
| | | | |
+------------+------------+----------------+----------------+
| t_case_1 | TEST_CASE | Name | |
+ + +----------------+----------------+
| | | Test Case 1 | |
+------------+------------+----------------+----------------+
If I just have UserID and TestCaseId as a parameter, how could I get TestCase Detail and verify that UserId has permission.
I've thought about storing complex hierarchical data within a single item. Something likes this
+------------+-------------------------+
| t_suite_01 | user_1#prj_1 |
+------------+-------------------------+
| t_suite_02 | user_1#prj_2 |
+------------+-------------------------+
| t_case_01 | user_1#prj_1#t_suite_01 |
+------------+-------------------------+
| t_case_02 | user_2#prj_1#t_suite_01 |
+------------+-------------------------+
Question: What is the best way for this case? I appreciate if you could give me some suggestion for this approach (bow)
I think the schema below does what you want. Create a Partition Key only GSI on the "GSIPK" attribute and query as follows:
Get Test Suite Detail and Validate User: Query GSI - PK == ProjectId, FilterCondition [SK == TestSuiteId || PK == UserId]
Get Test Case Detail and Validate User: Query GSI - PK == TestCaseId, FilterCondition [SK = TestSuiteId:TestCaseId || PK = UserId]
Remove Project: Query GSI - PK == ProjectId, remove all items returned.
Queries 1 and 2 come back with 1 or 2 items. One is the detail item and the other is the user permissions for the test suite or test case. If only one item returns then its the detail item and the user has no access.
The first question you should ask is: why do I want to use key-value document DB over relational DB when I clearly have strong relations in my data?
The answer might be: I need a single-digit millisecond queries at any scale (millions of records). Or, I want to save money using dynamodb on-demand. If this is not the case, you might be better with a relational DB.
Let’s say you have to go for dynamodb. If so, most of patterns applicable for relational DBs are anti-patterns when it comes to NoSQL. There is a useful talk from last re-invent about design patterns for dynamodb and advice to watch it https://youtu.be/HaEPXoXVf2k.
For your data I’d think about taking similar approach, and having two tables: users and projects.
Projects should store sub-set of test suits as map of array of objects and test cases as map of array of objects. Plus you could add list of user ids in the map of strings. Of course you will need to maintain this list when users join or leave the project/s.
This should satisfy your access patterns.

R - Multiple search and replace based on partial match within a column of a dataframe

I have a list of publisher that looks like this :
+--------------+
| Site Name |
+--------------+
| Radium One |
| Euronews |
| EUROSPORT |
| WIRED |
| RadiumOne |
| Eurosport FR |
| Wired US |
| Eurosport |
| EuroNews |
| Wired |
+--------------+
I'd like to create the following result:
+--------------+----------------+
| Site Name | Publisher Name |
+--------------+----------------+
| Radium One | RadiumOne |
| Euronews | Euronews |
| EUROSPORT | Eurosport |
| WIRED | Wired |
| RadiumOne | RadiumOne |
| Eurosport FR | Eurosport |
| Wired US | Wired |
| Eurosport | Eurosport |
| EuroNews | Euronews |
| Wired | Wired |
+--------------+----------------+
I would like to understand how I can replicate this code I use in Power Query :
search first 4 characters
if Text.Start([Site Name],4) = "WIRE" then "Wired" else
search last 3 characters
if Text.End([Site Name],3) = "One" then "RadiumOne" else
If no match is found, then add "Rest"
It does not have to be case sensitive.
Using properCase from ifultools package and gsub, we replace everything after first word with "" i.e delete it and treat the exceptional case of Radium separtely. If you have many exceptions like Radium case, please update your post with those so that we can find a neater solution to this hack :)
library("ifultools")
siteName=c("Radium One","Euronews","EUROSPORT","WIRED","RadiumOne","Eurosport FR","Wired US","Eurosport","EuroNews","Wired")
publisherName = gsub("^Radium$","Radiumone",gsub("\\s+.*","",properCase(siteName)))
# [1] "Radiumone" "Euronews" "Eurosport" "Wired" "Radiumone" "Eurosport" "Wired"
# [8] "Eurosport" "Euronews" "Wired"

Resources