Approach to securing test data for public repositories - integration-testing

We have setup nightly testing for an open source project (MERN stack). The Selenium tests require test data which we do not want to not make public. Initially we tried to keep test data as environment variables in the build server (CircleCI) but this approach is not scalable. We do not own any infrastructure - so any database or storage bucket based solutions will need additional cost which will not be feasible based on the org's current budget.Is there a smart solution to keep the test data files secure at no additional cost?

As you know, the challenge is that you need somewhere to put that data. If you're trying to do this without paying any providers, the best I can suggest is Amazon's free tier for either S3 storage or a database. https://aws.amazon.com/free/
Those could be securely accessed from CircleCI by just storing the API keys as project variables.
CircleCI's AWS S3 orb encapsulates the install and setup of AWS CLI to simplify this.
version: 2.1
orbs:
aws-s3: circleci/aws-s3#1.0.2
jobs:
build:
docker:
- image: 'circleci/node:10'
steps:
- checkout
- aws-s3/copy:
from: 's3://your-s3-bucket-name/test_data/somefile.ext'
to: test_data.ext
- run: # your test code here

Related

Difference between tracking_uri and the backend store uri in MLFLOW

I am using Mlflow for my project hosting it in an EC2 instance. I was wondering in MlFlow what is the difference between the backend_store_uri we set when we launch the server and the trarcking_uri ?
Thanks,
tracking_uri is the URL of the MLflow server (remote, or built-in in Databricks) that will be used to log metadata & model (see doc). In your case, this will be the URL pointing to your EC2 instance that should be configured in programs that will log parameters into your server.
backend_store_uri - is used by MLflow server to configure where to store this data - on filesystem, in SQL-compatible database, etc. (see doc). If you use SQL database, then you also need to provide the --default-artifact-root option to point where to store generated artifacts (images, model files, etc.)

Is it possible to setup MYSQL replication with binlog files generated from server A (which we are considering as Master) to server B

We are migrating from the Magento community to Magento cloud for one of our projects and we need to access DB for our custom developed CRM.
But unfortunately magento cloud does not support DB replication and they have enabled binlogs and they are not supporting for creating replication user and server id setup, The binlog files can be synced to our CRM server periodically.
Now we want to know whether we can use the binlog files to replicate the database or is there any workaround for doing the same?
We have tried using tunnel setup but the query execution time is more while using tunnel setup which will affect our CRM performance badly.
Also we need to reconfirm whether there are any other possibilities we can try to access the Magento Cloud DB in our CRM without performance lag.
Thanks in advance for your suggestions.
Yes, it is possible, but it may be a little fiddly in the setup you are describing. You can replay the binlogs as relay logs. Have a look at this article for more details:
https://lefred.be/content/howto-make-mysql-point-in-time-recovery-faster/
Specifically, these parts are relevant (you'll need to edit them appropriately):
[root#mysql1 mysql]# for i in $(ls /tmp/binlogs/*.0*)
do
ext=$(echo $i | cut -d'.' -f2);
cp $i mysql1-relay-bin.$ext;
done
[root#mysql1 mysql]# ls ./mysql1-relay-bin.0* >mysql1-relay-bin.index

BAD_GATEWAY when connecting Google Cloud Endpoints to Cloud SQL

I am trying to connect from GCP endpoints to a Cloud SQL (PostgreSQL) database in a different project. My endpoints backend is an app engine in the flexible environment using Python.
The endpoints API works fine for non-db requests and for db requests when run locally. But the deployed API produces this result when requiring DB access:
{
"code": 13,
"message": "BAD_GATEWAY",
"details": [
{
"#type": "type.googleapis.com/google.rpc.DebugInfo",
"stackEntries": [],
"detail": "application"
}
]
}
I've followed this link (https://cloud.google.com/endpoints/docs/openapi/get-started-app-engine) to create the endpoints project, and this (https://cloud.google.com/appengine/docs/flexible/python/using-cloud-sql-postgres) to link to Cloud SQL from a different project.
The one difference is that I don't use the SQLALCHEMY_DATABASE_URI env variable to connect, but take the connection string from a config file to use with psycopg2 SQL strings. This code works on CE servers in the same project.
Also double checked that the project with the PostgreSQL db was given Cloud SQL Editor access to the service account of the Endpoints project. And, the db connection string works fine if the app engine is in the same project as the Cloud SQL db (not coming from endpoints project).
Not sure what else to try. How can I get more details on the BAD_GATEWAY? That's all that's in the endpoints logfile and there's nothing in the Cloud SQL logfile.
Many thanks --
Dan
Here's my app.yaml:
runtime: python
env: flex
entrypoint: gunicorn -b :$PORT main:app
runtime_config:
python_version: 3
env_variables:
SQLALCHEMY_DATABASE_URI: >-
postgresql+psycopg2://postgres:password#/postgres?host=/cloudsql/cloudsql-project-id:us-east1:instance-id
beta_settings:
cloud_sql_instances: cloudsql-project-id:us-east1:instance-id
endpoints_api_service:
name: api-project-id.appspot.com
rollout_strategy: managed
And requirements.txt:
Flask==0.12.2
Flask-SQLAlchemy==2.3.2
flask-cors==3.0.3
gunicorn==19.7.1
six==1.11.0
pyyaml==3.12
requests==2.18.4
google-auth==1.4.1
google-auth-oauthlib==0.2.0
psycopg2==2.7.4
(This should be a comment but formatting really worsen the reading, I will update on here)
I am trying to reproduce your error and I come up with some questions:
How are you handling the environment variables in the tutorials? Have you hard-coded them or are you using environment variables? They are reset with the Cloud Shell (if you are using Cloud Shell).
This is not clear for me: do you see any kind of log file in CloudSQL (without errors) or you don't see even logs?
CloudSQL, app.yaml and requirements.txt configurations are related. Could you provide more information on this? If you update the post, be careful and do not post username, passwords or other sensitive information.
Are both projects in the same region/zone? Sometimes this is a requisite, but I don't see anything pointing this in the documentation.
My intuition points to a credentials issue, but it would be useful if you add more information to the post to better understand where the issue cames from.

Multiple applications in the same Symfony2 application

This is quite a long question, but there's quite a lot to it.
It feels like it should be a reasonably common use case, so I'm hoping the Stack Overflow community can provide me with a 'best practice in Symfony2' answer.
The solution I describe below works, but there are several consequences I'd like to avoid:
In my local dev environment, if I have used the wrong db connection the test will work in dev but fail on production
The routes of the ADMIN API are accessible on the PUBLIC API url, just denied.
If I have a mirror of live in my dev environment (3 separate checkouts with the corresponding parameters.yml file) then the feature tests for the other bundles fail
Is there a 'best practice in Symfony2' way to set up my project?
We're running a LAMP stack. We use git/(Atlassian) stash for version control.
We're using doctrine for the ORM and FOS-REST with OAuth plus symfony firewalls to authenticate and authorise the users.
We're committed to use Symfony2, so I am trying to find a 'best practice' solution:
I have a project with 3 applications:
A public-facing API (which gives read-only access to the data)
A protected API (which provides admin functionality)
A set of batch processes (to e.g. import data and monitor data quality)
Each application uses a set of shared models.
I have created 4 bundles, one each for the application and a 4th for the shared models.
Each application must use a different database user to access the database.
There's only one database.
There's several tables, one is called 'prices'
The admin API only must be accessible from one hostname (e.g. admin-api.server1)
The public API only must be accessible from a different hostname (e.g. public-api.server2)
Each application is hosted on a different server
In parameters.yml in my dev environment I have this
// parameters.yml
api_public_db_user: user1
api_public_db_pass: pass1
api_admin_db_user: user2
api_admin_db_pass: pass2
batch_db_user: user3
batch_db_pass: pass3
In config.yml I have this:
// config.yml
doctrine:
dbal:
connections:
api_public:
user: "%api_public_db_user%"
password: "%api_public_db_pass%"
api_admin:
user: "%api_admin_db_user%"
password: "%api_admin_db_pass%"
batch:
user: "%batch_db_user%"
password: "%batch_db_pass%"
In my code I can do this (I believe this can be done from the service container too, but I haven't got that far yet)
$entityManager = $this->getContainer()->get('doctrine')->getManager('api_public');
$entityRepository = $this->getContainer()->get('doctrine')->getRepository('CommonBundle:Price', api_admin');
When I deploy my code to each of the live servers, I put junk values in the parameters.yml for the other applications
// parameters.yml on the public api server
api_public_db_user: user1
api_public_db_pass: pass1
api_admin_db_user: **JUNK**
api_admin_db_pass: **JUNK**
batch_db_user: **JUNK**
batch_db_pass: **JUNK**
I have locked down my application so that the database isn't accessible (and thus the other API features don't work)
I have also set up Symfony firewall security so that the different routes require different permissions
There's also security in the apache vhost to deny access to say the admin api path from the public api directory.
So, I have secured my application and met the requirement of the security audit, but the dev process isn't ideal and something feels wrong.
As background:
We have previously looked at splitting it up into different applications within the same project (like this Symfony2 multiple applications and api centric application. Actually followed this method http://jolicode.com/blog/multiple-applications-with-symfony2) , but ran into difficulties, and in any case, Fabien says not to (https://groups.google.com/forum/#!topic/symfony-devs/yneojUuFiqw). That this existed in Symfony1 and was removed in Symfony2 is enough of an argument for me.
We have previously gone down the route of splitting up each bundle and importing it using composer, but this caused too many development overheads (for example, having to modify many repositories to implement a feature; it not being possible to see all of the changes for a feature in a single pull request).
We are receiving an ever growing number of requests to create APIs, and we're similarly worried about putting each application in its own repository.
So, putting each of the three applications in a separate Symfony project / git repository is something we want to avoid too.

What do I need for Travis-CI to decrypt secure variables on my fork?

I have forked a Github repository and would like to use travis-ci, as the original repository does, to run tests when I commit. However, the AWS keys, which are encrypted, are not decrypted and keep the tests from succeeding. Since my workplace owns the original repository, I have access to whatever is needed, but am unsure what information to retrieve, where to find it, or what to do with it.
For clarity, here is the pertinent part of the .travis.yml:
env:
global:
- NODE_ENV: test
- [...]
- secure: M3YSEJnWYd[...]
- secure: kvvLABsWTq[...]
All of the environment variables are imported except the secure ones (which is to be expected, of course).
Travis documents that for security reasons secret variables are not available to forks (https://docs.travis-ci.com/user/environment-variables/#defining-encrypted-variables-in-travisyml). It should be possible though to set new secrets in travis.yml or fork‘s repo settings.

Resources