prevent Google from indexing VM's external IP - wordpress

My setup:
Bitnami Wordpress
GCP VM
GCP HTTPs Load Balancer
Load Balancer has http to https redirection, www to non-www redirection
Cloud CDN
Main Problem:
The VM's external IP (22.22.22.22) is indexed by Google Search. I'm unable to remove it from Google Search because it is not recognized as a property that belongs to me and the indexed page (http://22.22.22.22/home) is live. http://22.22.22.22/home is resolved to example.com which is a live page.
what I have checked
The domain name and Load Balancer's IP is properly resolved
No new mod_rewrite rules other than those from the default installation
The site does not have any reference to VM's external IP, including database
No cache plugins installed
Wordpress's site address and home address is pointing to http://example.com
what I have done to rectify
I've added 301 redirects which I'm not sure if that helps. It will take some time to check if that works.
My questions
Is there a way to prevent Google from indexing IPs(Any IPs in general)?
How to prevent VM's ext. IP from being indexed by Google especially when load balancer is in use?

There is a simple solution but it takes time. Configure the Apache web server to redirect IP-based queries to your domain or return an error. Google Search will eventually notice the redirect and remove the IP address from search queries.
For redirects, use a permanent redirect (301).
301 Moved Permanently
How To Create Temporary and Permanent Redirects with Apache and Nginx
There are WordPress plugins to do the same, but I prefer to configure Apache directly and not add another plugin to a site.
Is there a way to prevent Google from indexing IPs(Any IPs in
general)?
No, Google can index any site, IP or Hostname based, that is public unless restricted via robots.txt.
Introduction to robots.txt
How to prevent VM's ext. IP from being indexed by Google especially
when load balancer is in use?
You can create a VPC Firewall rule that only allows traffic from the load balancer and blocks Internet ingress traffic.
Load Balancer Firewall Rules
Additional information:
Remove a page hosted on your site from Google

Related

Wordpress Gcloud Load Balancer HTTPS front end, but IP on port 80 still open on the back end

I'm trying to put a simple Wordpress site on Google Cloud with SSL.
I set up a load balancer using an instance group, as described here: https://docs.bitnami.com/google-templates/how-to/configure-lb-ssl-google-templates/
except I used one-click deployment instead of bitnami.
unmanaged instance group with the vm as a member
load balancer with the instance group on the backend, port 80
front end HTTPS, port 443
The front end works, this is my test site: https://thescanclub.com
However, insecure access is still possible to the backend I.P. and also the database! I must have left out a firewall or something. How would you fix this?
Back end IP to the vm: 34.150.136.72
PS... this is a small test site and may get a new vm if I have to redo it.
The Google Managed Certificate that you created resides within the Front End of the Load Balancer. This means that it will only use the Google's SSL certificate, and redirect from HTTP to HTTPS when you access your Wordpress site through the Load Balancer's front end IP address or through the domain (thescanclub.com).
Digging the A record of your domain, you'll see that it is not pointed to the VM's public IP but rather to a different one which I presume is the HTTPS Load Balancer's front end IP.
If you wish to access the website directly from the VM's public IP, then it will be best to set up a letsencrypt SSL certificate within the Bitnami instance and set up the HTTP to HTTPS redirection through it. Hope you find this documentation helpful.

How to prevent user access to IP address for a website hosted in GCP?

I have a Wordpress website (Bitnami) thats hosted in Google Cloud Platform. The IP address is something like: 33.33.33.33. My domain is hosted on Google Domains and has 2 nameservers pointing to Cloudflare. On the Cloudflare DNS settings, I have 2 A records, one is ftp and the other is something like 'mywebsite.com'. I also have a CNAME which is www mywebsite.com.
I am able to hit the full URL of my website but I also notice I can enter the IP address and it also works and loads the website. Is it possible to prevent acccess to 33.33.33.33 and only allow the full URL?
Create VPC Firewall Rules that only allow Cloudflare IP addresses.
Consult this document for the current Cloudflare IP list.
Google Cloud VPC firewall rules overview
Note: Cloudflare does not support FTP, so you must point your DNS resource record for FTP to your server's IP address and not through Cloudflare. I recommend that you do not use FTP. Use SSH/SFTP for file transfers. Configure WordPress so that installing plugins, uploading images, etc do not require FTP. FTP is not encrypted which means your login username and password are sent in the clear.

Web Server no longer accessible after WordPress URL rewrite

The WordPress website was accessible via a Public IP however after URL rewrite (IP to the domain name) it showed 'DNS_PROBE_FINISHED_NXDOMAIN'. The domain DNS is resolving to the IP and I am not seeing any issue on the web server. I tried flushing local IP and also tried from different devices assuming it could be DNS cache but the issue remains.
Any thoughts?

WP site accessible at subpath of domain used by Squarespace site

Alright this is a super weird request and it's just a stopgap measure until the client cleans up their many web properties, so no need to tell me that it's a convoluted way to do things. Current setup:
example.com is registered with GoDaddy but points to a Squarespace
site (CNAME & A records, etc)
sub.example.com is a Wordpress site on Godaddy hosting
Question: Can I make the WP site at sub.example.com instead appear as example.com/sub? If so, how?
I can set the site_url and home_url in WP, and I can set up a forwarding rule from Squarespace to the IP of the GoDaddy shared hosting account. Would that be enough?
I had thought I would need to set an A record with the host as example.com/sub, but slashes aren't allowed. Any insight greatly appreciated. Thanks for reading!
the only way to get this done that I am aware, is to use some form of land balancer, e.g.:
AWS Elastic Load balancer
Nginx / apache proxy
AWS CloudFormation custom backends
last option will also speed up your site.
Unless your DNS provider have some custom option, DNS alone can't do that.

How to set up a Google VM (instance grouped), https load balanced w/CDN, so that the backend resolves the domain?

Backend: "Bitnami WordPress with NGINX and SSL Stack for Google Cloud Platform" from marketplace.
I used this guide: https://www.am22tech.com/google-cloud-cdn-wordpress/
I ended up with a somewhat working system as follows:
My Domain ->(google managed cert)-> CDN + Load Balancer -> Instance Group ->(http)-> VM with bitnami stack.
All works well and seems very fast. The biggest gap in my understanding is how the VM can be told it needs to behave as if it's the original domain.
For example, in the nginx server config, any kind of reference to $host seems to return the VM's IP address or something like that.
Also, in wordpress, in a lot of places the domain is replaced by an IP address, even though the site URL and wordpress address show up correctly. This isn't a wordpress question though, as I'm quite sure there is a more general solution I'm missing to do perhaps with NGINX or the load balancer configuration. I think PHP detects the host and passed it along to wordpress but I'm not clear how.
I found a reference somewhere in the google documentation how to manually assign a domain to a VM but not sure that's what's needed here.
Further to this, I'm totally unclear how I would set up https between the vm and the load balancer, yet only have one domain/ip address for the global forward rule. Maybe a separate question.
Bitnami Engineer here. If you already have the domain, certificates and the Load Balancer in place, you will need to configure WordPress to use that domain name as default domain of the application. You will need to edit the wp-config.php file and configure these lines
define('WP_SITEURL', 'http://DOMAIN/');
define('WP_HOME', 'http://DOMAIN/');
More info: https://docs.bitnami.com/google/apps/wordpress-pro/administration/configure-domain/
In case you also want NGINX to redirect you your domain, no matter how you access your app's information, you can add this configuration line
return 301 https://DOMAIN$request_uri;
in the /opt/bitnami/nginx/conf/bitnami/bitnami.conf file
More info: https://docs.bitnami.com/google/apps/wordpress-pro/administration/force-https-nginx/
I had better luck having the load balancer talk to my VM with https. Once I got that working, I didn't have to make any changes to wp-config.php. In this case I didn't bother with varnish because I think it only supports http. I'm hoping google's CDN will be sufficient regarding caching, and I may try a helper plugin in wordpress.
To redirect http to https, I followed the bitnami instructions to set up to front ends to the load balancer pointing to the same static ip address, then in my nginx server blocks, I added a redirect line in the https block (not the http block), since the google load balancer communicates with my backend via https. Google sets the http_x_fowward_proto to http so I check that and redirect if necessary.
if ($http_x_forwarded_proto = "http") { return 301 https://$host$request_uri; }
The bitnami stack is amazing, everything seems extremely fast!

Resources