How to interpret access log of Apache server? - http

I checked this link but could understand as they have provided with a different format.
This is an entry in my access log.
127.0.0.1 - - [13/Aug/2012:13:39:53 +0530] "GET /cgi-bin/test.cgi HTTP/1.1" 200 48
I want to know the meaning of the last code.
I know that 200 id OK status code but what is 48 or any other number.... ?
Please help!

i think it means the size in bytes ..
Regards
mimiz

Related

What is this second number after HTTP status code?

I did grep " 500 " on the request log and got this, I understand that HTTP code is 200 but what is this 501?
POST /api/v1/url HTTP/1.1" 200 501
I suppose your server is Apache and logs are in CLF (Common Log Format).
Then 501 is the size of the server response (in bytes and in total).

LinkedIn Link Sharing: Open Graph Image Issue

NOTE: I've seen a variety of similar questions to this one so I'm going to try and be succinct in describing the issue along with what I've done so far.
The issue is that LinkedIn is failing to properly scrape images from articles on a WordPress site. The site is using All-in-One SEO to add the appropriate meta tags and these tags and, judging by facebook's sharing and object debuggers, is doing so correctly.
Here's a sample article that demonstrates the issue.
Upon entering the URI into a LinkedIn article, LinkedIn attempts to fetch the page's data. It returns with the title and description but leaves an empty space where the image would presumably display:
In tailing the access logs, I've seen LinkedIn hitting the site along with 200 status codes for the page itself and the image:
[ip redacted] - - [29/Mar/2017:19:50:44 +0000] "GET /linkedin-test/?refTest=LI17 HTTP/1.1" 200 23758 "-" "LinkedInBot/1.0 (compatible; Mozilla/5.0; Jakarta Commons-HttpClient/3.1 +http://www.linkedin.com)" 0.906 "[ips redacted]"
[ip redacted] - - [29/Mar/2017:19:50:44 +0000] "GET /wp-content/uploads/2017/03/modern-architecture-skyscrapers-modest-with-images-of-modern-architecture-ideas-on-gallery.jpg HTTP/1.1" 200 510088 "-" "LinkedInBot/1.0 (compatible; Mozilla/5.0; Jakarta Commons-HttpClient/3.1 +http://www.linkedin.com)" 0.000 "[ips redacted]"
Following some other Stack Overflow threads, I experimented with the following:
Attempt to bust LinkedIn's cache with query strings
Result: No change with query strings or completely new URLs
Verify image dimensions for og:image resource are not too small
Result: No change was seen here in using images that match or exceed those indicated in LinkedIn's knowledge base
Revise all meta tags to include prefix="og: http://ogp.me/ns#"
Result: No change
I feel like I've hit a wall so any suggestions or thoughts are definitely welcome!
Thanks!
This seems to work fine for me:
https://www.linkedin.com/sharing/share-offsite/?url=https%3A%2F%2Fdev-agilealliance.pantheonsite.io%2F
Source: Official Microsoft Documentation for Sharing on LinkedIn. Results:
So, what changed??? I did some digging.
Let's look at the 2018 robots.txt file for dev-agilealliance.pantheonsite.io, and compare it to the working, 2020 robots.txt file. Seems pretty clear. This is why things were blocked in 2018...
# Pantheon's documentation on robots.txt: http://pantheon.io/docs/articles/sites/code/bots-and-indexing/
User-agent: *
Disallow: /
And this is the working robots.txt in 2020...
User-agent: RavenCrawler
User-agent: rogerbot
User-agent: dotbot
User-agent: SemrushBot
User-agent: SemrushBot-SA
User-agent: PowerMapper
User-agent: Swiftbot
Allow: /
Of course, it could also be that sharing services are much more lenient in 2020 than in 2017! But things do seem to be working now, according to the above specs., I can confirm.

Nagios returning OK with check_http and status 303

Probably an easy question but search is not helping.
# ../libexec/check_http -H google.co.uk
Provides:
HTTP OK: HTTP/1.1 301 Moved Permanently - 592 bytes in 0.153 second response time|time=0.152933s;;;0.000000 size=592B;;;0
But
# ../libexec/check_http -H google.co.uk.thisisnotarealurl
Provides:
HTTP OK: HTTP/1.0 303 See Other - 212 bytes in 0.161 second response time|time=0.161133s;;;0.000000 size=212B;;;0
How can it be showing HTTP OK when the site doesn't exist?
Nagios is showing the site is OK whether it exists or not, is this normal?
Status 303 means the site is present, just somewhere else. Same for 301 (with different semantics). Thats not a Failure so yeah thats normal.
The question is why you get 303 for google.co.uk.thisisnotarealurl. Maybe some setup in your network (DNS proxy that always delivers some result? See also the comment of James_pic!)
What do you get if you hit that into a browser (from the same machine & setup of Nagios)

Nginx giving 400 error:

I am using Nginx to handle hits of API. I checked the access log and found that sometimes Nginx is giving 400 error.
GET /url to hit/ HTTP/1.1" **400 172** "-" "-"
What is 172 in above log ? and how to solve this error in Nginx ?
172 corresponds to the size of server response in bytes.
Source: https://easyengine.io/tutorials/nginx/log-parsing/

How can I measure my (SAMP) server's bandwidth usage?

I'm running a Solaris server to serve PHP through Apache. What tools can I use to measure the bandwidth my server is currently using? I use Google analytics to measure traffic, but as far as I know, it ignores file size. I have a rough idea of the average size of the pages I serve, and can do a back-of-the-envelope calculation of my bandwidth usage by multiplying page views (from Google) by average page size, but I'm looking for a solution that is more rigorous and exact.
Also, I'm not trying to throttle anything, or implement usage caps or anything like that. I'd just like to measure the bandwidth usage, so I know what it is.
An example of what I'm after is the usage meter that Slicehost provides in their admin website for their users. They tell me (for another site I run) how much bandwidth I've used each month and also divide the usage for uploading and downloading. So, it seems like this data can be measured, and I'd like to be able to do it myself.
To put it simply, what is the conventional method for measuring the bandwidth usage of my server?
This depends on your setup. If you have a (near-)dedicated physical interface for your web server you could gather stats straight from the interface.
Methods to do this could include SNMP (try net-snmp) or "ifconfig", combined with RRDTool or simple logging to flat files.
An alternative is using the Apache log, which could look like this:
192.168.101.155 - - [17/Apr/2005:20:39:19 -0700] "GET / HTTP/1.1" 200 1456
192.168.101.155 - - [17/Apr/2005:20:39:19 -0700] "GET /apache_pb.gif HTTP/1.1" 200 2326
192.168.101.155 - - [17/Apr/2005:20:39:19 -0700] "GET /favicon.ico HTTP/1.1" 404 303
192.168.101.155 - - [17/Apr/2005:20:39:42 -0700] "GET /index.html.ca HTTP/1.1" 200 1663
192.168.101.155 - - [17/Apr/2005:20:39:42 -0700] "GET /apache_pb.gif HTTP/1.1" 304 -
192.168.101.155 - - [17/Apr/2005:20:39:43 -0700] "GET /favicon.ico HTTP/1.1" 404 303
192.168.101.155 - - [17/Apr/2005:20:40:01 -0700] "GET /apache_pb.gif HTTP/1.1" 304 -
192.168.101.155 - - [17/Apr/2005:20:40:09 -0700] "GET /apache_pb.gift HTTP/1.1" 404 306
192.168.101.155 - - [17/Apr/2005:20:40:09 -0700] "GET /favicon.ico HTTP/1.1" 404 303
The last number is the amount of bytes transferred, excluding the header(!). See Apache Log Docs.
I am just guessing, but I think the usual approach is to use the same tools and services that are used to deliver QoS features. QoS == Quality of Service. Somewhere on the server itself, or on the network routers around the server, there will be services enabled that measure the size of the packets flowing out of your server. These same services can be used to limit the amount of bandwidth for customers that need to have such limitations enforced. I have not heard of an application that can be run on your server that measures bandwidth. I think it should be possible to create such an app, but that's not the usual way that such measurements are collected. I suspect this answer will end up not being solaris-specific.

Resources