Access log-structure
Log line structure is as follows, note that some fields are quoted and some are not.if we will add more fields in the future, they'll be added on the right side.

Field names:

Field
Description
remote_addr
Client IP
timestamp
Timestamp
status
Response Status code sent to client
bytes_sent
Number of bytes sent in the response
method
HTTP Request method
request
The complete URL (including the query string)
proto_version
Protocol Version (1.0/1.1)
blocked
Was it blocked by Reblaze?
is_human
Was it marked as human?
block_reason
If blocked / Exceptionally passed - for what reason
geoip_country_name
Country Name
geoip_country_code
Country Code
request_id
Unique ID of this request within Reblaze
captured_vector
The vector attack we captured
request_time
The time it took our system to process the request
upstream_addr
The address of the upstream server(s) reblaze approached
upstream_response_time
The time Reblaze was waiting to the upstream server(s) to return the response
domain_name
The domain name of the server group in reblaze
host
The Host header of this request (same as domain name, or one of its aliases)

Example line (sliced into smaller parts):

15.14.13.12 1465461803.223 200 3158 \
"POST /payment_service_api/get_all_payment_methods.json HTTP/1.1" "0" "0" \
"" "Singapore" "SG" "foobar-rbzr131343635343631383032ce9df9a6a90fa3bc" "-" \
0.729 "8.12.40.38:443" "0.418" "secure.foobar.com" "secure.foobar.com" "-" \
"REEBONZ 5.2.2 rv:3 (iPhone; iPhone OS 9.3.2; en_SG)" "-" \
"eyJob3N0Ijoic2VjdXJlLnJlZWJvbnouY29tIiwieC1uZXdyZWxpYy1pZCI6IlVnWURVRkJBQ1FzSlZGbGFCUT09IiwiY29udGVudC10eXBlIjoiYXBwbGljYXRpb25cL3gtd3d3LWZvcm0tdXJsZW5jb2RlZDsgY2hhcnNldD11dGYtOCIsImNvbm5lY3Rpb24iOiJjbG9zZSIsImNvbnRlbnQtbGVuZ3RoIjoiMTQ0IiwiYWNjZXB0LWVuY29kaW5nIjoiZ3ppcCIsInBvc3RfcmVxdWVzdF9ib2R5Ijp7ImNvdW50cnlfY29kZSI6IlNHIiwic2lnbmF0dXJlIjoiMGE2YzQwZTI3ZGRmMGYyNjcxMzM2YjhiYTljYjVmYzIiLCJkYXRldGltZSI6IjIwMTYtMDYtMDkgMTY6NDM6MjEiLCJkcnVwYWxfdWlkIjoiNTI0NTgxNiIsInBsYXRmb3JtX25hbWUiOiJNb2JpbGUiLCJidV9jb2RlIjoiMDEifSwidXNlci1hZ2VudCI6IlJFRUJPTlogNS4yLjIgcnY6MyAoaVBob25lOyBpUGhvbmUgT1MgOS4zLjI7IGVuX1NHKSJ9" \
"AS4773 MobileOne Ltd. Mobile/Internet Service Provider Singapore" "200" \
"/payment_service_api/get_all_payment_methods.json" "foobar-rbzr1" "0" "0" "0" "0" \
"0" "83d90abb12608623ea23442273249146" "466" "max-age=0, private, must-revalidate" \
"-" "-" "text/html; charset=utf-8" "-" "TLSv1.2" "ECDHE-RSA-AES128-GCM-SHA256" "-"
Headers are base64 encoded JSON string contains headers names and values.
if we decided to block a request, either by ACL, or WAF/IPS it will be marked as "1" in the blocked field,
as well there will be a description at "block_reason" field.

Decoding headers filed shall result with the following:

{
"host": "secure.reebonz.com",
"x-newrelic-id": "UgYDUFBACQsJVFlaBQ==",
"content-type": "application/x-www-form-urlencoded; charset=utf-8",
"connection": "close",
"content-length": "144",
"accept-encoding": "gzip",
"post_request_body": {
"country_code": "SG",
"signature": "0a6c40e27ddf0f2671336b8ba9cb5fc2",
"datetime": "2016-06-09 16:43:21",
"drupal_uid": "5245816",
"platform_name": "Mobile",
"bu_code": "01"
},
"user-agent": "FOOBAR 5.2.2 rv:3 (iPhone; iPhone OS 9.3.2; en_SG)"
}
Note that within the headers, we add an entry for the post_request_body.

Python Parser Example:

import re
access_line_rec = re.compile(''' # -- line by line
(\S+)\s # remote_addr
(\S+)\s # timestamp
(\S+)\s # status
(\S+)\s # bytes_sent
"(\S+)\s # METHOD
(.+) # request
\s(HTTP/\d\.\d)"\s # PROTOCOL/Version
"([^"]*)"\s # blocked
"([^"]*)"\s # is_human
"([^"]*)"\s # block_reason
"([^"]*)"\s # geoip_city_country_name
"([^"]*)"\s # geoip_city
"([^"]*)"\s # request_id
"([^"]*)"\s # captured_vector
(\S+)\s # request_time
"([^"]*)"\s # upstream_addr
"([^"]*)"\s # upstream_response_time
"([^"]*)"\s # canonical_domain_name
"([^"]*)"\s # http_host
"([^"]*)"\s # referer
"([^"]*)"\s # user-agent
"([^"]*)"\s # cookie
"([^"]*)"\s # headers
"([^"]*)"\s # organization
"([^"]*)"\s # upstream_status
"([^"]*)"\s # uri
"([^"]*)"\s # hostname
"([^"]*)"\s # is_cloud
"([^"]*)"\s # is_tor
"([^"]*)"\s # is_vpn
"([^"]*)"\s # is_anonymizer
"([^"]*)"\s # is_proxy
"([^"]*)"\s # rbzsessionid
"([^"]*)"\s # request_length
"([^"]*)"\s # sent_http_cache_control
"([^"]*)"\s # sent_http_expires
"([^"]*)"\s # cookie_rbzid
"([^"]*)"\s # sent_http_content_type
"([^"]*)"\s # browsersig
"([^"]*)"\s # ssl_protocol
"([^"]*)"\s # ssl_cipher
"([^"]*)" # cache_status
(.*)''', re.X) # anything else
names = ("remote_addr","timestamp","status",
"bytes_sent","method","request","proto_version",
"blocked","is_human","block_reason",
"geoip_city_country_name","geoip_city",
"request_id", "captured_vector", "request_time", "upstream_addr",
"upstream_response_time", "domain_name", "host", "referer",
"user_agent", "cookie", "request_headers", "organization", "upstream_status",
"uri", "hostname", "is_cloud", "is_tor", "is_vpn", "is_anonymizer",
"is_proxy", "rbzsessionid", "request_length", "sent_http_cache_control",
"sent_http_expires", "cookie_rbzid", "sent_http_content_type", "browsersig",
"ssl_protocol", "ssl_cipher", "cache_status", "anything_else")
def parse_line(line, as_dict=False):
rmatch = access_line_rec.match(line)
if rmatch:
g_match = rmatch.groups()
if not as_dict:
return g_match
else:
# to do, check if using re-group names would be faster
return dict(zip(names, g_match))
else:
return None
Export as PDF
Copy link
Outline