Whenever you change the configuration of the Reblaze platform, you must save your changes, and in almost all cases, you must publish them to the cloud as well.
If you do not save your changes, they will be lost when you go to a different page within the user interface. You will not be prompted or asked to confirm the abandonment of your changes.
The Publishing process checks the saved changes for errors. If no errors are found, they are pushed to the cloud.
The Reblaze interface has several ways in which to save changes.
In many places, this button appears on the upper right part of the screen:
In other places, the Save functionality is part of a pull-down menu, which is designated by three vertical dots:
Remember that you must select Save Changes, in whatever format the button is offered, whenever you make any edits within the Reblaze interface.
Also remember that in almost all cases, Save Changes only saves your changes within your local session. To push them to your planet in the cloud, you must Publish Changes as well.
When editing Dynamic Rules, saving your changes will apply them to your planet.
In all other cases, after Saving your changes, you will also need to Publish them, which will push everything to the cloud.
Rather than trying to remember which activities require publishing and which do not, it is best to get into the habit of always Publishing after Saving.
In some cases, choosing the Save button will be followed by an immediate prompt to Publish. In other cases, it will not, but Publishing is still required. Again, it is recommended that you adopt the habit of always Publishing after Saving.
Publishing is done on the Planet Overview page, via the button on the upper right. The outcome will appear on the screen once it is complete.
As described in Traffic Concepts, out of the box Reblaze includes an Active Challenge process. This is very useful in distinguishing humans from bots.
Active Challenges work well, but an even better option is Passive Challenges.
Active Challenges temporarily redirect the user's browser. This can affect site metrics gathered by products such as Google Analytics. (Specifically, the initial referrer information is lost.) Passive Challenges are simple pieces of Javascript. They do not redirect the user's browser; they merely ask it to solve a challenge, and then insert the Reblaze cookies.
Active Challenges will not occur when site content is served from a CDN. Passive Challenges can still detect bots in this situation.
Most importantly, Passive Challenges allow Reblaze to use Biometric Bot Detection—an advanced and sophisticated means of distinguishing humans from automated traffic sources.
With Biometric Bot Detection, Reblaze continually gathers and analyzes stats such as client-side I/O events, triggered by the user’s keyboard, mouse, scroll, touch, zoom, device orientation, movements, and more. Based on these metrics, the platform uses Machine Learning to construct and maintain behavioral profiles of legitimate human visitors. Reblaze learns and understands how actual humans interact with the web apps it is protecting. Continuous multivariate analysis verifies that each user is indeed conforming to expected behavioral patterns, and is thus a human user with legitimate intentions. More information about this.
We recommend that all customers enable Passive Challenges if it is possible to do so. Biometric Bot Detection provides much more robust detection of automated traffic than is possible without it.
Implementing Passive Challenges is simple. Place this Javascript code within the pages of your web applications:
The code snippet can go into the header or at the end of the page. The best practice is to place it within the header. This ensures that subsequent calls contain authenticating cookies.
(This matters for the first page served to a visitor. Subsequent pages will already have the authenticating cookies to include with their requests.)
Most customers set up the code snippet as a tag within their tag manager. This makes it simple to install it instantly across their entire site/application.
If desired, the script code can include async
and/or defer
attributes:
These usually are not necessary, and their effect will depend on the placement of the script within the page. Their use is left to your discretion.
To test the implementation, use a browser to visit a page containing the Javascript snippet. Once it runs, the browser should have a cookie named rbzid.
There are two primary situations where customers sometimes want to disable Active Challenges:
When a customer needs site analytics to correctly reflect all referrers. (Active Challenges can interfere with this.)
For API endpoints. Active Challenges are designed to verify the client's browser environment; for most API calls, there is no browser environment to verify. (For users of our Mobile SDK, this is not a problem. They can still use active challenges for these endpoints.)
Other than those situations, Active Challenges can be very beneficial.
We recommend that you keep Active Challenges enabled if possible. They automatically eliminate almost all DDoS traffic, scanning tools, and other hostile bot traffic.
If you wish to turn off Active Challenges, do the following.
For specific URLs/locations: remove the default "Deny Bot" ACL Policy from all Profiles that are assigned to those locations.
For an entire site/application: remove the "Deny Bot" ACL Policy from all Profiles within the site.
For specific traffic sources: Add an ACL Policy that will 'Allow' those specific requestors. The requestors should be defined as narrowly as possible.
Turning off Active Challenges will disable the direct blocking of bots (where a requestor is blocked merely for being identified as a bot). However, automated traffic will still be excluded via all other relevant means.
If you merely remove the Deny Bot ACL Policy from the relevant Profiles, then bots will still be excluded by the other active ACL Policies, Dynamic Rules, content filtering, and so on. If instead you added an "Allow" ACL Policy to specific requestors, then other ACL Policies will not block those requestors; they will be exempted from ACL Policy filtering.
If you have not enabled Passive Challenges (and successfully tested them), disabling Active Challenges is not recommended.
This section describes best practices for some common tasks.
Leveraging Reblaze's traffic data
A typical security solution provides information about the traffic that it has blocked. Contrary to this, Reblaze shows all of the traffic that it receives.
This provides you with a powerful ability to dig into your traffic data and gain deep understanding about the nature and disposition of incoming requests.
Metrics about blocked requests are very useful, but their usefulness is multiplied when you can compare them to passed requests. By showing all requests, Reblaze allows you to understand what "normal" requests look like. This gives you more insights into abnormal traffic.
The discussion below assumes you have already read the .
Reblaze provides multiple ways to view your traffic. The discussion below will focus on the . Similar tactics can be applied when viewing different parts of the .
Sometimes the View Log is used to answer a specific question about a request or a traffic source. At other times, it might be a more general exploration; for example, a beginning-of-the-day review of what happened over the last 24 hours.
Let's discuss the latter scenario (a general exploration). Because the View Log shows all requests, it can be overwhelming. It's helpful to start by excluding requests that aren't as relevant to your current purpose. Examples:
Exclude passed requests, and show only the challenges or blocked requests: reason:challenge
(to show only challenges), or is:blocked
(to show only blocked requests).
Exclude requests being rejected by the origin (i.e., the upstream server): reason:!by origin
Exclude requests from a banned IP: ip:!1.2.3.4 .
Another approach: reason:!autoban/etc.
Exclude requests that are obviously invalid, e.g. those with unrecognized host headers.
Eventually, you can filter the display down to a list of challenges or blocked requests that might produce some insights.
From this point, you are looking for possible patterns, or unusual outliers, and considering possible actions to take. For example:
Is the same IP failing multiple challenges? It might be interesting to filter on that IP only, and go through all of its activity, to see what that traffic source was trying to do. (You can see that a challenge is being failed when the challenge itself appears in the logs, but it is not followed by a successful request by the IP for the same URI.)
Are there many blocked requests for the same URI, but from different traffic sources? This might be a False Positive alarm. See below for more on this.
There is no set procedure for this process. Essentially, you are browsing the list of challenges or blocked requests, thinking about why you are seeing the entries there, and asking further questions.
Sometimes, request data will be encoded. Here's an example of a XSS attempt.
The script in the request is HTTP encoded. To see it in plain text:
Double-click on the reason (the yellow label beginning with XSS).
The full contents of the label will appear in the query box, decoded.
For some forms of encoding, more processing is required.
Reblaze does not perform all possible forms of decoding in the interface.
Double-encoded requests will only have their first level of encoding undone.
A “False Positive” (FP) alarm occurs when a security system misinterprets a non-malicious activity as an attack. These errors are a critical issue for cybersecurity.
Although it might seem that FP errors do not necessarily have serious consequences, incorrect security alerts can lead to significant monetary losses. For example, an e-commerce site might incorrectly exclude real online shoppers. Or, it might reject "good" web crawlers, which would reduce its Internet visibility.
FP diagnosis is often done when Reblaze is first deployed, before it has been fine-tuned for the customer's applications. It can also be done later, when you discover that certain requests are being blocked, but you do not understand why.
When examining blocked request(s), it can be helpful to ask questions such as the following.
Sometimes an application will expect and accept inputs that Reblaze blocks by default. Example: the site might use a CMS which accepts HTML/Javascript POST requests. However, out of the box, Reblaze is often configured to block requests containing POST. Thus, Reblaze would need to be re-configured to match the application it is protecting. (In this example, it's a simple toggle switch in the WAF/IPS Policies interface.)
If the range of valid inputs has changed, but Reblaze was not re-configured, then FP errors can result.
Is the block happening to multiple users?
If a single IP is being blocked, but other requestors for the same URI are being allowed through, the block is more likely to be the correct response to this IP's request. This is especially true when the single IP has attempted other questionable activities.
Conversely, if multiple requestors are being blocked, and they seem to be innocuous otherwise, this indicates the problem might be a FP.
There are some situations where a request is blocked even though Reblaze has no obvious reason to be blocking it. This can occur when Reblaze is not actually the entity making the decision.
The upstream server can reject requests. These requests can be displayed in the View Log withreason:by origin
as the filter.
Reblaze can use ACLs with external sources of information, e.g. Spamhaus. Example: a request is blocked with a reason of acl-ip
. The reason indicates a Reblaze ACL blocked the request because of its IP—but what if you didn't configure any ACLs to reject specific IPs? The answer is that there are several ACLs which rely on external lists of hostile IPs, such as Spamhaus DROP and eDrop.
Example: a web user visited a landing page, entered contact details into a form, and then tried to proceed to the next page. However, the request to proceed was blocked.
This could be a FP due to "junk input." Perhaps the user entered a phone number of "1111111111" and it was rejected by the upstream application. Or perhaps the page itself autocompleted a field and inadvertently created junk input on its own.
If requestors are being blocked for violating rate limits, and the rate limits are very stringent, then perhaps the limits are too tight, and they need to be relaxed.
Or, perhaps the block is the result of content filtering. This feature is powerful, and it is possible to mistakenly configure it to be too restrictive.
Looking up the custom signature shows that its "Is matching with" condition is the following regex:
(?i)(select|create|drop|\<|>|alert)\W
The admin wrote this regex in order to identify SQL injection attempts (i.e, SELECT, CREATE, and DROP commands. Normally SQL and code injection attempts are blocked automatically by the WAF, but occasionally customers choose to disable this).
Now let's examine one of the blocked requests in the View Log. Its Capture Vector is this:
https://www.pinterest.com/pin/create/button/?url=https%3A%2F%2Fexample.com%2Fpitbull-2012%3Frt%3Dstorefront%26rp%3Dpitbull%26rn%3DPitbull%26s%3Dhanes-S04V%26c%3DBlack%26p%3DFRONT&media=&description=
https://www.pinterest.com/pin/create/button/?url=https://example.com/pitbull-2012?rt=storefront&rp=pitbull&rn=Pitbull&s=hanes-S04V&c=Black&p=FRONT&media=&description=
Notice the highlighted (in yellow) part of the string in the bottom textbox. This shows which part of the bottom string matches with the regex in the box above it.
We see that the expression that was meant to identify an SQL command of CREATE, is also matching with the URL generated by a user who was trying to feature this site's product on Pinterest.
This indicates that the regex should probably be modified, so that it only accomplishes its intended purpose. In this example, it could be modified as follows:
(?i)(\sselect|\screate|\sdrop|\<|>|alert)\W
This would require a space to be found before each potential SQL command, thus eliminating matches when those words are found in a URL.
As mentioned earlier, there is no set procedure for the process of identifying the underlying reasons why requests are blocked. It requires digging into the requests, gathering data, and asking questions.
Hopefully the examples above are helpful in illustrating the necessary thought process, and some of the tools that are available.
How to filter results
The Reblaze query command box allows you to query the data in order to present only the records that you want. The structure of a filter is operator:value.
For example, to view only traffic from IP 10.11.12.13:
ip:10.11.12.13
An exclamation mark ("!") can be used with various filters as a "NOT" operator. So, to exclude requests from an IP, add "!" as a prefix:
ip:!10.11.12.13
Note that filters can be combined to fine-tune a search, by chaining them with a comma. For example, in order to show all traffic from France that was blocked:
country:France, is:blocked
The fastest way to build query strings is to double-click on labels of existing log entries. This will add the value of each label as a filter to the query box. When you have constructed the entire query string, click on the search button, or place the cursor into the text field and hit "Return" on your keyboard.
For example, when looking at this entry for a blocked request (the IPs have been censored for privacy reasons):
Let's say you wanted to quickly see what other requests produced a 403 error that came in from the United States on a cloud IP. Double-clicking on the "403" label, the "United States" label, and the "Cloud" label builds the query string for you automatically:
Then select the "search" button, or place the cursor within the query box and hit "Return" on your keyboard, to process the query.
In some situations, this technique has limited usefulness. In the example screenshot above, double-clicking on the label beginning with "XSS" would add a (very) long filter string to the query, because that label contains the full script from the injection attempt (although the label itself only displays the first few characters in the interface).
If this is what you want (perhaps you need to search only for logged requests which contain that exact script), then this is effective. But if you wanted to build a more general query, then this is not useful.
A better approach is to choose a representative substring.
Queries can be based on substrings. Internally, Reblaze executes a "contains" search, so queries will return everything that contains the string that was specified.
In some situations, you'll want to build queries that produce exact matches. But in many other situations, it's faster just to search for a relevant substring instead.
Examples:
reason:XSS
will show all records with "XSS" in their reason label.
url:login
will include all URLs that include "login".
ip:191
will display all IP addresses containing "191".
Regular expressions can be applied to the filter value to make the search more accurate. The regex matching is done by wrapping the search value with re(REGEX here). Examples:
is:re(200|401)
ua:re(iOS!iPhone!iPad)
If you wish to display login requests with method “POST” which were blocked, but not by the origin (upstream server):
url:/login,is:POST,is:blocked,is:!origin.
This displays those requests which produced 502-504 error codes:
is:re(502|504)
This displays the requests that contain a specific value in an argument:
arg_name:text, arg_value:free_text
Are there a lot of requests for a Wordpress file, but your site does not use Wordpress? These are coming from malicious traffic sources. It might be useful to set up a , e.g. to Ban requestors who submit more than one request for that file in a three-minute period.
In these cases, you can highlight the string in the query box, copy it, and run it through decoding tools. (For example, .)
Example: requests are being blocked because of a (reason:acl-custom-sig
).
This can be decoded with a tool such as . It now becomes this:
Now let's see why the regex condition matched the request. On , it's possible to paste in regex and a string, and see if/where a match occurs.
Name | Explanation |
After | Log records registered after a given date and time. |
Before | Log records registered before a given date and time. |
arg_name | Search for an argument by name |
arg_value | Search for an argument value |
asn | Autonomous System Number for the IP |
city | Abbreviation for the country. Example: "US" for the United States. |
country | Name of the country |
cookie_rbzid | Reblaze ID cookie value (base64) |
cookie_name | Search by cookie name |
cookie_value | Search by cookie value |
domain_name | The main domain name in Reblaze (not necessarily the actual Host header). |
ext | File extension |
header_name | Name of header |
header_value | Value found in header_name |
host | HTTP Host header: usually, the domain name of the application. |
id | Reblaze unique request ID. |
ip | IP address of the web client. |
is | IS is a special operator that can be used for almost all attributes of a request. It can be used to filter a certain HTTP method, "flags" and protocol version. For instance, to get blocked POST requests made by a human, type: is:blocked, is:human, is:post. |
mime | Multipurpose Internet Mail Extensions |
origin | The IP address of the origin server. |
path | The path of the request (the URL without the query string part). |
proxy | The IP address of the particular Reblazer server. |
rbzid | Hash value of Reblaze ID cookie value (md5). |
reason | Reason the request was blocked or passed. |
ref | HTTP referer header. |
sslcipher | The chosen SSL cipher. |
sslprotocol | The chosen SSL protocol. |
status | Response status code (regex is acceptable, e.g. [45]\d\d). |
ua | HTTP User Agent header. |
upstream_status | Response status code from upstream servers (regex is acceptable, e.g. [45]\d\d). |
url | A substring to search for within the complete URL request line (including the query string). |