Merge lp:~maxiberta/canonical-identity-provider/update-robots.txt into lp:canonical-identity-provider/release

Proposed by Maximiliano Bertacchini
Status: Merged
Approved by: Daniel Manrique
Approved revision: no longer in the source branch.
Merge reported by: Otto Co-Pilot
Merged at revision: not available
Proposed branch: lp:~maxiberta/canonical-identity-provider/update-robots.txt
Merge into: lp:canonical-identity-provider/release
Diff against target: 25 lines (+13/-4)
1 file modified
src/webui/templates/static/robots.txt (+13/-4)
To merge this branch: bzr merge lp:~maxiberta/canonical-identity-provider/update-robots.txt
Reviewer Review Type Date Requested Status
Daniel Manrique (community) Approve
Review via email: mp+354785@code.launchpad.net

Commit message

Update robots.txt.

Description of the change

- Limit URLs with an ending '$' where appropriate.
- Duplicate rules with an encoded '+' (currently cowboyed in prod as per https://portal.admin.canonical.com/C114092). Should not be needed but Bing seems happier with it; and it won't hurt anyway.
- Add /saml endpoint exclusion.

To post a comment you must log in.
Revision history for this message
Daniel Manrique (roadmr) wrote :

This is the current robots.txt on production (cowboy):

User-agent: *
Disallow: /+bad-token
Disallow: /+deactivated
Disallow: /+logout
Disallow: /+logout-to-confirm
Disallow: /+openid
Disallow: /+saml
Disallow: /+suspended

# 2018-09-06 maxiberta cowboy (lp:1787823)
Disallow: /%2Bbad-token$
Disallow: /%2Bdeactivated$
Disallow: /%2Blogout$
Disallow: /%2Blogout-to-confirm$
Disallow: /%2Bopenid$
Disallow: /%2Bsaml$
Disallow: /%2Bsuspended$

I spot a few differences, see below, but +1'd in case I'm just misunderstanding where the $ ending is needed.

review: Approve
Revision history for this message
Maximiliano Bertacchini (maxiberta) wrote :

Currently in production, all rules with a '+' are open-ended (i.e. with an implicit '*' at the end); and otoh, all rules with '%2B' are all exact matches. The proposed unified rules are exact matching where possible, or open-ended if URL variations ending in /* exist. Makes sense? Thanks!

Revision history for this message
Daniel Manrique (roadmr) wrote :

Makes sense. Let's go then!

review: Approve
Revision history for this message
Otto Co-Pilot (otto-copilot) wrote :

Preview Diff

[H/L] Next/Prev Comment, [J/K] Next/Prev File, [N/P] Next/Prev Hunk
1=== modified file 'src/webui/templates/static/robots.txt'
2--- src/webui/templates/static/robots.txt 2018-08-24 16:15:01 +0000
3+++ src/webui/templates/static/robots.txt 2018-09-12 14:54:50 +0000
4@@ -1,8 +1,17 @@
5 User-agent: *
6 Disallow: /+bad-token
7-Disallow: /+deactivated
8-Disallow: /+logout
9+Disallow: /+deactivated$
10+Disallow: /+logout$
11 Disallow: /+logout-to-confirm
12-Disallow: /+openid
13+Disallow: /+openid$
14 Disallow: /+saml
15-Disallow: /+suspended
16+Disallow: /+suspended$
17+Disallow: /saml
18+
19+Disallow: /%2Bbad-token
20+Disallow: /%2Bdeactivated$
21+Disallow: /%2Blogout$
22+Disallow: /%2Blogout-to-confirm
23+Disallow: /%2Bopenid$
24+Disallow: /%2Bsaml
25+Disallow: /%2Bsuspended$